Video-Instrument Synergistic Network for Referring Video Instrument Segmentation in Robotic Surgery

by   Hongqiu Wang, et al.

Robot-assisted surgery has made significant progress, with instrument segmentation being a critical factor in surgical intervention quality. It serves as the building block to facilitate surgical robot navigation and surgical education for the next generation of operating intelligence. Although existing methods have achieved accurate instrument segmentation results, they simultaneously generate segmentation masks for all instruments, without the capability to specify a target object and allow an interactive experience. This work explores a new task of Referring Surgical Video Instrument Segmentation (RSVIS), which aims to automatically identify and segment the corresponding surgical instruments based on the given language expression. To achieve this, we devise a novel Video-Instrument Synergistic Network (VIS-Net) to learn both video-level and instrument-level knowledge to boost performance, while previous work only used video-level information. Meanwhile, we design a Graph-based Relation-aware Module (GRM) to model the correlation between multi-modal information (i.e., textual description and video frame) to facilitate the extraction of instrument-level information. We are also the first to produce two RSVIS datasets to promote related research. Our method is verified on these datasets, and experimental results exhibit that the VIS-Net can significantly outperform existing state-of-the-art referring segmentation methods. Our code and our datasets will be released upon the publication of this work.


page 1

page 4

page 7

page 8


ISINet: An Instance-Based Approach for Surgical Instrument Segmentation

We study the task of semantic segmentation of surgical instruments in ro...

One to Many: Adaptive Instrument Segmentation via Meta Learning and Dynamic Online Adaptation in Robotic Surgical Video

Surgical instrument segmentation in robot-assisted surgery (RAS) - espec...

U-NetPlus: A Modified Encoder-Decoder U-Net Architecture for Semantic and Instance Segmentation of Surgical Instrument

Conventional therapy approaches limit surgeons' dexterity control due to...

Co-Generation and Segmentation for Generalized Surgical Instrument Segmentation on Unlabelled Data

Surgical instrument segmentation for robot-assisted surgery is needed fo...

Visual-Kinematics Graph Learning for Procedure-agnostic Instrument Tip Segmentation in Robotic Surgeries

Accurate segmentation of surgical instrument tip is an important task fo...

Kinematic Parameter Optimization of a Miniaturized Surgical Instrument Based on Dexterous Workspace Determination

Miniaturized instruments are highly needed for robot assisted medical he...

Please sign up or login with your details

Forgot password? Click here to reset