Surgical tool segmentation is becoming imperative to provide detailed information during intra-operative execution. These tools can obscure surgeons’ dexterity control due to narrow working space and visual field-of-view, which increases the risk of complications resulting from tissue injuries (e.g. tissue scars and tears). This paper demonstrates a novel application of segmenting and removing surgical instruments from laparoscopic/endoscopic video using digital inpainting algorithms. To segment the surgical instruments, we use a modified U-Net architecture (U-NetPlus) composed of a pre-trained VGG11 or VGG16 encoder and redesigned decoder. The decoder is modified by replacing the transposed convolution operation with an up-sampling operation based on nearest-neighbor (NN) interpolation. This modification removes the artifacts generated by the transposed convolution, and, furthermore, these new interpolation weights require no learning for the upsampling operation. The tool removal algorithms use the tool segmentation mask and either the instrument-free reference frames or previous instrument-containing frames to fill-in (i.e., inpaint) the instrument segmentation mask with the background tissue underneath. We have demonstrated the performance of the proposed surgical tool segmentation/removal algorithms on a robotic instrument dataset from the MICCAI 2015 EndoVis Challenge. We also showed successful performance of the tool removal algorithm from synthetically generated surgical instruments-containing videos obtained by embedding a moving surgical tool into surgical tool-free videos. Our application successfully segments and removes the surgical tool to unveil the background tissue view otherwise obstructed by the tool, producing visually comparable results to the ground truth.
|