Automated segmentation of tissue and cellular structure in H&E images is an important first step towards automated histopathology slide analysis. For example, nuclei segmentation can aid with detecting pleomorphism and epithelium segmentation can aid in identification of tumor infiltrating lymphocytes etc. Existing deep learning-based approaches are often trained organ-wise and lack diversity of training data for multi-organ segmentation networks. In this work, we propose to augment existing nuclei segmentation datasets using cycleGANs. We learn an unpaired mapping from perturbed randomized polygon masks to pseudo-H&E images. We generate over synthetic H&E patches from several different organs for nuclei segmentation. We then use an adversarial U-Net with spectral normalization for increased training stability for segmentation. This paired image-to-image translation style network not only learns the mapping form H&E patches to segmentation masks but also learns an optimal loss function. Such an approach eliminates the need for a hand-crafted loss which has been explored significantly for nuclei segmentation. We demonstrate that the average accuracy for multi-organ nuclei segmentation increases to 94.43% using the proposed synthetic data generation and adversarial U-Net-based segmentation pipeline as compared to 79.81% when no synthetic data and adversarial loss was used.
Colorectal cancer is the fourth leading cause of cancer deaths worldwide, the standard for detection and prevention is the identification and removal of premalignant lesions through optical colonoscopy. More than 60% of colorectal cancer cases are attributed to missed polyps. Current procedures for automated polyp detection are limited by the amount of data available for training, underrepresentation of non-polypoid lesions and lesions which are inherently difficult to label and do not incorporate information about the topography of the surface of the lumen. It has been shown that information related to depth and topography of the surface of the lumen can boost subjective lesion detection. In this work, we add predicted depth information as an additional mode of data when training deep networks for polyp detection, segmentation and classification. We use conditional GANs to predict depth from monocular endoscopy images and fuse these predicted depth maps with RGB white light images in feature space. Our empirical analysis demonstrates that we achieve state-of-the-art results with RGB-D polyp segmentation with a 98% accuracy on four different publically available datasets. Moreover, we demonstrate a 87.24% accuracy on lesion classification. We also show that our networks can domain adapt to a variety of different kinds of data from different sources.
Colorectal cancer is the second leading cause of cancer deaths in the United States and causes over 50,000 deaths annually. The standard of care for colorectal cancer detection and prevention is an optical colonoscopy and polypectomy. However, over 20% of the polyps are typically missed during a standard colonoscopy procedure and 60% of colorectal cancer cases are attributed to these missed polyps. Surface topography plays a vital role in identification and characterization of lesions, but topographic features often appear subtle to a conventional endoscope. Chromoendoscopy can highlight topographic features of the mucosa and has shown to improve lesion detection rate, but requires dedicated training and increases procedure time. Photometric stereo endoscopy captures this topography but is qualitative due to unknown working distances from each point of mucosa to the endoscope. In this work, we use deep learning to estimate a depth map from an endoscope camera with four alternating light sources. Since endoscopy videos with ground truth depth maps are challenging to attain, we generated synthetic data using graphical rendering from an anatomically realistic 3D colon model and a forward model of a virtual endoscope with alternating light sources. We propose an encoder-decoder style deep network, where the encoder is split into four branches of sub-encoder networks that simultaneously extract features from each of the four sources and fuse these feature maps as the network goes deeper. This is complemented by skip connections, which maintain spatial consistency when the features are decoded. We demonstrate that, when compared to monocular depth estimation, this setup can reduce the average NRMS error for depth estimation in a silicone colon phantom by 38% and in a pig colon by 31%.