Deep learning-based approaches have shown highly successful performance in the categorization of digitized biopsy samples. The commonly used setting in these approaches is to employ convolutional neural networks for classification of data sets consisting of images all having the same size. However, the clinical practice in breast histopathology necessitates multi-class categorization of regions of interest (ROI) in biopsy samples where these regions can have arbitrary shapes and sizes. The typical solution to this problem is to aggregate the classification results of fixed-sized patches cropped from these images to obtain image-level classification scores. Another limitation of these approaches is the independent processing of individual patches where the rich contextual information in the complex tissue structures has not yet been sufficiently exploited. We propose a generic methodology to incorporate local inter-patch context through a graph convolution network (GCN) that admits a graph-based ROI representation. The proposed GCN model aims to propagate information over neighboring patches in a progressive manner towards classifying the whole ROI into a diagnostic class. The experiments using a challenging data set for a 4-class ROI-level classification task and comparisons with several baseline approaches show that the proposed model that incorporates the spatial context by using graph convolutional layers performs better than commonly used fusion rules.
We propose a framework for learning feature representations for variable-sized regions of interest (ROIs) in breast histopathology images from the convolutional network properties at patch-level. The proposed method involves fine-tuning a pre-trained convolutional neural network (CNN) by using small fixed-sized patches sampled from the ROIs. The CNN is then used to extract a convolutional feature vector for each patch. The softmax probabilities of a patch, also obtained from the CNN, are used as weights that are separately applied to the feature vector of the patch. The final feature representation of a patch is the concatenation of the class-probability weighted convolutional feature vectors. Finally, the feature representation of the ROI is computed by average pooling of the feature representations of its associated patches. The feature representation of the ROI contains local information from the feature representations of its patches while encoding cues from the class distribution of the patch classification outputs. The experiments show the discriminative power of this representation in a 4-class ROI-level classification task on breast histopathology slides where our method achieved an accuracy of 66.8% on a data set containing 437 ROIs with different sizes.
Digitization of full biopsy slides using the whole slide imaging technology has provided new opportunities for understanding the diagnostic process of pathologists and developing more accurate computer aided diagnosis systems. However, the whole slide images also provide two new challenges to image analysis algorithms. The first one is the need for simultaneous localization and classification of malignant areas in these large images, as different parts of the image may have different levels of diagnostic relevance. The second challenge is the uncertainty regarding the correspondence between the particular image areas and the diagnostic labels typically provided by the pathologists at the slide level. In this paper, we exploit a data set that consists of recorded actions of pathologists while they were interpreting whole slide images of breast biopsies to find solutions to these challenges. First, we extract candidate regions of interest (ROI) from the logs of pathologists' image screenings based on different actions corresponding to zoom events, panning motions, and fixations. Then, we model these ROIs using color and texture features. Next, we represent each slide as a bag of instances corresponding to the collection of candidate ROIs and a set of slide-level labels extracted from the forms that the pathologists filled out according to what they saw during their screenings. Finally, we build classifiers using five different multi-instance multi-label learning algorithms, and evaluate their performances under different learning and validation scenarios involving various combinations of data from three expert pathologists. Experiments that compared the slide-level predictions of the classifiers with the reference data showed average precision values up to 62% when the training and validation data came from the same individual pathologist's viewing logs, and an average precision of 64% was obtained when the candidate ROIs and the labels from all pathologists were combined for each slide.
High spectral and high spatial resolution images acquired from new generation satellites have enabled new
applications. However, the increasing amount of detail in these images also necessitates new algorithms for
automatic analysis. This paper describes a new approach to discover compound structures such as different
types of residential, commercial, and industrial areas that are comprised of spatial arrangements of primitive
objects such as buildings, roads, and trees. The proposed approach uses a robust Gaussian mixture model (GMM)
where each Gaussian component models the spectral and shape content of a group of pixels corresponding to a
primitive object. The algorithm can also incorporate spatial constraints on the layout of the primitive objects in
terms of their relative positions. Given example structures of interest, a new learning algorithm fits a GMM to
the image data, and this model can be used to detect other similar structures by grouping pixels that have high
likelihoods of belonging to the Gaussian object models while satisfying the spatial layout constraints without
any requirement for region segmentation. Experiments using WorldView-2 data show that the proposed method
can detect high-level structures that cannot be modeled using traditional techniques.
We describe a system for interactive classification and retrieval of microscopic tissue images. Our system models tissues in pixel, region and image levels. Pixel level features are generated using unsupervised clustering of color and texture values. Region level features include shape information and statistics of pixel level feature values. Image level features include statistics and spatial relationships of regions. To reduce the gap between low-level features and high-level expert knowledge, we define the concept of prototype regions. The system learns the prototype regions in an image collection using model-based clustering and density estimation. Different tissue types are modeled using spatial relationships of these regions. Spatial relationships are represented by fuzzy membership functions. The system automatically selects significant relationships from training data and builds models which can also be updated using user relevance feedback. A Bayesian framework is used to classify tissues based on these models. Preliminary experiments show that the spatial relationship models we developed provide a flexible and powerful framework for classification and retrieval of tissue images.
We describe a system for interactive training of models for semantic labeling of land cover. The models are build based on three levels of features: 1) pixel level, 2) region level, and 3) scene level features. We developed a Bayesian algorithm and a decision tree algorithm for interactive training. The Bayesian algorithm enables training based on pixel features. The scene level summaries of pixel features are used for fast retrieval of scenes with high/low content of features and scenes with low confidence of classification. The decision tree algorithm is based on region level features that are extracted based on 1) spectral and textural characteristics of the image, 2) shape descriptors of regions that are created through segmentation process, and 3) auxiliary information such as elevation data. The initial model can be created based on a database of ground truth and than be refined based on the feedback supplied by a data analyst who interactively trains the model using the system output and/or additional scenes. The combination of supervised and unsupervised methods provides a more complete exploration of model space. A user may detect the inadequacy of the model space and add additional features to the model. The graphical tools for the exploration of decision trees allow insight into the interaction of features used in the construction of models. The preliminary experiments show that accurate models can be build in a short time for a variety of land covers. The scalable classification techniques allow for fast searches for a specific label over a large area.