Texture analysis applied to second harmonic generation image data for ovarian cancer classification

Abstract. Remodeling of the extracellular matrix has been implicated in ovarian cancer. To quantitate the remodeling, we implement a form of texture analysis to delineate the collagen fibrillar morphology observed in second harmonic generation microscopy images of human normal and high grade malignant ovarian tissues. In the learning stage, a dictionary of “textons”—frequently occurring texture features that are identified by measuring the image response to a filter bank of various shapes, sizes, and orientations—is created. By calculating a representative model based on the texton distribution for each tissue type using a training set of respective second harmonic generation images, we then perform classification between images of normal and high grade malignant ovarian tissues. By optimizing the number of textons and nearest neighbors, we achieved classification accuracy up to 97% based on the area under receiver operating characteristic curves (true positives versus false positives). The local analysis algorithm is a more general method to probe rapidly changing fibrillar morphologies than global analyses such as FFT. It is also more versatile than other texture approaches as the filter bank can be highly tailored to specific applications (e.g., different disease states) by creating customized libraries based on common image features.

Texture analysis applied to second harmonic generation image data for ovarian cancer classification 1 Introduction According to the American Cancer Society, in 2013 about 22,000 new cases of ovarian cancer were diagnosed and about 15,000 women died because of this disease. 1 The survival rate for ovarian cancer has not significantly improved over the last two decades. With current screening and diagnostic abilities, about 70% of women who are detected with ovarian cancer are diagnosed in later stages, 2,3 leading to a low 5-year survival rate of about 25%. The major problem of current diagnostic methods is the lack of reliable screening/imaging tools to detect early malignancies in the ovary. CA125 is currently the best serum biomarker, however, the sensitivity/specificity are both low. 4 For example, about 20% of women with ovarian cancer do not have elevated CA125. 2 The achievable resolution of clinical modalities (computed tomography, positron emission tomography, ultrasound, magnetic resonance imaging) is limited (only ∼0.5 to 3 mm) and is not sufficient for imaging microscopic disease. This is especially important for ovarian cancer as metastasis can occur during early stages of tumor growth. 3 Because of these limitations in detection and the high mortality rate, there is a compelling need for new technologies that can image ovarian cancers with better resolution and specificity and improve the accuracy of diagnosis and prognosis. Although traditional pathology focuses on cellular architecture, many recent studies have demonstrated that there is a close correlation between cancer initiation/progression and remodeling of the extracellular matrix (ECM) in the tumor microenvironment (TME). [5][6][7][8][9] For example, changes in collagen composition and morphology in the ECM have been documented for many cancers, including those of the ovary, breast, and colon. 7,10-12 It would then be advantageous to further develop collagen specific microscopic imaging modalities such as second harmonic generation (SHG) imaging microscopy 13 for this purpose.
SHG microscopy has already emerged as a highly sensitive/ specific probe of collagen architecture changes in several diseases, including many cancers, 10,11,[14][15][16][17][18] connective tissue disorders, 19,20 and fibroses. 21,22 All these diseases are characterized by changes in alterations of collagen density, fibrillar organization, collagen isoform distribution, and combinations thereof. We previously utilized three-dimensional (3-D) imaging in combination with the measurement of bulk optical properties and Monte Carlo simulations to differentiate normal ovarian stroma and high grade serous carcinomas. 10 Collectively, the results indicated an increase in organization in the collagen organization at both the fibril (subresolution) and fiber levels of assembly.
Although successful in elucidating detailed structural differences, the method is labor intensive and requires several independent measurements. It would be desirable to have a quantitative, objective measurement that can classify the state (or class) of tissues and is easy to perform. A measure based on fibrillar architecture, i.e., fiber size and organization, is attractive in this regard in general, and particularly for ovarian cancer. This is because a profound remodeling occurs in the stroma. 10,18 As an example, single-SHG optical sections from ex vivo normal stroma (b) and high grade serous cancer (a) tissues are shown in Fig. 1, where these are characterized by shorter mesh-like and longer curvy fibers, respectively. By inspection, we noted that these respective overall patterns are seen throughout tissues of each type and also between patients in each group. 10 This similarity suggests the development of a reliable image analysis approach toward a system for automated classification of these images. Still, classification is challenging because there are large stochastic variations with no highly defined fiber organization within the image.
Several image processing techniques have been employed for quantitative analysis of the collagen morphology observed in SHG microscopy. The simplest approach is to use segmentation methods. For example, Schanne-Klein used a thresholding process of image segmentation of collagen fibers for scoring fibrosis in a mouse model of kidney disease. 23 Similarly, Tai et al. 24 applied Otsu's segmentation to score liver fibrosis in both mouse and human tissues. However, segmentation is most sensitive to brightness and the collagen area covered in the image and is not as sensitive to fibrillar alignment and organization, which are often more important markers of diseased states. To help alleviate this limitation, several researchers have explored the use of other signal processing concepts. For example, FFT analysis has been used in several studies for analysis of SHG images. [25][26][27][28] Although this is simple to implement and has been successful in some cases, it is a global approach, analyzing the frequency components that are present in the entire image. However, perceptually the morphology that often discriminates one type of tissue from another is composed of predominately rapidly changing "local" features. Other transforms, such as wavelets and their variants, are more powerful for local analysis of the fibrillar morphology within such images. For example, we previously used wavelets to examine the length of sarcomeres in normal and optically cleared skeletal muscle and calculated the entropy as a measure of organization. 29 More recently, we used two-dimensional (2-D) wavelet transforms to delineate normal lung tissue from that diseased with idiopathic pulmonary fibroses. 30 In a different approach, Keely et al. used curvelets, which are highly sensitive to edges, 31 to delineate tumor boundaries in different stages of breast cancer. 32 Although more applicable than FFT these transforms, in their simplest implementation, still lack the ability to analyze more random patterns of collagen that are representative of the stroma of most ECM tissues (normal and diseased). For example, 2-D wavelet transforms were not successful in accurately classifying the ovarian tissues studied here (unpublished results).
To solve this problem for ovarian cancer, we utilized a form of texture analysis of SHG images as a classification tool. In computer vision, texture is an image property based on repetitive patterns with slowly varying local statistical propertues. Texture analysis has the strong advantage of being insensitive to intensity and not requiring long range orientation (e.g., tens to hundreds of microns). Rather, it probes the environment around small individual regions in the image, and using computer vision, extracts common features that are present. Our implementation applies the method developed by Varma and Zisserman. 33,34 Specifically, we focused on collagen fiber distribution of the image by convolving filter patches in different directions and scales. Instead of extracting visually apparent features like angular distribution, fiber length, or area covered, as has been more commonly done, we trained on large sets of cancer and normal SHG images by clustering the filter responses within small groups of pixels using statistical methods to find common features among each tissue type. This is an important distinction, as in real tissues it is often difficult to discretize all of the individual fibers, which leads to a loss of information.

Tissues
We conducted an institutional review board-approved study of ex vivo ovarian tissues from 5 normal patients and 5 patients with high-grade serous ovarian cancer from the University of Connecticut Health Center. The diagnoses for all tissues were confirmed by pathological analysis of biopsied tissues. Tissues were fixed in 4% formalin for 24 h, transferred to phosphate buffered saline, and sliced into 100 to 200-μm-thick sections with a vibrating microtome (Vibratome, Buffalo Grove, Illinois).

SHG Imaging
Tissues were imaged by the SHG microscopy as previously described. 13 The excitation used 890-nm, 100-fs pulses from a commercial Ti:sapphire oscillator (Mira, Coherent, Santa Clara, California). The SHG laser scanning microscope was a modified Fluoview300 (Olympus, Center Valley, Pennsylvania) mounted on a fixed stage upright stand (Olympus BX61). All imaging was performed with a 40× (0.8 NA) water immersion objective lens with an average power of 20 to 50 mW at the focal plane. To excite all orientations equally, circularly polarized light was used throughout. This was achieved at the focal plane using the combination of a quarter wave plate and a half wave plate as a compensator. 13 The SHG was collected in the forward direction by a 0.9-NA condenser, isolated with a 20-nm bandwidth 445-nm bandpass filter (Semrock, Rochester, New York) and detected by a single photon counting photomultiplier tube module (Hamamatsu 7421, Hamamatsu City, Japan). Images were acquired at three times zoom with a field-of-view of 170 μm by 170 μm and a field size of 512 by 512 pixels to sample at the Nyquist frequency.

Training Images Selection
Machine learning is required for the texture analysis method to obtain a statistical distribution of repetitive collagen structure patterns. This is acquired from training images of both normal and cancer tissues. For the training image set, we randomly chose single-optical sections that had at least 60% collagen coverage from each image stack. Altogether, there were 1100 selected training images (550 images each from cancer and normal tissues). We normalized the overall image intensity of each optical section to the full 12-bit dynamic range to compensate for any artifacts arising from depth dependent attenuation introduced by scattering within the tissue slice. This also compensates for the increased brightness of the SHG from the tumor specimens relative to the normal tissues. 10

Filter Selection and Image Model Construction
In the learning stage, we convolved all training images with the rotationally invariant filter bank MR8 [elements are shown in Fig. 2(a)]. This bank has 38 filters and consists of Gaussian and a Laplacian of Gaussian filters, which are rotationally symmetric. It also includes edge filters and bar filters at 3 different scales. Both the edge and bar filters are oriented at 6 orientations at each scale. Measuring the maximum response only across these orientations reduces the number of responses from 38 (6 orientations at 3 scales for 2 oriented filters, plus 2 isotropic) to 8 (3 scales for 2 filters, plus 2 isotropic). This provides rotation invariant behavior. 34 This is important as we do not know the orientation of the tumor relative to its point of removal and thus have no fiducial markers for placement on the microscope. Each pixel then generates an eight-dimensional vector response after convolution with this subset of the MR8 filter bank.
We randomly chose 10,000 pixel responses from each training image [see e.g., Fig. 2(b)] to keep the computational cost feasible. These were analyzed in small "patches," with each composed of 49 × 49 pixels. The frequency of occurrence of individual patterns in an image (here histogram of textons) will then provide a so-called topic model for the image. But since the predominant filter responses (i.e., textons) are not known a priori, the standard approach in machine learning is to group the responses via a K-means clustering [ Fig. 2(c)]. These were then used to build an overall texton dictionary.
Finally, we built the classification model using the texton distribution histogram obtained for each training image [as represented in Fig. 2(d)]. Through this process, we identified representative structural features in normal and malignant tissues based on prelearned models of the respective SHG images.

Classification
In the classification stage, we built a model for testing images [Figs. 3(c) and 3(d)] based on the statistical distribution of the histogram of the texton distribution for each, as was performed for the training set [ Figs. 3(a) and 3(b)]. Then, we adopted a χ 2 nearest neighbor (NN) classification to determine the identity of the testing image based on the image model. We applied different thresholds for both the cancer and normal images based on the Gaussian weighted [expð−d 2 ∕2σ 2 Þ] distribution of NN distances for each case, where σ is the width of the distribution and d is the χ 2 distance between training and testing images. We determined σ from the fitted Gaussian distribution of all NN distances from all the training images, which afforded the classification of each test image by comparison with the most similar training images around it. We then used standard 10-fold cross validation, where we randomly divided the total number of images (550 cancers and 550 normal) into 10 groups (each group then had 110 images). In the cross validation procedure, each group serves as the test set once whereas the remaining nine are the training set. This is repeated for each group, i.e., 10 times in all. The summary scores reflect the mean over all folds. In this calculation, we held out a group of images to optimize the number of NNs and achieved the highest accuracy by applying NN ¼ 10.
We diagram the classification scheme in Fig. 3(e), where purely for demonstration purposes, we assumed that 2 textons were selected to construct the image model. In the demonstration, the yellow circle represents the testing image models; the blue squares and red triangles are the normal and cancer image models, respectively; the x and y axes are the count numbers of each of the 2 textons in the model. The distances of different image models are evaluated by the χ 2 distance, which reflects the similarity of their respective statistical distributions, i.e., images that are more similar to each other will have a smaller χ 2 distance. In this particular example, we chose six nearest training images away from the testing image. Then, the classification of the testing image is decided by the sum of all six weights around it.

Results of Classification Accuracy
We use the receiver operating characteristic (ROC) curve formalism 35,36 of true positives versus false positives (or sensitivity versus 1-specificity) to determine the accuracy of the classification, where the accuracy is defined as the area under the ROC curve (AUROC). We applied the optimized NN ¼ 10 from held out samples (determined in Sec. 3.3), and systematically changed the texton number to best represent the features in each tissue type. Figure 4 shows the resulting ROC curves that were generated using a range of 5 to 400 textons. The discrimination threshold is crucial for ROC curve generation, and this was chosen by summing up the weighting of the 10 NNs around the testing images. Using 40 textons and NN ¼ 10, we achieved a high accuracy with AUROC ¼ 0.974. We found that if we chose 100 textons, the accuracy of classification decreased slightly to 0.963, since the frequency for each base is then so low that there were insufficient counts for a sufficient statistical distribution to construct the imaging model. As a more extreme example of this result, 400 textons resulted in yet a significantly worse discrimination (0.805). On the other hand, when the texton number is lower than 20, the accuracy of classification also decreases due to the lack of features differentiating normal and cancer tissues. For example, 5 textons yielded a significantly lower AUROC (0.919) than the optimal value of 40.

Discussion
The current image analysis methods utilized in SHG microscopy (summarized in Sec. 1) all probe the fiber organization either globally or relative to specific features, and are not completely general in their applicability. Texture based methods may be superior in this regard. The simplest form of texture, the gray level co-occurrence matrix, compares the brightness of adjacent pixels and has been used for SHG image analysis. 25,37 However, it is also not a general approach as it is not as sensitive to morphology. As an alternate, and more versatile form of texture analysis, Wang et al. 38 applied spectral moments to quantify intervertebral disk damage in a mouse model and successfully developed a linear discriminate classifier to differentiate loaded and sham-loaded tissues. This approach determines the 2-D frequency response and is independent of scale and orientation. As a result, this approach is a fairly general approach used for image classification. Textons, on the other hand, may have some advantages as spectral moments may be "too general;" for instance, in our application, we expect that having a dependence on orientation (as could be probed using the full set of filters in the MR8 filter bank) may actually result in an additional discriminatory capability of the statistical model. For example, one may want to be able to record the orientation of the maximum response when this is relevant, as there is some clear alignment of fibers in the malignant tissues. This will yield higher order co-occurrence statistics on orientation dependent "topics" within the model and such information may be critical in discriminating textures that may seem similar in an orientationindependent spectral moment analysis. Further, topic models based on textons are known to yield better discrimination (at least for photographic images of naturally occurring objects) 34 than those based on the formulations where the algorithm transforms the data to a 2-D frequency space, where the latter loses potentially meaningful orientation information.
The texture analysis algorithm here successfully recognized the repetitive collagen fiber patterns by convolution with a standard filter bank composed of many shapes, sizes, and orientation. As a result, this approach also affords the classification and comparison of essentially any morphology present in image data, as long as unique features can be assigned to each class. This criterion was satisfied with normal and high grade malignant ovarian tissues. By optimizing the number of textons and NNs we obtained an excellent classification accuracy of ∼97%. We stress that the tumors were all high grade serous malignancies and were not representative of all ovarian cancer types. Still, excellent discrimination was achieved with a small sample set because of the large change in morphology. We note that our previous analysis using 3-D SHG imaging, measurement of optical properties, and Monte Carlo simulations delineated normal stroma and high grade malignancies using a small sample size. 10 Although that study provided insight into subresolution structural changes in the latter, the method requires many measurements and simulations. The routine developed here can now be readily implemented on the SHG images of new tissues in a straightforward manner, as the dictionaries from training sets are already created.
Although the result of the feature extraction is sensitive to the original filter selection, it is straightforward to change the filter set until common features are located which can differentiate the tissues being compared. The drawback for this method is the large number of images required for the algorithm to extract the common features in each class. It is also not possible to directly visually associate textons with specific visual features such as fiber length and alignment. Still, our approach of comparison to a filter bank affords the specific tailoring of the feature selection to the desired application. For example, even using limited single-optical sections as inputs, the texture analysis employed here showed great potential for ovarian cancer classification.

Summary
We applied a texture analysis algorithm to evaluate the ECM structural changes in normal ovarian stroma and high grade Fig. 4 Receiver operating characteristic curves for classification with five textons (pink pentagons) with 91.9% accuracy; 20 textons (turquoise diamonds) with 96.4% accuracy; 40 textons (black squares) with 97.4% accuracy;100 textons (red circles) with 96.6% accuracy; and 400 textons (green stars) with 80.5% accuracy.
ovarian serous cancer observed in SHG images. By optimizing the number of textons and NNs, we achieved high accuracy (97%) for classifying high grade cancer tissue and normal ovarian tissue using an ROC curve analysis. The classification algorithm is a relatively general method based on prelearned SHG images and is well suited for analysis of rapidly changing fibrillar features typical of most tissues. The application here was for the discrete case of normal and high grade serous malignancies, but the approach could be extended to other cases such as low grade and borderline tumors.