Comparison of contourlet transform and gray level co-occurrence matrix for analyzing cell-scattered patterns

Abstract. Distribution of scattered image patterns hinges on morphological and optical characteristics of cells. This paper applied a numerical method to simulate scattered images of real cell morphologies, which were reconstructed from confocal image stacks dyed by fluorescent stains. Two approaches, contourlet transform (CT) and gray level co-occurrence matrix (GLCM), were then used to analyze the simulated scattered images. The results showed that features extracted using GLCM contained more information than those extracted using CT. Higher classification accuracy could be achieved with a single GLCM parameter than CT and GLCM could achieve higher accuracy with fewer parameters than CT when using multiple parameters. Meanwhile, GLCM requires less computational cost. Thus, GLCM is more suitable and efficient than CT for the analysis of cell-scattered images.


Introduction
Mitochondrion is one of the most important organelles in most cells; it supplies the energy used by cells and plays a significant role in many cell activities, 1 including cell differentiation and apoptosis. Cell-scattered light stimulated by coherent light can provide deep insight into a cell's morphology. [2][3][4] Previously, polarization diffraction imaging flow cytometry was developed, which could simultaneously obtain cross-polarized scattered images or a polarization-scattered image (p-SI) pair per cell carried by core flow and illuminated by a laser. It has been shown that the spatial patterns in p-SI images correlate with morphological features of biological cells, [5][6][7][8] but it is difficult to distinguish scattered images obtained from different cell morphologies. Therefore, an effective assay algorithm for analyzing the scattered images corresponding to the cell morphology is necessary.
This paper reports a simulation method based on an accurate process of scattered light simulation 6,[9][10][11][12] and a reconstructed real cell morphology obtained in previous studies 6,[12][13][14] to simulate the polarized cell-scattered images. The main characteristic of scattered images is pattern distribution, which can be treated as a type of frequency information. Contourlet transform (CT) can decompose an input image into several subimages that contain different scales and direction frequency information. [15][16][17][18] Gray level co-occurrence matrix (GLCM) analyzes the probability of pixel pairs, which is also frequency information. 19,20 Thus, these two algorithms were applied to analyze scattered images simulated from real cell morphologies with different mitochondria volumes and refractive indices (RIs). The results our comparative study showed that features extracted using GLCM contain more information than those extracted using CT.
Higher classification accuracy could be achieved with a single GLCM parameter than CT, and GLCM could achieve high accuracy with fewer parameters than CT when using multiple parameters. Meanwhile, GLCM requires less computational cost. GLCM is a more appropriate assay method than CT for cell-scattered images.

Reconstructed Cell Three-Dimensional Structure and Refractive Index Model
The reconstructed cell three-dimensional (3-D) structures were acquired using a laser scanning confocal microscope (LSM510, Zeiss, Germany). Cells were stained with two fluorescent dyes, Syto 61 and Mito Tracker Orange CMTMRos (S11343 and M-7510, Life Technologies), using the same protocol as described in previous papers. 6,12 Syto 61 marks the nucleic acids mainly distributed inside the nucleus, and Mito Tracker Orange CMTMRos mostly adheres to mitochondria. Stimulated by two laser beams with wavelengths of 633 and 532 nm, two 12-bit fluorescent intensity images of the nucleus and mitochondria were saved in the red and green channels, respectively, of an image stack with a resolution of 0.07 μm in the Xand Y-directions and 0.5 μm in the Z-direction as F r ðrÞ and F g ðrÞ, where r is the voxel position inside the cell. Approximately 60 slices were acquired for each cell at different focal planes with a step size of 0.5 μm in one stack. Figure 1(a) shows some images at different focal planes of one normal human prostate epithelial cell, termed PCS (PCS440010, ATCC). After acquisition, the confocal image stack was processed with in-house software. The cell was segmented into three regions, nucleus, mitochondria, and cytoplasm, which were denoted as Ω n ; Ω m , and Ω c , respectively. Ω n was further divided into three subregions denoted as Ω nl , Ω nm , and Ω nh with relatively low, medium, and high RIs according to their pixel intensities, respectively; larger intensity means more nucleic acids, which should have a larger RI. To achieve accurate simulated diffraction images, cell 3-D structures should have the same resolution in all three directions. Thus, interpolation between confocal slices was performed. As Amsterdam discrete dipole approximation (ADDA) code treated scatterer as an array of voxel that has the same size in three directions, equal resolution in three dimensions is also necessary for accurate simulation. Because of the different resolution of confocal images between different directions, 0.07 μm in X and Y and 0.5 μm in Z, the 3-D cell structures were interpolated in Z direction to obtain the same resolution as in Xand Y-directions, and the precise cell structure used for ADDA code. Details of segmentation and interpolation have been described elsewhere. 12,14 The reconstructed 3-D cell structure is shown in Fig. 1(b) in which the nucleus, mitochondria, and cytoplasm can be clearly observed.
Because of the heterogeneity between and inside organelles, the cell RIs were modeled as the sum of a constant term, a mean RI value with a specific value in a specific region, and a randomly fluctuating term that indicates the heterogeneity inside organelle as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 1 ; 6 3 ; 2 8 7 n α ðrÞ ¼ n α0 þ ðn α0 − n w Þ · A α · RND; ∀ r ∈ Ω α ; (1) where α ¼ c; m; nl; nm, or nh, which indicates the region or subregion, n α0 is the mean value of RI in the region of Ω α ; n w is the RI of water, A α is the fluctuation amplitude, and RND denotes random numbers uniformly distributed in [−1;1]. To simulate the heterogeneity inside organelle and distinguish different organelles, A α was set to 10% in this paper, which gave every organelle an independent RI range. Combining reconstructed cell 3-D structures with RI, a cell model was obtained. By varying RI or modifying the cell structure, a series of cell models was derived from one confocal stack.

Simulation of p-SIs Using a Cell Model
To obtain the scattered light distribution of a cell model in a host medium, an open source code of a parallel ADDA algorithm for discrete dipole approximation was applied to simulate angularly resolved Mueller matrixes S ij where i; j ¼ 1;2; 3;4. 11,21 The cell model was divided into several voxels in three directions, and each voxel was treated as a dipole. For example, the PCS cell mentioned above was treated as a 116 × 110 × 86 dipole array, and the number of dipoles per wavelength used for this cell was 3.8. S ij were calculated from all fields scattered by these dipoles that were excited by an incident light. With the incident wavelength of 532 nm and RI of host media (water) n w ¼ 1.334, the polarizabilities could be calculated. ADDA code could be executed on a parallel computing cluster to calculate Mueller matrixes (S ij ), which are functions of scattering polar angles (θ; ϕ). Mie theory was used to validate ADDA code with a 10-μm sphere, which was represented by a 126 × 126 × 126 dipole array, and the number of dipoles per wavelength was 10.08. The normalized S 11 calculated by Mie theory and ADDA are shown in Fig. 2(a). The absolute error and relative error between ADDA and Mie theory S 11 are shown in Fig. 2(b). The configuration of the simulation system is shown in Fig. 3(a). First, the scattered light was projected onto an input plane P in , which is an arbitrary plane defined between the scatterer and the objective to obtain an input p-SI denoted as I in−kl ðy; zÞ used for the simulation process. I in−kl ðy; zÞ was obtained using a linear combination of Mueller matrix elements, which carry the spatial distribution of scattered light for specific scattered polarization k ¼ p or s and incident polarization l ¼ p; s, or 45 deg. A total of three pairs of input p-SI were obtained with three types of incident polarizations. For example, the input p-SI I in−kl ðy; zÞ of scattered polarization p and incident polarization p was calculated by E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 2 ; 3 2 6 ; 2 4 3 Other equations of different polarizations can be found in Refs. 8 and 22.
A ray-tracing software program (Zemax-EE v2005, Zemax Development Corp.) was used to trace the rays of input p-SI I in−kl ðy; zÞ at the input plane P in to the imaging unit at image plane P im and obtain the p-SI I kl ðy; zÞ at P im , which is the conjugate image of the object plane at defocus position x ¼ 150 μm. Simulated p-SIs I kl ðy; zÞ are similar to the measured images obtained in our cell-scattered image measurement experiments with a microscope objective as indicated by their similar GLCM parameter distribution ranges (shown in Table 1). Some examples of simulated p-SI I in−kl ðy; zÞ and I kl ðy; zÞ and measured p-SI are shown in Figs. 3(b) and 3(c).

Analysis of Simulated p-SI with CT
CT is an algorithm that combines the merits of wavelet transform and directional filter banks. It can decompose the image into multiple subimages with different scales γ ¼ 1;2; 3; : : : ; γ max and directions δ ¼ 1;2; 3; : : : ; δ max , which enable simultaneous analysis of p-SI information at different scales and directions in the frequency domain. First, I kl ðy; zÞ, denoted as I kl;0−0 ðy; zÞ, which can be treated as the low-pass image with overview information in scale 0, is processed by the Laplacian pyramid algorithm. I kl ðy; zÞ is down-sampled by a weighted smoothing filter, so a low-pass image denoted as I kl;1−0 ðy; zÞ is acquired, carrying the overview information of I kl;0−0 ðy; zÞ in scale 1. The bandpass image I kl;1−d ðy; zÞ is then obtained by the difference of I kl;0−0 ðy; zÞ and the upsampled images of I kl;1−0 ðy; zÞ. By repeating this procedure on low-pass images in each scale, a series of low-pass and bandpass images are acquired at different scales denoted as I kl;γ−0 ðy; zÞ and I kl;γ−d ðy; zÞ, respectively, with overview and   Journal of Biomedical Optics 086013-3 August 2016 • Vol. 21 (8) detail information, where γ is the scale index. Second, bandpass images I kl;γ−d ðy; zÞ are processed by directional filter banks. I kl;γ−d ðy; zÞ is decomposed into several directional images denoted as I kl;γ−δ ðy; zÞ where δ ¼ 1;2; 3; : : : ; δ max , which contain the detail information in specific directions. This report used γ max ¼ 5 and δ max ¼ 2 3 for scales1-4 and δ max ¼ 2 2 for scale 5. In this study, 41 low-pass and bandpass subimages were obtained from one simulated p-SI, and 12 parameters were calculated from each subimage. These parameters were then tested by some stripe images with different periods or directions. The results showed that four of these 12 parameters varied significantly among the 41 subimages. Thus, these four parameters were chosen to characterize subimages, and the parameters energy E, contrast C, variance V, and fluctuation F were defined as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 3 ; 6 3 ; 5 8 4 where k indicates the polarization of scattered light from cells; l indicates the polarization of incident light; N y and N z are the pixel numbers of CT image I kl;γ−δ ðy; zÞ in yand z-directions, respectively; and squared local contrast SLC and mean value AVE are defined as

Analysis of Simulated p-SI with GLCM
GLCM is an image texture-analyzing algorithm based on the occurrence probability of gray level pairs. GLCM can be used to describe the orientation, amplitude, and period information of an image texture. 19 E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 5 ; 3 2 6 ; 6 9 7 ði − jÞ 2 pði; j; ψÞ pði; j; ψÞ · log½pði; j; ψÞ p xþy ðh; ψÞ · log½p xþy ðh; ψÞ ½i − μðψÞ 2 pði; j; ψÞ ðh − SEN kl;ψ Þ 2 p xþy ðh; ψÞ To combine the four-directional characteristic of p-SI I kl ðy; zÞ, the average of these 17 parameters over four directions was calculated and denoted as COR kl , DIS kl , CON kl , IDM kl , ENT kl , SEN kl , DEN kl , ASM kl , VAR kl , SVA kl , DVA kl , MEA kl , SAV kl , CLS kl , CLP kl , MAP kl , and MIP kl . Together with three other parameters, strength STR kl , mass-z MSZ kl , and mass-y MSY kl , a total of 20 parameters were used to characterize one p-SI E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 7 ; 3 Results

Simulated p-SIs
Erosion and dilation of three pixels of the mitochondria regions inside the cell for each slice, including the interpolated ones, generated two new virtual cell models, denoted as PCS −3 and PCS 3 , whereas the original one was denoted as PCS 0 ; these models are shown in Figs. 4(a), 4(b), and 4(c), respectively. The morphology parameters of PCS −3 , PCS 0 , and PCS 3 are shown in Table 2. Furthermore, three virtual cell models of a human prostate cancer cell termed PC3 (CRL-1435, ATCC) were also derived, which are denoted as PC3 −3 , PC3 0 , and PC3 3 .
To eliminate the influence of cell orientation, 26 cell orientations, which could be treated as 26 virtual cells, were employed in the three virtual cell models described above by rotating the original one by specific Euler angles, which were uniformly distributed over the 4π solid angle range. Six sets of virtual cells with different mitochondria volume ratios were used to execute ADDA simulation. The actual RI of mitochondria could not be precisely measured. Several RI models have been tested for simulated images and these simulated images were compared with experimental images by GLCM parameters. Then, two of these RI models N a and N b that yielded simulated images similar to experimental images were used in the simulation process to study the influence of mitochondria RI for diffraction images. Two sets of diffraction images were simulated with the same cell structure and the two RI models N a and N b . The RI mean values and standard deviations for cell organelles of RI models N a and N b used by ADDA are shown in Table 3.
For each cell model, each orientation and each RI, six p-SIs I kl ðy; zÞ were simulated with different polarizations.   shows some simulated images with different cell structures, orientations, and RI models.

Classification Results for Simulated p-SIs Using CT and GLCM
For one p-SI pair with the same incident polarization and two scattered polarizations as described above, 82 subimages and four types of parameters were calculated for one subimage, so a total of 328 parameters were extracted to characterize one p-SI pair. Because too many parameters for just one p-SI pair consume a large amount of computational time, they were divided into four sets C; E; F, and V of 82 parameters to be analyzed separately. Simulated p-SIs with the same RI of N b ; different cell models-PCS 0 versus PCS 3 , PCS −3 versus PCS 3 and PCS 3 versus PC3 3 -and the same cell model PCS 0 ; and different RIs-N a versus N b -illuminated by a 532 nm wavelength laser with polarizations p; s, or 45 deg were classified by an SVM algorithm 24,25 with four different kernel functions: linear (Lin), polynomial (Pol), sigmoid (Sig), and radial basis functions (Rbf). The sample size for every simulated p-SI set was 26. Given the small sample size of simulated diffraction images, all samples were used as training sets and no test process was performed. The goal of this study was to test the ability of CT and GLCM for characterizing cell diffraction images. Thus, the SVM was used only to achieve classification accuracy. The brief classification results are shown in Table 4. All four sets of classification could achieve very high classification accuracy A, greater than 98%, which was the ratio of the correctly classified p-SI number to the total p-SI number. This result shows that E parameters could result in better performance with a larger A for each set than the other parameters. The three highest classification accuracy levels indicated by larger A values with a single E parameter corresponding to the classification results in Table 4 with E parameters are shown in Fig. 6.
With a specific incident polarization k, a total of 40 GLCM features were applied to characterize one p-SI pair described above. The same four sets of p-SIs were classified by SVM with four of kernel functions. All four sets of classification results could also achieve very high classification accuracy A, greater than 98%, as shown in Table 5. The three highest classification accuracy levels indicated by larger A values with a single GLCM parameter corresponding to the results in Table 5 are shown in Fig. 6.

Classification Results for Experimental p-SIs
Using CT and GLCM Three sets of experimental PCS and PC3 cell images were also analyzed and classified as simulated images. The experimental cell image sample size is shown in Table 6. As with the simulated images, all samples are used as training sets. Tables 7 and 8 show the classification results of experimental cell images with CT and GLCM features. Both CT and GLCM algorithms can also achieve high classification accuracy with experimental cell images. As with the simulated results, 1.5534 AE 0.0127 a n c , n nl , n nm , n nh , and n m are the mean value and standard deviation of the RIs in regions Ω c , Ω nl , Ω nm , Ω nh , and Ω m , respectively. E parameters resulted in better performance with a larger A than the other parameters. The three highest classification accuracy levels indicated by larger A values with a single E parameter corresponding to the classification results in Table 7 with E parameters and a single GLCM parameter corresponding to the results in Table 8 are shown in Fig. 7.

Discussion
The actual variation of mitochondria volume change by erosion or dilation was obtained by replacing the mitochondria RIs of eroded regions with cytoplasm RIs or setting the dilated regions with mitochondria RIs instead of cytoplasm RIs. Thus, volume change essentially reflects local RI change. Visually, it is clear The classification accuracy A of incident polarization p; s, and 45 deg with parameter numbers N p and the kernel function used in SVM. Fig. 6 The three highest classification accuracy levels indicated by larger A values with a single E parameter corresponding to the classification results in Table 4 with E parameters in blue lines and single GLCM parameter corresponding to Table 5 in red lines.
Journal of Biomedical Optics 086013-7 August 2016 • Vol. 21 (8) that a change in mitochondria volume or RI greatly influences the p-SIs. The diffraction images have larger spots with larger RIs or more mitochondria as shown in Fig. 5. Therefore, changes in cell RIs can be studied by analyzing corresponding p-SIs. In this report, two algorithms, CT and GLCM, were applied to analyze the simulated p-SIs, which have different mitochondria volumes or different mitochondria RIs, and experimental cell images. The results showed that both CT and GLCM algorithms could precisely classify the same simulated and experimental p-SI sets with high accuracy, demonstrating that both parameters extracted by these two methods can be used to characterize and analyze p-SIs.
CT yielded more parameters than GLCM for a single p-SI and GLCM achieved larger classification accuracy A with a single parameter as compared with the CT method, indicating that the parameters extracted by GLCM contain more information and are more efficient to describe both the simulated and experimental p-SI as shown in Figs. 6 and 7. CT has disintegrated the original p-SI into 41 subimages, which generate only very few features of the original p-SI in one subimage. Therefore, even though CT extracts many subimages and parameters, this process did not significantly improve the performance of p-SI analysis. A previous study also showed that the CT subimages carry more information at coarse scales γ ≥ 3 of p-SI patterns than the fine scales γ ¼ 1 or 2, and the former perform better than the latter in SVM classification. Meanwhile, a long period of time is needed for the calculation of these subimages and parameters. 14 CT consumed about 10 times computational cost than GLCM for both feature calculation and SVM classification. Therefore, the GLCM algorithm should be more suitable and efficient for analysis of cell p-SIs. GLCM coupled with cellscattered images establishes a new rapid method that will benefit cell basic research including cell assay and cell structure studies.
Although diffraction images with different mitochondria RIs or volumes can be classified by the SVM algorithm, it is only suitable for binary situations. By using SVM multiple times and establishing a database with various types of cells, diffraction images provide opportunities for the development of fast and label-free cell assay methods in the future.
Since cell diffraction images mainly contain relatively low frequency information, and many subimages and parameters obtained by CT reflect relative high frequency information of the analyzed images, CT should be more appropriate for analyzing images with rich frequency information. The classification accuracy A of incident polarization p; s, and 45 deg with parameter numbers N p and the kernel function used in SVM.  The classification accuracy A of incident polarization p; s, and 45 deg with parameter numbers N p and the kernel function used in SVM.

Summary
This paper used virtual cell models reconstructed and interpolated from confocal image stacks stained by two fluorescence dyes to simulate polarization-scattered images with different mitochondria volumes modified by erosion and dilation or different mitochondria RIs. These simulated images were then analyzed with two algorithms, CT and GLCM. The results show that GLCM can carry more information than CT with the same number of parameters. Analysis of experimental images yields similar results as the simulated situation. GLCM is a more suitable and efficient method for cell-scattered image analysis than CT.
Jun Zhang received his BS degree in biomedical engineering from Tianjin University in 2010 and he is a PhD candidate of biomedical engineering at Tianjin University. His current research interests include scattered image analysis, cell morphology, and image processing.
Gang Wang received his BS degree in biomedical engineering from Tianjin University in 2013 and he is a graduate student of biomedical engineering at Tianjin University. His current research interests include image processing and analysis.
Yuanming Feng is a professor of biomedical engineering at Tianjin University. His research interests include diffraction imaging flow cytometry and measurement technique of radiation-induced apoptosis.
Yu Sa received his BS and PhD degrees from Tianjin University and he is a lecturer in the Department of Biomedical Engineering at Tianjin University. His research interests include instrument development, diffraction imaging, and computational fluid dynamics modeling studies. Fig. 7 The three highest classification accuracy levels indicated by larger A values with a single E parameter corresponding to the classification results in Table 7 with E parameters in blue lines and single GLCM parameter corresponding to Table 8 in red lines.