## 1.

## Introduction

Mitochondrion is one of the most important organelles in most cells; it supplies the energy used by cells and plays a significant role in many cell activities,^{1} including cell differentiation and apoptosis. Cell-scattered light stimulated by coherent light can provide deep insight into a cell’s morphology.^{2}3.^{–}^{4} Previously, polarization diffraction imaging flow cytometry was developed, which could simultaneously obtain cross-polarized scattered images or a polarization-scattered image (p-SI) pair per cell carried by core flow and illuminated by a laser. It has been shown that the spatial patterns in p-SI images correlate with morphological features of biological cells,^{5}6.7.^{–}^{8} but it is difficult to distinguish scattered images obtained from different cell morphologies. Therefore, an effective assay algorithm for analyzing the scattered images corresponding to the cell morphology is necessary.

This paper reports a simulation method based on an accurate process of scattered light simulation^{6}^{,}^{9}10.11.^{–}^{12} and a reconstructed real cell morphology obtained in previous studies^{6}^{,}^{12}13.^{–}^{14} to simulate the polarized cell-scattered images. The main characteristic of scattered images is pattern distribution, which can be treated as a type of frequency information. Contourlet transform (CT) can decompose an input image into several subimages that contain different scales and direction frequency information.^{15}16.17.^{–}^{18} Gray level co-occurrence matrix (GLCM) analyzes the probability of pixel pairs, which is also frequency information.^{19}^{,}^{20} Thus, these two algorithms were applied to analyze scattered images simulated from real cell morphologies with different mitochondria volumes and refractive indices (RIs). The results our comparative study showed that features extracted using GLCM contain more information than those extracted using CT. Higher classification accuracy could be achieved with a single GLCM parameter than CT, and GLCM could achieve high accuracy with fewer parameters than CT when using multiple parameters. Meanwhile, GLCM requires less computational cost. GLCM is a more appropriate assay method than CT for cell-scattered images.

## 2.

## Materials and Methods

## 2.1.

### Reconstructed Cell Three-Dimensional Structure and Refractive Index Model

The reconstructed cell three-dimensional (3-D) structures were acquired using a laser scanning confocal microscope (LSM510, Zeiss, Germany). Cells were stained with two fluorescent dyes, Syto 61 and Mito Tracker Orange CMTMRos (S11343 and M-7510, Life Technologies), using the same protocol as described in previous papers.^{6}^{,}^{12} Syto 61 marks the nucleic acids mainly distributed inside the nucleus, and Mito Tracker Orange CMTMRos mostly adheres to mitochondria. Stimulated by two laser beams with wavelengths of 633 and 532 nm, two 12-bit fluorescent intensity images of the nucleus and mitochondria were saved in the red and green channels, respectively, of an image stack with a resolution of $0.07\text{\hspace{0.17em}\hspace{0.17em}}\mu \mathrm{m}$ in the $X$- and $Y$-directions and $0.5\text{\hspace{0.17em}\hspace{0.17em}}\mu \mathrm{m}$ in the $Z$-direction as ${F}_{\mathrm{r}}(r)$ and ${F}_{\mathrm{g}}(r)$, where $r$ is the voxel position inside the cell. Approximately 60 slices were acquired for each cell at different focal planes with a step size of $0.5\text{\hspace{0.17em}\hspace{0.17em}}\mu \mathrm{m}$ in one stack. Figure 1(a) shows some images at different focal planes of one normal human prostate epithelial cell, termed PCS (PCS440010, ATCC).

After acquisition, the confocal image stack was processed with in-house software. The cell was segmented into three regions, nucleus, mitochondria, and cytoplasm, which were denoted as ${\mathrm{\Omega}}_{\mathrm{n}},{\mathrm{\Omega}}_{\mathrm{m}}$, and ${\mathrm{\Omega}}_{\mathrm{c}}$, respectively. ${\mathrm{\Omega}}_{\mathrm{n}}$ was further divided into three subregions denoted as ${\mathrm{\Omega}}_{\mathrm{nl}}$, ${\mathrm{\Omega}}_{\mathrm{nm}}$, and ${\mathrm{\Omega}}_{\mathrm{nh}}$ with relatively low, medium, and high RIs according to their pixel intensities, respectively; larger intensity means more nucleic acids, which should have a larger RI. To achieve accurate simulated diffraction images, cell 3-D structures should have the same resolution in all three directions. Thus, interpolation between confocal slices was performed. As Amsterdam discrete dipole approximation (ADDA) code treated scatterer as an array of voxel that has the same size in three directions, equal resolution in three dimensions is also necessary for accurate simulation. Because of the different resolution of confocal images between different directions, $0.07\text{\hspace{0.17em}\hspace{0.17em}}\mu \mathrm{m}$ in $X$ and $Y$ and $0.5\text{\hspace{0.17em}\hspace{0.17em}}\mu \mathrm{m}$ in $Z$, the 3-D cell structures were interpolated in $Z$ direction to obtain the same resolution as in $X$- and $Y$-directions, and the precise cell structure used for ADDA code. Details of segmentation and interpolation have been described elsewhere.^{12}^{,}^{14} The reconstructed 3-D cell structure is shown in Fig. 1(b) in which the nucleus, mitochondria, and cytoplasm can be clearly observed.

Because of the heterogeneity between and inside organelles, the cell RIs were modeled as the sum of a constant term, a mean RI value with a specific value in a specific region, and a randomly fluctuating term that indicates the heterogeneity inside organelle as follows:

## (1)

$${n}_{\alpha}(r)={n}_{\alpha 0}+({n}_{\alpha 0}-{n}_{w})\xb7{A}_{\alpha}\xb7RND,\phantom{\rule[-0.0ex]{1em}{0.0ex}}\forall \text{\hspace{0.17em}}r\in {\mathrm{\Omega}}_{\alpha},$$## 2.2.

### Simulation of p-SIs Using a Cell Model

To obtain the scattered light distribution of a cell model in a host medium, an open source code of a parallel ADDA algorithm for discrete dipole approximation was applied to simulate angularly resolved Mueller matrixes ${S}_{ij}$ where $i,j=\mathrm{1,2},\mathrm{3,4}$.^{11}^{,}^{21} The cell model was divided into several voxels in three directions, and each voxel was treated as a dipole. For example, the PCS cell mentioned above was treated as a $116\times 110\times 86$ dipole array, and the number of dipoles per wavelength used for this cell was 3.8. ${S}_{ij}$ were calculated from all fields scattered by these dipoles that were excited by an incident light. With the incident wavelength of 532 nm and RI of host media (water) ${n}_{\mathrm{w}}=1.334$, the polarizabilities could be calculated. ADDA code could be executed on a parallel computing cluster to calculate Mueller matrixes (${S}_{ij}$), which are functions of scattering polar angles ($\theta ,\phi $). Mie theory was used to validate ADDA code with a $10\text{-}\mu \mathrm{m}$ sphere, which was represented by a $126\times 126\times 126$ dipole array, and the number of dipoles per wavelength was 10.08. The normalized ${S}_{11}$ calculated by Mie theory and ADDA are shown in Fig. 2(a). The absolute error and relative error between ADDA and Mie theory ${S}_{11}$ are shown in Fig. 2(b).

The configuration of the simulation system is shown in Fig. 3(a). First, the scattered light was projected onto an input plane ${P}_{\text{in}}$, which is an arbitrary plane defined between the scatterer and the objective to obtain an input p-SI denoted as ${I}_{\text{in}-\mathrm{kl}}(y,z)$ used for the simulation process. ${I}_{\text{in}-\mathrm{kl}}(y,z)$ was obtained using a linear combination of Mueller matrix elements, which carry the spatial distribution of scattered light for specific scattered polarization $k=p$ or $s$ and incident polarization $l=p,s$, or 45 deg. A total of three pairs of input p-SI were obtained with three types of incident polarizations. For example, the input p-SI ${I}_{\text{in}-\mathrm{kl}}(y,z)$ of scattered polarization $p$ and incident polarization $p$ was calculated by

Other equations of different polarizations can be found in Refs. 8 and 22.A ray-tracing software program (Zemax-EE v2005, Zemax Development Corp.) was used to trace the rays of input p-SI ${I}_{\text{in}-\mathrm{kl}}(y,z)$ at the input plane ${P}_{\text{in}}$ to the imaging unit at image plane ${P}_{\mathrm{im}}$ and obtain the p-SI ${I}_{\mathrm{kl}}(y,z)$ at ${P}_{\mathrm{im}}$, which is the conjugate image of the object plane at defocus position $x=150\text{\hspace{0.17em}\hspace{0.17em}}\mu \mathrm{m}$. Simulated p-SIs ${I}_{\mathrm{kl}}(y,z)$ are similar to the measured images obtained in our cell-scattered image measurement experiments with a microscope objective as indicated by their similar GLCM parameter distribution ranges (shown in Table 1). Some examples of simulated p-SI ${I}_{\text{in}-\mathrm{kl}}(y,z)$ and ${I}_{\mathrm{kl}}(y,z)$ and measured p-SI are shown in Figs. 3(b) and 3(c).

## Table 1

Distribution range of GLCM parameter for simulated and measured images.

SVAp45 | SENp45 | MEAp45 | SVAs45 | SENs45 | MEAs45 | |
---|---|---|---|---|---|---|

Simulated images | $46.24\pm 8.09$ | $4.80\pm 0.18$ | $23.12\pm 4.04$ | $47.50\pm 12.19$ | $4.56\pm 0.38$ | $23.74\pm 6.09$ |

Measured images | $43.51\pm 12.77$ | $4.60\pm 0.40$ | $21.75\pm 6.39$ | $54.39\pm 15.38$ | $4.85\pm 0.30$ | $27.19\pm 7.69$ |

## 2.3.

### Analysis of Simulated p-SI with CT

CT is an algorithm that combines the merits of wavelet transform and directional filter banks. It can decompose the image into multiple subimages with different scales $\gamma =\mathrm{1,2},3,\dots ,{\gamma}_{\mathrm{max}}$ and directions $\delta =\mathrm{1,2},3,\dots ,{\delta}_{\mathrm{max}}$, which enable simultaneous analysis of p-SI information at different scales and directions in the frequency domain. First, ${I}_{\mathrm{kl}}(y,z)$, denoted as ${I}_{\mathrm{kl},0-0}(y,z)$, which can be treated as the low-pass image with overview information in scale 0, is processed by the Laplacian pyramid algorithm. ${I}_{\mathrm{kl}}(y,z)$ is down-sampled by a weighted smoothing filter, so a low-pass image denoted as ${I}_{\mathrm{kl},1-0}(y,z)$ is acquired, carrying the overview information of ${I}_{\mathrm{kl},0-0}(y,z)$ in scale 1. The bandpass image ${I}_{\mathrm{kl},1-d}(y,z)$ is then obtained by the difference of ${I}_{\mathrm{kl},0-0}(y,z)$ and the up-sampled images of ${I}_{\mathrm{kl},1-0}(y,z)$. By repeating this procedure on low-pass images in each scale, a series of low-pass and bandpass images are acquired at different scales denoted as ${I}_{\mathrm{kl},\gamma -0}(y,z)$ and ${I}_{\mathrm{kl},\gamma -d}(y,z)$, respectively, with overview and detail information, where $\gamma $ is the scale index. Second, bandpass images ${I}_{\mathrm{kl},\gamma -d}(y,z)$ are processed by directional filter banks. ${I}_{\mathrm{kl},\gamma -d}(y,z)$ is decomposed into several directional images denoted as ${I}_{\mathrm{kl},\gamma -\delta}(y,z)$ where $\delta =\mathrm{1,2},3,\dots ,{\delta}_{\mathrm{max}}$, which contain the detail information in specific directions. This report used ${\gamma}_{\mathrm{max}}=5$ and ${\delta}_{\mathrm{max}}={2}^{3}$ for scales1–4 and ${\delta}_{\mathrm{max}}={2}^{2}$ for scale 5. In this study, 41 low-pass and bandpass subimages were obtained from one simulated p-SI, and 12 parameters were calculated from each subimage. These parameters were then tested by some stripe images with different periods or directions. The results showed that four of these 12 parameters varied significantly among the 41 subimages. Thus, these four parameters were chosen to characterize subimages, and the parameters energy $E$, contrast $C$, variance $V$, and fluctuation $F$ were defined as

## (3)

$${E}_{\mathrm{kl},\gamma -\delta}=\sum _{y=1}^{{N}_{y}}\sum _{z=1}^{{N}_{z}}{I}_{\mathrm{kl},\gamma -\delta}^{\prime}{(y,z)}^{2}{C}_{\mathrm{kl},\gamma -\delta}=\sum _{y=1}^{{N}_{y}}\sum _{z=1}^{{N}_{z}}\frac{{\mathrm{SLC}}_{\mathrm{kl},\gamma -\delta}(y,z)}{4{N}_{y}{N}_{z}-2({N}_{y}+{N}_{z})}\phantom{\rule{0ex}{0ex}}{V}_{\mathrm{kl},\gamma -\delta}=\frac{1}{{N}_{y}{N}_{z}-1}\sum _{y=1}^{{N}_{y}}\sum _{z=1}^{{N}_{z}}{[{I}_{\mathrm{kl},\gamma -\delta}^{\prime}(y,z)-{\mathrm{AVE}}_{\mathrm{kl},\gamma -\delta}]}^{2}{F}_{\mathrm{kl},\gamma -\delta}=\frac{\sqrt{{V}_{kl,\gamma -\delta}}}{{\mathrm{AVE}}_{kl,\gamma -\delta}},$$## (4)

$${\mathrm{SLC}}_{\mathrm{kl},\gamma -\delta}(y,z)=\sum _{i=-\mathrm{1,1}}\{{[{I}_{\mathrm{kl},\gamma -\delta}^{\prime}(y+i,z)-{I}_{\mathrm{kl},\gamma -\delta}^{\prime}(y,z)]}^{2}+{[{I}_{\mathrm{kl},\gamma -\delta}^{\prime}(y,z+i)-{I}_{\mathrm{kl},\gamma -\delta}^{\prime}(y,z)]}^{2}\}\phantom{\rule{0ex}{0ex}}{\mathrm{AVE}}_{\mathrm{kl},\gamma -\delta}=\frac{1}{{N}_{y}{N}_{z}}\sum _{y=1}^{{N}_{y}}\sum _{z=1}^{{N}_{z}}{I}_{\mathrm{kl},\gamma -\delta}^{\prime}(y,z).$$## 2.4.

### Analysis of Simulated p-SI with GLCM

GLCM is an image texture-analyzing algorithm based on the occurrence probability of gray level pairs. GLCM can be used to describe the orientation, amplitude, and period information of an image texture.^{19}^{,}^{20} First, the GLCMs $P(i,j,d,\psi )$ are calculated by the numbers of repetitive pixel pairs ($i,j$) in the p-SI ${I}_{\mathrm{kl}}(y,z)$, which have an assigned distance $d$ at an assigned angle $\psi $ with intensities $i$ and $j$ where $i,j=\mathrm{1,2},3,\dots ,G$, $G$ is the maximum intensity, $d$ can be any integer smaller than the image size, and $\psi $ can be 0 deg, 45 deg, 90 deg, or 135 deg. Then, the normalized GLCMs $p(i,j,d,\psi )$ are obtained by normalizing $P(i,j,d,\psi )$ with the number of total pixel pairs. To study adjacent pixels, pixel pair distance $d$ was set to 1, and four normalized GLCMs $p(i,j,d,\psi )$ could be termed as $p(i,j,\psi )$ with $\psi =0\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{deg}$, 45 deg, 90 deg, and 135 deg. A total of 17 texture parameters (correlation COR, dissimilarity DIS, contrast CON, inverse difference moment IDM, entropy ENT, sum entropy SEN, difference entropy DEN, angular second moment ASM, variance VAR, sum variance SVA, difference variance DVA, mean MEA, sum average SAV, cluster shade CLS, cluster prominence CLP,^{23} maximum probability MAP, and minimum probability MIP) were extracted from each GLCM to characterize p-SI ${I}_{\mathrm{kl}}(y,z)$, defined as follows:

## (5)

$${\mathrm{COR}}_{\mathrm{kl},\psi}=\frac{1}{{\sigma}_{x}{\sigma}_{y}}\sum _{i=0}^{G-1}\sum _{j=0}^{G-1}(ij)p(i,j,\psi )-{\mu}_{x}(\psi ){\mu}_{y}(\psi )\phantom{\rule{0ex}{0ex}}{\mathrm{DIS}}_{\mathrm{kl},\psi}=\sum _{i=0}^{G-1}\sum _{j=0}^{G-1}|i-j|p(i,j,\psi ){\mathrm{CON}}_{\mathrm{kl},\psi}=\sum _{i=0}^{G-1}\sum _{j=0}^{G-1}{(i-j)}^{2}p(i,j,\psi )\phantom{\rule{0ex}{0ex}}{\mathrm{IDM}}_{\mathrm{kl},\psi}=\sum _{i=0}^{G-1}\sum _{j=0}^{G-1}\frac{1}{1+{(i-j)}^{2}}p(i,j,\psi ){\mathrm{ENT}}_{\mathrm{kl},\psi}=-\sum _{i=0}^{G-1}\sum _{j=0}^{G-1}p(i,j,\psi )\xb7\mathrm{log}[p(i,j,\psi )]\phantom{\rule{0ex}{0ex}}{\mathrm{SEN}}_{\mathrm{kl},\psi}=-\sum _{h=0}^{2\text{\hspace{0.17em}\hspace{0.17em}}G-2}{p}_{x+y}(h,\psi )\xb7\mathrm{log}[{p}_{x+y}(h,\psi )]{\mathrm{DEN}}_{\mathrm{kl},\psi}=-\sum _{h=0}^{G-1}{p}_{x-y}(h,\psi )\xb7\mathrm{log}[{p}_{x-y}(h,\psi )]\phantom{\rule{0ex}{0ex}}{\mathrm{ASM}}_{\mathrm{kl},\psi}=\sum _{i=0}^{G-1}\sum _{j=0}^{G-1}{[p(i,j,\psi )]}^{2}{\mathrm{VAR}}_{\mathrm{kl},\psi}=\sum _{i=0}^{G-1}\sum _{j=0}^{G-1}{[i-\mu (\psi )]}^{2}p(i,j,\psi )\phantom{\rule{0ex}{0ex}}{\mathrm{SVA}}_{\mathrm{kl},\psi}=\sum _{h=0}^{2\text{\hspace{0.17em}\hspace{0.17em}}G-2}{(h-{\mathrm{SEN}}_{\mathrm{kl},\psi})}^{2}{p}_{x+y}(h,\psi ){\mathrm{DVA}}_{\mathrm{kl},\psi}=\frac{1}{G-1}\sum _{h=0}^{G-1}{[{p}_{x-y}(h,\psi )-{\overline{p}}_{x-y}(\psi )]}^{2}\phantom{\rule{0ex}{0ex}}{\mathrm{MEA}}_{\mathrm{kl},\psi}=\sum _{i=0}^{G-1}i\sum _{j=0}^{G-1}p(i,j,\psi )={\mu}_{x}(\psi ){\mathrm{SAV}}_{\mathrm{kl},\psi}=\sum _{h=0}^{2\text{\hspace{0.17em}\hspace{0.17em}}G-2}k{p}_{x+y}(h,\psi )\phantom{\rule{0ex}{0ex}}{\mathrm{CLS}}_{\mathrm{kl},\psi}=\sum _{i=0}^{G-1}\sum _{j=0}^{G-1}{[i+j-2\mu (\psi )]}^{3}p(i,j,\psi ){\mathrm{CLP}}_{\mathrm{kl},\psi}=\sum _{i=0}^{G-1}\sum _{j=0}^{G-1}{[i+j-2\mu (\psi )]}^{4}p(i,j,\psi )\phantom{\rule{0ex}{0ex}}{\mathrm{MIP}}_{\mathrm{kl},\psi}=\mathrm{min}[p(i,j,\psi )]{\mathrm{MAP}}_{\mathrm{kl},\psi}=\mathrm{max}[p(i,j,\psi )],$$## (6)

$${p}_{x}(i,\psi )=\sum _{j=0}^{G-1}p(i,j,\psi ){p}_{y}(j,\psi )=\sum _{i=0}^{G-1}p(i,j,\psi )\phantom{\rule{0ex}{0ex}}{p}_{x+y}(h,\psi )=\underset{i+j=h}{\sum _{i=0}^{G-1}\sum _{j=0}^{G-1}}p(i,j,\psi ){p}_{x-y}(h,\psi )=\underset{|i-j|=h}{\sum _{i=0}^{G-1}\sum _{j=0}^{G-1}}p(i,j,\psi )\phantom{\rule{0ex}{0ex}}{\mu}_{x}(\psi )=\sum _{i=0}^{G-1}i{p}_{x}(i,\psi ){\mu}_{y}(\psi )=\sum _{j=0}^{G-1}j{p}_{y}(j,\psi )\phantom{\rule{0ex}{0ex}}{\sigma}_{x}{(\psi )}^{2}=\sum _{i=0}^{G-1}{[{p}_{x}(i,\psi )-{\mu}_{x}(\psi )]}^{2}{\sigma}_{y}{(\psi )}^{2}=\sum _{j=0}^{G-1}{[{p}_{y}(j,\psi )-{\mu}_{y}(\psi )]}^{2}\phantom{\rule{0ex}{0ex}}\mu (\psi )={\mu}_{x}(\psi )={\mu}_{y}(\psi )\mathrm{.}$$To combine the four-directional characteristic of p-SI ${I}_{\mathrm{kl}}(y,z)$, the average of these 17 parameters over four directions was calculated and denoted as ${\mathrm{COR}}_{\mathrm{kl}}$, ${\mathrm{DIS}}_{\mathrm{kl}}$, ${\mathrm{CON}}_{\mathrm{kl}}$, ${\mathrm{IDM}}_{\mathrm{kl}}$, ${\mathrm{ENT}}_{\mathrm{kl}}$, ${\mathrm{SEN}}_{\mathrm{kl}}$, ${\mathrm{DEN}}_{\mathrm{kl}}$, ${\mathrm{ASM}}_{\mathrm{kl}}$, ${\mathrm{VAR}}_{\mathrm{kl}}$, ${\mathrm{SVA}}_{\mathrm{kl}}$, ${\mathrm{DVA}}_{\mathrm{kl}}$, ${\mathrm{MEA}}_{\mathrm{kl}}$, ${\mathrm{SAV}}_{\mathrm{kl}}$, ${\mathrm{CLS}}_{\mathrm{kl}}$, ${\mathrm{CLP}}_{\mathrm{kl}}$, ${\mathrm{MAP}}_{\mathrm{kl}}$, and ${\mathrm{MIP}}_{\mathrm{kl}}$. Together with three other parameters, strength ${\mathrm{STR}}_{\mathrm{kl}}$, mass-z ${\mathrm{MSZ}}_{\mathrm{kl}}$, and mass-y ${\mathrm{MSY}}_{\mathrm{kl}}$, a total of 20 parameters were used to characterize one p-SI

## (7)

$${\mathrm{STR}}_{\mathrm{kl}}=\sum _{y=1}^{H}\sum _{z=1}^{W}{I}_{\mathrm{kl}}(y,z),\phantom{\rule{0ex}{0ex}}{\mathrm{MSZ}}_{\mathrm{kl}}=\frac{1}{{\mathrm{STR}}_{\mathrm{kl}}}\sum _{y=1}^{H}\sum _{z=1}^{W}{I}_{\mathrm{kl}}(y,z)\times z,\phantom{\rule{0ex}{0ex}}{\mathrm{MSY}}_{\mathrm{kl}}=\frac{1}{{\mathrm{STR}}_{\mathrm{kl}}}\sum _{y=1}^{H}\sum _{z=1}^{W}{I}_{\mathrm{kl}}(y,z)\times y.$$## 3.

## Results

## 3.1.

### Simulated p-SIs

Erosion and dilation of three pixels of the mitochondria regions inside the cell for each slice, including the interpolated ones, generated two new virtual cell models, denoted as ${\mathrm{PCS}}_{-3}$ and ${\mathrm{PCS}}_{3}$, whereas the original one was denoted as ${\mathrm{PCS}}_{0}$; these models are shown in Figs. 4(a), 4(b), and 4(c), respectively. The morphology parameters of ${\mathrm{PCS}}_{-3}$, ${\mathrm{PCS}}_{0}$, and ${\mathrm{PCS}}_{3}$ are shown in Table 2. Furthermore, three virtual cell models of a human prostate cancer cell termed PC3 (CRL-1435, ATCC) were also derived, which are denoted as ${\mathrm{PC}3}_{-3}$, ${\mathrm{PC}3}_{0}$, and ${\mathrm{PC}3}_{3}$.

## Table 2

Morphology parameters of cell models.

Cell model | Vca (μm3) | Vma (μm3) | Rmca (%) |
---|---|---|---|

${\mathrm{PCS}}_{-3}$ | 1392 | 54 | 3.88 |

${\mathrm{PCS}}_{0}$ | 149 | 10.70 | |

${\mathrm{PCS}}_{3}$ | 279 | 20.04 |

## a

Vc is the volume of the cell, Vm is the volume of the mitochondria, and Rmc is the volume ratio of the mitochondria.

To eliminate the influence of cell orientation, 26 cell orientations, which could be treated as 26 virtual cells, were employed in the three virtual cell models described above by rotating the original one by specific Euler angles, which were uniformly distributed over the $4\pi $ solid angle range. Six sets of virtual cells with different mitochondria volume ratios were used to execute ADDA simulation. The actual RI of mitochondria could not be precisely measured. Several RI models have been tested for simulated images and these simulated images were compared with experimental images by GLCM parameters. Then, two of these RI models ${N}_{\mathrm{a}}$ and ${N}_{\mathrm{b}}$ that yielded simulated images similar to experimental images were used in the simulation process to study the influence of mitochondria RI for diffraction images. Two sets of diffraction images were simulated with the same cell structure and the two RI models ${N}_{\mathrm{a}}$ and ${N}_{\mathrm{b}}$. The RI mean values and standard deviations for cell organelles of RI models ${N}_{\mathrm{a}}$ and ${N}_{\mathrm{b}}$ used by ADDA are shown in Table 3.

## Table 3

RI models of cell models.

RI | nca | nnla | nnma | nnha | nma |
---|---|---|---|---|---|

${N}_{\mathrm{a}}$ | $1.3675\pm 0.0019$ | $1.4183\pm 0.0049$ | $1.4692\pm 0.0078$ | $1.5200\pm 0.0107$ | $1.4200\pm 0.0050$ |

${N}_{\mathrm{b}}$ | $1.5534\pm 0.0127$ |

## a

nc, nnl, nnm, nnh, and nm are the mean value and standard deviation of the RIs in regions Ωc, Ωnl, Ωnm, Ωnh, and Ωm, respectively.

For each cell model, each orientation and each RI, six p-SIs ${I}_{\mathrm{kl}}(y,z)$ were simulated with different polarizations. Figure 5 shows some simulated images with different cell structures, orientations, and RI models.

## 3.2.

### Classification Results for Simulated p-SIs Using CT and GLCM

For one p-SI pair with the same incident polarization and two scattered polarizations as described above, 82 subimages and four types of parameters were calculated for one subimage, so a total of 328 parameters were extracted to characterize one p-SI pair. Because too many parameters for just one p-SI pair consume a large amount of computational time, they were divided into four sets $C,E,F$, and $V$ of 82 parameters to be analyzed separately. Simulated p-SIs with the same RI of ${N}_{\mathrm{b}}$; different cell models—${\mathrm{PCS}}_{0}$ versus ${\mathrm{PCS}}_{3}$, ${\mathrm{PCS}}_{-3}$ versus ${\mathrm{PCS}}_{3}$ and ${\mathrm{PCS}}_{3}$ versus ${\mathrm{PC}3}_{3}$—and the same cell model ${\mathrm{PCS}}_{0}$; and different RIs—${N}_{\mathrm{a}}$ versus ${N}_{\mathrm{b}}$—illuminated by a 532 nm wavelength laser with polarizations $p,s$, or 45 deg were classified by an SVM algorithm^{24}^{,}^{25} with four different kernel functions: linear (Lin), polynomial (Pol), sigmoid (Sig), and radial basis functions (Rbf). The sample size for every simulated p-SI set was 26. Given the small sample size of simulated diffraction images, all samples were used as training sets and no test process was performed. The goal of this study was to test the ability of CT and GLCM for characterizing cell diffraction images. Thus, the SVM was used only to achieve classification accuracy.

The brief classification results are shown in Table 4. All four sets of classification could achieve very high classification accuracy $A$, greater than 98%, which was the ratio of the correctly classified p-SI number to the total p-SI number. This result shows that $E$ parameters could result in better performance with a larger $A$ for each set than the other parameters. The three highest classification accuracy levels indicated by larger $A$ values with a single $E$ parameter corresponding to the classification results in Table 4 with $E$ parameters are shown in Fig. 6.

## Table 4

The SVM classification accuracy of different cell models or RIs with CT parameters.

Cell models | RI | Pa | (A, Np, kernel)-pb | (A, Np, kernel)-sb | (A, Np, kernel)-45 degb |
---|---|---|---|---|---|

${\mathrm{PCS}}_{0}$ versus ${\mathrm{PCS}}_{3}$ | ${N}_{\mathrm{b}}$ | $C$ | 100%, 50, Lin | 96.15%, 23, Lin | 100%, 8, Lin |

$E$ | 98.08%, 39, Lin | 100%, 22, Lin | 98.08%, 16, Lin | ||

$F$ | 88.46%, 22, Sig | 86.54%, 28, Lin | 84.62%, 7, Lin | ||

$V$ | 94.23%, 3, Lin | 90.38%, 33, Sig | 90.38%, 8, Lin | ||

${\mathrm{PCS}}_{-3}$ versus ${\mathrm{PCS}}_{3}$ | ${N}_{\mathrm{b}}$ | $C$ | 98.08%, 7, Lin | 96.15%, 6, Sig | 98.08%, 3, Lin |

$E$ | 98.08%, 63, Lin | 96.15%, 13, Sig | 94.23%, 20, Lin | ||

$F$ | 88.46%, 44, Rbf | 88.46%, 38, Lin | 78.85%, 14, Lin | ||

$V$ | 96.15%, 60, Pol | 90.38%, 15, Rbf | 88.46%, 24, Pol | ||

${\mathrm{PCS}}_{0}$ | ${N}_{\mathrm{a}}$ versus ${N}_{\mathrm{b}}$ | $C$ | 88.46%, 11, Lin | 84.62%, 28, Lin | 90.38%, 17, Lin |

$E$ | 98.08%, 20, Lin | 94.23%, 77, Lin | 96.15%, 43, Lin | ||

$F$ | 76.92%, 32, Lin | 78.85%, 7, Lin | 76.92%, 24, Lin | ||

$V$ | 88.46%, 31, Lin | 82.69%, 14, Sig | 80.77%, 8, Sig | ||

${\mathrm{PCS}}_{3}$ versus ${\mathrm{PC}3}_{3}$ | ${N}_{\mathrm{b}}$ | $C$ | 100%, 3, Lin | 100%, 5, Lin | 100%, 4, Lin |

$E$ | 98.08%, 18, Lin | 96.15%, 7, Lin | 98.08%, 9, Rbf | ||

$F$ | 92.31%, 25, Lin | 88.46%, 48, Lin | 82.69%, 12, Lin | ||

$V$ | 100%, 14, Lin | 94.23%, 17, Lin | 96.15%, 9, Sig |

## a

P indicates the CT parameters used for SVM C,E,F, or V.

## b

The classification accuracy A of incident polarization p,s, and 45 deg with parameter numbers Np and the kernel function used in SVM.

## Table 5

The SVM classification accuracy of different cell models or RIs with GLCM parameters.

Cell model | RI | (A, Np, kernel)-pa | (A, Np, kernel)-sa | (A, Np, kernel)-45 dega |
---|---|---|---|---|

${\mathrm{PCS}}_{0}$ versus ${\mathrm{PCS}}_{3}$ | ${N}_{\mathrm{b}}$ | 98.08%, 2, Lin | 100%, 32, Rbf | 100%, 26, Lin |

${\mathrm{PCS}}_{-3}$ versus $PC{S}_{3}$ | ${N}_{\mathrm{b}}$ | 96.15%, 34, Pol | 96.15%, 19, Lin | 98.08%, 25, Pol |

${\mathrm{PCS}}_{0}$ | ${N}_{\mathrm{a}}$ versus ${N}_{\mathrm{b}}$ | 98.08%, 25, Lin | 94.23%, 2, Rbf | 96.15%, 38, Pol |

${\mathrm{PCS}}_{3}$ versus ${\mathrm{PC}3}_{3}$ | ${N}_{\mathrm{b}}$ | 98.08%, 30, Lin | 98.08%, 2, Pol | 96.15%, 3, Rbf |

## a

The classification accuracy A of incident polarization p,s, and 45 deg with parameter numbers Np and the kernel function used in SVM.

With a specific incident polarization $k$, a total of 40 GLCM features were applied to characterize one p-SI pair described above. The same four sets of p-SIs were classified by SVM with four of kernel functions. All four sets of classification results could also achieve very high classification accuracy $A$, greater than 98%, as shown in Table 5. The three highest classification accuracy levels indicated by larger $A$ values with a single GLCM parameter corresponding to the results in Table 5 are shown in Fig. 6.

## 3.3.

### Classification Results for Experimental p-SIs Using CT and GLCM

Three sets of experimental PCS and PC3 cell images were also analyzed and classified as simulated images. The experimental cell image sample size is shown in Table 6. As with the simulated images, all samples are used as training sets.

## Table 6

Experimental sample size of cell images used for SVM.

Cell model | Sample size | ||
---|---|---|---|

p | s | 45 deg | |

PCS | 602 | 407 | 390 |

PC3 | 731 | 733 | 794 |

Tables 7 and 8 show the classification results of experimental cell images with CT and GLCM features. Both CT and GLCM algorithms can also achieve high classification accuracy with experimental cell images. As with the simulated results, $E$ parameters resulted in better performance with a larger $A$ than the other parameters. The three highest classification accuracy levels indicated by larger $A$ values with a single $E$ parameter corresponding to the classification results in Table 7 with $E$ parameters and a single GLCM parameter corresponding to the results in Table 8 are shown in Fig. 7.

## Table 7

The SVM classification accuracy of experimental PCS and PC3 cells with CT parameters.

Cell type | Pa | (A, Np, kernel)-Pb | (A, Np, kernel)-sb | (A, Np, kernel)-45 degb |
---|---|---|---|---|

PCS versus PC3 | $C$ | 97.97%, 79, Lin | 96.23%, 56, Lin | 78.80%, 79, Lin |

$E$ | 98.50%, 78, Lin | 97.81%, 69, Pol | 86.15%, 82, Lin | |

$F$ | 97.45%, 79, Lin | 93.60%, 36, Lin | 80.15%, 61, Lin | |

$V$ | 97.75%, 75, Lin | 95.09%, 45, Lin | 78.12%, 54, Lin |

## a

P indicates the CT parameters used for SVM C,E,F, or V.

## b

The classification accuracy A of incident polarization p,s, and 45 deg with parameter numbers Np and the kernel function used in SVM.

## Table 8

The SVM classification accuracy of experimental PCS and PC3 cells with GLCM parameters.

Cell model | (A, Np, kernel)-pa | (A, Np, kernel)-sa | (A, Np, kernel)-45 dega |
---|---|---|---|

PCS versus PC3 | 100%, 40, Lin | 98.68%, 19, Lin | 88.68%, 29, Pol |

## a

## 4.

## Discussion

The actual variation of mitochondria volume change by erosion or dilation was obtained by replacing the mitochondria RIs of eroded regions with cytoplasm RIs or setting the dilated regions with mitochondria RIs instead of cytoplasm RIs. Thus, volume change essentially reflects local RI change. Visually, it is clear that a change in mitochondria volume or RI greatly influences the p-SIs. The diffraction images have larger spots with larger RIs or more mitochondria as shown in Fig. 5. Therefore, changes in cell RIs can be studied by analyzing corresponding p-SIs. In this report, two algorithms, CT and GLCM, were applied to analyze the simulated p-SIs, which have different mitochondria volumes or different mitochondria RIs, and experimental cell images. The results showed that both CT and GLCM algorithms could precisely classify the same simulated and experimental p-SI sets with high accuracy, demonstrating that both parameters extracted by these two methods can be used to characterize and analyze p-SIs.

CT yielded more parameters than GLCM for a single p-SI and GLCM achieved larger classification accuracy A with a single parameter as compared with the CT method, indicating that the parameters extracted by GLCM contain more information and are more efficient to describe both the simulated and experimental p-SI as shown in Figs. 6 and 7. CT has disintegrated the original p-SI into 41 subimages, which generate only very few features of the original p-SI in one subimage. Therefore, even though CT extracts many subimages and parameters, this process did not significantly improve the performance of p-SI analysis. A previous study also showed that the CT subimages carry more information at coarse scales $\gamma \ge 3$ of p-SI patterns than the fine scales $\gamma =1$ or 2, and the former perform better than the latter in SVM classification. Meanwhile, a long period of time is needed for the calculation of these subimages and parameters.^{14} CT consumed about 10 times computational cost than GLCM for both feature calculation and SVM classification. Therefore, the GLCM algorithm should be more suitable and efficient for analysis of cell p-SIs. GLCM coupled with cell-scattered images establishes a new rapid method that will benefit cell basic research including cell assay and cell structure studies.

Although diffraction images with different mitochondria RIs or volumes can be classified by the SVM algorithm, it is only suitable for binary situations. By using SVM multiple times and establishing a database with various types of cells, diffraction images provide opportunities for the development of fast and label-free cell assay methods in the future.

Since cell diffraction images mainly contain relatively low frequency information, and many subimages and parameters obtained by CT reflect relative high frequency information of the analyzed images, CT should be more appropriate for analyzing images with rich frequency information.

## 5.

## Summary

This paper used virtual cell models reconstructed and interpolated from confocal image stacks stained by two fluorescence dyes to simulate polarization-scattered images with different mitochondria volumes modified by erosion and dilation or different mitochondria RIs. These simulated images were then analyzed with two algorithms, CT and GLCM. The results show that GLCM can carry more information than CT with the same number of parameters. Analysis of experimental images yields similar results as the simulated situation. GLCM is a more suitable and efficient method for cell-scattered image analysis than CT.

## Acknowledgments

The authors acknowledge financial support from the National Natural Science Foundation of China (Grant Nos. 81171342 and 81201148).

## References

## Biography

**Jun Zhang** received his BS degree in biomedical engineering from Tianjin University in 2010 and he is a PhD candidate of biomedical engineering at Tianjin University. His current research interests include scattered image analysis, cell morphology, and image processing.

**Gang Wang** received his BS degree in biomedical engineering from Tianjin University in 2013 and he is a graduate student of biomedical engineering at Tianjin University. His current research interests include image processing and analysis.