Automated classification of cell morphology by coherence-controlled holographic microscopy

In the last few years, classification of cells by machine learning has become frequently used in biology. However, most of the approaches are based on morphometric (MO) features, which are not quantitative in terms of cell mass. This may result in poor classification accuracy. Here, we study the potential contribution of coherence-controlled holographic microscopy enabling quantitative phase imaging for the classification of cell morphologies. We compare our approach with the commonly used method based on MO features. We tested both classification approaches in an experiment with nutritionally deprived cancer tissue cells, while employing several supervised machine learning algorithms. Most of the classifiers provided higher performance when quantitative phase features were employed. Based on the results, it can be concluded that the quantitative phase features played an important role in improving the performance of the classification. The methodology could be valuable help in refining the monitoring of live cells in an automated fashion. We believe that coherencecontrolled holographic microscopy, as a tool for quantitative phase imaging, offers all preconditions for the accurate automated analysis of live cell behavior while enabling noninvasive label-free imaging with sufficient contrast and high-spatiotemporal phase sensitivity. © The Authors. Published by SPIE under a Creative Commons Attribution 3.0 Unported License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI. [DOI: 10 .1117/1.JBO.22.8.086008]


Introduction
In many fields of biology, such as cancer research, 1 drug discovery, 2,3 cell death, 4 phenotypic screening, 5 study of pathological processes, 6,7 or interactions of cells with biomaterials, 8 the microscopy study of cell morphology belongs to essential research methods. Manual observation and evaluation of cell morphology in microscopy images require a trained biologist who performs inspection on every image. Nowadays, the increasing prevalence of automated image acquisition systems is enabling microscopy experiments that generate large image datasets. As such, manual image analysis becomes rather time-consuming and requires considerable effort and concentration of the investigator. Moreover, the analysis provided by one person has a tendency to be biased by subjective observation. The analysis results, therefore, largely depend on personal skills, decisions, and preferences. Consequently, these aspects impose significant constraints on the speed and reliability of cell morphology evaluation from microscopy images.
One of the approaches to address these limitations is supervised machine learning. 9 Finally, the algorithms of supervised machine learning are increasingly being applied to classification of microscopy data. Such a solution provides an objective unbiased method of scoring the content of microscopic images in contrast to subjective manual interpretation, thus potentially being more sensitive, consistent, and accurate.
When applying the supervised machine learning to the classification of cell morphologies, a computer is trained based on example images of cells belonging to predefined morphological classes. Once the cells are segmented, the common step is to represent each of them by a set of features, for the purpose of dimensionality reduction. The features are summarized into a feature vector, which serves as an input to the classifier. After the classifier is trained on the user-labeled training examples, it is able to distinguish between a defined set of cell classes.
In the last few years, a lot of work has been done in automated image analysis by machine learning. A machine learning approach for classification of erythrocytes in anemia based on morphological changes was presented in Ref. 6. Automated classification of myeloma cells in microscopic images was proposed in Ref. 10. In both studies, the input images for classification were gained by bright-field imaging of stained cells. The staining ensures the accurate segmentation of cells, yet there is a demand for sample preparation and the possibility that the stain would influence the natural cell behavior.
Another study 11 presents application of machine learning techniques to analysis of cell morphology in phase-contrast microscopy images. However, the images gained by phasecontrast microscopy demonstrate halo artifacts, which makes the boundaries of the cells appear brighter and makes the segmentation challenging and inaccurate. This may result in poor accuracy of the machine learning classifier.
Several publications have focused on cell classification of images gained by fluorescent microscopy. Automated scoring of diverse cell morphologies by means of machine learning was described in Ref. 12. Several automated image analysis methods for high-content screening of fluorescent images were summarized in Ref. 13. However, the drawback of these techniques is the necessity of sample preparation by fluorescent staining of the cells before imaging. Moreover, the fluorescent stain is likely to influence the cell behavior as well as the cell morphology, which could possibly affect the experiment and classification.
In the mentioned approaches, the features extracted from the images mostly represent the cellular shape or the intensity values depending on the stain concentration, but they are not quantitative in terms of cell mass.
In recent years, digital holographic microscopy (DHM) has proven to be a very versatile noninvasive tool for the observation of live cells, [14][15][16][17] while overcoming the limitations of the previously mentioned approaches. DHM provides quantitative phase images (QPIs) with high intrinsic contrast without labeling and since the images contain quantitative information about cell mass, it may potentially improve the performance of the classification.
Several publications studied cell behavior by monitoring cell features extracted from the QPI. Cell life cycle characterization by monitoring of morphometric (MO) and quantitative phase features was proposed in Ref. 18. Assessment of wound healing by monitoring the cellular volume, dry mass, and refractive index was presented in Ref. 19. The study of cancer cell growth and drug response by monitoring cell dry mass is described in Ref. 20. In the mentioned publications, the authors extract quantitative phase features and monitor their changes, but do not apply machine learning algorithms for the automated assessment of cell behavior.
Only limited work has been published toward the application of machine learning classification algorithms to QPI. Morphology-based classification of red blood cells using DHM was presented in Ref. 7. Automated detection and classification of living organisms in drinking water resources using DHM was performed in Ref. 21. The automated diagnosis of breast and prostate cancer from tissue biopsies was described in Refs. 22 and 23, respectively. However, to our present knowledge, none of the publications studied the potential of QPI for the classification of morphology of live adherent eukaryotic cells.
In this work, we demonstrate the methodology for classification of cell morphologies based on QPI in the experiment with deprived cancer tissue cells. In the experiment, the cells are exposed to phosphate-buffered saline (PBS). PBS deprives cells of nutrients and causes morphological changes. The deprived cells exhibit different types of morphologies, which represent the categories for classification. The cells were imaged with a coherence-controlled holographic microscope (CCHM). 24,25 The imaging in CCHM is based on the interference of the object and the reference light beams, which enables it to detect the phase delay induced by the specimen (SP). It has been demonstrated in several publications that the measured phase in the image corresponds to the dry mass density distribution within the cell. 26,27 Since the dry mass density distribution within the viable and deprived cells differs markedly, the features extracted from the QPIs play an important role in the classification. The goal of the experiment was to evaluate the contribution of QPI for the classification of tissue culture cell morphologies and compare the results with a commonly used approach based on MO features.

Methods
Cell culture and deprivation of cancer tissue cells are described in section "cell culture techniques." The imaging process is explained in section "image acquisition." Finally, the image preprocessing and classification methods are presented in section "classification."

Cell Culture Techniques
In the experiment, LW13K2 cells (spontaneously transformed rat embryonic fibroblasts) were exposed to conditions that induce nutritional deprivation. The cells were first grown attached to a solid surface and maintained in Eagle's minimal essential medium (Sigma-Aldrich, Czech Republic) supplemented with 10% fetal bovine serum (Sigma-Aldrich, Czech Republic) and gentamicin (Sigma-Aldrich, Czech Republic) in an incubator at 37°C and humid 3.5% CO 2 atmosphere. The cells were harvested by trypsinization and transferred into five sterilized observation chambers μ-Slide I (Ibidi GmbH, Germany). The seeding densities were 20 cells∕mm 2 in order to achieve sparse coverage for the purposes of segmentation of individual cells. The observation chambers were kept in the incubator under the same conditions.
The culture medium was replaced by PBS after two days. For the experiment, standard PBS (NaCl 8 g∕l, KCl 0.2 g∕l, KH 2 PO 4 0.24 g∕l, Na 2 HPO 4 1.44 g∕l, pH 7.4) was used. PBS deprives cells of nutrients and causes changes in cell morphology. The cells were imaged immediately after PBS application. The same procedure was repeated for all five observation chambers.

Image Acquisition
CCHM, 24,25,28,29 now also available as Q-Phase (by TESCAN Orsay Holding, a.s., Brno, Czech Republic), was employed for the quantitative phase imaging of cells. The optical setup of the microscope is based on Mach-Zehnder type interferometer modified for incoherent off-axis holographic microscopy ( Fig. 1). The illumination is formed by a low coherence source (halogen lamp), interference filters, collector lens, and the beamsplitter (BS), which splits the beam into two separated optical arms-reference and object arms. Matching condensers, objectives, and tube lenses (TL) are contained in both arms. The two arms are nearly identical except for the diffraction grating (DG) placed in the reference arm used for the spatial separation of the light of different wavelengths. Beams from the reference (only the + first-order beam is transmitted) and object arm interfere in the output plane (OP), where the interference pattern (hologram) is formed. The hologram recorded by the CCD camera is further reconstructed using a Q-Phase SW (TESCAN Orsay Holding, a. s., Brno, Czech Republic). The numerical reconstruction of the hologram is based on fast Fourier transform methods 30 and phase unwrapping. 31 Since the reconstructed QPIs are affected by the phase aberrations, they are compensated by the algorithm described in Ref. 32. The phase in the reconstructed quantitative image is proportional to the optical path difference of the two arms according to E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 1 ; 3 2 6 ; 1 2 7 where λ is the illumination wavelength, d is the thickness of the cell, and n c is the mean axially integrated refractive index of the cell immersed in the culture medium of refractive index n m . 33 It has been shown that the measured phase in the image corresponds to the dry mass density distribution within the cell. 26,27 In contrast to existing DHM techniques, 4,7,18,19 the use of incoherent illumination enables high-quality quantitative phase imaging with strong suppression of coherent noise and parasitic interferences while providing high temporal stability and spatial uniformity of the phase measurement. 25 Using the approach described in Ref. 34, the temporal and spatial phase sensitivities were determined as 0.0081 and 0.0094 rad, respectively. The lateral resolution is comparable with the lateral resolution of conventional wide-field optical microscopes, thus twice better than in typical DHM techniques with a coherent source of illumination. Moreover, the low illumination power of the incoherent source (0.2 μW∕cm 2 ) is not likely to influence the physiological functions of the imaged cells, which is very convenient for live cell imaging.
During the experiment, the samples were illuminated with a halogen lamp through the interference filter (λ ¼ 650, 10 nm FWHM). Microscope objectives (Nikon Plan Fluor 20 × ∕0.5) were utilized for the imaging. At least 100 images were acquired from each sample in pursuit of collecting enough data for the classification. Images were obtained by scanning in a random manner across each sample. All images were gathered in the database, which was used for the classification.

Classification
Classification, as a category of supervised machine learning, aims to build a model that makes predictions based on a self-learning procedure on known labeled data. In the case of cell classification, the algorithm identifies patterns in the input images and trains a model based on labels which were assigned to the cells in the images by an expert biologist. Such a trained model is able to classify cells in new so-far unseen images. The essential precondition for the successful classification is a sufficiently large database of labeled cell images on which the classifier is trained. The overview of the classification process based on QPI is shown in Fig. 2. The whole procedure was performed in MATLAB (MathWorks, Inc.).

Image preprocessing and feature extraction
Before the classification itself, the cells in the images were first segmented from the background by a marker-controlled watershed segmentation approach 35 and identified as separate regions of interest (ROIs). Each ROI was represented by a set of cell features-a procedure referred to as feature extraction. Two types of features were extracted: MO and QPI features.
where n is the number of pixels in the image and A is the pixel area. ii. Perimeter of the footprint area (PFA) is defined as the sum of pixels in the inner boundary of the region. When multiplied by the pixel size, the resulting value of PFA is in units of length. iii. Convex area (CA) is calculated as the sum of pixels of the convex cell region, multiplied by the pixel area. The boundaries of the convex cell region are defined by the smallest convex polygon that contains the region of the cell. iv. Perimeter of the convex area is calculated as a sum of pixels in the inner boundary of the region and multiplied by the pixel size. v. Solidity (S) specifies the proportion of the pixels belonging to the cell FA to those which are contained in the CA.  vi. Roundness (R) determines the deviation of the cell region from the circular shape. Roundness depends on the FA and its perimeter according to E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 3 ; 6 3 ; 7 vii. Indentation (I) evaluates the level of cell boundary indentation. Indentation can be calculated as the ratio of perimeter of the CA and perimeter of the FA as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 4 ; 6 3 ; 6 3 2 I ¼ viii. Eccentricity (EC) specifies the eccentricity of the ellipse that has the same second-moments as the cell region. The eccentricity is calculated as the ratio of the major axis and minor axis lengths.
b. QPI features. The features are extracted from the phase values of the cell in QPI, and therefore contain quantitative information about the dry mass density distribution within the cell.
i. Total phase of the cell (φ total ) is calculated as the sum of phase values (in radians) in the pixels belonging to the region of the cell. φ total is calculated as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 5 ; 6 3 ; 4 4 3 φ total ¼ where k is the number of pixels of the cell region and φ is the phase value in the i'th pixel belonging to the region of the cell. ii. Average phase (μ φ ) specifies the average phase value in the cell region. The average phase value is defined as the total phase over the FA of the cell. iii. Median describes the median of the phase values belonging to the cell region. iv. Variance (Var φ ) and standard deviation of the phase (σ φ ) determine the variation of the phase values, and the dry mass distribution within the cell. The variance and standard deviation of the phase are calculated as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 6 ; 6 3 ; 2 3 5 Var φ ¼ and E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 7 ; 6 3 ; 1 8 2 σ φ ¼ ffiffiffiffiffiffiffiffiffiffi Var φ p : v. Skewness (Skew φ ) is calculated from the histogram of the phase values and describes its shape. Skewness measures the symmetry of distribution of the phase values from the mean value. The parameter is determined by E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 8 ; 6 3 ; 9 7Skew The values of skewness close to zero report the symmetrical distribution of phase values, which is characteristic for spread and well-adhered cells. vi. Kurtosis (Kurt φ ) is also derived from the histogram of the phase values and quantifies the extent to which the shape of the data distribution matches the normal distribution. Kurtosis is described as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 9 ; 3 2 6 ; 6 6 0 All extracted features are summarized into feature vectors, each feature vector representing one cell. Each cell feature vector is then assigned one of the class labels determined by the expert biologist. The class labels are later used for training of the classification algorithm.
In the next step, all extracted features undergo normalization in order to scale the feature values to a fixed range from 0 to 1. The normalization speeds up the training of the classifier, prevents the classifier from getting stuck in local optima, and is an essential step in the classification for some algorithms. The normalization of the features is done via E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 0 ; 3 2 6 ; 4 8 5 where X norm is the normalized value of the feature, X is the original value of the feature, X min is the minimum value of the feature, and X max is the maximum value of the feature.

Classification algorithms
It is well known that the performance of the classification is highly dependent on the selection of the classification algorithm, 36 thus we employed several machine learning algorithms to correctly compare the performance of the classification with two different sets of features. Moreover, each algorithm can be adjusted by setting its parameters, therefore, each algorithm was tested by several possible variations.
To verify the assumption that QPI features improve the classification performance over the commonly used MO features, two sets of feature vectors have been used for the classification. In the first case, the feature vector consisted of MO features only. In the latter case, QPI features were also added. A short description of the used algorithms is presented below: Decision trees. In this method, a tree structure is built with root node and leaf nodes. The leaf nodes represent the classes of the features. Every interior node in the tree consists of a decision criterion. The features are partitioned based on homogeneity until a leaf node is assigned to a particular class. Three types of decision tree classifiers were used: complex, medium, and simple tree, with a defined maximum number of splits: 100, 20, and 4, respectively.
Discriminant analysis. Discriminant analysis assumes that different classes generate data based on different Gaussian distributions. To train a classifier, the fitting function estimates the parameters of a Gaussian distribution for each class. We used both linear and quadratic discriminant analyses. Support vector machines. Support vector machine (SVM) classifier classifies data by finding the best hyperplane that separates objects with different class memberships. The optimal hyperplane gives the largest margin between the two classes. Margin is the maximal width of the slab parallel to the hyperplane that has no data points. The closest data points to the margin are called support vectors. The performance of SVM largely depends on the kernel function mapping the training examples into a higher dimensional feature space. Here, we used linear, quadratic, cubic, and Gaussian kernels.
K-nearest neighbor classifier. This method uses k-nearest neighbor (KNN) samples to determine the class of an object. The k samples are chosen based on a similarity measure. A commonly used similarity measure is the distance function. A class is assigned to an object based on the majority vote of its k neighbors. Here, we used fine KNN (k ¼ 1, Euclidean distance), medium KNN (k ¼ 10, Euclidean distance), cosine KNN (k ¼ 10, cosine distance), cubic KNN (k ¼ 10, cubic distance), and weighted KNN (k ¼ 10, weighted by the inverse square of the Euclidean distance).
Ensemble classifiers. Ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. Here, we used the following ensemble classifiers: bagged trees, boosted trees, subspace discriminant, and subspace KNN.
Artificial neural network. The artificial neural network (ANN) approach is inspired by the human learning process. ANN consists of several layers of neurons which are joined by weighted connections. The accuracy and performance of ANN highly depend on the network structure. The training of the ANN is computationally demanding, but the classification accuracy is usually high. Here, we used a feed-forward backpropagation neural network with 1 hidden layer containing 10 hidden neurons.

Performance evaluation
K-fold cross validation was used to evaluate the performance of the classification algorithms. The data were partitioned into k randomly chosen subsets of roughly equal size. One subset (testing set) was used to validate the classifier, which had been trained on the remaining subsets (training set). This process was repeated k times, such that each subset was used for the validation (we used k ¼ 5). Since cross validation does not use all of the data for training, it is a commonly used method to avoid overfitting.
To compare the performance of the classification with two different feature vectors, several performance parameters were calculated from the confusion matrix for each classification algorithm: accuracy, precision, recall, and F-score.
The accuracy expresses the ratio of correctly classified examples by the classifier and is calculated as follows: F-score can be interpreted as a harmonic mean of precision and recall, calculated as follows: For comparison of the two classification approaches, the performance results were compared by statistical hypothesis testing. The Wilcoxon signed rank test 37 was used as a paired nonparametric statistical hypothesis test which can reveal the existence of significant differences between two distributions. The null hypothesis is that the median difference between pairs of observations is zero. P-value 0.05 was considered to be statistically significant.

Results and Discussion
The proposed methodology was applied to the QPIs gained by monitoring the nutritionally deprived cells by CCHM. Morphological changes of cells appeared on the order of minutes after the application of PBS. Most of the cells became slightly deprived after 5 min. The majority of the cells were seriously deprived after 20 min. The images of cells were divided by the expert biologist into three categories based on their morphology: viable, semideprived, and deprived cells. Viable cells did not exhibit any changes in morphology; cells in the semideprived category were influenced by PBS and started to shrink while their boundaries became indented. The deprived cells, which were influenced the most, adopted a rounded morphology. All images of cells were gathered in the database consisting of 1400 cells. According to the labels assigned by the expert biologist, the database contained the following distribution of class labels based on their morphology: viable (540), semideprived (470), and deprived cells (390). The cells with uncertain class membership were excluded from the database. Three distinct types of cell morphologies are shown in Fig. 3.
The segmentation of cells from the background by the marker-controlled watershed approach has proven to be a crucial step, which affects the extraction of both MO and QPI features and hence, the performance of the classification. Therefore, we did not consider highly overlapping cells where the segmentation was not clear. The cells located on the border of the image were excluded as well, as shown in Fig. 4. For the purpose of more accurate cell segmentation, we used sparse seeding densities to obtain subconfluently grown cells. The segmentation results in the case of more confluent cell layers were satisfying as well, as shown in Fig. 4. However, in the case of confluent cell layers, there is a higher chance of less accurate segmentation results, which may lead to poorer performance of the classification.
From the images of cells, two sets of features were extracted: MO and QPI features. Two feature vectors were composed, while the first one included only MO feature set and the second one both sets. The two feature vectors formed input for the classification algorithms. Performance measures (accuracy, precision, recall, and F-score) of each classification algorithm were determined as a mean of the values obtained by fivefold cross validation and can be found in Table 1. The overall performance of the classification for each of the two feature vectors was determined as the mean of performance parameter values reached by all classification algorithms.
The overall accuracy of the classification using only MO features was 0.888 AE 0.015, which is comparable to values mentioned in previous studies on cell morphology classification. 6,11,38 The overall precision, recall, and F-score were 0.859 AE 0.022, 0.815 AE 0.058, and 0.836 AE 0.039, respectively. The classification using both sets of features led to higher performance of the classifier, with the overall accuracy of the classification reaching 0.956 AE 0.011. In this case, the overall precision, recall, and F-score were 0.942 AE 0.011, 0.937 AE 0.012, and 0.939 AE 0.011, respectively.
The performance results of both approaches are shown in the form of box-whiskers plots in Fig. 5. The results were compared by Wilcoxon signed rank test, which revealed significant differences between the two classification approaches (p < 0.001) in terms of all performance parameters (accuracy, precision, recall, and F-score).
In order to study the impact of the cell sample preparation and other experimental conditions on classification performance, we tested the approach on data gained from another independent experiment. The experiment was identically designed, however, the cell preparation was performed by a different person and the classification algorithms were trained on the images of cells from the first experiment. The performance of the classification is summarized in Table 2 together with the results from the first experiment. The performance of the classification on data obtained in two independent experiments was compared by Wilcoxon rank sum test, 37 which revealed no significant differences between the classification performances in the two experiments. According to the results, we assume that cell sample preparation and other experimental conditions do not significantly influence the performance of the classification.
Based on the overall results, it can be concluded that the quantitative phase information gained by CCHM increases the performance of the classification of cell morphologies in contrast to commonly used methods based on MO features. The study shows that CCHM offers preconditions for an accurate classification of cell morphologies, while the main asset of the technique lies in the accurate cell segmentation and the quantitative nature of the images it provides.
Even though we have presented the application of quantitative phase imaging together with the supervised classification only for distinguishing different morphologies of deprived cells, the approach might also contribute to higher performances when it comes to different classification tasks. The methodology can develop into a tool for the monitoring of cell live cycles, cell death, reaction of cells to treatment, interaction of cells with material (biocompatibility testing), detection of different experimental conditions, or distinguishing different cell lines. In general, the approach could improve the monitoring of live cell behavior in an automated fashion.

Conclusion
In this paper, we have applied several classification algorithms on the QPIs of deprived cells obtained by CCHM, while three types of cell morphologies were being distinguished. Our goal was to compare the performance of the commonly used classification approach based on MO features and our method also employing quantitative phase features. After the preprocessing steps, two sets of features were extracted from the QPIs and their relevance for the classification was tested. A comparative analysis between these two approaches was performed while using several classification algorithms. The results showed (in terms of overall accuracy, precision, recall, and F-score) that the classification also employing quantitative phase information outperforms the commonly used method-based solely on the MO features. Based on the results, we assume that CCHM, as a tool for quantitative phase imaging, could be a valuable microscopy technique for automated analysis of live cell behavior, while enabling noninvasive label-free imaging with sufficient contrast and high-spatiotemporal phase measurement sensitivity. However, the robustness of the approach can be further improved by enlarging the image database and introducing more features. Therefore, our future work will be aimed at database and feature extension and application of the proposed approach to different classification tasks.

Disclosures
R. Chmelik reports patents for the CCHM optical system licensed to commercial entities (TESCAN Orsay Holding a.s). This research was completed without their participation, knowledge, or financial support, and data were acquired and processed by coauthors unaffiliated with any commercial entity.