Cervical cancer is the second most common cancer among women worldwide, and it is generally more common in developing countries.1 Each year, new cases are reported and approximately woman died due to this kind of cancer.2
The main causing agent is the sexually transmitted papilloma human virus (HPV).3 Cervical malignancy is usually preceded by cervical intraepithelial neoplasia (CIN) of grades I, II, and III before becoming invasive.3 It is widely known that a large amount of deaths by this pathology (estimated as ) would be avoided with early diagnosis.3 This emphasizes the search for effective screening methods for cervix neoplasia.
The available screening and diagnosis methods (mainly Pap smear and colposcopy) have some shortcomings that make impossible fast, effective, and efficient supervision of cancer in the general population, mainly in developing countries. These limitations are due to the fact that all methods are based on subjective interpretations of morphological abnormalities. The cytological analysis of the cervix, which is the elementary principle of a Pap smear and colposcopy, for example, had a high false-negative rate of .4 This fact is inherent to the subjective histological grading of this pathology. It is common atypical cells be associated to inflammatory infiltrates and the slice be misinterpreted by the pathologist as simple cervix inflammation (cervicitis) instead of a malignance or neoplasia or vice versa. This high false-negative rate would limit the accuracy of screenings at very initial stages of the cervical cancer, which prevents an adequate treatment at this stage. As a consequence, a large number of patients will need a surgical procedure to remove the cancer that otherwise would have been detected and treated at the beginning.
Optical biopsy techniques, such as fluorescence spectroscopy,5 polarized light scattering spectroscopy,6 optical coherence tomography,7 confocal reflectance microscopy,8 and Raman spectroscopy9, 10, 11, 12, 13 had been extensively used to characterize cervical cancers. In particular, the Raman spectroscopy technique was able to probe several biochemical alterations due to pathology development as a change in the DNA, glycogen, phospholipid, or noncollagenous proteins. 9, 10, 11, 12, 13, 14, 15
All the above-cited studies claimed that the optical biopsy methods were able to discriminate normal and malignant tissues. However, to the best of our knowledge, no study presented in the literature concerned the influence of cervicitis on the optical diagnosis of cervix cancer. The objective of the present work is to cover this void, presenting a systematic study of normal, CIN I, and cervicitis cervical tissue by Fourier Transform (FT)–Raman spectroscopy.
This research was carried out following the ethical principles established by the Brazilian Health Ministry and approved by the local ethics in research committee (certificate 168/2004CEP–UniVap). Patients were informed concerning the subject of the research and gave their permission for the collection of tissue samples.
Cervix biopsies were collected from 63 patients. Thirty-three biopsies were collected from malignant sites (as indicated by Pap smear and colposcopy) and 30 from normal ones. Immediately after the procedure, the samples were identified, snap frozen and stored in liquid nitrogen in cryogenic vials prior to FT-Raman spectra recording.
A FT-Raman spectrometer (Bruker RFS 100/S; Bruker Optics GmbH, Ettlingen, Germany) was used with an Nd:YAG laser at as the excitation light source. Laser power at the sample was maintained at , while the resolution was set to . The spectra were recorded using 300 scans (nearly of acquisition time) with laser excitation done at the epithelial side of the tissues. For FT-Raman data collection, the samples were brought to room temperature and maintained moistened in 0.9% physiological solution to preserve their structural characteristics, then placed in a windowless aluminum holder for Raman spectra collection. Observation revealed that the chemical species in the physiological solution ( , , , , water) presented no measurable Raman signal and their presence did not affect the tissue spectral signal. The typical size of samples was .
Typically, three to five Raman spectra were recorded on each sample, resulting in 230 spectra. Soon after the Raman measurements, the samples were fixed in 10% formaldehyde solution for further histopathological analysis. Each measured sample was histopathologically assessed by two pathologists. Details of all samples used in the study are shown in Table 1 .
|No. of samples||No. of spectra||Diagnosis|
All spectra were baseline corrected and vector normalized. The typical baseline was a straight line in all cases. The algorithm used calculated the straight line, considering the first and last experimental data points at 800 and , respectively. The spectral differences were analyzed using multivariate principal components analysis (PCA). PCA was performed over the range by computing the covariance matrix. The underlying data structure was summarized by clustering (i) the whole spectral data and (ii) PC2, PC3, and PC4 scores, both at 95% level of similarity using correlation distance measurement. The results were presented as a dendrogram. The set of PCs that provided the best classification after visual inspection of scattering plots were fed into the logistic regression (LR) algorithm16 to determine the parameter equation that best differentiated the pathologic states. LR provides a method for modeling a binary 0 or 1) response variable and is based on the linear dependence between the logit function of the probability of response 1 and the parameters of diagnosis. In our case, these parameters are the PCs. Thus, the LR model equation isis the probability of obtaining response 1 and and are the model parameters.
For the LR modeling, data set was randomly divided into two independent portions (test and validation). The choice of spectra to be put into each set was made with the help a random-number-generator algorithm. The model’s predictive ability was estimated by measuring the association between the response variable and predictive probabilities. The fit quality was tested by the deviance goodness of fit Pearson- and Somer’s D parameters. All these steps were performed with help of the statistical software Minitab, version 14.20 (Minitab Inc., State College, Pennsylvania, USA) and Mathematica 5.2 software (Wolfram Research, Champaign, IL, USA).
Results and Discussion
Figure 1 shows some typical histological stains representing normal [Fig. 1a], cervicitis [Fig. 1b], and CIN I [Fig. 1c] cervical tissues. The normal cervical tissues presented only discrete differences compatible with the expected biological variation. The cervicitis tissues were characterized as presenting only inflammatory infiltrates without atypical cells while the CIN I presented both inflammatory characteristics and atypical cells.
The box plot of the spectral data is displayed in Fig. 2 . The main spectral changes observed (when considering the interquartile region) were at (CCH deformation aromatic) (C-C stretching), (CN stretch, NH bending of Amide III), ( bending), and ( stretching) vibrational bands.
The 857 and bands were slightly more prominent on normal samples. These peaks are related to glycogen, as shown by Lyng 10 As discussed by some works (e.g., Refs. 2, 10, 14, 15), the decrease in these glycogen bands on malignant samples is an expected and observed fact. Glycogen is known to be linked to cellular maturation and disappears with loss of differentiation during neoplasia.10
The Amide III band appeared slightly more prominent ( more intense) and well defined on normal samples. This was normally found on macro-Raman (laser spot ) measurement setups.9, 11, 12, 13 The micro-Raman measurements of Lyng 10 showed this spectral band more intense in malignant cervical epithelial cells than in normal ones. This opposite result could be justified by the Raman signal of the connective tissue, which is collagenrich. The micro-Raman setup is able to spatially discriminate connective, basal, or epithelial tissues cells. In fact, the spectra recorded from connective tissue of cervix (Fig.4(a) of Ref. 10) presented more intense Amide III band than the basal or epithelial cells. The macromeasurement does not enable this kind of discrimination. However, because this signal is present in both normal and altered tissues, this will not influence on the spectral discrimination.
The other two bands at 1370 and presented a slight intensity increase in the altered tissue when compared to the normal one. These bands are related to nucleic acids, and similar intensity variation was also reported in the literature. 9, 10, 11, 12, 13, 14, 15
Figure 3a presents the clustering of the spectral data using the whole spectra. The data are grouped into three clusters. The composition of each cluster is showed in Table 2 . Cluster 1 is composed by samples diagnosed as normal (50%) and cervicitis (50%). Cluster 2 concentrates or almost all CIN I samples (20 samples from a total of 24 or 83% of them), but it was also composed of normal and cervicitis ones. Cluster 3 had the predominance of cervicitis samples (72%). Thus, the clustering indicated spectral mixing among normal, cervicitis, and CIN I tissues. To better discern about the similarity between samples and try to obtain a better discrimination, the PCA analysis was performed and the PC2 to PC4 principal components were clustered and showed on the dendrogram of [Fig. 3b]. Unfortunately, the clustering was essentially identical to the previous case [Fig. 3a].
Number of spectra per cluster.
|Diagnosis||Cluster 1||Cluster 2||Cluster 3||Total|
The scattering plot of PC2–PC4 is shown in Fig. 4 . There was some discrimination among the normal and CIN I data. However, the cervicitis data points appeared to be in the midway among the normal and CIN I groups.
To quantify these findings, a nominal LR model was built, considering the three groups of samples. Compared was cervicitis to normal and CIN to normal, keeping the normal group as a reference. After fitting the model to a test set of spectra, it was applied to the validation set. The Pearson- parameter for this model was found to be 0.633, indicating a poor goodness of fit. The Somer’s D parameter was 0.79, implying a relatively low predictive ability of the model.
The LR was also applied to another situation considering two set of data: normal and altered (cervicitis and CIN I). The best fit to the test set was found to be a combination of PC3 and PC4 (Fig. 5 )
After validation, the Pearson- parameter for this model was found to be exactly 1, indicating a very good fit. The Somer’s D parameter was 0.92, which means a high predictive ability of the model. The sensibility was found to be 93% and the specificity 85%. In spite of the high quality of the diagnosis model, this indicated a misclassification of cervicitis samples, which were intrinsically identified as CIN I neoplasia. In some sense, this is a similar situation to the conventional screening methods, where infiltrate inflammatory cells present in both cervicitis and CIN induce the wrong diagnostic. This implies that at this stage, the diagnosis by Raman will not provide substantial improvement in the screening power of cervix diseases because, in all kinds of malignance, there will be present inflammatory cells and some kind of cervicitis. At a worst-case situation, a nonmalignant cervicitis state (normal tissue with some inflammatory infiltrates) would be misclassified as CIN I, which is a very undesirable result.
The results presented in this work indicated that the full discrimination among normal and neoplastic CIN I tissues of cervix by Raman optical biopsy was seriously compromised by the presence of inflammatory infiltrates. In fact, both the crude biochemical analysis obtained by direct spectral comparison among normal, cervicitis, and CIN I samples; the clustering procedure; and the LR diagnosis model results indicated that the cervicitis samples were always misclassified as CIN I. This fact increases the false-positive rate of a Raman-based diagnosis. This is specially relevant because cervix inflammation is very common (noncancerous) disease of cervix. Thus, the results suggest that, for a safe and useful optical diagnosis of the cervix, it is mandatory find out a cervicitis-marker-like signal that could be found [e.g., by coupling an auxiliary technique (fluorescence, dichroism, etc.) to the Raman-based one].
H.S.M. thanks the Brazilian agency (Process No. 301018/2006-5) for the financial support. A.A.M. thanks the FAPESP (Grant No. 2001/14384-8) and (Grant No. 302393/2003-0) Brazilian agencies for their financial support.