Micro-Raman spectroscopy study of cancerous and normal nasopharyngeal tissues

Abstract. The capabilities of micro-Raman spectroscopy for differentiating normal and malignant nasopharyngeal tissues were evaluated. Raman scattering signals were acquired from 22 normal and 52 malignant nasopharyngeal tissue samples. Distinctive spectral differences in Raman spectra between normal and malignant nasopharyngeal tissues were found, particularly in the spectral ranges of 853, 937, 1094, 1209, 1268, 1290 to 1340, 1579, and 1660  cm−1, which primarily contain signals related to proteins, DNA, and lipids. Compared to normal tissues, the band intensity located at 853, and 937  cm−1 were significantly lower for cancerous tissues (p<0.05), while the band intensity located at 1094, 1209, 1268, and 1579  cm−1 were significantly higher (p<0.05). The band intensity located at 1290 to 1340, and 1660  cm−1 were also higher for cancerous tissues; but the differences were not statistically significant (p>0.05). Principal component analysis (PCA) and linear discriminate analysis (LDA) were employed to generate diagnostic algorithms for classification of Raman spectra of the two nasopharyngeal tissue types. The PCA-LDA algorithms together with leave-one-out, cross-validation technique yielded diagnostic sensitivity of 92% and specificity of 82%. This work demonstrated that the Raman spectroscopy technique associated with PCA-LDA diagnostic algorithms has potential for improving the diagnosis of nasopharyngeal cancers.


Introduction
Nasopharyngeal malignancies remain one of the major causes of cancer-associated death and have high incidence rates in East Asia, particularly in South China. 1 Presently, routine whitelight endoscopy and histological examination are the primary methods for clinical identification of nasopharyngeal carcinoma (NPC). Due to the anatomical position of NPC, it is not easy to diagnose this lesion early. Only about 10% and 20% of the diagnosed lesions are stage I and II diseases. Almost two thirds of the diagnosed NPCs are at an advanced stage. 2,3 Due to the fact that the therapeutic outcome is highly related to the stage of the disease, early identification of malignant NPC is crucial for improving the survival rate of patients. 4 Recent developments of spectral techniques may significantly expand our ability to diagnose this tumor rapidly and accurately. Optical spectroscopic techniques such as fluorescence spectroscopy, [5][6][7] light scattering spectroscopy, and Raman spectroscopy 8 have been investigated for the evaluation of malignancies in tissues. Although fluorescence spectroscopy has shown its promising diagnostic potential for in vivo detection of preneoplastic and early neoplastic lesions with highdetection sensitivities, it still suffers from moderate diagnostic specificities owing to the inter-observer dependence and the lack of ability to reveal specific biomolecular information about the tissue. Among the optical approaches currently under investigation for in vivo endoscopic applications, Raman spectroscopy is a very promising technique.
Raman spectroscopy is a nondestructive, inelastic light scattering technique in which the scattered photon is shifted to another wavelength with respect to the incident excitation light, depending on the specific vibrational modes of molecules in tissue and cells. Thus, Raman spectroscopy reveals specific biochemical information and biomolecular structures of tissue, providing the unique opportunity to distinguish between different pathological tissue types at the molecular level. Over the last two decades, studies using Raman spectroscopy have been conducted on samples from a variety of organs, including brain, breast, colon, cervix, gastric tissue, heart, liver, lung, lymph system, prostate, skin and thyroid, etc. [9][10][11][12][13][14][15][16] The results of all these studies demonstrated that normal and malignant tissues can be differentiated on the order of 80% to 100% accuracy with the use of various statistical analyses.
Very few reports have focused on Raman spectroscopy of NPC. In 2003, we reported a preliminary study of excised NPC and normal tissues from 6 patients using a fiber optic Raman system. 8 That time our system was not optimized and it covers only a narrow spectral window of 950 to 1650 cm −1 . The signal to noise ratio (S/N) of the spectra was not very good. To the best of our knowledge, there are no other publications on Raman spectroscopy of nasopharyngeal tissues. This motivated us to conduct a more systematic study of NPC and normal nasopharyngeal tissues. This time we used a commercial micro-Raman system covering a broad spectral windows and with better spectral resolution and S/N. Furthermore, the micro-Raman system gets more of the tissue intrinsic Raman spectral properties independent of the tissue optics modifications to the spectra due to the confocal arrangement and the micron size tissue volume being measured. The previous fiber optic based Raman system 8 interrogated a millimeter size tissue volume. Due to strong scattering of tissue to light, the Raman spectra are largely affected by tissue optics. 17 The fractional contributions of different tissue components to the total spectra vary due to tissue optics modifications. 17 This will result in significant spectral shape differences between the spectra obtained from the fiber optic system and the micro-Raman system. In this study, intrinsic (micro-) Raman spectral results from 74 excised tissue samples (52 NPC and 22 normal) will be discussed.

Subjects and Protocol
Seventy-four patients from the Fujian Provincial Tumor Hospital were recruited for this study with institutional ethical approval and informed consent. Spectra were obtained from 22 normal and 52 cancerous nasopharyngeal biopsies (50 nonkeratinizing undifferentiated and two differentiated carcinoma). The mean ages and standard deviations are 43 AE 13 for the normal group and 48 AE 13 for the cancer group. A recent study on the Raman spectral properties of healthy and diseased oral mucosa tissues demonstrated that "the aging related changes do not seem to have any bearing on classification of normal from abnormal conditions" (Ref. 18). We expect that the slight mismatch on ages (five years difference on mean values) between the normal and cancer groups in our study will not affect the diagnostic performances. 41 cancer samples are from male and 11 from female. 14 normal samples are from male and 8 from female. Fresh tissue samples were stored in a −80°C refrigerator until spectral measurements. The samples were placed on pure aluminum plate and thawed at room temperature immediately before experimental measurements.

Raman System
The targets were directly placed on a pure aluminum plate and all the Raman spectra of the tissue samples were recorded using a Renishaw inVia micro-Raman system with a 50 × objective. A 785 nm diode laser was used for excitation and Raman spectra were recorded from 800 to 1800 cm −1 with an integration time of 30 s. The excitation laser power focused on the targets was lower than 30 mW. There was no notable tissue damage during the measurements.

Data Processing and Analysis
A fifth order polynomial was fitted to the background tissue autofluorescence for each set of raw data. 19 The polynomial was then subtracted from the measured spectrum to obtain the Raman signal. Each Raman spectrum was then normalized to the integrated area under the curve to correct variations in absolute spectral intensity and enable comparison of spectral shapes. To test the capability of tissue Raman spectra for differentiating cancer from normal tissue, principal component analysis (PCA) combined with linear discriminate analysis (LDA) was performed on the measured spectra. PCA is a statistical technique for simplifying complex data sets and determining the key variables in a multidimensional data set that can best explain the differences in the observations. To reduce the dimension of the spectral data, PCA is usually employed to extract a set of orthogonal principal components (PCs) that account for the maximum variance in the dataset for further diagnosis and characterization. In this study, the SPSS software package (SPSS Inc., Chicago) was used for PCA analysis. Subsequently, the PC scores were used as input for LDA to generate diagnostic algorithms. The performance of the PCA-LDA diagnostic algorithm was validated in an unbiased manner using leaveone-out, cross-validation methodology. In the validation procedure, one tissue sample was left out and the PCA-LDA modeling was redeveloped using the remaining Raman spectra. The redeveloped PCA-LDA diagnostic algorithm was then used to classify the withheld Raman spectra. This process was repeated iteratively until all withheld Raman spectra were classified.

Mean Spectra of Normal and Cancerous
Nasopharyngeal Tissues  Figure 1(b) shows the mean Raman intensity for the two different tissue groups along with corresponding standard deviations at the major Raman bands mentioned above. The six bands (853, 937, 1094, 1209, 1268, and 1579 cm −1 ) with significantly (p < 0.05) mean intensity differences between the cancer group and the normal group are labeled with elipitical circles in Fig. 1(b). The band intensities located at 853 and 937 cm −1 were significantly lower for cancerous tissues (p < 0.05), while the band intensities located at 1094, 1209, 1268, and 1579 cm −1 were significantly higher (p < 0.05). These intensity differences indicate that there is a significant increase and decrease in the percentage of distinctive biomolecules relative to the total Raman-active constituents in different tissue types, suggesting the diagnostic potential of Raman spectroscopy for identification of malignant lesions in the nasopharynx.

PCA and LDA
To test the capability of tissue Raman spectra for differentiating cancer from normal, PCA combined with LDA was performed on the measured Raman spectra. Independent-sample t-test on all the PC scores comparing normal and cancerous groups showed that there were three most diagnostically significant PCs (PC1, PC2, and PC4) for discriminating normal and cancerous groups. To illustrate the use of PC scores for diagnostic classification, a three-dimensional (3-D) scatter plot of using the three PC scores (PC1, PC2, and PC4) as axes are presented in Fig. 2 for direct comparisons between normal (n ¼ 22) and cancer (n ¼ 52) groups. The data points are clustered into two distinct groups which further confirmed that substantial changes in spectral profiles in the process of malignant transformation from normal to carcinoma. In order to incorporate all significant Raman spectral features, LDA was used to generate diagnostic algorithms using the PC scores for the three most significant PCs (PC1, PC2, and PC4). To prevent over-training, the leaveone-out and cross-validation procedures were used. Figure 3 shows the posterior probability of each spectrum belonging to the normal and nasopharyngeal cancer groups as calculated from the LDA model. Using a discrimination threshold of 0.5, the diagnostic sensitivity for detecting nasopharyngeal cancer was 92% and the corresponding diagnostic specificity was 82%.
To further evaluate the performance of the PCA-LDA-based diagnostic algorithm for nasopharyngeal cancer diagnosis, the receiver operating characteristic (ROC) curve was generated from the posterior probability plot in Fig. 3 by varying the threshold level. The results are shown in Fig. 4. The integration area under the ROC curves is 0.968. This further demonstrated that PCA-LDA-based diagnostic algorithms can be used for nasopharyngeal cancer detection with good performance.

Discussions
Over the last two decades, studies using Raman spectroscopy have been conducted on samples from a variety of organs and the results in general suggest that Raman spectroscopy has great potential for cancer detection based on the quantitative information about the biochemical differences between normal and neoplastic tissues in terms of proteins, DNA, and lipids. In this work, we investigated the ex vivo NIR micro-Raman spectral properties of normal and malignant lesions in the nasopharynx. Raman spectra of nasopharynx tissue are dominated by many vibrational modes of various biomolecules, such as proteins, lipids, and nucleic acids, which may be altered in quantity or confirmation associated with nasopharyngeal cancer. To better understand the molecular basis, Table 1 lists tentative assignments for the observed Raman bands. 20-23 Distinctive spectral features and relative intensity differences were observed   Note: ν, stretching mode; ν s , symmetric stretching; δ, bending mode. between cancer and normal groups, which reflect molecular and cellular changes associated with malignant transformation. For example, the Raman peak intensity at 937 cm −1 due to the νðC─CÞ in α-helix conformation of proline and valine appeared to be more intense for normal tissue than for NPC tissue. The peak at 1094 cm −1 , which is probably characteristic of C─N stretching of DNA, was higher for tumor tissue than for normal tissue, indicating that the cancerous tissue may be associated with an increase in the relative amounts of nucleic acids. The band at 1450 cm −1 corresponds to the CH 2 bending mode of collagen has previously been recognized as being of diagnostic significance. The band at 1579 cm −1 was attributed to C═C bending mode of phenylalanine, and the percentage signals were considerably increased in cancer tissue, indicating an increase in the percentage of phenylalanine content relative to the total components in nasopharyngeal cancer tissue. Huang et al. 20 also observed an increase of phenylalanine in malignant lung tissue by Raman spectroscopy.
Recently, our group has also carried out surface-enhanced Raman spectroscopy (SERS) studies of blood plasma samples from NPC patients and healthy volunteers. 23 Interestingly, the Raman bands at 1330 and 1579 cm −1 on the SERS spectra showed the same variation trends as observed in this study. This suggests that tissue biochemical changes in the process of malignant transformation from normal to carcinoma seem reflected in the circulating blood. This, from a different perspective, supports the feasibility of using blood plasma SERS spectroscopy for noninvasive cancer detection.
Irrespective of the identification and quantization of the additional molecular species present in malignant nasopharyngeal spectra, the substantial changes in spectral profiles from normal to malignant can be effectively used for discrimination of the two classes of samples. In order to use the full information contained in Raman spectra for the classification, an accurate discriminate algorithm is required. PCA was performed to reduce the large amount of data contained in the measured Raman spectra into a few important principal components. Three most diagnostically significant PCs (PC1, PC2, and PC4) were extracted after independent-sample t-test on all the PC scores. Figure 5 shows the plots of these three PC loadings. The spectral shape of any of the three plots is not dominated by any single molecule in the tissue. Instead it contains many of the 13 Raman peaks listed in Table 1. For example, PC1 contains 12 of the 13 peaks. PC1 also contains other Raman peaks that are not obvious from the measured tissue spectra and thus not listed in Table 1. Similarly, PC2 contains 10 and PC4 contains 8 of the 13 peaks listed in Table 1. PC2 and PC4 also containes other nonlisted peaks. These results suggest multiple chemical origins of the PC loadings. LDA algorithm based on these three PCs shows that malignant nasopharyngeal tissue can be differentiated from normal tissue with a diagnostic sensitivity of 92.3% and specificity of 81.8%. It demonstrates good separations between normal nasopharyngeal tissue and diseased tissue.

Conclusions
Micro-Raman spectroscopy was performed on 22 normal and 52 malignant nasopharyngeal tissue samples. Distinctive spectral differences between normal and malignant nasopharyngeal tissue were revealed which primarily contain signals related to proteins, DNA, and lipids The band intensities located at 853 and 937 cm −1 were significantly lower for cancerous tissues (p < 0.05), while the band intensities located at 1094, 1209, 1268, and 1579 cm −1 were significantly higher (p < 0.05). An algorithm derived from PCA-LDA analysis of the spectral data yielded a diagnostic sensitivity of 92.3% and specificity of 81.8%. This suggests great potential for using Raman spectroscopy to improve the diagnosis of nasopharyngeal cancers. It provides solid support for us to develop endoscopic Raman instrumentations for in vivo clinical applications.