Lung cancer is still the most common cancer-related death globally, with an estimated 1.3 million new cases diagnosed annually.1 80% of cases are histologically classified as nonsmall cell lung cancer, for which surgery is the main curative treatment. Despite this, 50% of these patients suffer recurrence within . A number of prognostic markers that predict postoperative recurrence have been suggested, but none have been clinically validated. Several optical techniques are being developed for the diagnosis of lung cancer, the most prominent being autofluorescence.2, 3 Raman spectroscopy is a well-established analytical technique where the structure and binding of molecules can be studied by examination of their light scattering properties.4, 5, 6 Recently, the sensitivity of Raman spectrometers has improved as a result of several technical advances, allowing the acquisition of high quality Raman spectra from tissue sections and cells.5 Because no staining or tissue preparation is required, and Raman studies can be conducted under nondestructive conditions, the technique can be applied in vivo.6, 7 The method has been applied to a number of biomedical applications, and several solid organ cancers have been successfully analyzed using Raman spectroscopy.8, 9, 10, 11, 12 Two studies have recently demonstrated that Raman spectroscopy can differentiate malignant from normal lung tissue.13, 14 Huang used an excitation wavelength of to distinguish tumor from normal bronchial tissue, with a sensitivity and specificity of 94 and 92%, using only the ratio of two distinctive features in the Raman spectrum . Yamazaki used a higher wavelength of to distinguish tumor from normal parenchymal tissue, and obtained a sensitivity of 91% and a specificity of 97%. However, the conventional Raman spectroscopy approach employed in these studies analyzed contributions from a heterogenous tissue involving a complicated combination of cell types and stromal tissues, all with different biochemical profiles. This issue has been addressed in a study examining normal bronchial tissue sections using Raman microscopy, in which conventional Raman spectroscopy is combined with optical microscopy to achieve spatial as well as spectral resolution.15 Several chemical differences were identified between the individual layers, such as epithelium and supporting stromal layers.
Our aim in the present study has been to investigate whether Raman microscopy can: 1. distinguish lung cancer cells from normal bronchial epithelium, 2. explore its potential for prognosis in patients undergoing lung cancer surgery, and 3. investigate biochemical differences between normal and malignant bronchial cells.
Materials and Methods
43 patients undergoing lung cancer resection were recruited for this study. A sample of tumor tissue and normal lung tissue, distant from the tumor, were obtained from each patient. Of the 43 normal lung samples, 28 contained identifiable normal bronchial epithelium. Of the 43 tumor specimens, 34 were suitable for analysis: four were not nonsmall cell cancer, two tumors were too small for residual tissue to be collected, and three tumor tissues suffered burn damage in the analyzing laser beam. Using a cryotome stage, the sections were mounted in Tissue Tek®, and frozen tissue sections measuring in thickness were prepared. The first section was placed on a glass microscope slide and stained with hematoxylin and eosin. The next consecutive section was placed onto a quartz microscope slide and then immersed for in 99% ethanol to preserve the tissue. This second unstained section was then analyzed by Raman microscopy using the first section as a “stained map.” These maps were also assessed by a pathologist and given a diagnosis of normal or nonsmall cell lung carcinoma. All the patients gave informed consent, and the study had the approval of the Queen’s University Belfast Research Ethics Committee.
Confocal Raman Microscopy
Raman spectra were recorded using a Horiba Jobin Yvon LabRam HR800 Raman microscope (Horiba, Limited, Kyoto, Japan). Light from a diode laser was focused on the sample with a objective (numerical aperture is 0.9) and an excitation power of at the sample. A notch filter was used to reject elastic scattering from the incident light and a diffraction grating was used to provide a spectral resolution of over the spectral range from on a Peltier-cooled charge-coupled device (CCD) detector (Andor Technology, Belfast, Ireland, model DU420). Spectra were acquired and processed using Labspec software (Jobin-Yvon, Villeneuve d’Ascq, France).
Using the stained section of normal bronchial tissue for guidance, two normal areas of bronchial epithelium were identified on the unstained section and scanned with the Raman microscope. These areas measured on average , included five to ten cells, and were sampled over 16 points using a square grid. The integration time for each point was four minutes. Mean spectra were calculated from these areas; therefore for each patient, two spectra were obtained for normal tissue and two for malignant tissue.
Spectral Preprocessing and Analysis
Residual cosmic spikes were removed from each spectrum and a mean spectrum was calculated using the 16 spectra from the grid. The position of the phenylalanine peak was used to check for any -axis shift in recorded spectra from sample to sample. Raman scattering from a clean quartz slide was measured in a similar fashion to the tissue and was subtracted from each of the mean spectra to reduce the influence of quartz in the spectra. Background noise and fluorescence were removed by multiple linear baseline subtraction; nine points over regions known to have no major Raman activity were manually chosen to fit the background, and this baseline was subtracted from the spectra. The spectra were then normalized by dividing by the total area under the curve.
Mean normal and malignant spectra were compared using a Student’s t-test. The ability of Raman microscopy to correctly classify normal and malignant lung tissue was tested using principal component analysis (PCA)16 with a leave-one-out cross-validation (LOOCV), and by Random forest classification17 with training and test sets. The ability of Raman microscopy to predict postoperative recurrence in 34 patients was also analyzed by principal component analysis with leave-one-out cross-validation. A survival analysis for these patients was performed using the scores from the recurrence analysis and pathological parameters (log-rank test). These statistical analyses were performed by Unscrambler (Camo, Oslo, Norway), SPSS (SPSS Incorporated, Chicago, Illinois), and the R software (R Foundation for Statistical Computing, Vienna, Austria).
To assist with spectral identification of the loadings from the PCA, spectra from several reference materials were recorded with the Raman microscope using excitation. Horse heart cytochrome C (C-7150) was purchased from Sigma (Saint Louis, Missouri). Human DNA was isolated from the whole blood of a control subject. Genomic DNA was purified using “salting-out” method with a Gentra Puregene purification kit.
Patient characteristics for the 34 patients in the prognostic analysis are shown in Table 1 . There were 15 (44%) cases of early postoperative recurrence, defined on the basis of radiological or pathological evidence of local or distant recurrence within of surgical resection. Recurrence only in the mediastinal nodes was noted in three patients. The remaining recurrences were metastatic.
|Mean age(range)||63 (49 to 79)||66 (51 to 78)|
Light microscopy images of two sequential normal tissue sections are shown in Figs. 1 (stained with hematoxylin and eosin) and 1 (not stained). The box within Fig. 1 illustrates the area analyzed by the Raman microscope. Figures 1 and 1 show a malignant tissue section at a lower magnification. Again, these are consecutive sections and only Fig. 1 has been stained. The histopathological diagnosis was confirmed on the stained section, which was morphologically indistinguishable from the section used for Raman analysis. These figures illustrate that the stained sections can be used as a guidance map for the unstained sections that were analyzed by Raman microscopy. The figure also illustrates how a small group of individual cells can be targeted with the microscope.
There are many subtle differences in intensities between the two main spectra but the main difference identified was in the intensity and width of the amide I band at (Fig. 2 ), which is consistent with reports in previous papers on both lung cancer and other solid organ tumors.8, 9, 10, 11, 12, 13 However, it should be pointed out that there was no significant difference between the mean intensities at when compared using an independent samples t-test .
In the principal component analysis of the 124 normal and malignant spectra recorded in the present study, 60% of variation in the spectra was described by the first two principal components. These two components also most efficiently classified normal and tumor spectra, for which a scatter plot is shown in Fig. 3 . The dashed, diagonal line separates these score plots into normal and malignant groups. Thus, it was found that Raman microscopy can differentiate malignant from normal lung tissue with a sensitivity of 84% and a specificity of 61% (positive predictive value 72%, negative predictive value 76%).
Principal component loadings are a graphical representation of the mathematical function that PCA uses to explain the separation between normal and tumor spectra. Their interpre-tation is useful because peaks in the loading correspond with Raman peak positions, recorded in wavenumbers , for chemical species that are more abundant in the tumor samples, and troughs correspond with entities more abundant in the normal samples. To provide some pointers as to the possible origin of the differences between malignant and normal tissue, Fig. 4 displays the loading from principal component 1, overlaid by the spectrum recorded for cytochrome C. While a number of the troughs in the loading coincide with some major cytochrome C peaks, including features at 1585, 1352, and , there are also some notable absences, particularly a prominent feature at . The favorable correlation analysis in Fig. 4 (Spearman rho correlation , ) may suggest that excess porphyrin within the normal samples could explain the majority of the variation between the normal and malignant samples. Nevertheless, the peak discrepancies noted preclude this as a definitive conclusion.
The second principal component was analyzed in a similar fashion to PC 1. The loading from the second principal component, trace (a), is shown in Fig. 4 and compared with a reference Raman spectrum of DNA, trace (b). While some of the major bands in the DNA again exhibit a satisfactory correlation with peaks in the loading [Fig. 4, Spearman rho , ], some DNA features are absent in the loading plot (see Sec. 4).18
Being a linear multivariate technique, principal component analysis is limited to a linear separation of the data, and for this reason the use of a nonlinear classification technique was explored. Random forest classification17 is an augmented form of decision tree analysis, where many decision tree models are constructed by randomly selecting subgroups of samples from a training set to construct models for each subgroup. Each decision tree then classifies each sample from an independent test set, and the final classification for this sample is decided by a weighted “majority vote” from all these decision trees. A randomly chosen training set was used to train the software to classify normal and malignant spectra. 40 independent spectra were randomly chosen to test this classification model. Random forest classification, with 28 variables tried at each split, was constructed on the training set and applied to the test set, resulting in a test set sensitivity of 90% ( tumor samples misclassified) and specificity of 75% ( normal samples misclassified). The accuracy was unaffected by the number of variables tried (for values between 20 and 60).
Because Raman microscopy provides a unique chemical signature of tumor tissue, it provides a window into the chemical make-up of that tissue and may offer an insight into how aggressive a tumor is. Raman spectra from the tumor samples of 34 patients, for which there was at least postoperative follow-up, were analyzed using principal component analysis with leave-one-out cross-validation. The patients were divided into two groups depending on whether their cancer recurred within twelve months postoperatively. From this PCA, the first and third principal component scores separated the two groups. A scatter plot of these principal components is shown in Fig. 5 . The diagonal line illustrates how these principal component scores can detect recurrence with a sensitivity of 73% and a specificity of 74% , (positive predictive value 69% and negative predictive value 78%).
The tumor spectra were separated into two groups depending on whether the score from principal component 3 in the recurrence analysis was above or below its median. The overall median follow-up was . Using the log-rank test, there was no significant difference in survival between these two groups, although there was a strong trend .
We have shown that Raman microscopy of normal and malignant tissue sections obtained from lung cancer resections can distinguish malignant and normal bronchial tissue and predict early postoperative recurrence in this group of patients.
Rather than relying on the use of individual spectral peaks to distinguish groups of tissue, multivariate data reduction techniques have been employed to include information from the whole spectrum to classify spectra. Principal component analysis (PCA) is a data reduction technique that summarizes the variation between the spectra by reducing it into a small number of principal component scores, while retaining the majority of information contained within the spectra. These components describe consistent differences between the spectra, allowing separation of the spectra into groups based on similarities or disparities in the spectral characteristics. This classification can be tested using a leave-one-out cross-validation, where the software develops a model based on all the spectra except one, and then tests the model on that spectrum, and repeats this process on all the spectra.
Principal component analysis of normal and malignant spectra demonstrated that Raman microscopy can distinguish normal and malignant lung tissue with a sensitivity of 84% and a specificity of 61%. Although the use of a nonlinear random forest classification technique improved the accuracy, the specificity was still suboptimal at 75%. This specificity, which is a function of the ability of a test procedure to classify normal samples, can be explained by the heterogeneous nature of the normal bronchial epithelium that was examined. Although all the normal samples were confirmed as “normal” by a pathologist, there were significant differences between these samples. It is likely that this morphological heterogeneity is also reflected in the cellular biochemistry and thus in the Raman spectra. The normal samples used in this study were taken from the same lung as the resected tumor and have certainly undergone the same inhaled insults as the tumor. They may also have undergone, to a varying degree, malignancy associated changes.19, 20 It is also well established that tobacco smoking is a “field” effect, making the chemical analysis from this heterogeneous field challenging, although clinically relevant. As pointed out earlier, our classification accuracy is less than that reported in the two previous studies of lung cancer using Raman spectroscopy in the absence of the higher spatial resolution possible with a microscope.13, 14 Some Raman scattering in the normal samples from these two previous studies will certainly have originated from stromal and supporting tissues, which may have unduly influenced the discrimination in these two studies. In our present study, only Raman scattering from normal and malignant cells was used, which can make the discrimination more challenging. It should be noted that in one of the earlier studies referred to, only two Raman spectral peaks (1450 and ) were used to classify the samples. In our study, t-test analysis showed that there was no significant difference between the normal and tumor peak intensities at , and multivariate analysis was required to exploit the subtle differences between the spectra, so as to classify the samples.
Two previous studies have shown that Raman spectroscopy can identify and grade prostatic adenocarcinoma tissue sections using the Gleason score as a gold standard, and identify chemoresistance in cell lines.9, 21 However, this is the first study to suggest that Raman spectroscopy can provide prognostic information about lung cancer patients using clinical samples. A number of other methods have been used to predict prognosis in lung cancer, and a vast array of prognostic markers have been suggested, possibly reflecting the molecular heterogeneity of lung tumors.22 Recently, a number of studies have used gene expression signatures to divide patients undergoing surgical resection into a high- and low-risk group, depending on survival, and predict postoperative recurrence.23, 24, 25, 26, 27 Potti applied a lung metagene model to independent cohorts from two multicenter studies and predicted recurrence with an accuracy of 72 and 79%, respectively. Larsen 28 described a 54-gene signature that predicted recurrence in two independent cohorts with an accuracy of 72 and 67%. Our overall accuracy of 73% compares favorably with these studies. More recently, two groups have used a mass spectrometry proteomic approach to predict response to targeted chemotherapy and postoperative recurrence.29 Raman spectroscopy inherently has two main advantages over other current technologies. Since many molecular and macromolecular species, including DNA, proteins and lipids, contribute to the Raman spectrum, the technique has the potential to tease out the molecular heterogeneity of lung tumors. Also, because Raman spectroscopy can be carried out under nondestructive conditions, there is the potential to obtain Raman spectra from lung tumors in vivo via bronchoscopy and thus obtain a diagnosis and prognosis simultaneously and noninvasively. Although this technology is still in development, in vivo fiber optic probes have been studied in lung cancer and other organ systems.6, 7, 30
The analysis of the loading from the first PCA suggested that the main chemical difference between normal and malignant lung tissue may be attributable to a contribution of porphyrin to the spectrum. Although cytochrome C was used as the reference material, its Raman spectrum is virtually identical to that of the other cytochromes, because Raman scattering originates from the porphyrin ring structure of the heme component of the molecule. There are a number of intracellular cytochromes that this may represent, including cytochrome C, which is a key component in apoptosis.31, 32 Evasion of this programmed cell death is characteristic of carcinogenesis. The tentative conclusion reached earlier (Sec. 3)—that cytochrome C may be more abundant in the normal bronchial epithelium than in the malignant lung cells—would be consistent with this.33 A number of other cytochromes have been found in higher levels in bronchial epithelial cells when compared with lung cancer cells, including cytochrome oxidase, cytochrome P450, and CYP1A1.34, 35, 36, 37 However, if there is indeed an excess of cytochrome (a cytoplasmic protein) in the normal cells, this could simply represent an increase in the cytoplasm:nucleus ratio seen in normal tissue compared to malignant tissue. This is clearly an area that will require further investigation.
The analysis of the loading from principal component 2 in the first PCA suggested that this may be attributable to DNA, although some DNA peaks were absent in the loading trace. The conclusion though tentative would be consistent with the fact that DNA is more abundant in malignant tissue when compared to normal tissue because of the more active mitosis and cellular turnover.10, 14, 22
There is clearly scope for further refinement of the methodology in future studies, such as increasing the number of spots for laser analysis in each sample map. Additionally, the use of fresh or thawed samples might have improved the accuracy of the discrimination, but this was logistically impossible in this current study.
In conclusion, Raman microscopy can differentiate malignant lung tissue from normal bronchial epithelium, and has the potential to predict early recurrence in patients undergoing lung cancer resection. However, this should be tested in a larger independent test set.
Research and Development Office for Health and Personal Social Services in Northern Ireland (EAT/2538/03); and The Biotechnology and Biological Sciences Research Council (18471). Neither funding agency has had a role in the design of the study; collection, analysis, or interpretation of the data; the decision to submit the manuscript for publication; or the writing of the manuscript.