Breast cancer is the most common form of malignant tumor found in women in the United States, with an estimated 240 000 new cases diagnosed in 2007.1 The implementation of regular screening programs and increased use of x-ray mammography for early detection have had a significant impact on breast cancer mortality. Mammograms, however, still suffer from insufficient sensitivity and high rates of “false negatives,” which result in unnecessary follow-ups and biopsies that lead to patient trauma, time delay, and high medical costs.2 In fact, 60% to 90% of the suspicious lesions detected by mammography are benign upon biopsy,3 which can range from the fine needle aspiration of single cells to the surgical removal of the entire suspicious mass. A technique that holds considerable promise to overcome these limitations is optical spectroscopy, thanks to its ability to provide biochemical and morphological information about a tissue in a near-real time, minimally or noninvasive manner.4, 5, 6, 7 Another potential application of optical spectroscopy that has currently drawn significant interest is therapeutic guidance,8 especially the evaluation of surgical margins in real time to guide tumor resection during breast conservative therapy (partial mastectomy).
Optical spectroscopic techniques that have been investigated for breast cancer detection to date include autofluorescence, diffuse reflectance, and Raman spectroscopies.4, 5, 6, 7, 8, 9 Alfano and co-workers10, 11, 12 were the first to apply autofluor escence spectroscopy to the problem of identifying breast malignancy. A series of studies carried out by them10, 11, 12 showed that significant differences exist in the fluorescence signatures of malignant and nonmalignant human breast tissues. In particular, they showed that using discrimination indices based on ratio of intensities from emission spectra at excitation and excitation spectra at emission, they could discriminate malignant from fibrous breast tissues with 93% sensitivity and 95% specificity, but results were worse for discriminating normal fatty and malignant tissues. Gupta 13 and Majumder 14 measured emission spectra at 337- and excitation wavelengths from normal, benign, and malignant breast tissue samples, and using the integrated emission intensity from the excitation, they separated malignant tissues in a binary fashion from both benign and normal with sensitivity and specificity of up to 99.6%.13, 14 More recently, Palmer 15 used the fluorescence emission spectra from multiple excitation wavelengths ranging from to separate malignant from nonmalignant samples. Using principal component analysis (PCA), followed by a Wilcoxon rank-sum test to identify significant components, and then entering those into a support vector machine (SVM) resulted in 70% sensitivity and 92% specificity for discriminating malignant from normal or benign tissues.15 Using a probe with three different delivery-to-collection fiber distances and similar data analysis procedures, the same group found that analyzing integrated fluorescence emission intensities from a single excitation wavelength at all three separations could provide results comparable to those of the previous study, but with a simpler experimental setup.16
A few groups have investigated the utility of UV-visible diffuse reflectance (or elastic scattering) spectroscopy, either alone or in conjunction with fluorescence, for breast cancer diagnosis. Of these, Bigio 8 used in vivo measurements in an attempt to both make a diagnosis and help guide resection, and using diagnostic algorithms based on artificial neural networks and hierarchical cluster analysis, they were able to distinguish malignant from normal breast tissue with sensitivities up to 69% and specificities up to 93%. Palmer 15 investigated the utility of combining diffuse reflectance measurements with fluorescence measurements, but they found that diagnostic performance for breast cancer was not significantly improved by doing so.
Perhaps the most widespread application of Raman spectroscopy in cancer research has been for breast cancer detection. 17, 18, 19, 20, 21, 22, 23, 24 Alfano 17 were the first to investigate the ability of Raman spectroscopy to distinguish normal from malignant breast tissue, using a Fourier transform Raman spectrometer at excitation. Later, Frank 18, 19 and Redd 20 employed Raman spectroscopy using visible and near IR excitation to study excised human breast tissues. More recently, Feld and colleagues21, 22, 23, 24 have done extensive work on using Raman spectroscopy for breast cancer diagnosis. An early study21 similar to Redd 20 showed comparable spectra, but multivariate statistical algorithms improved diagnosis. Over the past several years, this group has developed a system that classifies breast Raman spectra according to the modeled contributions of individual component spectra from materials such as fat, collagen, and DNA.22 They have used this system on tissue samples to discriminate invasive carcinoma from normal fatty, fibroadenoma, and fibrocystic change tissues with an overall accuracy of 86%,23 and in a small pilot in vivo trial for guiding resection, normal, fibrocystic change, and malignant tissues were classified with an overall accuracy rate of 93%.24
Although the studies described use autofluorescence, diffuse reflectance, combined autofluorescence and diffuse reflectance, or Raman spectroscopy for breast tissue discrimination, there is no published report of a direct, side-by-side comparison of the efficacy of these modalities for specific types of discrimination. The goal of this paper, then, is to report a comparative evaluation of the relative capabilities of fluorescence, diffuse reflectance, combined fluorescence and diffuse reflectance, and Raman spectroscopy for discriminating the different histopathologic categories of human breast tissues. Fluorescence, diffuse reflectance, and Raman spectra were acquired ex vivo from human breast tissue samples belonging to four histopathologic categories: invasive ductal carcinoma (IDC), ductal carcinoma in situ (DCIS), fibroadenoma (FA), and normal. A probability-based multivariate statistical algorithm capable of direct multiclass classification25 was developed to analyze the diagnostic content of these different sets of optical spectra measured sequentially from the same set of breast tissue sites. The results showed that although Raman spectroscopy allows for the most accurate tissue classification as may be useful for clinical diagnosis of the pathological state of a tumor, combined fluorescence and diffuse reflectance holds promise for use in clinical applications such as evaluating margin status during breast surgery in which imaging an entire tissue surface will be required for delineating normal from nonnormal breast tissue.
Materials and Methods
Breast Tissue Samples
The human breast tissue samples were obtained under a protocol approved by the Vanderbilt University Institutional Review Board. The freshly frozen samples were obtained either from the tissue bank at Vanderbilt Clinic or from the National Cancer Institute’s (NCI) Cooperative Human Tissue Network. A total of 74 tissue samples were obtained for this study. Normal tissue samples were obtained from either reduction mammoplasty or uninvolved areas from radical mastectomy procedures, and tumor samples were partial sections of surgically removed breast lesions. All samples were stored at until spectroscopic study, at which point they were thawed to room temperature in buffered saline.
In vitro fluorescence and diffuse reflectance spectra of breast tissue samples were measured using a portable spectroscopic system as illustrated in Fig. 1 . A high-pressure nitrogen laser (Spectra Physics, Mountain View, CA) is used as the excitation source for fluorescence measurements, and a tungsten-halogen lamp (Ocean Optics, Dunedin, Florida) emitting broadband white light from is used for diffuse reflectance measurements. Light delivery to and collection from the sample is achieved with a fiber-optic probe consisting of seven core diameter fibers arranged in a six-around-one configuration (Romack, Williamsburg, Virginia). Two of the surrounding fibers deliver laser and white light consecutively to the tissue sample while the remaining fibers collect fluorescence and diffuse reflectance from the tissue sample. Diffuse reflectance and fluorescence emissions collected by the fiber-optic probe are serially dispersed and detected with a chip-based spectrometer (model number S-2000, Ocean Optics). For this study, the output power of the white light was at the tissue surface, and the nitrogen laser was operated at a repetition rate, pulse width, and average pulse energy of at the tissue surface. An integration time of was used for each spectral measurement. The overall spectral resolution of the system was .
Raman spectra of the breast tissue samples were measured with a portable Raman spectroscopy system shown in Fig. 2 . The system consists of a diode laser (Process Instruments, Inc., Salt Lake City, Utah), a seven-around-one beam-steered fiber optic probe (Visionex Inc., Atlanta, Georgia), an imaging spectrograph (Kaiser Optical Systems, Inc., Ann Arbor, Michigan), and a back-illuminated, deep-depletion, thermoelectrically cooled charge-coupled device (CCD) (Princeton Instruments, Princeton, New Jersey), all controlled with a laptop computer. The beam-steered fiber optic probe delivers the light, which is band pass filtered at the probe tip, onto the tissue and collects the Raman scattered light, which is then filtered with an inline notch filter within the probe itself. The light is then fed into the spectrograph where it is filtered again with a holographic notch filter and dispersed onto the CCD, where the computer records the signal. For this study, the fiber optic probe delivered onto the tissue and collected the scattered light for .
A standard protocol was followed for the spectral measurements and maintained for all the samples in this study. Prior to spectral acquisition, each sample was thawed to room temperature in phosphate-buffered saline. For recording spectra, the tissue sample was kept on a sheet of aluminum foil, and the tips of the fiber-optic probes were placed normally in gentle contact with the target tissue. From each tissue sample, spectra were recorded from 2 to 6 sites depending on the size of the sample. From each individual site, autofluorescence, diffuse reflectance, and Raman spectra were measured sequentially. In all cases, the overhead fluorescent lights were turned off during the measurements. Following spectral acquisition from each sample, the investigated sites were inked to record their locations, and then fixed in formalin for standard hematoxylin and eosin staining and examination by an experienced pathologist (F.B.) blinded to the results of the optical spectra. The histopathology report of each site was considered to be the gold standard. All spectra were categorized according to their histological identities and grouped into IDC, DCIS, FA, or normal breast tissue.
After acquisition of autofluorescence and diffuse reflectance spectra, a set of reference spectra from a fluorescence and a reflectance standard were recorded to correct for intersample variability due to variations in laser-pulse energy and white light power. The fluorescence standard is a low-concentration Rhodamine 6G solution contained in a quartz cuvette, and the reflectance standard is a 20% reflectance plate (Labsphere, North Sutton, New Hampshire) placed in a black box. All in vitro raw fluorescence and diffuse reflectance spectra were processed to remove instrumentation-induced variations and to yield calibrated spectra, the details of which are described elsewhere.26 The resultant fluorescence and diffuse reflectance spectra were further corrected for the nonuniform spectral response of the detection system. Fluorescence spectra were recorded from , and diffuse reflectance spectra were recorded from .
Prior to Raman spectral measurements, the wave number axis was calibrated with a neon-argon lamp, acetaminophen, and naphthalene standards. For each Raman spectrum measured, the signal from the CCD was binned along the vertical axis to create a single spectrum per measurement site. Prior to any signal processing, the spectrum was truncated to only include the region from about to eliminate the large Raman peaks due to the silica present in the fiber-optic probe that obscure any tissue Raman peaks, as well as the noise present at the very end of the spectral region. The spectrum was then binned along the wave number axis in intervals and filtered with a second-order Savitzky-Golay filter for noise smoothing. Fluorescence subtraction was accomplished using an automated, modified polynomial fitting method in which a fifth-order polynomial is fit to the fluorescence baseline.27
Following data processing, a method of normalization was adopted to remove the absolute intensity information from the spectra that might be affected by many unavoidable experimental factors. In the case of fluorescence and diffuse reflectance, the spectrum from each site of a sample was normalized with respect to the integrated intensity from that site. In the case of Raman, each spectrum was normalized to its mean spectral intensity across all Raman bands.
The set of spectral data normalized in the manner just discussed were used for subsequent data analysis. For combined fluorescence and diffuse reflectance, the respective area-normalized spectra from each tissue site were concatenated end-to-end to form a single column vector. These column vectors were then further concatenated in rows to form the input data matrix (for all the tissue samples investigated).
Figure 3 shows a flow chart of the diagnostic algorithm that was developed to analyze the breast tissue fluorescence, diffuse reflectance, and Raman spectra. The algorithm development was described previously25 and consisted of two steps: (i) extraction of diagnostic features from the spectra using nonlinear maximum representation and discrimination feature (MRDF)28 and (ii) development of a probabilistic scheme of classification based on sparse multinomial logistic regression (SMLR)29 for classifying the nonlinear features into corresponding tissue categories. Each step is described in detail in the following.
Given a set of input data comprising samples from different classes with a given dimensionality, nonlinear MRDF28 aims to find a set of nonlinear transformations of the input data that optimally discriminate between the different classes in a reduced dimensionality space. It invokes nonlinear transforms, in this case restricted order polynomial mappings of the input data,28 in two successive stages. In the case of present spectral data, the aim of nonlinear MRDF is to compute nonlinear transformation vectors, , from -dimensional (where is the number of wavelengths over which spectra were recorded) spectra of breast tissue sites, such that the projections of the input data on from the different tissue categories are statistically well separated from each other. In the first stage, the input spectral data (normalized intensities corresponding to wavelengths of the spectra) from each tissue type are raised to the power to produce the associated nonlinear input vectors , which are then subject to a transform such that are the first stage output features in the nonlinear feature space of reduced dimension . In the second stage, the reduced -dimensional output features for each tissue type are further transformed nonlinearly to the power to produce higher order features , and a second transform is computed so as to yield the final output features in the nonlinear feature space of dimension . Because the nonlinearities introduced in the two stages are different ( in the first stage and in the second stage), this is expected to produce more general nonlinear transforms on the input spectral data, leading to improved separation of the final nonlinear features for the tissue categories in the new feature space. Thus MRDF automatically finds a closed form solution for the best set of nonlinear transforms.
Classification with SMLR29 is a probabilistic multiclass model based on the sparse Bayesian machine-learning framework of statistical pattern recognition. The central idea of SMLR is to separate a set of labeled input data into its constituent classes by predicting the posterior probabilities of their class membership. It computes the posterior probabilities using a multinomial logistic regression model and constructs a decision boundary that separates the data into its constituent classes based on the computed posterior probabilities following Bayes’ rule. Classification of a given set of input data is based on the vector of posterior probability estimates yielded by the SMLR algorithm and a class is assigned to a data for which its posterior probability is the highest
An important task following development of the diagnostic algorithm was to evaluate its classification ability in an unbiased way through cross-validation. Because we had a limited number of spectra in each diagnostic category from a limited number of samples, the cross-validation of the algorithm was performed using leave-one-sample-out cross-validation. In this method, the training of the algorithm was performed using samples (where samples), and the test was carried out using the excluded sample.30 This was repeated times, each time excluding a different sample. Thus, training was achieved using, in a sense, all samples, and at the same time independence between the training and test sets was maintained.
Multiclass receiver operating characteristic analysis
To quantitatively compare the relative performance of the diagnostic algorithms developed for fluorescence, diffuse reflectance, combined fluorescence and diffuse reflectance, and Raman spectral data sets, a multiclass receiver-operating characteristic (ROC) analysis was carried out on the classification results yielded by the corresponding algorithms. The formulation developed by Hand and Till31 was followed for this purpose. The formulation extends the two-class ROC analysis in a straightforward way for multiclass case and computes a generalized metric indicative of overall performance measure of a given multiclass diagnostic algorithm. Given number of classes, the Hand and Till measure is the average of the pairwise area under the ROC curves between pairs of classes:is the area under the two-class ROC curve involving classes and . The summation is calculated over all pairs of distinct classes, irrespective of order. Similar to the two-class case, the closer the equals to 1, the more accurate the corresponding diagnostic algorithm is.
In vitro fluorescence, diffuse reflectance, and Raman spectroscopic measurements were carried out on a total of 74 different tissue samples from 74 different patients. Optical spectra were acquired from 293 unique tissue sites on these samples. The details of the histopathological distribution of the tissue sites are summarized in Table 1 .
Histological distribution of the tissues.
|Category||No. of Spectra||No. of Tissue Samples|
|Invasive ductal carcinoma (IDC)||86||25|
|Ductal carcinoma in-situ (DCIS)||18||6|
|Normal (adipose, glandular)||134||32|
Figures 4, 5, 6 show the average normalized fluorescence, diffuse reflectance, and Raman spectra for IDC (86), DCIS (18), FA (55), and normal breast tissues (134), with the error bars representing the spectral standard deviations. From the figures, it is evident that the variation in the measured spectral intensity is comparable for all the tissue types in all three sets of optical spectra. The percentage variation in the spectral intensities from the different measurement sites was observed to lie in the range of over the respective number of tissue sites included in the four histopathological categories for all the three sets of spectra. Here, is the mean intensity value from different measurement sites of one category and is one standard deviation.
For comparison of spectral differences among the different tissue types, the average fluorescence, diffuse reflectance, and Raman spectra are plotted without error bars in Figs. 7, 8, 9 . One can see that although fluorescence from FA tissue is visibly different than that from the rest of the tissue types throughout the spectral region, the differences among IDC, DCIS, and normal breast tissues are fairly subtle except in the band region where the band intensities show prominent differences. Similarly, although the average reflectance spectra show visible differences among all the tissue types in general, almost no difference is seen between IDC and normal breast tissue in the wavelength region. In contrast, the differences in the average Raman spectra appear to be somewhat more pronounced among all the tissue types across all the major Raman bands present over the entire wave number region.
Tables 2, 3, 4, 5 show the diagnostic results in the form of confusion matrices displaying comparisons of the pathological diagnosis with that of the MRDF–SMLR–based spectroscopic diagnostic algorithms. In all instances, the classification results were obtained based on leave-one-sample-out cross-validation of the entire data set. One can see that diffuse reflectance data alone achieved an overall classification accuracy of 72% (211 out of 293). Its best performance was in classifying FA tissue (85% accuracy), though it fared worse in classifying other tissue types, and errors were spread among the various classes. Fluorescence data alone provided the worst overall accuracy, correctly classifying only 209 measurement sites (71%). It proved most adept at classifying normal breast tissues, though accuracy in that diagnosis was still only 85%. When fluorescence and diffuse reflectance spectra were combined, the diagnostic accuracy improved quite a bit, classifying 247 out of 293 (84%) sites correctly. Normal tissues were correctly classified in 86% of cases, but DCIS and FA were classified correctly in 89% and 98% of the sites. The algorithm still struggled somewhat to classify IDC, but, in general, misclassifications appeared less random than with either modality alone. With Raman spectra as the input data, the algorithm discriminated all four classes very well, correctly classifying 290 out of 293 (99%) sites. Normal tissues, which include both fatty and glandular, were classified correctly for every site, and only one each IDC, DCIS, and FA sites were misclassified.
Confusion matrix displaying classification of breast tissues using MRDF-SMLR–based algorithm with fluorescence spectra.
|Pathology Diagnosis (no. of sites)||Fluorescence Diagnosis|
Confusion matrix displaying classification of breast tissues using MRDF-SMLR–based algorithm with diffuse reflectance spectra.
|Pathology Diagnosis (no. of sites)||Diffuse Reflectance Diagnosis|
Confusion matrix displaying classification of breast tissues using MRDF-SMLR–based algorithm with combined fluorescence and diffuse reflectance spectra.
|Pathology Diagnosis (no. of sites)||Combined Fluorescence and Diffuse Reflectance Diagnosis|
Confusion matrix displaying classification of breast tissues using MRDF-SMLR–based algorithm Raman spectra.
|Pathology Diagnosis (no. of sites)||Raman Diagnosis|
In addition to assigning class labels, all the diagnostic algorithms also yielded posterior probabilities of the measured tissue sites belonging to each breast tissue type. Figures 10a to 10d illustrate these posterior probabilities computed by the four different algorithms. The posterior probabilities are indicative of the certainty of classification, and they are plotted for all the different tissue sites included in each type of tissue. It is apparent from the figures that although more than 95% of the correctly classified tissue sites in each type have a posterior probability with the algorithm based on Raman spectra of breast tissues, a significant fraction (up to ) of these sites classified correctly using the algorithm based on combined fluorescence and diffuse reflectance spectra are seen to have a posterior probability . Similarly, although the accuracy obtained in correctly classifying FA and normal tissue sites is with the algorithm based on either fluorescence or diffuse reflectance spectra alone, only of these tissue sites are seen to have a posterior probability .
The multiclass ROC analyses of the classification results provided a quantitative evaluation of the overall performance of the diagnostic algorithms. Table 6 lists the values obtained for the algorithms based on fluorescence, diffuse reflectance, combined fluorescence and diffuse reflectance, and Raman spectra. The estimated value of the algorithm based on fluorescence spectra alone is 0.83, and those for diffuse reflectance, combined fluorescence and diffuse reflectance, and Raman spectra based algorithms are 0.88, 0.95, and 0.99, respectively. It is important to mention here that the value is a quantitative measure of the gross performance of an algorithm and the for an ideal diagnostic algorithm will have a value of 1.
HTM values corresponding to the diagnostic algorithms based on autofluorescence, diffuse reflectance, combined autofluorescence and diffuse reflectance, and Raman spectra of breast tissues.
|Fluorescence||DiffuseReflectance||CombinedAutofluorescence andDiffuse Reflectance||Raman|
Most of the studies reported to date on the application of optical spectroscopy for breast cancer detection 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 have used one of three spectroscopic techniques: fluorescence, diffuse reflectance, or Raman scattering. However, a comprehensive, side-by-side evaluation of the relative efficacies of these different methods has not been addressed in the literature. The goal of the present study is to evaluate and compare the relative capabilities of fluorescence, diffuse reflectance, combined fluorescence and diffuse reflectance, and Raman spectroscopy for simultaneously discriminating the different histopathologic categories of human breast tissues. Such an evaluation is important because it may help choose the optimal modality for a given diagnostic problem. In the present study, fluorescence, diffuse reflectance, and Raman spectra were acquired ex vivo from human breast tissue samples belonging to four distinct histopathologic categories: IDC, DCIS, FA, and normal. A probability-based multivariate statistical algorithm capable of direct multiclass classification was developed to analyze the diagnostic content of these different sets of optical spectra measured sequentially from the same set of breast tissue sites.
The primary basis for optical detection using spectroscopic techniques is an array of biochemical changes that take place as tissue undergoes neoplastic transformations. For example, IDC, FA, and normal breast tissues are known to show variable amounts of collagen, and elastin, which is reflected in the fluorescence intensity of the band characteristic of these connective tissue proteins.4, 5, 6 Similarly, the differences in the concentrations and oxidation states of coenzymes such as nicotinamide adenine dinucleotide (NADH) and flavin adenine dinucleotide due to differences in metabolic activities in normal and neoplastic breast tissues contribute to the changes in the fluorescence intensity of the broad band believed primarily to be due to these fluorophores.4, 5, 6 Some of the changes found in the fluorescence spectra of normal and abnormal breast tissues are also the result of changes in the wavelength-dependent absorption and scattering properties of tissues.4, 5, 6 However, these changes can be seen to be more prominent in the diffuse reflectance spectra of the corresponding tissue types, as diffuse reflectance provides a direct measurement of tissue absorption as well as scattering.8, 15 For example, the several dips found in the diffuse reflectance spectra of breast tissues represent the signatures of absorption by oxygenated and deoxygenated hemoglobin,8, 15 which are major absorbers present in blood and have structured absorption bands spanning nearly the entire visible and near-infrared region.4, 5, 6, 8 The differences in absorption properties between normal and abnormal breast tissues, known to be caused primarily by hemoglobin, are clearly seen in the measured diffuse reflectance spectra that show significant variation in spectral line shapes for the corresponding tissue types. On the other hand, Raman spectroscopy probes the vibrational energy levels of molecules, and specific peaks in the Raman spectrum correspond to particular chemical bonds or bond groups.7 Because of Raman’s chemical specificity, it has the ability to discern the slight biochemical changes associated with neoplastic transformation.7 For example, the spectral variations between the different breast tissue pathologies observed at 1000 to 1150, 1170, 1200 to 1345, 1440, and correspond to biochemical differences inherent in the different breast tissues, notably connective tissue proteins, and fatty acids.7, 17 The normal breast tissue spectra are noticeably different as compared with all other categories and are dominated by Raman bands characteristic of fatty acids (1650, 1440, and ), whereas the intensities of these lipid-specific bands are much reduced in the Raman spectra of IDC, DCIS, and FA, implying a relative increase in protein content in these tissue types. In contrast, differences among IDC, DCIS, and FA are found to be subtler: the ratio of the bands between (tryptophan, phenylalanine, amide III: proteins) and ( bending: proteins, lipids) varies with tissue types, and this variation is different between cancerous and noncancerous breast tissues. Similarly, the ratio of peaks at (amide I: proteins, lipids) and the small peak near (tryptophan) is different in the IDC and FA than in DCIS. Other variations include changes in the band patterns and intensities between 1000 and (tyrosine, proline, phenylalanine, proteins) evident as a function of pathology.
It is relevant to note here that although all the spectroscopic techniques are seen to lead to many observable spectral differences between different breast pathologies, it is more important to see the significance of these variations toward pathological classification. A gross comparison of the classification results (see Tables 2, 3, 4, 5) yielded by the four diagnostic algorithms clearly indicates that even though fluorescence alone is the least capable in accurately discriminating among the four histopathologic categories of breast tissues based on their measured spectra at excitation, the performance of diffuse reflectance alone appears to be a bit better. Although the results are improved to a large extent when fluorescence and diffuse reflectance spectra are combined, the performance of Raman spectroscopy alone is seen to be superior to that of all the other techniques employed. The large improvement in classification performance of the Raman-based algorithm likely originates from the greater diagnostic content of the tissue Raman spectra in comparison with that from either fluorescence or diffuse reflectance spectra. Although diffuse reflectance primarily probes the absorption and scattering properties of tissue,8, 15, 16 fluorescence, in principle, has the additional advantage of biochemical specificity that arises from the fact that tissue contains several intrinsic fluorophores that have their characteristic fluorescence emission.4, 5, 6 However, the significantly broad and overlapping emission profiles of these fluorophores make the appearance of the resulting tissue fluorescence (which is a superposition of the spectra of its constituent fluorophores modulated by tissue optical properties) spectra mostly flat and featureless,4, 5, 6 thus making it difficult, in practice, to fully exploit this advantage. In contrast, the Raman spectrum of a tissue consists of relatively narrower bands characteristic of specific molecular vibrations of a much larger number of Raman-active biochemicals present in tissue,7 thus allowing one to detect molecular information in a tissue in greater detail than with fluorescence or diffuse reflectance.
A critical evaluation of the diagnostic results listed in Tables 2, 3, 4, 5 reveals some interesting points that are worth noting. One may see that both fluorescence and diffuse reflectance consistently misclassify 30% to 45% of IDC tissue sites as normal, and the situation does not improve much even with the combined approach, which still leads to a 25% misclassification rate. In contrast, the classification accuracy of Raman alone is 99% (85 out of 86 IDC tissue sites). The likely reason for this inferior classification performance of fluorescence and diffuse reflectance becomes apparent only when one critically examines the detailed physical appearance of all the tissue samples investigated. It was found that most of the malignant tissue sites that were misclassified as normal belonged to those IDC samples that had thin layers of fat on their surfaces. It is known that both fluorescence and diffuse reflectance with UV and visible light excitation can probe only the superficial tissue layer,4, 5, 6 whereas Raman with near IR excitation can probe to a much greater depth inside tissue.7 On a similar note, one may find that a total of of the normal tissue sites are consistently misclassified as IDC, FA, or DCIS by either fluorescence or diffuse reflectance, and the results remain the same even with the combined approach with no apparent improvement in discrimination of normal tissue sites. Although the majority of the normal breast tissue samples investigated was predominantly fatty, a few of them were glandular or fibrous as well. It was noticed that the normal breast tissue sites that were consistently misclassified belonged to the samples that had more glandular or fibrous than fatty tissues. On the other hand, the DCIS tissue sites that show very poor classification accuracy with either fluorescence or diffuse reflectance alone are seen to have a much improved accuracy of 89% (16 out of 18) with the combined approach. However, in this case, the performance of Raman is only marginally better, with 17 out of 18 classified correctly (94% accuracy).
Recently, Palmer 15 demonstrated a comparison of diffuse reflectance and fluorescence spectroscopy for ex vivo characterization of different breast pathologies. They used PCA for reducing the dimensionality of the spectral data and linear SVM for classifying the resulting diagnostically relevant principal components. Even though their PCA-SVM algorithm was successful in discriminating malignant from nonmalignant breast tissues with a sensitivity and specificity of 70% and 92%, respectively, based on autofluorescence spectra alone, the performance of the algorithm was found to provide a sensitivity of only 30% and a specificity of 78% when only diffuse reflectance spectra of the same set of breast tissue samples were incorporated for discrimination analysis. Use of both fluorescence and diffuse reflectance spectra in combination, however, was observed to make the algorithm perform much better than that with diffuse reflectance alone, though the sensitivity and specificity values were the same as that obtained using autofluorescence alone. Although a direct comparison is not possible, mainly due to the differences in experimental and data analysis methods, these observations of Palmer are seen to be grossly consistent with those of ours. For example, in our case, the maximum sensitivity and specificity achieved by the algorithm based on combined autofluorescence and diffuse reflectance in discriminating malignant from nonmalignant breast tissue sites were 72% and 89%, respectively, whereas the algorithm based on fluorescence spectra alone provided a poorer sensitivity and specificity of 58% and 77%, respectively. It is pertinent to note here that Palmer 15 used fluorescence spectra at nine excitation wavelengths ( in increments) as input to their diagnostic algorithm, whereas we used fluorescence spectra at only excitation for our algorithm development. Use of multiple excitation wavelengths make it possible to probe most of the tissue fluorophores that have characteristic emission over the complete UV-visible wavelength region as compared with fewer fluorophores (primarily NADH and flavins) that can be probed with the wavelength. The possibility of incorporating a much larger number of spectral features that can be of diagnostic relevance in multiexcitation as compared with single-excitation fluorescence is most likely the reason for the superior classification performance of their multiexcitation fluorescence-based algorithm as compared with that of ours based on single excitation.
Although the earlier studies by several research groups17, 18, 19, 20 demonstrated the applicability of Raman spectroscopy for breast cancer diagnosis, a series of systematic studies carried out recently by the Feld group21, 22, 23, 24 has provided strong evidence that the dispersive Raman spectroscopic technique has enough promise to be used as a potential diagnostic tool for real-time assessment of various breast pathologies. The chemical and morphological models–based diagnostic algorithm developed by them22 shows excellent diagnostic accuracies in classifying ex vivo breast tissue samples according to their specific pathologic diagnoses, attaining 94% sensitivity and 96% specificity for distinguishing cancerous breast tissues from normal and benign tissues.23 One may note here that although there is intrinsic difference between the MRDF-SMLR–based diagnostic algorithm25 employed by us and the spectroscopic model–based algorithm22 used by them for the discrimination analysis of the breast tissue Raman spectra, the resulting discrimination results in both the cases are comparable with classification accuracy consistently being in the range of 90% to 100%. This is perhaps not unexpected given the wealth of information embedded in the Raman spectra of breast tissues that can serve as a basis for diagnosis.7
It is pertinent to mention here that in addition to investigating the capability of fluorescence, diffuse reflectance, combined fluorescence and diffuse reflectance, and Raman, a scheme was also investigated where all three sets of spectra were combined to develop a diagnostic algorithm. This was done to explore whether this kind of trimodal spectroscopy can result in any better classification outputs. This algorithm did not, however, lead to any improvement in the classification results that were already obtained using the Raman-based diagnostic algorithm alone. This perhaps shows that blindly adding together spectral data of all three spectroscopic techniques is not the way to address a given diagnostic problem for any better accuracy.
Another important point worth considering is that when it comes to distinguishing only normal from nonnormal tissues, both Raman and combined fluorescence and diffuse reflectance are seen to have the potential to provide such discriminations reasonably accurately. For example, the diagnostic results in Table 4 show that combined fluorescence and diffuse reflectance spectroscopy can discriminate tumor from normal breast tissues with sensitivity and specificity of 83% and 85%, respectively. If the objective is only to delineate tumor from normal breast tissues, as may be required for certain clinical procedures, the combined fluorescence and diffuse reflectance approach can serve as a method of choice. This technique can be modified as well, such as adding polarization optics to provide depth-dependent information32 to perhaps overcome the limitations of fluorescence and diffuse reflectance as discussed previously. A further incentive toward using this approach is that, given currently available instruments, combined fluorescence and diffuse reflectance is a significantly stronger candidate than Raman for imaging techniques. In many applications, such as evaluating margin status during breast conservative therapy, it is highly desirable to move to an imaging-based system rather than point spectroscopy to gather more information from an entire tissue surface in a much quicker time. At this point, one can perhaps think of exploiting the advantages of all three spectroscopic techniques in one setting by combining imaging based on combined fluorescence and diffuse reflectance with Raman-based point spectroscopy if the goal is to cover a larger area while accurately distinguishing normal from nonnormal tissue.
The authors would like to thank Ms. Evelyn Okediji for help with preparing the tissue slides for pathology. The authors acknowledge the financial support of the NCI SPORE in Breast Cancer Pilot Project (5P50 CA098131-03).