Skin cancers are the most prevalent form of human cancer; their incidence is nearly equal to that of all other cancers combined.1 Nonmelanoma skin cancers, including basal cell carcinomas (BCC) and squamous cell carcinomas (SCC), present more than one million cases each year in the United States alone.1 Melanomas contribute another cases each year.1, 2 As in most cancers, early diagnosis is crucial for a favorable prognosis.
Recognition of skin cancers typically relies on visual inspection and patient history. Although skin presents an easy site for inspection, diagnosis of skin cancers is difficult, as many benign lesions visually resemble malignancies. Thus, accurate clinical diagnosis relies on biopsy and subsequent histopathologic examination. Clinicians are presented with the challenge of deciding which and how many skin lesions to biopsy, usually relying on visual inspection and palpation. Furthermore, as treatment depends on the pathology of the lesion and the extent of its margins, time-consuming histopathologic confirmation of lesion malignancy is required for proper treatment. Despite many efforts, there is a continued need for an automated, noninvasive diagnostic method of characterizing skin lesions in situ that can direct biopsies and, ultimately, circumvent the need for histology. Such a technique could streamline and combine diagnostic and therapeutic procedures in a single visit, saving time and expense for both the clinician and the patient.
Optical methods present a viable approach for providing automated, noninvasive, real-time diagnosis of skin lesions, as well as guidance of therapy. There has been much interest in recent years to develop “optical biopsy” methods for tissue diagnosis. Several optical techniques have been applied in a variety of organ systems, with varying degrees of success.
A number of optical imaging modalities have been employed to study skin structures and malignancy, including optical coherence tomography (OCT), laser scanning microscopy, and polarized light imaging. Welzel obtained OCT images from various skin lesions and were able to differentiate skin layers and morphological changes.3 Knüttel utilized OCT to measure refractive index and scattering differences in skin structures.4 However, OCT typically provides axial sections that are based on morphological structures alone and requires expert interpretation. On the other hand, confocal reflectance imaging typically provides transverse sections of tissues by depth resolution that is again based on tissue morphology. This method has been used to study a number of skin conditions and diseases. 5, 6, 7, 8 In particular, Rajadhyaksha constructed a portable confocal microscope for in vivo skin imaging with positional accuracy within .9 Thus, both confocal imaging and OCT are two methods that provide in vivo morphological images that still require an expert to interpret. Both these imaging methods provide little quantitative information and no biochemical information with respect to the disease state.
Pellacani and Seidenari compared polarization microscopy with epi-illuminescence microscopy of pigmented skin lesions and found that the polarization revealed excellent contrast between morphological structures while obscuring the observed pigment colors in the lesions.10 Yaroslavsky used polarization imaging to accurately detect locations and shapes of dye-enhanced samples with BCC and SCC in vitro.11 Olivier found polarization imaging to be a superior alternative to clinical examination for determining necrotic regions of skin flaps.12 While these results show that polarization microscopy provides semiquantitative information about tissue state, it provides little biochemical information and lacks the capability for automated, graded tissue diagnosis.
Spectroscopic techniques, in contrast, are based on quantitative measurement of specific native tissue chromophores. Clinical application of these techniques can be performed by unskilled personnel, with automated, statistical algorithms providing objective correlation of measured data with pathological state. In the case of skin cancer diagnosis, perhaps the most often applied spectroscopic techniques have been fluorescence spectroscopy and Raman spectroscopy.
Fluorescence spectroscopy has been used to characterize melanomas13 as well as nonmelanoma skin cancers (BCC and SCC) from normal skin spectra.14 Fluorescence spectra could be correlated to the disease state, and some of these differences were attributed to tryptophan moieties related to the carcinomas. Doukas found a linear relationship between the fluorescence intensity at and epidermal keratinocyte proliferation.15 In another study, Gillies found this same relationship in psoriatic lesions and observed a clear distinction between the fluorescence signals of epidermis and dermis.16 These studies indicate that the biochemical constituents detected by fluorescence spectroscopy are capable of predicting morphological and pathologic changes. However, because there are relatively few autofluorescent biological markers, the fluorescence spectra provide a limited capability to perform graded diagnosis of skin lesions. Moreover, Lauridsen found pigmentation to be the greatest factor in the variability of fluorescence spectra from normal skin measured in vivo.17 Sandby-Møller also found skin redness due to sun exposure affected the fluorescence spectra, while epidermal thickness had no effect.18 Thus, intrinsic tissue parameters such as skin color can modulate the fluorescence spectra measured, thus affecting the performance of fluorescence spectroscopy for skin cancer diagnosis.
Raman spectroscopy is an optical technique that probes the vibrational energy levels of molecules. In the Raman spectrum, specific peaks correspond to particular chemical bonds or bond groups. Because of Raman’s chemical specificity, it has the ability to discern the slight biochemical changes associated with malignant transformation, thereby aiding in differential diagnosis. Furthermore, when utilized in a confocal configuration, Raman spectra can be acquired from various depths and locations within the tissue. Several groups have used Raman spectroscopy to study the molecular composition and biochemistry of normal skin. One study on the normal skin measured in vivo shows Fourier-transform (FT) Raman spectra to vary as a function of hydration state and related collagen structure, while pigmentation was found to have very little contribution to spectral differences.19 Caspers correlated confocal Raman spectra measured in vitro and in vivo with molecular composition and hydration of skin layers. 20, 21, 22, 23 FT-Raman spectra have also been measured from normal and psoriatic lesions in vitro, and distinct differences due to lipid content and related keratin content were found.24 Hata found a relationship between Raman spectral differences and carotenoid concentrations and therefore, cancerous and precancerous skin lesions.25 For skin cancer detection, Gniadecka used FT-Raman spectroscopy to characterize skin structures26 and to differentiate BCC from normal skin in vitro.27 Sigurdsson utilized surface-level FT-Raman spectra to discriminate BCC and melanoma from actinic keratosis, pigmented nevi, and normal skin tissue with sensitivity and specificity.28 Nuclear and tissue environmental changes between BCC and normal epidermal cells were found using confocal Raman microscopy by Short 29 Nijssen used a confocal system to generate Raman maps of 15 skin sections and were able to diagnose BCCs with 100% sensitivity and 93% specificity.30 Thus, these different studies demonstrate the ability of Raman spectroscopy to detect subtle molecular transformations associated with skin and skin malignancy. Furthermore, a confocal approach has been shown to be essential in order to maximize Raman signal collection from relevant structures while avoiding interference from irrelevant sources. However, none of the published reports explore the capability of Raman spectroscopy to differentiate both melanomas and nonmelanomas (BCC and SCC) from normal skin toward clinical detection, nor compare the diagnostic performance of depth-resolved Raman measurements with non-depth-resolved measurements such as those obtained from a simple fiber probe.
While skin is easily accessible for optical diagnostic techniques, its stratified composition, inherent variability between and within patients, and presentation of malignancy at various depths within the skin provide a unique challenge. Any optical diagnostic technique, then, must not only be able to discern normal skin from various pathological conditions, but also be spatially resolved in order to minimize the effect of extraneous contributors in the superficial layers. Due to its molecular specificity, clinically viable acquisition times ( seconds), and spatial resolution, confocal Raman spectroscopy was selected for differential skin cancer diagnosis. The goal of this study was to characterize the Raman signatures of normal and malignant skin and to develop a strategy for clinical implementation. Using an in-house built confocal Raman microspectrometer, spectra were acquired from various normal and malignant skin tissues in vitro at various depths, and the spectra were characterized using simple and multivariate statistical techniques. Logistic discrimination was performed to predict pathological classification of the samples, and preliminary analyses indicate the ability of confocal Raman spectroscopy to successfully differentiate the various types of skin lesions and suggest that further clinical studies are warranted.
Materials and Methods
All skin samples were obtained under a protocol approved by the Vanderbilt University Institutional Review Board (IRB). Each sample was obtained fresh-frozen from the Vanderbilt University Medical Center immediately after resection. Thirty-nine samples were obtained for this study, of which 17 were normal, 8 basal cell carcinoma (BCC), 7 squamous cell carcinoma (SCC), and 7 melanoma. Normal samples were obtained from either breast reduction or amputation procedures. Malignant samples were partial sections of lesions that had been surgically removed. All samples were maintained at until time of spectral study, at which point they were thawed in buffered saline. Because of their small size and harvest from within the resected tumor boundaries, gross tissue pathology from the parent lesion was used as the gold-standard for classification.
Confocal Raman Microscope
The confocal Raman microscope (CRM) used in this study is an in-house built system designed specifically for tissue interrogation and is shown in Fig. 1 . The excitation source of the microscope is an external cavity diode laser (ECDL) built in-house, which includes a 150-mW diode at (DL8032, Sanyo Electric, Japan) and an 1800-line/mm, gold-coated, high-modulation holographic grating (ThermoRGL, Rochester, New York). The ECDL output beam is shaped by a weak cylindrical lens to minimize astigmatism, and an anamorphic prism pair transforms the elliptical cross section into a circular beam with Gaussian energy distribution. A longpass dichroic beamsplitter at (Omega Optical, Brattleboro, Vermont) separates the excitation beam path from the emitted Raman scatter. A second dichroic (hot) mirror (Edmund Industrial Optics, Barrington, New Jersey) separates both the excitation beam and the Raman scatter from the shortpass-filtered white light illumination source to allow concurrent Raman acquisition and white light viewing of the sample via a color video camera. An achromatic near-infrared-optimized microscope objective ( , Nachet, France) serves both to focus the excitation light on the sample and to collect the Raman scatter. A longpass edge filter at (Omega Optical, Brattleboro, Vermont) eliminates residual Rayleigh scatter. The Raman signal from the sample is focused into a core diameter, multimode optical fiber, which serves as the confocal aperture. The fiber is connected to a holographic imaging spectrograph (HoloSpec f/1.8i, Kaiser Optical Systems, Ann Arbor, Michigan) and liquid-nitrogen-cooled, back-illuminated, deep-depletion CCD (EEV1024EB, Roper Scientific, Trenton, New Jersey) for detection. In addition, a motorized microscope stage (4400RP, Conix Research, Springfield, Oregon) allows automatic positioning of the sample with mechanical resolution better than .
Axial and lateral optical resolution of the CRM was determined by the average full-width-at-half-maximum (FWHM) of the intensity profile of a small polystyrene bead as it was repeatedly scanned through the beam focus in the three respective dimensions. This produced an axial resolution of and lateral resolutions of 2.4 and in the and directions, respectively. The axial resolution was deliberately increased over true diffraction-limited resolution (theoretically to with objective) in order to guarantee tissue-level measurement volumes ( to 3 cells) while maintaining volume resolution and increasing overall collection efficiency. The discrepancy between the lateral resolutions can be attributed to residual astigmatism in the excitation beam. The spectral resolution of the detection system was calculated to be , based on the holographic grating dispersion, slit width, and CCD pixel size.
Data Acquisition and Processing
Raman spectra were recorded from the surface of each tissue specimen (determined by focusing the microscope on the tissue surface using a video image) and at increments below the surface, down to a depth of at least (determined by translation of the motorized stage). These depth measurements were made at various locations (2 to 5, depending on sample size) within each tissue specimen to ensure repeatability of the spectral data. All spectra were recorded using a 30-s integration time, with an excitation power of at the sample. The spectral dispersion of the system was characterized using the atomic emission lines of a lamp; wave number calibration was performed using naphthalene and acetamidophenol as standards. Spectra were binned to half the spectral resolution of the system for direct comparison. High-frequency readout noise and shot noise were removed from the spectra by a second-order Savitzky-Golay filter.31 Tissue autofluorescence was subtracted using an automated modified polynomial fitting method.32 To account for inherent variation in intra- and intersample absolute signal intensity, the spectra were normalized to their respective mean intensity across all wave numbers.
For each sample, spectra from all measurement locations were averaged, and a mean spectrum was calculated for each depth. Outlier spectra were determined as those residing outside three standard deviations of the respective mean spectrum (i.e., from same sample, same depth, different locations), and mean spectra were recalculated with outliers removed. All further data analysis used only the mean spectral values from each sample, at each measured depth.
Integrated Raman spectral surrogates were created from the depth-resolved spectra to approximate spectra that would be obtained from a traditional, nonconfocal Raman probe. These integrated spectra were created by summing the intensity values of the depth-resolved spectra (processed as earlier but without any normalization) at each respective wave number across all measured depths and locations within each sample. While depth-related optical absorption was not mathematically factored into this integration, the depth resolution of the Raman microscope and the associated decrease in Raman intensity with increasing measurement depth allows a general approximation of non-depth-resolved spectra. Outliers, identified earlier, were excluded from integration. Each resultant spectrum (one for each tissue specimen) was then normalized to its mean intensity across all wave numbers to eliminate inherent intersample absolute intensity variation.
Each measurement depth (and the integrated spectra) was evaluated independently to assess the effect of measurement depth on the pathological predictive capabilities of the Raman spectra, using leave-one-out cross-validation. The analysis technique employed has been described previously33, 34, 35 and consists of two steps: (1) extraction of diagnostic features from the spectra using the nonlinear maximum representation and discrimination feature (MRDF), and (2) development of a probabilistic scheme of classification based on linear sparse multinomial logistic regression (SMLR) for classifying the nonlinear features into corresponding tissue categories.
Given a set of input data comprising samples from different classes with a given dimensionality, nonlinear MRDF aims to find a set of nonlinear transformations on the input data that optimally discriminate between the different classes in a reduced dimensionality space. It uses nonlinear transforms that are polynomial mappings of the input data. In the case of spectral data, the aim of nonlinear MRDF is to compute nonlinear transformation vectors, , from -dimensional (where is the number of wave numbers over which spectra were recorded) spectra of skin tissues, such that the projections of the input data on from the different tissue categories are statistically well separated from each other. Since the dimensionality is much larger than the size of the data, a two-stage MRDF with restricted polynomial transform at each stage was used. In the first stage, the input data (intensities corresponding to wave numbers of the spectra) from each tissue type were raised to the power and subject to a transform to produce the first stage output features in the feature space of reduced dimension . In the second stage, the reduced -dimensional output features for each tissue type were further raised to the power and subject to a second transform to yield the final output features in the nonlinear feature space of dimension . Since the nonlinearities introduced in the two stages were different ( in the first stage, and in the second stage), this is expected to produce more general nonlinear transforms on the input spectral data leading to improved separation of the final nonlinear features for the tissue types in the new feature space.
Classification with SMLR is a probabilistic multiclass model based on a sparse Bayesian machine-learning framework of statistical pattern recognition. The central idea of SMLR is to separate a set of labeled input data into its constituent classes by predicting the posterior probabilities of their class membership. It computes the posterior probabilities using a multinomial logistic regression model and constructs a decision boundary that separates the data into its constituent classes based on the computed posterior probabilities following Bayes’s rule. Classification of a given set of input data is based on the vector of posterior probability estimates yielded by the SMLR algorithm, and a class is assigned to each dataset (transformation of the original spectrum) for which its posterior probability is the highest.
Figure 2 shows typical Raman spectra (normalized to their respective mean intensity) obtained from each layer of a normal skin tissue specimen, alongside a hematoxylin and eosin (H&E) stained histological section of representative normal skin. Spectral variations between the different layers can be observed at 890 to 1030, 1170, 1230 to 1345, 1440, and , corresponding to biochemical differences inherent in the major skin strata, notably collagen, elastin, keratin, and lipids.23, 24, 36
The mean spectra of the various skin malignancies studied at each measured depth are presented in Fig. 3a . Numerous differences between the tissue pathologies can be observed at many depths. The melanoma spectra are noticeably different as compared to the other categories at all depths. Differences between normal, BCC, and SCC are subtler: the ratio of the bands between 1200 to 1300 (tryptophan, phenylalanine, amide III: proteins) and ( bending: proteins, lipids) varies with depth, and this variation is different between normal and malignant tissues. The peak at (amide I: proteins, lipids) as well as the small peak near (tryptophan) is higher in the BCC and SCC than in the normal tissue. Other variations include changes in the band patterns between 850 and (tyrosine, proline, glucose, glycogen, phenylalanine, proteins) evident as a function of pathology and depth. Figure 3b shows the mean integrated spectra of the various malignant tissues. As expected, the spectral changes evident between the pathologies at each depth are not as easily visible in the integrated spectra. Figure 4 shows the mean normal, BCC, and SCC Raman spectra taken at a measurement depth of excluding the melanoma spectrum, allowing the subtle variations between these pathologies to be more evident. While there are many observable spectral differences, it is more relevant to explore the significance of these variations toward pathological classification.
In order to quantify the spectral variations of the different pathologies at each depth, both data reduction (MRDF) and classification (SMLR) algorithms were applied. The MRDF algorithms reduced the dimensionality of the spectra such that the features most important to differentiation are retained (as opposed to techniques such as principal components analysis, which compress the data with regard to common variance) through higher-order nonlinear transformation. The SMLR algorithms received as input the compressed (via MRDF) datasets and provided probabilistic class memberships. Figure 5 shows the posterior probability plots at each depth, grouped by tissue pathology (as determined by histopathological diagnosis of parent lesions). These plots show both the surface and the -depth measurements as those with the highest overall probabilities of class membership.
The confusion matrices for each measurement depth and the integrated spectra are shown in Fig. 6 , with each value corresponding to the percentage of correct classifications (i.e., agreement between SMLR classification of Raman spectra and histopathologic diagnosis). It can be seen that the surface spectra provide 100% sensitivity and specificity of disease versus normal and misclassify only a single spectrum as a SCC rather than BCC. Increasing the measurement depth causes further classification error, although not necessarily in a linear fashion. All measurement depths from the surface down to provide reasonable classification accuracy, while the classification at depth incurs much more error. The integrated spectra are seen to provide slightly lower classification accuracy than the surface and -depth measurements, although higher than some of the deeper measurements.
These results show that a spatially resolved Raman measurement technique provides a moderate increase in diagnostic performance as compared to integrative measurements. Skin cancers manifest in specific skin strata, and non-spatially-resolved measurement techniques can attenuate the slight spectral differences caused by slight biochemical changes in localized regions. However, this technique is not without limitations. Spatial resolution of a confocal system depends not only on the system optics, but also on sample stability. While this is not a problem in vitro, clinical application will require stabilizing the tissue for the duration of the measurements. Short measurement times will aid in this sample stabilization procedure and also minimize the impact of the Raman measurements on the patient and clinician. Furthermore, accurate histopathologic correlation of the Raman measurement sites will allow accurate correlation of the Raman spectra with specific morphological structures.
The spectral differences in the melanoma spectra are seen to be much more significant, and at different wave number ranges than the BCC and SCC spectra, while the BCC and SCC spectra show significant differences in similar wave number ranges. This is likely due to differences in the cellular origins of the cancers, as both BCC and SCC involve malignancy of keratinized epidermal cells and melanomas result from malignancy of melanocytes. 37, 38, 39, 40 While melanomas were included in this study for completeness, the direct comparison between BCC or SCC and melanoma is much less relevant clinically. Melanomas, being pigmented lesions, generally require diagnosis to differentiate them from other pigmented lesions such as nevi, while BCC and SCC are often mistaken for unpigmented lesions such as actinic keratoses or inflamed scar tissue. As such, future studies are planned to include the aforementioned benign lesions in a larger clinical study. This obvious disparity between the melanoma spectra and the nonmelanoma cancer and normal tissues is, thus, not unexpected, and shows the Raman technique to be more sensitive to pigment-related variations than the subtle biochemical differences leading to skin disease. This observation strengthens the argument for a depth-resolved measurement technique, in order to minimize the effect of inherent pigment variations between subjects on optical-based diagnosis.
The data analysis methods used in this study were selected for their ability to both compress the large amount of data obtained with each Raman spectrum as well as retain only the diagnostically relevant portions of the spectra in this compression. Principal components analysis (PCA), in contrast, creates only a single model for a dataset and compresses the data in decreasing degrees of shared variance. Since the diagnostically relevant features in the Raman spectra are very small in comparison with the shared spectral content, it is imperative that these features be retained. However, while MRDF is demonstrably adept at exploiting diagnostically relevant features for maximum discrimination, its nonlinear mathematical transformation of the original data space does not allow retracing of the specific wave number regions responsible for each transform. It is, therefore, not possible to identify which particular Raman band differences are ultimately responsible for the MRDF/SMLR discrimination, as opposed to the component loadings generated by PCA. Further comparisons between the MRDF/SMLR technique and conventional data reduction/discrimination techniques can be found in other literature.33, 34
As seen in the depth-specific classification of the Raman spectra, there is a clear variation in classification accuracy with increasing measurement depth, particularly between 80 and . This is likely due to the anatomical structure of the skin and the thickness of the epidermis. The epidermis is the site of manifestation for most skin tumors and is comprised of keratinized cells superficial to the basement membrane (stratum basale). Although the depth of the basement membrane varies considerably, it is likely that this heterogeneity was accounted for within the depth, whereas the depth sampled larger portions of the underlying dermis, which is comprised primarily of structural collagen. Histological analysis of each measurement site could reveal this heterogeneity but unfortunately was not available for these samples.
These results also show that an optical diagnostic tool such as Raman spectroscopy would benefit from a tandem optical imaging technique (i.e., OCT, laser confocal imaging), which would allow rapid, noninvasive visualization of the skin strata in addition to the quantitative spectral measurements. The tandem imaging technique could easily coexist with the Raman sampling optics and allow visual identification of lesion depth and size prior to targeting the depth-specific spectral acquisition. These capabilities are being planned in future revisions of the CRM.
The accessibility of most skin lesions make them ideally suited for the application of optical diagnostic techniques, yet the stratified architecture of the skin and its malignancies presents an obstacle for optical techniques that cannot discern these strata. Raman microspectroscopy provides the ability to obtain optical signals only from specified volumes within the tissue, and therefore can circumvent the extraneous optical signals of non-disease-predictive regions. Using multivariate analysis methods, we were able to accurately predict the pathology of several skin samples in vitro. These results indicate the potential of Raman spectroscopy to accurately diagnose skin lesions clinically. Based on this success, the development of a handheld confocal Raman system has been initiated, and clinical studies are planned for the near future.
This project was funded in part by the Vanderbilt In-Vivo Imaging Center, Grant No. NIH CA86283.