Reflectance and fluorescence spectroscopy have been widely investigated as noninvasive tools for detecting oral cancer.1, 2, 3, 4, 5 Although numerous studies have reported differences in spectral parameters when comparing diseased and normal tissues, in many cases, multiple anatomic sites (i.e., the gingiva, buccal mucosa, etc.) are combined without taking into account the normal spectral contrasts (differences in the spectral properties).5, 6, 7, 8 Therefore, spectral contrasts due to anatomy may be incorrectly attributed to malignancy. A critical step in applying these techniques to the detection of oral malignancy is to first develop an understanding of the spectral contrasts among anatomic sites in the oral cavity.
Only a few studies have investigated the relationship between spectral contrasts and anatomic differences. In a study by Kolli, fluorescence spectra were collected from the buccal mucosa and dorsal surface of the tongue of 21 healthy volunteers, in addition to patients.9 Analysis of data collected from healthy volunteers revealed that the ratio of fluorescence emission at when excited at 335 and , respectively, was statistically different between the buccal mucosa and dorsal surface of the tongue. Differences in the fluorescence peak intensities at 337, 365, and excitation were also noted by Gillenwater in a study of nine sites in the oral cavity of eight healthy volunteers.10 Another small study of 10 healthy volunteers by Dhingra noted differences in the intensity of the fluorescence emission among various sites.11 Although these three studies provide support for the fact that spectral contrasts due to anatomy exist, they lack a detailed description of the differences and only examine a few sites.
In a larger study by de Veld, 9295 fluorescence spectra were collected from 97 healthy volunteers to determine whether differences in the mucosa at various sites in the oral cavity resulted in changes in the fluorescence intensity and lineshape.12 In this study, both classification and clustering methods were used to detect spectral contrasts among the 13 anatomic sites evaluated. The authors found that the dorsal surface of the tongue and vermilion border of the lip each displayed unique spectral properties, and the other anatomic sites were similar enough to be combined into a single group; however, significant differences in total fluorescence intensity between almost all the different sites were noted at multiple wavelengths. In this large study, the authors tested the class overlap between each of the 11 anatomic sites (the dorsal surface of the tongue and vermilion border of the lip were excluded) against the remaining 10 sites. Because the larger group consisted of multiple sets of sites with small differences in their means and standard deviations, the resultant distribution has a large spread and is more likely to overlap with the single site against which it is compared. This would diminish the ability to observe differences. There may be added value in separating certain sites, because small changes with cancer would be more easily distinguished for narrow distributions (i.e., increased power of a diagnostic test).
None of the four studies cited above used a physical model of tissue fluorescence, in which the spectral features are modeled by parameters relating to specific tissue components. Therefore, it is difficult to directly interpret or provide an explanation for the observed differences in fluorescence emission. Additionally, to the best of our knowledge, no studies have been performed that examine differences in the reflectance properties of oral tissue due to anatomy.
In the present study, we use a physical model to extract parameters from both diffuse reflectance and intrinsic fluorescence spectra that are related to the morphological and biochemical properties of the tissue.13 We compare the extracted parameter distributions for various anatomic sites, and also relate the physical spectral parameters to the anatomic features of the tissue sites. We perform -means cluster analysis in order to identify and classify groups of sites that share similar or distinct spectral properties.
Materials and Methods
In Vivo Data Collection
Healthy volunteers (HVs) were recruited at Boston Medical Center (BMC) and at the Massachusetts Institute of Technology (MIT). The protocol for in vivo data collection was approved by the institutional review board at BMC and the Committee on the Use of Humans as Experimental Subjects at MIT. Written informed consent was obtained from all subjects to indicate their willingness to participate in the study. Relevant background information was obtained from each subject, such as their smoking history, alcohol consumption, and history of lesions in the upper aerodigestive tract. Study subjects without a prior history of a lesion of the oral cavity, larynx, or esophagus (benign or malignant) were considered healthy volunteers, regardless of their smoking status. Smokers were included as HVs because in our patient study we have found that those presenting with oral lesions represent a mixture of smokers and nonsmokers. Less than 1% of the subjects had a history of alcohol abuse.
Reflectance and fluorescence spectra were collected from nine different anatomic sites within the oral cavity using the Fast Excitation Emission Matrix (FastEEM) instrument, which has been previously described.14 The calibration procedures are also described in Ref 14. Briefly, a xenon arc flash lamp ( pulse) and a XeCl excimer laser ( pulse) serve as the excitation light sources for the reflectance and fluorescence measurements, respectively. Fluorescence excitation wavelengths between 308 and were generated by using the excimer laser to sequentially pump a series of dyes. A diameter optical fiber probe, consisting of a central delivery fiber ( core) and six collection fibers ( core), was used to deliver the excitation light and collect the fluorescence emission and diffuse reflectance. A quartz shield at the tip of the probe ensured a fixed geometry when the probe was placed in contact with the tissue during data collection. Emission light was collected over the range of . The optical fiber probe was disinfected with CIDEX OPA (Advanced Sterilization Products, Irvine, CA) before each session, according to the manufacturer’s specifications. In a single data set, five measurements each of the reflectance and fluorescence spectra were collected in , and the average of each of the five spectra was recorded. The standard deviation for the five reflectance and fluorescence spectra, respectively, was also recorded. Approximately three to five data sets were acquired for each tissue sample examined. Data sets exhibiting large standard deviations were excluded from the analysis. Additionally, data sets that did not overlap in lineshape or intensity were excluded, and the remaining data sets were averaged. If all the measurements were inconsistent, then the data for the tissue sample were excluded from the analysis.
The measured diffuse reflectance was fit using the model of Zonios 15 The inputs to the model are the reduced scattering coefficient, , and absorption coefficient, . Reflectance spectra were fit over the range of using a constrained nonlinear least squares fitting algorithm. A double power law equation was used to represent the wavelength dependence of ,, is expressed in units of microns and is equal to . A power law description of has been widely used to model data collected from both cells and tissue.16, 17, 18, 19 We add the second power law term, , because it was found to be particularly important for fitting the spectra at wavelengths , where there is significant scattering from smaller particles (Rayleigh scattering). From modeling the scattering, we extract three parameters: A, B, and C. Parameter A is a scaling parameter related to the overall magnitude of scattering. In a study of skin, the collagen fibers of the dermis were shown to be the major source of light scattering, contributing mostly to the scattering from larger particles (Mie scattering) and, to a lesser extent, Rayleigh scattering.20 Parameter B reflects the size of the scattering particles.16 C represents the magnitude of scattering by small scatterers.
The absorption coefficient was modeled as the sum of the contributions from two absorbers, hemoglobin and -carotene,21, 22, 23 In our analysis, we adopt the formulation of the model developed by Svaasand, 23 as presented by van Veen 21 Vessel packaging accounts for alterations in the shape and intensity of the hemoglobin absorption peaks as a result of the strong absorption of light at 420 nm by hemoglobin in blood vessels. Using this model, the absorption coefficient of hemoglobin is represented by the product of the absorption coefficient of hemoglobin in whole blood, , a wavelength-dependent correction factor that alters the shape of the absorption spectrum, , and the volume fraction of blood sampled, . The absorption coefficient for hemoglobin was described as follows: is given by is the effective vessel radius (in millimeters) and is the absorption coefficient of blood (in inverse millimeters). The volume fraction, , was calculated using the following ratio: . The equation for was given by is the oxygen saturation, is the extinction coefficient (in ), denotes oxyhemoglobin, and Hb denotes deoxyhemoglobin.24 We further impose a lower limit on the extracted effective vessel radius of , because the minimum diameter of a capillary is on the order of .25 The absorption coefficient for -carotene was modeled as follows: represents the concentration of -carotene (in milligrams per milliliter) and is the extinction coefficient -carotene (in ).26 The absorption coefficient was calculated as the sum of the contributions of each absorber, , effective vessel radius, and .
Using a model that has been previously described, the distortions in the measured fluorescence spectra due to scattering and absorption were removed, in order to obtain what we refer to as the instrinsic fluorescence.13, 27 Unlike the measured fluorescence, the intrinsic fluorescence can be modeled as a linear combination of the component fluorophores present within the tissue. The spectra of the component fluorophores (basis spectra) for each excitation wavelength were extracted from the intrinsic fluorescence using multivariate curve resolution (MCR) and, in some cases, a spectrum collected from the pure chemical.
Parameters were extracted from the intrinsic fluorescence spectra in the subsequent data analysis in one of two ways, which we refer to as non-normalized fluorescence data or area-normalized fluorescence data. In the case of the non-normalized fluorescence data, the basis spectrum of each fluorophore at a given excitation wavelength was normalized by dividing the intensity at each wavelength point in the spectrum by the intensity at the peak. Following this step, a linear combination of the basis spectra was used to fit the intrinsic fluorescence spectrum for the sample. For area-normalized fluorescence data, the intensity at each wavelength in the basis spectrum was divided by the total area under the curve. This same process was applied to the intrinsic fluorescence spectrum for the sample. The resultant intrinsic fluorescence spectrum for the sample was then fit with a linear combination of the basis spectra. In a final step, the extracted contribution of each fluorophore was divided by the total sum of the contributions for all the fluorophores at the specific excitation wavelength. The extracted contribution for each fluorophore in the area-normalized fluorescence data represents the fraction of the total area under the emission curve contributed by the component fluorophore.
The fluorescence excited at was modeled by a linear combination of tryptophan, collagen, and the reduced form of nicotinamide adenine dinucleotide (NADH) over the range of . The fluorescence excited at was modeled by a linear combination of NADH and collagen over the range of .
Outlier Removal and -Means Cluster Analysis
Outliers for each of the parameters were identified for each site separately based on the interquartile range (IQR) exclusion criteria.28 Applying these criteria, parameter values were excluded if they exceeded the upper quartile by 1.5 times the IQR, or fell below the lower quartile by 1.5 times the IQR. Because each parameter was evaluated separately, the number of exclusions varied for each parameter. Following this procedure, we performed k-means cluster analysis to determine if there were specific sites with unique spectral properties, without assuming a priori that the nine anatomic sites were well-defined groups (unsupervised classification). The analysis was performed using MATLAB (Mathworks, Natick, MA). Outliers in the data were removed in order to prevent skewing of the cluster centroid and to maximize the accuracy of the results. Analysis was performed for each parameter separately for , 3, and 4 clusters using a random initial clustering assignment and the city-block distance measure. Each separation entailed 20 replicate runs of the clustering algorithm. For each iteration, the distance from all points to its assigned centroid was calculated and the final result of the function represented the iteration that produced the minimum sum of all the distances. For each -means separation (i.e. , 3, or 4), the percentage of each site within each of the clusters was identified after subtracting the probability of a site being assigned to a cluster purely due to chance (i.e., 25% for ). We define as the difference between the maximum percentage of a given site assigned to a cluster and the percentage that would be expected to be assigned due to chance. All sites for which the value for for all three separations or at least two separations exceeded 30.0% were identified for each parameter. These selection criteria for were used to identify sites displaying significant clustering. For each parameter, sites meeting these criteria were either clustered together or separately based on the whether they shared the same or unique cluster assignments.
In order to evaluate the degree of spectral contrast, a logistic regression model was developed to differentiate specific sites. A receiver operator characteristic (ROC) curve was then generated after performing leave-one-out cross-validation. The accuracy of the separation was evaluated by determining the sensitivity, specificity, and area under the ROC curve (AUC). The sensitivities and specificities were selected based on the Youden index, which represents the point in the ROC curve with the maximum vertical distance from the 45 degree line.29
General Description of the Data
The complete set of HV data consisted of 781 spectra. Because of instrumentation error, evidence of movement (large variations among the three to five data sets collected from each tissue sample), or too few spectra collected from a given anatomic site, 9.1% of the spectra were excluded. The final set of healthy volunteer data included 710 spectra from 79 subjects. The average age ( deviation) of the subjects was . Data were collected from the following nine sites in the oral cavity: buccal mucosa (BM), dorsal surface of the tongue (DT), floor of the mouth (FM), gingiva (GI), hard palate (HP), lateral surface of the tongue (LT), retromolar trigone (RT), soft palate (SP), and ventral surface of the tongue (VT). Table 1 summarizes the number of spectra collected from each site within the oral cavity for the entire data set. Fewer spectra were collected from sites such as the HP, RT, and SP because the architecture of the oral cavity in some subjects made these areas less accessible to the probe.
Summary of the number of spectra collected from each of the nine sites in the oral cavity for the calibration, validation, and total data sets. BM: buccal mucosa, DT: dorsal surface of the tongue, FM: floor of the mouth, GI: gingiva, HP: hard palate, LT: lateral surface of the tongue, RT: retromolar trigone, SP: soft palate, and VT: ventral surface of the tongue.
|Site||Calibration data||Validation data||Total data|
|Number of spectra|
Reflectance and Fluorescence Modeling
Figure 1 shows representative examples of the excellent quality of the fits to the reflectance and intrinsic fluorescence spectra, respectively. Examples of how the quality of the modeled fit can deteriorate when either -carotene or vessel packaging are not accounted for are shown in Figs. 1a and 1b, respectively. -carotene has a prominent absorption peak at and a secondary peak at . The latter absorption peak is especially prominent in the spectrum in Fig. 1a. The vessel packaging model predicts that the differences in the depth of the hemoglobin peak as compared to the 540- and peaks will diminish as the size of the effective vessel radius increases. The uncorrected hemoglobin absorption spectrum underfits the 540- and peaks. In Figs. 1c and 1d, the fluorescence emission at 308- and excitation, respectively, is shown. The fluorescence excited at was modeled by a linear combination of tryptophan, collagen, and NADH [Fig. 1c]. The fluorescence excited at was modeled by a linear combination of three fluorophores: NADH, and two additional fluorophores with peaks at approximately 401 and , respectively [Fig. 1d]. A preliminary examination of the data revealed that by using only NADH and a single additional fluorophore (collagen), we could not reliably model the intrinsic fluorescence signal at excitation for the entire data set. Figure 2a shows a plot of NADH and the two additional spectral components used to fit the spectra at excitation. We refer to the two extracted collagen components as Coll401 (340 nm) and Coll427 (340 nm), where the number represents the wavelength at which the peak emission occurs. To determine if the two fluorophores we extracted could be due to different types of collagen, we collected the fluorescence emission for dry collagen IV (Sigma-Aldrich C9879) and collagen I (Sigma-Aldrich C5533), the major forms present in the basement membrane and stroma, respectively.30, 31 Figure 2b shows the fluorescence emission excited at . The peaks for collagen I and collagen IV occur at 409 and , respectively, which is similar to what we obtained from the tissue data.
The following six fluorescence ratios were examined based on the analysis of the non-normalized fluorescence data: NADH/collagen , NADH/tryptophan , NADH/Coll401 , NADH/Coll427 , NADH/total collagen , and Coll401/Coll427 . The number in parentheses indicates the excitation wavelength. Total collagen refers to the sum of Coll401 and Coll427. Analysis of the area-normalized fluorescence data yielded six additional parameters: tryptophan , collagen , NADH , NADH , Coll401 , and Coll427 . The 308- and excitations were specifically chosen for our analysis because data were not consistently available for all samples at the higher wavelengths due to a low signal and operator error.
Model-Based Analysis Parameter Distributions
Table 2 summarizes the mean, standard deviation , and the relative standard deviation (RSD) ( ) for each parameter for each individual site, as well as when all nine sites were combined into a single group. In each case, the statistical measures were calculated after applying the IQR exclusion criteria. The relative standard deviation varied widely across sites for a given parameter, even among sites with similar means. For 16 of the 19 parameters, the RSD for the combined group of sites exhibited a value that ranked among the highest three values.
Summary of the mean, standard deviation (σ), and relative standard deviation (RSD) for each parameter for each of the nine sites, as well as the combined group of sites.
|Vessel radius [mm]|
For parameter A, most of the sites showed very similar mean values, except the BM and RT, which exhibited a mean that was significantly higher than all other sites based on a multiple comparison test but not from each other. The HP and DT demonstrated slightly lower mean values that were significantly different from all other sites, except the GI.
The GI and HP both exhibited values of B of 0.6, which was significantly higher than for other sites. Three groups were easily distinguished for parameter C. The FM, LT, and VT formed one group, demonstrating the highest values of C. The BM, DT, RT, and SP formed another group with intermediate values of parameter C. At the other extreme, the keratinized GI and HP were easily distinguished by a small C parameter.
In examining the absorption parameters, we found that the GI, HP, and SP demonstrated low values of cHb compared to other sites. The mean oxygen saturations were comparable for all nine sites, and the combined set of sites had a mean value of 60%. The mean effective vessel radius for the nine sites ranged from approximately .
For the NADH/tryptophan and NADH/collagen fluorescence ratios at excitation, the FM was statistically different from all other sites. The extracted contribution of Coll427 (340 nm) was slightly higher than that of Coll401 (340 nm) for every site except the LT and RT. The ratio of NADH to Coll401, Coll427 and total collagen, respectively was highest for the GI and HP, largely due to the low collagen emission for these sites. The GI and HP demonstrated the lowest mean Coll401/Coll427 ratio of all the sites. Examining the individual contributions of Coll401 and Coll427 extracted from the non-normalized fluorescence, both values were found to be significantly lower than all other sites.
As shown in Table 2, the RSD values for the area-normalized fluorescence parameters were considerably smaller than those of the non-normalized fluorescence data, particularly for individual sites. Similar to the results for the non-normalized fluorescence data, the mean NADH and tryptophan parameters for the FM were statistically different from all other sites. However, this was not the case for the mean collagen value. The GI and HP displayed the two lowest mean collagen values. Similar to the results for the Coll401/Coll427 ratio, the mean Coll427 value exceeded the mean Coll401 value for each site except for the LT and RT. Although the GI and HP displayed low mean Coll401 values compared to other sites, as was observed for the non-normalized fluorescence, they demonstrated mean Coll427 values, which were comparable to other sites.
-Means Cluster Analysis
-Means clustering was used as an objective and quantitative means of identifying and determining the magnitude of differences and similarities among sites, based on the spectral data summarized in Table 2. The results are shown in Table 3 . For each parameter, we list the site(s) meeting the criteria for described earlier and the range of values for for each cluster. For those sites meeting these criteria, each row represents a separate clustering assignment; therefore, sites that were assigned to the same cluster are listed together, while those assigned to separate clusters are listed in different rows. For all but one parameter, oxygen saturation, there is at least one site or group of sites that form a well-defined cluster. For 15 of the 19 parameters studied, the HP was identified as being distinct from the majority of the other sites, either alone or in combination with another site(s). The GI also frequently demonstrated significant clustering (10 parameters). For 8 of the 19 parameters, the HP was assigned to the same cluster as the GI. Only in the case of one parameter [NADH/tryptophan ] did the DT group with these sites. The FM displayed unique properties for all of the non-normalized fluorescence parameters at excitation, and two of the three area-normalized fluorescence parameters. The sites least likely to show significant clustering were the BM and DT. Each of these sites only met the criteria for for a single parameter.
Summary of the results from the k -means cluster analysis. For each parameter, sites in which Δ exceeded 30.0% for all three separations ( k=2 , k=3 , and k=4 ) or at least two separations are listed. For sites meeting these criteria, those assigned to the same cluster are listed in the same row, while sites assigned to different clusters are listed in separate rows. In the last column, the range of values for Δ observed for a specific cluster is listed.
|Parameter||Sites||Range for Δ [%]|
|No distinct sites|
On the basis of the clustering results summarized in Table 3, the BM and FM would be expected to show some degree of separation based on any of the following parameters: A, cHb, vessel radius, NADH/collagen , NADH/tryptophan , tryptophan , and NADH . In each case, either the BM or FM met the criteria for , and if both met the criteria, they were assigned to separate clusters. Figure 3a shows a binary scatter plot demonstrating a clear separation for these sites based on A and NADH/tryptophan . The diagnostic decision line is also shown. The sensitivity, specificity, and AUC for separating the two sites based on these parameters were 81%, 83%, and 0.89, respectively. Similar results were obtained using either tryptophan and A (sensitivity, specificity, AUC of 87%, 79%, and 0.84, respectively) or NADH and A (sensitivity, specificity, AUC of 83%, 78%, and 0.87, respectively). As shown in Fig. 3b, two different aspects of the tongue, the DT and VT, can be differentiated based on NADH/tryptophan and with a sensitivity, specificity, and AUC of 92%, 78%, and 0.90 respectively. A sensitivity, specificity, and AUC of 92%, 81%, and 0.91 respectively, were obtained using and tryptophan . A sensitivity, specificity, and AUC of 74%, 94%, and 0.89, respectively, were obtained using and NADH . Figure 3c shows a similar binary scatter plot for the GI and HP of cHb versus B, two parameters for which these sites were assigned to the same cluster. As the plot shows, the two sites overlap considerably .
Figures 4a, 4b, 4c show pictures of the normal tissue architecture for the GI, HP, and BM. As can be seen for the former two sites, their strong spectroscopic similarity reflects a shared morphology. In contrast, the BM [Fig. 4c] displays a markedly different organization, in particular, the absence of overlying keratin.
Validation of Normal Tissue Discrimination
We next evaluated how accurately the results in Table 2 captured the parameter distributions for normal, healthy tissue ("normal limits") for each site and can be used to characterize potentially malignant, but clinically normal sites. The complete set of tissue data was divided into a set of calibration data and a set of validation data. The calibration data consisted of 584 spectra collected from 65 subjects and was used to derive the normal limits for each parameter for each site based on the 95% confidence interval (CI). The IQR exclusion criteria were applied to the calibration data in the same way as for analysis of the complete set of tissue data. The validation data consisted of 126 spectra collected from the remaining 14 subjects. These data were used to test the reliability of the calibration data to correctly categorize normal tissue. We used the calibration data to determine the parameter distributions and 95% CIs for each parameter for all nine sites. Then, for each parameter extracted from a sample in the validation data, we tested whether it lay within the 95% CI for that specific parameter. The site from which the sample was measured was also taken into account. Based on the results of this test we calculated the percentage of the validation data samples that were identified as normal, healthy tissue (within the 95% CI) for each parameter. The average age ( deviation) for subjects in the calibration and validation data was and , respectively. Table 1 provides a summary of the number of spectra collected from each site for both sets of data. The percentage of all the samples in the validation data falling within the normal limits developed from the calibration data is shown in Table 4 .
Summary of the percentage of the 126 spectra in the validation data collected from healthy volunteers in which the listed parameters fell within the 95% CI (“normal limits”) for a specific parameter for a given site. The normal limits were developed based on the results of the calibration data.
|Extracted parameter||Percentage of data within 95%Cl|
The oral cavity is comprised of a number of different anatomic sites, and exhibits a wide variety of architectural properties. One notable example is that while most of the mucosa of the oral cavity is nonkeratinized, the GI, HP and some regions of the DT are keratinized. Other key differences include the thickness of the epithelium (i.e., ), the composition of the submucosa (i.e., bone, muscle), the density and size of the vessels, as well as the presence of other unique features, such as glands (i.e., BM, HP, SP) and papillae (DT). We expect these variations to also be reflected in the spectral information we collect from the tissue.
Reflectance and Fluorescence Modeling
In order to fully capture the major spectral features necessary for accurately modeling the reflectance and fluorescence spectra of healthy oral tissue, we incorporated the absorber -carotene. Although -carotene has not been previously noted in reflectance spectra collected from the oral cavity, there are numerous reports in the literature of its presence in oral cells/tissue, as well as in plasma.32, 33, 34 This absorber is found at much lower concentrations than hemoglobin, but its extinction coefficient is times that of hemoglobin at .26 Distortions of the hemoglobin spectrum due to vessel packaging effects were also taken into account in the reflectance model. This correction has been used in other investigations studying tissue reflectance.18, 21
We model the fluorescence emission produced at excitation by a linear combination of three fluorophores: NADH and two additional spectra, both of which we attribute to collagen(s), likely in combination with elastin. To determine if keratin contributed to the measured fluorescence, we separated the keratinized sites (GI, HP, DT) from the nonkeratinized sites and performed MCR analysis separately on these two groups. The basis spectra extracted for both groups were identical. Furthermore, the keratinized sites did not demonstrate larger contributions of either of the two collagen components, as compared to nonkeratinized sites. The necessity for two distinct collagen spectra may also result from differences in the types of collagen being excited, the layered tissue architecture, depth-related effects (which can shift the proportions of light penetrating to the basement membrane or stromal layers), or a combination thereof. Spectra collected from pure collagen show, for example, that collagen IV (major type in the basement membrane) has a different emission profile from collagen I (major type in the stroma).35 This supports the fact that variations in the collagen signal exist, although the emission of the extracted spectral components and chemicals do not appear to be equivalent. A study of freshly excised ex vivo cervical tissue by Chang found that the stromal fluorescence differed greatly among various specimens and that data from their large patient data set were best modeled using a linear combination of three separate collagen fluorescence spectra.36 The authors attributed two of the components to different types of collagen cross-links, the structural components responsible for collagen fluorescence. Differences between the spectra of the pure chemicals and the results from MCR are likely due to the inability of the dry chemical to recapitulate the actual protein in vivo, where the fluorescence emission may be influenced by the tissue microenvironment, the contribution of other types of collagens, and the presence of other fluorophores that strongly overlap in emission, such as elastin and possibly keratin.
Model-Based Parameter Distributions
The detailed summary of the parameters provided in Table 2 reveals a number of findings about the relationship between the extracted parameters and the structural characteristics of the sites. The GI, DT, and HP are sites that are normally keratinized; however, only the GI and HP display unique features as compared to the nonkeratinized sites. The keratinized sites are distinguished by high values for B, tryptophan , and NADH , and low values for C, cHb, effective vessel radius, , Coll401/Coll427, NADH/total collagen , collagen , and NADH . The high B value for the GI and HP may reflect strong scattering by the superficial keratin layer, resulting in a rapid fall in the scattering as a function of wavelength. In their depth-resolved studies of fluorescence from epithelial tissue, Wu and Qu observed a similar trend in the fluorescence intensity.37 They found that the presence of keratin caused the fluorescence intensity to rapidly decay with increasing sampling depth, which was not observed for nonkeratinized tissues. The DT, which is only partially keratinized, exhibited a B value comparable to the nonkeratinized sites.38 The presence of keratin appears to diminish signals from small, more superficial cellular scatterers, resulting in low C values for the HP and GI.
The GI, HP, and SP demonstrated very low concentrations of hemoglobin as compared to other sites. In the case of the former two sites, these results may indicate that keratin prevents light from reaching the underlying stroma where the blood vessels are located. During data collection from the GI, visible blanching of the mucosa was often observed when the probe contacted the tissue, which may also contribute to the lower cHb values obtained for this site. The FM, LT, and VT,—sites with relatively thin epithelia as compared to other sites—demonstrated the highest cHb. The oxygen saturation values we obtain are comparable to those measured in studies of the oral cavity and other tissues.17, 21 The mean effective vessel radius was largest for areas on the tongue (DT, VT, LT) and the FM. Blood vessels are easily observable at the macroscopic level on the VT and FM.
Because -carotene is present in adipose tissue and in blood, we would expect the sites that demonstrated a high cHb and contain submucosal fat to exhibit the highest concentrations of this absorber. The highest concentration was found in the SP, which meets both these criteria. The FM, LT, VT, and BM were tied for the next highest value. The former three sites were those that displayed the highest cHb and the BM and FM both contain adipose tissue in the submucosa. Similar to the results for cHb, the GI and HP exhibit the lowest concentration of the absorber.
At excitation, light is readily absorbed by a number of tissue components. As a result, the light does not penetrate deeply into the tissue, and the emission signal is heavily influenced by superficial components. The FM displayed a mean value significantly higher than all the other sites for the NADH/tryptophan and NADH/collagen ratios, as well as the NADH and tryptophan parameters. The FM is distinguished by a very thin epithelium , as compared to for the BM, and for the HP.30 At excitation, the site with the highest values for the ratio of NADH to Coll401, Coll427, and the total collagen contribution, respectively, was the HP. The GI had the second highest value for the former two ratios, but was tied for second with the DT and SP for the latter ratio. These ratios are affected by a complex combination of epithelial and stromal properties. The GI and HP demonstrated the lowest mean Coll401/Coll427 ratio of all the sites. In a study by Collier examining cervical tissue by confocal microscopy, the authors found that keratin (which can be present both in normal and precancerous tissue) introduces a significant source of variability in the depth penetration of the incident light and, therefore, the extent to which scattering from different layers contribute to the observed signal.39 Although there is a significant amount of collagen in the deeper stroma (composed mostly of collagen I), this is offset by the higher probability of light reaching and returning from more shallow structures, such as the basement membrane (composed mostly of collagen IV). Most sites display Coll401/Coll427 ratios of [also Coll401 ], indicating that emission from Coll427 (340 nm), which is similar in profile to collagen IV, is preferentially collected over Coll401 (340 nm), which most closely resembles collagen I. The presence of keratin on the GI and HP likely further reduces penetration of light to the deeper stroma thus resulting in the low Coll401 and Coll401/Coll427 values.
For most parameters, the HP and, to a lesser extent, the SP were also unique in that the RSDs of the parameter distributions for these sites were significantly smaller than for most other sites. We attribute this to the fact that these sites are relatively hard and noncompressible and, therefore, less prone to error from variations in probe pressure. In addition to the natural variation among various sites as a result of the architecture of the mucosa, regional variations in the morphology within a given site, poor accessibility with the probe (i.e., the RT, some locations along the palate), the intrinsic movement of the tissue (i.e., the tongue), and the highly compressible nature of the softer tissue sites also contribute to the width of the parameter distributions. Analysis of data collected at different points along the palate and tongue, respectively, (data not shown) demonstrated not only the variation within a site, but also that areas of transition between sites can produce either gradual or marked shifts in a parameter, depending on its sensitivity to the specific structural differences between the two sites. These transitions can be appreciated based on gross visual inspection of oral tissue, as well as microscopic examination.30 In future studies focused on disease detection, the area-normalization procedure should be performed for the analysis of the fluorescence data because this resulted in smaller spreads in the parameter distributions compared to non-normalized fluorescence data. The sources of the variations in the data will be discussed in further detail in the subsequent discussion of results for the validation of normal tissue discrimination.
-Means Cluster Analysis
The degree of clustering based on anatomy can be very significant, as shown by the range of values for (Table 3). The keratinized sites, the GI and HP, frequently clustered together, while the partially keratinized DT only displays significant clustering with these sites for a single parameter [NADH/tryptophan ]. In addition to differences in the extent of keratinization, the DT is distinct from the GI and HP because it contains muscle in the submucosa rather than bone.
Despite the large RSD values and apparent overlap of the parameter distributions, clear spectral contrasts between sites exist based solely on anatomy [Figs. 3a and 3b]. The sensitivity and specificity values ranged between 78 and 92%, and for both distinctions, an AUC of was achieved. The plots show that excellent results can be obtained when only two parameters are used for the distinction. However, by combining several parameters, the distinction may be further enhanced. By combining C with A and tryptophan , a sensitivity, specificity, and AUC of 94%, 80%, and 0.88, respectively, can be achieved in discriminating between the BM and FM (sensitivity, specificity, and AUC of 87%, 79%, and 0.84, respectively, with only the latter two parameters). As another example, the FM and VT are poorly distinguished (sensitivity, specificity, and AUC of 80%, 47%, and 0.68, respectively) based on parameter A and tryptophan . However, with the addition of NADH , a sensitivity, specificity, and AUC of 72%, 72%, and 0.73, respectively, can be achieved.
The GI and HP could not be distinguished based on cHb and B, two parameters for which they are assigned to the same cluster [Fig. 3c]. Figure 4 shows the similarity of the GI and HP histology [Figs. 4a and 4b], and their contrast to the thicker, nonkeratinized epithelium of the BM [Fig. 4c]. The BM and GI were statistically different for 15/19 parameters, and the BM and HP were statistically different for 16/19 parameters. With only two parameters (cHb and A), the BM and HP could be distinguished with a sensitivity, specificity, and AUC of 89%, 95%, and 0.93, respectively (data not shown).
Validation of Normal Tissue Discrimination
Using the normal limits derived from the parameter distributions for the calibration data, we tested our ability to correctly classify parameters extracted from healthy tissue in a second set of validation data. We obtained excellent results, with few false positives (abnormal results), indicating that we have successfully captured the distributions for each parameter (Table 4). All of the fluorescence parameters except NADH/Coll427 demonstrate excellent accuracy in classifying normal tissue. The concentrations of hemoglobin and -carotene are somewhat less accurate for identifying normal tissue. This is not surprising in that, within any given site (particularly those in which vessels may be located nonuniformly throughout the site), the probe may sample more or less hemoglobin or -carotene depending on the proximity and density of the blood vessels.
There are a number of factors that contribute to the spread observed for a given site for the extracted parameters. These include variations among and within individuals, variations in the anatomy within a site, as well as probe-related factors. Because the data analyzed in the calibration and validation sets were collected from a large number of subjects, the effect of probe pressure can contribute to the observed spread for each site. We performed two tests evaluating the effect of variable probe pressure and repeat measurements in the same location, respectively, and found that the spread produced in these tests was smaller than that observed in the HV data.
Understanding the normal spectral contrasts associated with healthy oral tissue is essential for guiding the analysis and interpretation of data collected from lesions, and the development of spectroscopic-based tools for cancer diagnosis. Because the oral cavity is complex, in that it includes several different anatomic sites, the goal of this work was to characterize the normal spectral contrasts. There were significant differences in the means and spreads of the parameter distributions among the nine anatomic sites evaluated. Our findings also demonstrated the strong correlation between the extracted spectral parameters and the physical properties of the tissue. Keratinized sites (HP, GI) were distinct from nonkeratinized sites for a number of parameters. We have identified which parameters are affected by keratin and how the parameters are affected. Even among nonkeratinized sites, certain sites could be distinguished with sensitivity and specificity using only two parameters. The results of the k-means cluster analysis also demonstrated the magnitude of the contrasts among the various anatomic sites. Finally, we have developed thresholds for the range of normal values for each parameter for each site.
These results provide an important foundation for our ongoing patient studies. Our understanding of the impact of keratin on the spectral parameters will be valuable in our assessment of patient data, as many lesions (benign, dysplastic, or malignant) exhibit hyperkeratosis (leukoplakia). Using the thresholds for the range of normal values for each parameter for each site, we have the potential to determine whether we can detect malignancy in the absence of visible findings.
The findings in this work also highlight a number of major points to be considered in the application of spectroscopy to the detection of oral cancer. First, when comparing lesion categories, differences in the distribution of white (hyperkeratotic) lesions in the two categories could produce spectral contrast. As a result, the apparent discrimination between the categories could mistakenly be attributed to the disease process. This was shown by the frequent distinction of the HP and GI from the remaining nonkeratinized sites. The impact of this would be particularly pronounced when data collected from healthy volunteers are directly compared to data collected from patients with lesions, since oral lesions are often characterized by hyperkeratosis. When small numbers of benign samples are grouped with large numbers of samples collected from healthy volunteers without visible lesions, this effect is also likely to occur. Second, this work also emphasizes that by combining several anatomic sites, differences in anatomy can skew distributions, such that in comparing malignant tissue samples to benign lesions (or healthy) tissue samples, each of which are comprised of different proportions of each site, a separation that is completely unrelated to malignancy may arise. Finally, combining sites that have statistically different mean values can result in a larger spread in each parameter, and thus make it more difficult for a change in a spectral parameter associated with the presence of cancer to become significant. The combination of a site with a large parameter spread (large RSD) with a site with a smaller parameter spread (small RSD) would also make it more difficult to appreciate changes with disease for the site with the smaller parameter spread. These effects are clearly demonstrated by the group statistics for the combination of nine sites as compared to a single site.
In future studies of lesions, we may be able to combine certain sites with similar spectral properties. The number of biopsy cases demonstrating dysplasia or cancer is usually limited, and combining sites would increase the number of samples from which a diagnostic algorithm could be developed, and also extend the number of sites to which it could be applied. The disadvantage is that this may incur a decrease in sensitivity and specificity for cancer due to an increase in the overall spread for the parameter distributions by combining sites. This is being explored in our ongoing diagnostic study of patients with oral lesions.
The results presented in this study provide strong evidence that a robust and accurate spectroscopic-based diagnostic algorithm for oral cancer will need to be applied in a site-specific manner, to ensure accurate evaluation of malignancy and to avoid the confounding effects of anatomy.