Colorectal cancer is the third most common form of malignancy in the United States, and approximately 149,900 new cases and 50,000 deaths occur each year, resulting in a major cause of morbidity and mortality.1 Screening for this disease is commonly done with endoscopy, in which the mucosal surface of the colon is examined with reflected white light, to look for raised lesions called polyps. Such polyps can be either benign (hyperplastic) or premalignant (dysplastic, termed adenomas), but definitive pathological diagnosis cannot be made unless a biopsy is performed. Hyperplastic polyps have no risk of evolving into malignancy, while dysplastic lesions have a predictable risk that increases with the size of the lesion.2 Dysplastic mucosa can persist in a latent phase for approximately before transforming; thus, a window of opportunity exists to prevent cancer if adequate techniques for early detection can be developed. In addition, numerous benign inflammatory polyps can be found in patients with inflammatory bowel disease that cannot be distinguished endoscopically from dysplastic polyps.3 Polyp removal during endoscopy incurs an increased risk of bleeding or perforation. In addition, it takes time to excise these lesions, thus prolonging the duration of sedation, and additional cost is incurred for histopathological examination of tissue. Thus, a technique that can identify dysplasia during endoscopy could have tremendous diagnostic value in medical care.
The absorption of infrared light by the interatomic bonds in tissue macromolecules can be measured by techniques of Fourier transform infrared (FTIR) spectroscopy. This method has demonstrated promise for detection of cancer in a number of tissues including skin,4 cervix,5, 6, 7, 8 esophagus,9, 10 stomach,11, 12 lung,13 ovary,14 prostate,15 and colon.16, 17, 18, 19 FTIR spectra can reveal the molecular composition of the tissues without use of exogenous contrast agents.20, 21 Instead, a number of endogenous biomolecules can provide a characteristic spectrum of infrared absorption peaks constituting a molecular “fingerprint,” revealing the biochemical composition of the tissue. The use of FTIR to detect the presence and to measure the concentration of these specific molecules can provide important diagnostic information. Previously, we have measured FTIR spectra with attenuated total reflectance (ATR) by placing the tissue specimen in contact with a zinc selenide (ZnSe) crystal.10 In this mode, the infrared beam reflects at the interface between the crystal and the tissue below the critical angle and undergoes total internal reflection generating an evanescent wave with a depth of 2 to 3 microns into the tissue. This was used to determine the relative concentrations of the key endogenous biomolecules.
Recently, flexible optical fibers have been developed with polycrystalline silver halide materials for collecting FTIR spectra that may be configured to be endoscope-compatible and can enable interrogation of the molecular composition of tissues during endoscopic procedures.22, 23, 24, 25, 26, 27, 28, 29, 30, 31 Fibers that transmit infrared light can be made with a variety of materials; however, silver halides, such as , have advantages for in vivo applications that include broad spectral transmission, good mechanical flexibility, low optical attenuation, long-term stability, nonhygroscopicity, and nontoxicity.24, 26, 28 Silver halide transmits infrared radiation from , depending on the fiber composition, with low attenuation ( at ),24 and can be fabricated into optical fibers that are several meters in length. Here, we demonstrate the use of an endoscope-compatible silver halide optical fiber to collect FTIR spectra from freshly excised specimens of colonic mucosa, including normal, hyperplastic, and dysplastic mucosa. Because dysplasia is one of the early steps in the transformation process toward cancer, the molecular changes at this stage are expected to have subtle spectral differences compared to that of normal colonic mucosa and hyperplasia. Thus, we will need sophisticated spectral processing algorithms to accurately classify these specimens. The mid-infrared (MIR) spectral range is evaluated because this regime contains numerous absorption maxima of fundamental rotational and vibrational modes of biochemical bonds.22
Recruitment and Specimen Processing
A total of 37 subjects undergoing routine screening colonoscopy with a mean age of (range ) were recruited for this study. The protocol was approved by the human subjects committee (IRB) at Stanford University and the Veterans Affairs Palo Alto Health Care System (VAPAHCS). After informed consent, routine screening was performed using a standard video colonoscope in the Endoscopy Unit at the VAPAHCS Hospital. When a polyp was identified, a pinch biopsy with jumbo forceps was taken from the lesion and an adjacent site of normal-appearing mucosa. The endoscopic appearance and anatomic location of each lesion biopsied was recorded. Immediately after resection, half of the biopsy specimen was placed on moist gauze over ice in a container for transport to the FTIR spectrometer. The other half of each specimen was sent for routine histopathological evaluation. Water was removed from the mucosal surface of the research specimen by blowing cool air for . FTIR spectra were then obtained, as described in the following. Following the collection of spectra, each piece of tissue was then rehydrated with normal saline (0.9%) and placed in 10% buffered formalin. The tissues were processed and cut in sections, stained with hematoxylin and eosin (H&E), and analyzed by two gastrointestinal pathologists (MRA, JMC). Only specimens whose morphology were consistent with the endoscopic findings (polypoid versus normal appearing) were included in the analysis. This resulted in spectra from of the original 37 subjects, including , 17, and 24 specimens of normal, hyperplasia, and dysplasia, respectively.
FTIR Spectra Collection from Colon Specimens
An FTIR spectrometer (Nexus 470, Thermo Electron, Madison, Wisconsin) was used to measure the spectra and was continuously purged with air to remove water vapor and . A Multi-Loop-MIR silver halide optical fiber (Harrick Scientific, Pleasantville, New York) is coupled to a FiberMate module via SMA connectors. This module contains a set of elliptical (EM) and flat mirrors (FM), as shown in Fig. 1, that define the optical path, and is inserted into the signal arm of the Michelson interferometer used by the spectrometer. The input and output fibers are each one meter long and have a diameter of . The distal end of the fibers terminates into a handpiece that has an overall diameter of with a -diam replaceable fiber tip. The diameter of the handpiece is determined by the minimum bend radius of the silver halide fiber. In the attenuated total reflectance (ATR) configuration, the fiber tip is formed into a loop so that there is no physical interruption of the light path between the light delivery (input) and collection (output) fibers. The specimen is placed on a linear translational stage, which is adjusted until the distal tip of the fiber is in contact for spectral collection. A mercury cadmium telluride (MCT) detector cooled with liquid nitrogen and set at an autogain of 4 was used to record the spectra with a resolution of using 36 co-added scans for both the background and tissue. The extent of hydration was determined using a ratio between the Amide I and Amide II bands at 1650 and , respectively. This process was used to adjust the drying time to provide a consistent peak at .
The FTIR spectra were evaluated using partial least squares (PLS) discriminant analysis,32, 33, 34 and the algorithms were implemented using the SIMPLS algorithm in the PLS̱Toolbox, a commercially available software package (Eigenvector Research, Inc., Wenatchee, Washington).35 The spectra were first preprocessed to remove variance that is not relevant for predicting class membership. The simplest preprocessing method is interval PLS, where a subset of wavelengths (wave numbers) is identified that provides a superior predictive value in comparison to use of all the wavelengths collected.35 Interval PLS discriminant analysis is performed by conducting an exhaustive search for the best combination of wavelengths. In this study, the fingerprint regime was divided into six subranges, and all possible combinations of subranges were tested.
The two other basic categories of preprocessing are sample-wise and variable-wise methods. Sample-wise methods act on each spectrum one at a time to remove unwanted variance. Sample-wise pre-processing methods evaluated in this analysis include along-spectrum derivation, baseline subtraction, and normalization. Derivation was performed by applying the Savitsky-Golay algorithm,36 where a second-order polynomial was fit to a window around each value of the spectrum, and the value was then replaced with the second-order coefficient of the polynomial. Multiple window widths were also tested. When the derivation was not applied, a baseline spectrum, calculated as a low-order polynomial fit to the end regions of each spectrum, was subtracted. The relative values of spectral measurements often have greater predictive value than that of the absolute measurements. In this case, normalization is useful and helps all spectra to have an equal impact in the model. Three different types of metric were evaluated, with either the area under the curve of a spectrum, the vector length of a spectrum, or the maximum value of a spectrum undergoing normalization.
Variable-wise preprocessing methods act on each spectral wavelength independently but require multiple samples for parameter estimation before they are implemented. The variable-wise preprocessing method tested in this study was unit-variance scaling. In this method, the value of a spectrum (or preprocessed spectrum, as unit-variance scaling always occurs as the last preprocessing step) at each wavelength was centered by subtracting the mean value across all samples and then was scaled to unit variance by dividing by the standard deviation. Each subrange was independently normalized to have the same area under the curve. The normalized values were then scaled to unit-variance by subtracting the mean and dividing by the standard deviation at each wavelength. By normalizing and scaling to unit-variance, the mixture of units due to a combination of baseline-subtraction in some subranges and second derivative pre-processing in others are eliminated.
Classification Algorithm Design
We sought to identify diagnostic information from the subtle differences observed in the mid-infrared absorbance spectra collected from colonic mucosa, including normal, hyperplasia, and dysplasia. We developed a double binary detection algorithm that consists of one model to discriminate hyperplasia from dysplasia and another to distinguish normal from dysplasia. These distinctions are clinically relevant to patients that are found to have a very large number of polyps , where time limitations become a factor for lesion removal. The hyperplasia-versus-dysplasia and normal-versus-dysplasia models use spectra from respective tissue specimens for training. The total number of PLS discriminant analysis models tested for each is determined by the number of unique preprocessing combinations and exceeded . The accuracy of each component of the double binary model is assessed using leave-one-subject-out cross-validation to provide prospective analysis. The 30 most accurate hyperplasia-versus-dysplasia and normal-versus-dysplasia models were then selected for each. All 900 combinations of these two models were then tested in the double binary algorithm, again using leave-one-subject-out where spectra from each subject are rotated out for cross-validation. After completing the rotation, each spectrum has classification values for both hyperplasia-versus-dysplasia and normal-versus-dysplasia regardless of the true tissue class. Classification thresholds are varied in two dimensions, and the double binary algorithm is scored using histopathology as the “gold standard.” The final algorithm reported consists of the best hyperplasia-versus-dysplasia and normal-versus-dysplasia combination in terms of sensitivity and specificity for dysplasia.
The average FTIR spectra collected with the silver halide optical fiber from specimens of colonic mucosa, including normal, hyperplasia, and dysplasia, are shown in Fig. 2 . An average ratio of (SEM) was found between the Amide I and Amide II peaks from the spectra and is used as an internal control. The spectra reveal subtle differences among the three histological classes in terms of location and height of the absorbance peaks. The spectra are divided into the optimal six subranges found from the algorithm described in Sec. 2 and are indicated by the dashed vertical lines. These subranges correspond to the groups of vibrational modes listed in Table 1 (spectral ranges 1 to 6). Representative histology (H&E) for the specimens of colonic mucosa studied are shown in Fig. 3 . Although some dessication artifact was observed, it did not prevent reliable classification of the mucosal samples into normal, hyperplastic, or dysplastic categories. Because of the possibility that the two halves of each biopsy specimen might differ histologically, the results reported herein are based on interpretation of the half-specimen from which the FTIR spectra were obtained.
Primary mid-infrared absorption band assignments to tissue biomolecules in the 950cm−1to1800cm−1 spectral regime. The references cited are within 5cm−1 of those found in this study.10, 16, 17, 18, 37, 38, 39
|Wave number (cm−1)||Biochemical||Reference|
|Spectral range 1|
|1056||RNA, DNA, lipids||37|
|1085||Phosphates (RNA, DNA,phospholipids)||37|
|Spectral range 2|
|1225||Phosphates (RNA, DNA,phospholipids)||38|
|1240||Protein (Amide III)||10|
|1241||Phosphates (RNA, DNA,phospholipids)||18|
|1260||Protein (Amide III)||38|
|1280||Protein (Amide III)||38|
|1301||Protein (Amide III)||10|
|Spectral range 3|
|Spectral range 4|
|1545||Protein (Amide II)||17|
|Spectral range 5|
|1650||Protein (Amide I)||39|
|Spectral range 6|
The classification of each specimen using the best double binary algorithm is shown in the scatter plot in Fig. 4a, with the optimal thresholds for distinguishing hyperplasia-versus-dysplasia and normal-versus-dysplasia indicated by the solid vertical and dashed horizontal lines, respectively. Dysplasia was classified with sensitivity and specificity of 96% and 92%, respectively, and the accuracy and positive predictive value was 93% and 82%, respectively. In Fig. 4a, the four data points for normal (black crosses) in the lower-left corner represent false positives on the dysplasia-versus-hyperplasia decision line, and the data point for dysplasia (red circle) in the lower-right corner represents a false negative on the hyperplasia-versus-dysplasia decision line. The samples in the bottom-right quadrant of Fig. 4a are classified as nondysplasia by the hyperplasia-versus-dysplasia model. The specificity for the hyperplasia specimens alone was 93%. The receiver operator characteristic (ROC) curve for the double binary algorithm applied to all samples is shown in Fig. 4b, and reveals an optimal trade-off between sensitivity and specificity of 96% and 92%, respectively, for dysplasia and nondysplasia, shown by the black dot in the upper-left corner. The area under the curve was 0.953.
The spectral data after wavelength selection and preprocessing for input to the PLS discriminant analysis regression are shown by class average in Fig. 5a for the normal-versus-dysplasia model and in Fig. 5b for the hyperplasia-versus-dysplasia model. Note that the optimal model for normal-versus-dysplasia includes subrange 5, where Amide I and water are the predominant contributors to the absorbance peaks. This model used subranges 4 and 5 with baseline-removal only, while Savitzky-Golay second-derivative preprocessing was applied to subranges 2 and 6, respectively. The hyperplasia-versus-dysplasia model excluded subrange 5 only. Subrange 4 was used with baseline-removal, while the Savitzky-Golay second-derivative preprocessing was applied to subranges 1, 2, 3, and 6. The hyperplasia-versus-dysplasia model used the same normalization scheme as the normal-versus-dysplasia model and was also scaled to unit-variance. The normal-versus-dysplasia and hyperplasia-versus-dysplasia models both employed only two latent variables, suggesting that the scaling process removes extraneous variance that would otherwise be captured by additional latent variables.
The variable importance for projection (VIP) curves for the normal-versus-dysplasia and hyperplasia-versus-dysplasia models are shown in Fig. 5c. This score summarizes the importance of each wave number for modeling both the variance of the predictors (spectra) and that of the response (classes).40 By definition, the average VIP score is one, and variables with values much greater than one are the most important for the model, while those with values much less than one make a negligible contribution. The VIP results show that the glycogen peaks for hyperplasia at 1026 and (Table 1) in sub-range 1 are important for distinguishing them from dysplasia and that the Amide I and lipid peaks of subranges 5 and 6 are relatively unimportant. While these spectral differences can be clearly discerned in the average spectra of Fig. 2, the differences in protein, lipid, and phosphate peaks of subranges 2, 3, and 4 are much more subtle in the average spectra. However after preprocessing, these differences are amplified, and these peaks are clearly valuable for classification.
Here, we demonstrate the use of an optical fiber to collect FTIR spectra remotely from freshly excised colonic mucosa and find that adequate signal could be obtained with use of a silver halide fiber that is transparent in the mid-infrared regime, a sensitive liquid nitrogen cooled MCT detector, and an efficient FTIR spectrometer. While the fiber used in this study is not sufficiently long or thin enough to pass through a standard medical colonoscope, there are specialty instruments available, including therapeutic sigmoidoscopes, that have comparable lengths and channel diameters. The fiber is sealed within a protective jacket, and the exposed loop tip was replaced after several measurements to prevent degradation of signal. Because the specimens were studied immediately after excision, we expect that these results will be representative of that found in viable colonic mucosa. We observed subtle differences in the average location and magnitude of the absorbance peaks among normal, hyperplasia, and dysplasia in the fingerprint region between 950 and . Regression of these datasets is challenging because they are underdetermined and collinear. That is, there are more spectral peaks that exist than there are spectra in each class, and a high correlation occurs between absorbance maxima for unique biomolecules. Nonetheless, we were able to tease out these subtle spectral differences by performing an exhaustive search for the optimal subrange intervals to be used in the PLS discriminant analysis with rigorous spectral preprocessing techniques. In the end, we were able to develop a double binary algorithm that could distinguish normal from dysplasia and hyperplasia from dysplasia with high sensitivity and specificity.
We are able to simplify this complex spectral dataset into a handful of latent variables that could then be evaluated by PLS discriminant analysis. The underlying assumption of this analysis is that the observations are generated by a process driven by a small number of latent (not directly observed) variables, and the data measured can be projected onto these latent variables by maximizing the covariance of predictor variables (spectral absorbances) and responses (class membership). The number of latent variables is limited to four in this study to avoid overfitting the spectra. Residual water can be a serious interference with FTIR spectra. However, with attenuated total reflectance (ATR), we need to remove water only within the most superficial 2- to 3-micron layer of the tissue. This depth is much less than the dimension of a single cell, so the removal of water over this layer is feasible. As a consequence, we were able to address the spectral peak associated with water, a major source of absorbance in tissue. Our model for distinguishing hyperplasia from dysplasia could completely eliminate this effect by avoiding use of subrange 5 [Fig. 5b]. In addition, we can provide physiological explanation to interpret the spectral changes observed among the three classes of colonic mucosa with use of Table 1. Our spectra provides a unique “snapshot” of the biochemical composition of the mucosa at different steps along the cancer transformation process.20 The absorbance peak for RNA rises relative to that of DNA. Some of these spectral differences are apparent in Fig. 2, while others are not visually obvious. We also found that in addition to the well-known glycogen, nucleic acid, and amide protein peaks, there is useful diagnostic information in the lipids and weaker protein vibrations. In particular, it is notable that subranges 1 (glycogen and nucleic acids) and 3 (proteins and lipids) are not included in the normal-versus-dysplasia model at all, while they are important components of the hyperplasia-versus-dysplasia model, suggesting that these biomolecules have an important role in cellular proliferation.
For future development of this technology, the use of a classification algorithm that minimizes the amount of information required for calibration transfer between the optical fiber instruments is desirable. With the use of unit-variance scaling, a mean and standard deviation at each wavelength must be included as part of the calibration transfer, in addition to the regression coefficients. However, unit-variance scaling was found to provide clearly better performance. If the additional burden upon calibration transfer presented by unit-variance scaling is not significant even in the production of a large number of optical fiber instruments, then unit-variance scaling should be employed as a preprocessing component in the algorithm utilized by this technique.
These results demonstrate that collection and use of FTIR spectra with an endoscope-compatible optical fiber is feasible for distinguishing premalignant colonic mucosa and that the relatively simple spectral biomarkers used to identify dysplasia can potentially be measured in vivo with continued development of this technology. These exciting findings suggest that continued development of an endoscope compatible instrument for collecting FTIR spectra to identify premalignant mucosa in the colon is feasible and has implications for evaluating histopathology in situ as a novel approach for early detection of cancer in the colon that may be generalized to hollow organs.
This work was funded in part by grants from the National Institutes of Health, including K08 DK67618, P50 CA93990, and U54 CA136429 (TDW), and the Department of Defense USAMRCMC W81XWH-05-C-0086 (RW), by the Stanford Infrared Science and FEL Center as a grant from the U.S. Air Force Office of Sponsored Research (F9550-04-1-0075; CHC) and U54 CA105296 (JMC and CHC).