Renal tumors are the seventh leading cause of death by cancer and account for approximately 2% of all new primary cancer cases.1 Increases in incidence occurred in both sexes and it is estimated that 72,000 cases of renal cancer will be diagnosed in Europe each year.2 The main treatment modality for renal cancer is the radical or partial nephrectomy. However, at the time of surgery it is not always possible to distinguish malignant tumor cells from the surrounding healthy tissue. Only precise histological examination of the removed tissue can definitively confirm radicality of the operation, but this usually requires several days. Intraoperative frozen tissue section pathology examination also takes time and is not very reliable. If it reveals that the tumor has not been completely removed, a second surgery is needed. Another problem is the detection of tumor cell infiltration into normal tissue.3 The ideal would be to have a rapid and objective method that could indicate tumor cells in normal tissue during the resection.
One approach that might play a role in the identification of tumor cells is Fourier transform infrared (FTIR) spectroscopic imaging.4, 5 Recent reports have been largely dominated by studies focusing on FTIR spectroscopic imaging to characterize skin tumors,6 tumor cells in brain tissue,5, 7 breast tissue,8 in the colon,9 and many others. The high sensitivity of the method enables to identify tumors even in a very early stage of this disease. Raman spectroscopy, a complementary technique to FTIR spectroscopy, provides also molecular information from tissue.10 The advantage of Raman spectroscopy is that in vivo measurements are possible since water shows only a very weak Raman-signal. In a recent study it was demonstrated that Raman spectroscopy can accurately differentiate normal and tumor renal tissue and even classify the tumor cells as low- or high-grade.11 The advantage of FTIR spectroscopic imaging is that the acquisition is fast (up to a few minutes per image) and the spectra provides a higher signal-to-noise ratio than Raman spectra. FTIR spectroscopic imaging is preferred when a larger area of tissue must be investigated. However, the evaluation of FTIR spectroscopic data is difficult due to the high amount of information. An individual FTIR spectroscopic image contains more than 4000 FTIR spectra with up to several hundred wavenumbers in every pixel point. A FTIR spectrum of biological tissue is complex and biochemical changes in tumor cells usually are very different.12 Therefore, FTIR spectroscopic imaging is commonly used in conjunction with multivariate statistical methods such as cluster analysis,13 principal component analysis,14 linear discriminant analysis,15 or supported vector machines.16
In this work, we have undertaken an investigation of the capability of FTIR spectroscopic imaging in combination with fuzzy k-means cluster analysis and linear discriminant analysis to detect infiltration of tumor cells in adjacent normal tissue.
Renal tissue samples were obtained from Vilnius University Hospital, Santariskiu Klinikos by means of cancerous tissue surgery. Cryosections were prepared from an area around the borderline between tumor and normal tissue. A thin section of 10 μm in thickness was transferred onto a calcium fluoride window. After FTIR spectroscopic imaging the tissue sample was stained with hematoxylin and eosin (H&E) and examined by optical microscopy. The study protocol was approved by the Vilnius regional bioethics committee (approval no. 158200-12-131-056LP6, 05 05 2009).
FTIR Spectroscopic Imaging
FTIR spectroscopic images were collected in transmission mode using a FT–IR spectrometer Vertex 70 coupled with infrared microscope Hyperion 3000 (both from Bruker Optik GmbH, Ettlingen, Germany) and an MCT focal plane array detector. The imaging detector was a Santa Barbara focal plane MCT 64×64 array detector. The 15-fold Cassegrainian objective with a numerical aperture of 0.4 imaged a sample area of ∼270×270 μm2. A composition image of 20×20 individual infrared images was captured from a selected area of the tissue section. Pixel binning of 16×16 was applied to reduce the amount of spectra. The pixel binning reduces the spatial resolution to ∼ 170 μm. In agreement with the histopathology, this size allows the identification of areas of tumor cell infiltration. The sample area of the composition image has a dimension of ∼5.4×5.4 mm2 encompassing 80×80 (6400) individual infrared spectra. A reference spectroscopic image was recorded from the pure calcium ﬂuoride window. A total number of six interferograms (scans) were co-added. The interferograms were Fourier transformed applying Happ–Genzel apodization and zero filling factor of 1. Spectra at a resolution of 8 cm−1 of the sample image were rationed against the spectra of the reference image and transferred to absorbance values. This spectral resolution was chosen in order to improve the signal-to-noise ratio, to reduce the size of the spectral data set, and to ensure that all prominent bands, even those with medium intensity, appear clearly in the spectrum. The frame rate of the focal plane detector was 3773 Hz, yielding a total measurement time of approximately 20 s for each individual spectroscopic image.
Evaluation of spectral data was performed using the MATLAB package (Version 7, MathWorks Inc. Natick, Massachusetts). The main part of the data analysis is based on in-house written programs, in particular for data preprocessing and image processing. The flow chart of the data processing is sketched in Fig. 1. Data preprocessing involves a removal of outliers, a linear two-point baseline correction, and a normalization of each absorbance value of a spectrum to the integral absorbance. Outliers are spectra that are obviously not associated to tissue or spectra with a maximum absorbance value larger than 1.8 or smaller than 0.08. Principal component analysis (PCA) calculations were performed using the eig function of the MATLAB package. Fuzzy k-means cluster analysis was performed on an in-house written algorithm. In accordance to the elbow-criterium,17 a number of 10 clusters were chosen. Spectra classification was performed on an algorithm as described elsewhere.5, 18 The training set was generated from the results of the cluster analysis. Spectra are assigned to a cluster that clearly represents tumor or normal tissue. Both tissues were used to train the classification algorithm to find the discriminatory spectral patterns in the data set. The calculated classification model was verified by the leave-one-out method. Information about the algorithm is given at: http://www.bfsk.ff.vu.lt/statist_sp_analysis.htm.
Afterwards, all spectra of the composition image were classified. The classify function of the MATLAB package returns a matrix containing estimates of the posterior probabilities that the spectrum belongs to the class “tumor” or to the class “normal.”
Results and Discussion
Figure 2a shows the microscopic image of the tissue section. The overlaid grid indicates the measurement matrix of the FTIR spectroscopic images, where tumor tissue is located in the upper part. Tumor tissue appears to be more homogeneous than the surrounding normal tissue. Although the border of the tumor tissue appears well defined, some parts of the surrounding tissue were also suspected for a tumor. As it can be expected, the color coded bright field infrared image of the marked area [see Fig. 2b] is not informative enough for tumor detection. Variations of colors in the image can be related mainly with thickness variation of the tissue. The representative spectra of normal and tumor tissue are shown in Fig. 2c. Both spectra appear as quite similar. A detailed view of the spectra is given in Fig. 2d, which shows the fingerprint region (950 to 1750 cm−1) used for the data analysis.
The FTIR spectroscopic bright field image of the tissue section is presented in Fig. 2b. For every pixel the integral intensity across the spectral range 950 to 1750 cm−1 is transformed to a rainbow-scale. Dark blue pixels indicate low absorbance value, red and orange pixels are spectra with high absorbance. Red and yellow pixels in the upper right corner indicate an artifact of the thin section. The borderline between tumor and normal tissue is slightly visible. However, the question whether the tumor has infiltrated into the suspected tissue cannot be answered from the bright field image.
The spectrum exhibits the characteristic bands of tissue which emerge mainly from vibrations of proteins, lipids, and nucleic acids. The spectrum is dominated by the amide I band at 1650 cm−1 and amide II band at 1550 cm−1, which arise from the C=O stretching and N–H bending vibrations, respectively, of the amide groups comprising the peptide linkages of proteins. Weaker bands that do appear around 1453, 1469,, and 1344 cm−1 are assigned to various C–H vibrations of lipids. The spectral range between 1000 and 1250 cm−1 is mainly composed from absorption bands of C–O and PO2− groups of nucleic acids, phospholipids, and carbohydrates. Table 1 summarizes the vibrational modes and their molecular assignment.
Assignment of the absorption bands of the spectrum in Fig. 2d (Refs. 19, 20, 21).
|1048||C–O–P stretching, lipids, ribose - C–O–C stretching|
|1080||Nucleionic acids - symmetric PO2 stretching|
|1230||Lipids, nucleonic acids - asymmetric PO2 stretching|
|1344||CH2 wagging, C–O stretching|
|1396||Lipids - CH2 bending, amino acids - COO− stretching|
|1453||Lipids - CH2 bending|
|1540||Proteins - amide II|
|1646||Proteins - amide I|
The bright field image in Fig. 2b does not reveal the biochemical composition. Multivariate chemometric methods have to be utilized to identify characteristic bands and to classify the spectra. At ﬁrst, PCA was applied to investigate spectral differences between tumor and normal tissue. In order to compensate variations in the thickness of the tissue, section spectra were baseline corrected by two point linear baseline correction and area normalized. Figure 3 depicts the top four principal components that cover more than 98% of the total variance of the spectroscopic image. Score maps reveal the lateral distribution of the principal components. Significant features of the PCs are displayed in the loading plots. The first PC comprises by far the largest variance across the investigated area and represents the average spectrum of all spectra. The loading plot of the second PC shows variations in the regions of amide II and amide I bands, as well as around 1250 cm−1. We assign the second PC to variations in the protein profile and changes in the lipid content. The loading plot of the third PC exhibits again variations in amide II and amide I bands. In addition, lower absorption signals occur in the spectral region between 1000 and 1200 cm−1. This region is mainly associated to the absorption of nucleonic acids and glycolipids. The tumor region in the corresponding score map exhibit dark blue pixels (low intensity). Therefore, the interpretation of the third PC leads to the conclusion that the tumor tissue region exhibits an enhanced concentration of nucleonic acids and glycolipids. In case of the fourth PC, the assignment of the features of the loading plot is too difficult. Nevertheless, the tumor tissue region (mainly yellow pixels) can be clearly distinguished from the normal tissue. Figure 4 shows a scatter plot of the score values where the two types of tissue are separated. Despite this obviously good separation between the two types of tissue, it is not possible to define whether tumor cells have spread in normal tissue.
Cluster analysis is one of the most frequently used multivariate statistical methods that examine the similarity between spectra.21 In a previous study, we have demonstrated that the tumor region can be discriminated from the adjacent normal tissue by using fuzzy k-means cluster analysis.22 However the detection of renal tumor cells that have spread in normal tissue requires a more sophisticated approach. The cluster algorithm requires a preselection of the number of clusters. This is often a critical decision since different numbers of clusters may lead to different results. Unfortunately, the correct number of clusters is often unknown in advance. One approach for finding and optimizing a number of clusters is the elbow criteria. In spectroscopic data sets, however, overlapping of clusters as well as noise and outliers are common. For this reason the number of clusters was set to 10, e.g., two more than estimated by the elbow criteria. Figure 5 shows the result of fuzzy k-means cluster analysis of the spectral data set. The cluster assignment is represented in Fig. 5a. Figure 5b shows the corresponding centroid spectra.
The algorithm sorts the clusters according to their similarity. Although all centroid spectra are quite similar, the tumor region (see Fig. 2) is mainly represented by red, orange, and yellow pixels. The remaining clusters are predominantly located in the adjacent normal tissue. However, there are also a considerable number of yellow to red pixels in this region. The question addressed here is whether these pixels indicate tumor tissue or are a result of misclustering. One approach to improve the classification accuracy is the application of the supervised methods.23 Supervised classification allows to categorize the questionable pixels into different themes, based on the spectral characteristics of tumor and normal tissue. According to the image, two red pixels indicate tumor tissue. Since violet and blue clusters in Fig. 5 exhibit the lowest similarity to the clusters representing tumor tissue (red and orange), we assigned spectra of the clusters #1 to #4 to normal tissue. These selected spectra are used to create a training set for the supervised classification.
The tumor tissue (clusters #9 to #10) encompasses 2784 spectra and the normal tissue (clusters #1 to #4) 1847 spectra. In the following step 400 spectra from each tissue type were chosen by the algorithm and used as a training set. The spectral procedure for developing the classification model employs two algorithms in tandem. The program takes as input both the spectra in the training set and their histological assignment. Figure 6a shows the classification result. The probability of the class assignment is transferred into a blue-dark gray-yellow color scale. Yellow pixels indicate tumor tissue, whereas blue pixels represent normal tissue. Spectra that could not be assigned to one of the classes are represented as blue-gray or yellow-gray pixels. Gray pixels indicate spectra used for the training set. The tumor tissue is clearly distinguishable from the surrounding normal tissue. However, there are also a number of yellow pixels in the expected normal tissue. These spots may indicate tumor cells that are spread into the normal tissue and confirm the assumption that the tumor has already infiltrated into the surrounding tissue. It is worth noting that for normal, but also for tumor tissue, not all spectra are assigned with a probability of 1 to one class. This may simply highlight errors in the spectroscopic-based method, it may also be due to various malignancy grades appearing within a single sample. The H&E stained tissue section in Fig. 6b clearly shows the tumor tissue. The tumor cells are clear or slightly eosinophilic, with distinct cell membranes, arranged in compact structure. There is also an infiltration of tumor cells into the adjacent normal tissue. At least three hot spots of tumor cells are visible in the H&E stained tissue section, indicated by the arrows in Fig. 6b. Figure 6c is calculated from the spectral classification and the H&E image. Pixels that have a probability of more than 60% as a tumor are colored in yellow and merged with the H&E stained image. The image fusion in Fig. 6c clearly reveals some small differentiated areas that show the spectral characteristic of tumor cells indicating an infiltration of renal tumor cells into the normal tissue.
The averaged spectra of normal tissue and tumor tissue are shown in Fig. 7. Although the spectral profiles may appear at first glance to be similar to each other, the superficial resemblance is attributable only to the dominant contributions of protein constituents that overwhelm the spectra of all soft tissue samples. Closer inspection reveals three regions of the spectrum that stand out clearly as being different for the both tissue types. The absorption profile between 1020 and 1130 cm−1 appears slightly stronger for the tumor tissue. It is well known that tumor cells exhibit a higher proliferation rate that give rise to stronger bands of the nucleic acids in this spectral region. The band at 984 cm−1 is assigned to the vibrations involving the chain mode of the ribose-phosphate diester linkage.20, 21, 22, 23, 24 The bands at 1380, 1400,, and 1446 cm−1 correspond to symmetric CH2 and asymmetric CH2 deformation. The absorptions arise mainly from the lipid constituents (e.g., phospholipids and possibly others). Figure 7b shows the difference spectrum (tumor–normal) of the mean spectra and the standard deviations. The strongest differences appear in the range of the amide bands. However, the standard deviation is also very high so that this region was not selected by the algorithm for spectral classification. The difference spectrum reveals that tumor tissue exhibits higher absorbance in the range between 1000 and 1100 cm−1 and around 1250 cm−1. These regions are mainly associated with phosphate spectral bands of nucleic acids. Lower absorbance values were found for tumor tissue between 1380 and 1450 cm−1. This finding is in accordance with studies suggesting that malignancy is accompanied by a change in the lipid profile and by an accumulation of glycogen and lipids.25, 26
These diagnostic models can then be applied to an intraoperative spectroscopy approach to allow surgeons to arrive at a diagnosis in real time. This will especially affect the prognosis of patients with cancers, as more complete resection of the suspected tissue decreases the rate of recurrence and increases patient life expectancy. There is presently a high demand for intraoperative surgical guides, and vibrational spectroscopy has been shown to provide chemical contrast between tumor and normal tissue.27
The results obtained in this study show that FTIR spectroscopic imaging in conjunction with a supervised classification is a very powerful tool to identify renal tumor tissue. The spectroscopic image reveals the tumor infiltration into adjacent normal tissue. In this technique, differences in nucleic acid concentration and in the lipid profile between normal and tumor tissue are very sensitive markers for the classification. Tumor cells exhibit a higher proliferation that leads to the stronger bands of the nucleic acids. It is known that renal tumor cells accumulate glycogen and lipids resulting in changes of the lipid absorption bands. One of the strengths of this approach is that small areas of tumor infiltration can be very quickly detected without any staining. Since FTIR spectroscopy relies on the intrinsic biochemical differences between cancerous and normal tissues for contrast, these methods are poised to aid surgeons by quickly and accurately providing a diagnosis. Patient prognosis relies upon rapid diagnosis and treatment; renal cancer has improved the rate of survival with a complete resection of the tumor tissue. Traditional methods for diagnosis rely upon time consuming techniques where resected tissue must be ﬁxated and histochemically stained. FTIR spectroscopic imaging can be used to gain a large amount of information to train algorithms for fresh samples and tissue sections of the disease in question.
The studies are part of the project “Diagnostic studies of chronic noninfectious diseases by means of infrared spectroscopical microscopy” financed by the Lithuanian Science Council, Project No. MIP-111/2010. We thank the Bruker Optik GmbH (Ettlingen and Leipzig, Germany) for their technical support.