Stokes shift spectroscopic analysis of multifluorophores for human cancer detection in breast and prostate tissues

Abstract. Stokes shift spectroscopy (S3) offers a novel and simpler way to rapidly recognize spectral fingerprints of multiple fluorophores in complex media such as in tissue. This spectroscopic technique can be used as an effective approach to detect cancer in tissue. The alterations of the measured S3 spectra between cancerous and normal tissues were observed in human breast and prostate samples. In order to obtain the optimal Stokes shift interval, Δλi, for the purpose of breast/prostate cancer detection using S3, the S3 spectra of a mixed aqueous solution of tryptophan, nicotinamide adenine dinucleotide, and flavin were measured with different Δλi values. The experimental results analyzed using nonnegative least square method show that there is a reduced contribution from collagen and an increased contribution from tryptophan to the S3 signal of the cancerous tissue as compared with those of the normal tissue. This study indicates that the changes of relative contents of tryptophan and collagen in tissue shown by the S3 spectra may present potential native biomarkers for breast and prostate cancer detection. S3 has the potential to be a new armamentarium.


Introduction
Optical spectroscopy, in particular the Stokes shift spectroscopy (S3), has the potential to be used as a noninvasive or less invasive technique for cancer detection over other conventional diagnostic methods. Additional advantages of S3 include less time consumption, reproducibility, and minimal invasiveness without removing tissue. 1 S3 falls within the field of "optical biopsy," a spectroscopic technique to diagnose disease without removing tissue sample, heading toward new advanced tools for medical armamentarium.
Human tissue is mainly composed of epithelial cells, proteins, fat, water, and extracellular matrix of collagen fiber. A number of key fluorophores tryptophan, collagen, elastin, reduced nicotinamide adenine dinucleotide (NADH), and flavin were investigated as the fingerprint molecules in biomedical optics. 2,3 Tryptophan is an amino acid required by all forms of life for protein synthesis and other important metabolic functions, 4 and accounted to have major contribution to protein fluorescence. NADH and flavin adenine dinucleotide (FAD) are involved in the oxidation of fuel molecules and can be used to probe changes in cellular metabolism. 5 Spectral analysis holds a great promise as a tool for diagnosing early stage of carcinomas. [1][2][3]6,7 Stokes shift spectroscopy offers a novel and simpler way to rapidly recognize spectral fingerprints of fluorophores in complex mixtures such as in tissue. 3,8 S3 measurements can be used to acquire enough information of different key fluorophores from one spectrum; therefore, it can be employed for using one single scan to obtain the most critical fingerprints of main fluorophores which are valuable for cancer detection. 8 Most recently, the S3 technique has received increasing interest for diagnostics of different types of cancers in human tissues. 8,9 The aim of this paper is to show the usefulness of the S3 technique to distinguish the malignant tissues from the normal for the human breast and prostate tissues samples. The optimal Stokes shift interval, Δλ i , for S3 measurements was investigated and determined. The S3 spectra of the key fluorophores in human prostate and breast tissues were measured and analyzed using nonnegative least square (NNLS) method. 10 The underlying physics and biological basis for the S3 approach were discussed. The linear discriminant analysis (LDA) was used to convert alterations observed in the S3 spectra into valuable information 11 for diagnosing the cancerous tissue. Subsequently, the receiver operating characteristic (ROC) curve 11 was generated to further evaluate the performance of the NNLS algorithm combined with LDA for diagnosis of human breast/ prostate cancer. The results show that S3 spectra can be used as an alternative tool for the detection of breast/prostate cancer to monitor the changes of main fluorophores in tissue.

Methods and Samples
For typical organic molecules, the peaks of absorption and emission occur at different wavelengths. The difference between the emission and absorption peaks is known as the Stokes shift depending on emitting molecules and the polarity of the surrounding host environment. 6 A spectroscopic method combining fluorescence and absorption was proposed to acquire spectra by synchronously scanning the excitation and detection wavelengths with a constant shift interval, Δλ i , between the excitation and detection wavelengths. 6,7 This method was named as Stokes shift spectroscopy. 6 While conventional fluorescence spectroscopy uses either a fixed excitation wavelength, λ ex , to produce an emission spectrum or a fixed emission wavelength, λ em , to record an excitation spectrum, the S3 signal is recorded when both λ em and λ ex are simultaneously scanned in the synchronized changing mode of a spectrometer, such as Perkin-Elmer LS 50. In our study, the excitation light with 5 nm spectral width was focused on samples with a spatial size of ∼3 × 1 mm. The power of incident light was ∼0.5 μW. The scan speed was 250 nm per minute. The fluorescence was collected with a spectral resolution of 0.5 nm.
To investigate the optimal wavelength shift interval of Δλ i for cancer detection, the S3 spectra of the mixture of three kinds of key fluorophores, tryptophan, reduced NADH, and flavin, in solution were measured with various Δλ i , from 20 to 120 nm. The S3 spectra of individual tryptophan, NADH, and flavin in solution, and collagen in solid suspension at were also measured with Δλ i ¼ 40 nm and used for NNLS analysis for the S3 spectra of the breast and prostate tissue. All fluorophores were obtained commercially from Mallinckrodt Baker, INC.
Human breast and prostate tissue specimens were obtained from the Co-operation Human Tissue Network (CHTN) and the National Disease Research Interchange (NDRI) under an Institutional Review Board (IRB) approval at the City College of New York. CHTN and NDRI provided pathology reports for the tissue specimens. The samples were received on dry ice and were neither chemically treated nor frozen before spectroscopic measurements. The time elapsed between tissue resection and the S3 measurements varied from sample to sample due to different sample sources, but were not exceeding 30 h. When taking the S3 measurements, the cancerous tissue samples were carefully checked to determine the hard parts to locate the cancerous regions before making any measurements. The assessment using hardness is acknowledged to be a simple way to locate regions of malignancy. 12

Experimental Results and Analysis
The S3 spectra of five pairs of cancerous (solid line) and normal (dash line) prostate tissues were measured by setting Δλ i ¼ 40 nm, and their averaging profile is shown in Fig. 1(a). Figure 1(b) exhibits the averaging S3 spectra of 19 pairs of cancerous (solid line) and normal (dash line) breast tissues acquired with Δλ i ¼ 40 nm. Each spectral profile was normalized to unit value of 1 (i.e., the sum of squares of the elements in each S3 spectrum was set as 1) before taking average and calculation. The salient differences of the S3 spectra between cancerous and normal tissues displayed in Fig. 1(a) and 1(b) are that I c > I n at ∼294 or ∼295 nm while I c < I n at ∼340 nm, where I c and I n are the S3 spectral intensities of the cancerous and normal tissues, respectively. Another obvious difference between cancerous and normal prostate tissues revealed by the S3 spectra shown in Fig. 1(a) is that the peak intensity at I 294 is higher than that at I 340 for the cancerous prostate tissue while this property is inversed in the normal prostate tissue. In addition, there is a small but noticeable spectral difference for the cancerous and normal prostate tissues. The higher intensity at shoulder peak of ∼394 nm was observed in cancerous prostate tissues in comparison with that in normal prostate tissues. This tiny shoulder peak can also be seen at 385 nm for the cancerous breast tissue by enlarging the spectral profile in the range from 365 to 420 nm as shown in the insert of Fig. 1(b). The S3 spectral differences may reflect the change of tissue structures during the development of cancer. In order to understand which biochemical components mainly contribute to these changes, the S3 spectra of the main fluorophores in breast tissue, e.g., tryptophan, collagen, NADH, and flavin, were measured. Figure 2(a) shows the measured S3 spectrum for each of the individual fluorophore. The spectrum of collagen in solid suspension model was acquired with Δλ i ¼ 40 nm since the collagen is not soluble in water. The collagen sample was shaken evenly before the measurements. The S3 spectrum of individual tryptophan, NADH, and flavin was measured in the aqueous solution. The S3 spectra of Fig. 2(a) can be used to determine which biochemical components mainly contribute to the S3 spectra of the tissue samples. On comparing Fig. 2(a) with Fig. 1(a) and 1(b), it can be seen that the main peak at ∼290 nm for the tissue spectra is contributed from tryptophan. The small peak at ∼340 nm corresponds to collagen, and the very tiny peak at ∼380 or ∼390 nm stands for NADH. No obvious peak for flavin was observed. Therefore, it may be concluded that the S3 spectra of tissue samples obtained with Δλ i ¼ 40 nm is mainly contributed from tryptophan, collagen, and NADH.
In order to investigate the effect of Δλ i to the S3 spectra, a mixed aqueous solution of tryptophan, NADH, and flavin with a concentration of ∼0.4 mg∕cm 3 was measured with different Δλ i values. The S3 spectra of the mixture solution obtained with Δλ i ¼ 20 and 40 nm are displayed as solid and dash lines, respectively, in Fig. 2(b). Figure 2(c) shows the S3 spectra of same mixture solution obtained with Δλ i ¼ 60, 80, and 100 nm as solid, dash, and dot lines, respectively. The spectra obtained with Δλ i ¼ 120 and 140 nm are exhibited as solid and dash lines, respectively, in Fig. 2(d). All curves are acquired under same experimental conditions except using different Δλ i . One salient feature shown in Fig. 2 is that with the increase of Δλ i , the peak intensities contributed from three fluorophores ascend, but drop down at different values of Δλ i . Another notable feature shown in Fig. 2(b) to 2(d) is that the full width at half maximum (FWHM) of the S3 spectral profiles for all three fluorophores expands monotonously with the increase of Δλ i . These two salient features shown in Fig. 2(b) to 2(d) are valuable for one to choose the optimal wavelength shift interval Δλ i , for the S3 spectral study to determine the changes of multifluorophores in tissue. The FWHM of the S3 spectrum of an individual fluorophore actually indicates the resolution of the spectral signal. A smaller FWHM corresponds to a better spectral resolution.
In fact, a FWHM-based spectral-resolving power (FSRP) can be defined as: where λ range is the whole wavelength range for the measurements with effective signal, which can be taken as 500 nm in our study, and Δλ FWHM is the FWHM for the S3 spectrum of each fluorophore, which can be obtained from Fig. 2(b) to 2(d). Using Eq. (1) and the values of FWHM and peak intensities shown in Fig. 2(b) to 2(d), the FWHM-based spectral-resolving power and the spectral peak intensities were calculated as a function of Δλ i for three fluorophores of interest, and the results are shown in Fig. 3(a) and 3(b), respectively. Using Fig. 3, one can straightforwardly choose an optimal wavelength shift constant Δλ i to obtain the S3 spectra of interest. For signal processing, two most important parameters determine the quality of signals: resolution and magnitude. constant Δλ i corresponds to the higher FSRP of spectral signal. Figure 3(b) exhibits: (1) when Δλ i ¼ 20 nm is used, all of the three fluorophores have approximately same peak but weak intensities; (2) as Δλ i increases, the peak intensities of three fluorophores ascend up at first, but descend down at different critical values of Δλ i . The curve for tryptophan (squares with a solid line) falls at Δλ i ¼ 80 nm, and that for flavin (hexagons with a dot line) drops at Δλ i ¼ 60 nm and that for NADH (circles with a dash line) decreases at Δλ i ¼ 120 nm. These may expose the underlying physical and biological basis for the Stokes shift spectroscopy techniques. One may recognize that the S3 spectra actually acquire the signal of fluorescence, and the fluorescence intensity is closely related to the quantum yield (QY). The value of Δλ i used in the S3 measurements for each biomolecule is determined by the difference of its corresponding peak wavelengths of absorption and emission. When wavelength shift constant Δλ i used in the S3 measurements approaches the Stokes shift interval, Δλ ss , defined as the difference of peaks wavelengths between λ abs and λ em , the magnitudes of the S3 spectral signal will access the maximum. When Δλ i abandons Δλ ss , the intensity of the S3 spectra will reverse back. To understand Fig. 3(b), the S3-related parameters such as λ abs , λ em , Δλ ss , and QY for main molecules of our interest in tissue samples are listed in Table 1.
When S3 spectra were acquired by setting Δλ i ¼ 20 nm, the excitation of all three fluorophores is far from their Stokes shift interval, Δλ ss , resulting the smallest signal intensity. With increase of Δλ i , the magnitude of signal from tryptophan is boosted drastically because of its higher quantum yield in comparison with NADH and flavin. Since Δλ ss values of flavin and tryptophan are ∼70 nm and ∼70 to 80 nm, respectively, their S3 spectral intensities drops back at Δλ i ¼ ∼60 to 80 nm after it exceeds Δλ ss . Similar reason causes peak intensity of NADH regressing back at Δλ i ¼ 120 nm.
Tryptophan, collagen, NADH, and flavin are key molecules in cancer diagnose using spectroscopy. [2][3][4] For breast cancer, the most common grading system used in the United States is the Scarff-Bloom-Richardson (SBR) system, 17 which is a breast cancer staging system that examines the cells and tissue structures of tumors to determine how aggressive and invasive the cancer is depending on three features: (1) The percent of the tumor makes the tissue structures change. In cancer, the tissue structures usually become less orderly; (2) the numbers of mitotic figures (dividing cells) observed in a certain magnitude microscope field. One of the hallmarks of cancer is that cells divide uncontrollably; and (3) the nonuniformity of the cell nuclei. The cancerous cells have larger, irregular, and darker cell nuclei than that of normal breast duct epithelial cells. 17 Each of these features is assigned a score ranging from 1 to 3. The lowest score 3 (1 þ 1 þ 1) is given to well-differentiated tumors with best prognosis while the highest score 9 (3 þ 3 þ 3) for poorly differentiated tumor is the worst prognosis. 17 According to the features described by the SBR system, higher cell density, uncontrollable cells-dividing, and nonuniform larger cellular  nuclei are characteristics for cancerous breast cells; therefore the changes of fluorescence from the main fluorophores inside cells (e.g., tryptophan, NADH, and flavin) should be expected. The primary fluorophore in the breast tissue extracellular matrix is type I collagen. 18 For invasion and subsequent metastasis, tumor cells degrade the surrounding extracellular matrix, which is composed mainly of type I collagen. 18 Understanding these changes during breast cancer evolution is critical to reveal the contributions of the biochemical components in tissue for the spectroscopic features. It can be seen from Fig. 3(a) and 3(b) that if spectral resolution is the only consideration, optimal Δλ i should be chosen as small as possible. On the other hand, in order to enhance the signal-to-noise ratio (SNR), the optimal Δλ i should be chosen as close as possible to Δλ ss by means of acquiring the highest signal magnitude of the fluorophore of interest. In the application of breast cancer detection, the alteration of biochemical components due to the evolution of tumor allows an optimal and easy way to highlight the difference between cancerous and normal tissues using S3 techniques with selective optimal Δλ i . In the spectral analysis for cancer detection, the differences of emission of a fluorophore among different tissue specimens and patients pronounces unreliability to compare the absolute emission intensities of certain biomarkers for different samples. Although fluorescence intensity is proportional to the number of fluorophores and the contents of the fluorophores can be calculated from the measured intensity, the calculation of the absolute contents requires complicated calibration algorithm and the information of the tissue environment such as pH-values, polarity, temperature, and viscosity. These make impossible to compute the absolute contents of the fluorophores using spectral method. An easy way is to find and use an unchanged component as a reference to measure the changes of the fluorophores of our interest. It is better to use the changes of relative contents of tryptophan and collagen in the case of breast cancer detection because of evidence of increase of tryptophan and the decrease of collagen in cancerous tissue. 17,18 Since Δλ ss ¼ 40 to 50 nm for collagen and Δλ ss ¼ 70 to 80 nm for tryptophan, Δλ i ¼ 40 nm should be chosen as optimal scan wavelength interval, which gives balance between the resolution and the SNR. Furthermore, the S3 spectra with Δλ i ¼ 40 nm can be used to investigate the signal arisen from tryptophan and collagen in tissue, which vary inversely in cancerous and normal tissues. Therefore, the S3 with selective Δλ i ¼ 40 nm can highlight the difference between cancerous and normal tissues with the inverse spectral property exhibited in Fig. 1.
To investigate changes of the relative contents of fluorophores in tissue from the S3 spectra, a forward analyzing method, namely NNLS, 10 was applied to extract relative contents of the fluorophores, e.g., tryptophan, collagen, and NADH, using their individual S3 spectrum obtained with Δλ i ¼ 40 nm shown in Fig. 2(a) and the S3 spectra of the cancerous and normal breast tissues as shown in Fig. 1(b). Figure 4(a) shows the NNLSextracted relative contents of tryptophan versus collagen, Fig. 4(b) displays the extracted relative contents of collagen versus NADH, and Fig. 4(c) exhibits the extracted relative contents of tryptophan versus NADH for different cancerous (squares) and normal (circles) breast tissues. The most salient feature of Fig. 4(a) is that all data points for the normal tissues locate in the left-upper of the data for the cancerous tissues, indicating that the relative contents of collagen in normal tissues are higher in comparison with the cancerous tissues while the relative contents of tryptophan in normal tissues are lower than those in the cancerous tissues. Figure 4(b) shows that the relative contents of collagen are lower, but the relative contents of NADH are higher in cancerous tissue in comparison with the normal tissues. Figure 4(c) provides reproducible evidence of increase of the relative contents of tryptophan and NADH in cancerous breast tissues. Figure 4(a) to 4(c) obviously shows that the relative content of collagen in cancerous breast tissue is lower than that in normal breast tissue, but the relative contents of tryptophan and NADH are larger in cancerous breast tissues in comparison with that in normal breast tissue. This observation is in good agreement with other group's studies performed for cancerous and normal breast cells and tissues. 3,4,[17][18][19][20][21] In a diagnosis test for cancer, the test outcome can be positive (cancer) or negative (healthy). LDA is a method to find a linear combination of features which separates two or more classes of objects or events, 11 which is very useful for the two-group classification. To evaluate the potential of a diagnosis method, the following statistic terms are usually used: (1) true positive: defining as a cancerous sample correctly diagnosed as malignant; (2) false positive: defining as a healthy sample incorrectly identified as malignant; (3) true negative: defining as a healthy sample correctly identified as healthy; and (4) false negative: defining as a cancerous sample incorrectly identified as healthy.
The sensitivity and specificity then can be calculated by: The criteria of categorizing the true or false positive and negative groups in our study were determined by LDA model. The separating lines on the scatter plots shown in Fig. 4(a) to 4(c) were loaded by the LDA model for the three diagnostically significant fluorophores, e.g., tryptophan, collagen, and NADH. To evaluate the S3 spectra analysis combined with LDA as a criterion for effective diagnostic algorithms for breast tissue classification, the sensitivity and specificity for our diagnostic results were calculated and are summarized in Table 2.
The performance of a two-group classification problem is typically evaluated with an ROC curve, which is a graphical plot of true positive rate versus false positive rate. 11 Fig. 4(d), 4(e), and 4(f) were generated from the scatter plot shown in Fig. 4(a), 4(b), and 4(c) to determine the correct or incorrect classification of cancerous and normal breast tissues using different pairs of diagnostically significant fluorophores, respectively. Figure 4(d) shows the ROC curve generated from the pairs of diagnostically significant fluorophores of tryptophan versus collagen, Fig. 4(e) displays the ROC curve generated from the pairs of diagnostically significant fluorophores of collagen versus NADH, and Fig. 4(f) exhibits ROC from tryptophan versus NADH. The AUC calculated from the ROC shown in Fig. 4(d), 4(e), and 4(f) is 0.956, 0.97, and 0.964, respectively, demonstrating the excellent efficacy of the S3 spectra with selective Δλ i ¼ 40 nm and the NNLS analysis combined with LDA as a promising diagnostic tool for breast cancer.

Discussion
These initial results explore the efficacy of the S3 spectra combined with NNLS and LDA methods for cancer detection in human breast and prostate tissue samples. It was also found that tryptophan, collagen, and NADH can be used as potential biomarkers for cancer detection using the S3 techniques with the selective optimal Δλ i ¼ 40 nm.
The conventional spectral approaches applied in tissue optics studies are absorption or optical density (OD), fluorescence (emission and/or excitation), and excitation-emission matrix (EEM) measurements. In the tissue absorption measurements, only few chromophores can match the detectable level. The fluorescence spectroscopy including either emission or excitation can detect low-level fluorophores, but because of the fixed pump or detecting wavelength, the strongest emission or excitation spectral signal can be acquired only for one or two fluorophores. In addition, the emission signals of most fluorophores have very wide FWHM. Table 3 is the comparison of FWHMs of the S3 spectra obtained with Δλ i ¼ 40 nm and conventional emission spectra for the four main fluorophores of our interest: tryptophan, collagen, NADH, and flavin. 3,6,22 It can be seen from Table 3 that the conventional emission spectral measurements provide poor resolution and much less information than the S3.
Although EEM can be used to ensure the coverage of all endogenous fluorophores, the data acquisition is extremely time consuming and, thus, not suitable for clinic use. Furthermore, redundant information of EEM conceals alteration of the spectral fingerprints of cancerous and benign breast tissues. In contrast, S3 spectral measurements can be used to acquire enough information of multiple key fluorophores at much low content and reach relative higher resolution with a single scan (as compared with EEM). The S3 approach can dramatically reduce data acquisition time and keep the reasonably high classification accuracy. The most obvious advantage of S3 over other conventional spectral methods is that it gives a simple and efficient way to highlight differences between normal and diseased tissues, which is more robust, evident, and pronounced.
The S3 spectra for cancer detection have also been investigated for different types of tissues other than breast. Reproducible results were observed. For example, similar changes of biomarker for cancerous and normal prostate tissue were observed as shown in Fig. 1(a), where higher peak intensities of the S3 spectra at ∼290 nm for tryptophan and ∼390 nm for NADH, and lower peak at ∼340 nm for collagen in the cancerous prostate tissues, indicating that there is a reduced contribution from the emission of collagen and increased contributions from tryptophan and NADH in cancerous prostate tissues as compared with normal tissue. These reproducible changes were also observed in lung cancer tissues in limited number of samples. Further investigation needs to be achieved in large number of different types of cancerous tissue specimens.
The histological imaging analysis is needed to confirm our spectral study for diagnosis purpose of breast/prostate cancer in collaboration with pathology experts. The stained normal and cancerous breast/prostate tissue histological slides corresponding to the pairs of the cancerous and normal tissues used for the spectral measurements will be needed for microscopy study to investigate the presence of these spectral biomarkers, tryptophan, collagen, and NADH, as a function of types and stages of cancer, specifically SBR scores of breast cancer, which is used to determine the aggressiveness and invasiveness of the breast cancer, and Gleason grades of prostate cancers, which is used to evaluate the prognosis and aggressiveness of men with prostate cancer.

Conclusion
Stokes shift spectroscopy method offers a robust and efficient way to rapidly measure spectral fingerprints of fluorophores in complex media such as in tissue and highlights the difference between cancerous and normal tissues. This paper explicitly discloses the underlying basis for how and why this spectral method is superior in comparison with absorption and fluorescence spectroscopic methods. It also specifies how to select the optimal Δλ i for breast cancer detection. This study demonstrates that the S3 method can be used to acquire information for different key fluorophores from one S3 spectrum and to investigate the changes of the relative contents of the key fluorophores in breast/prostate tissues during the development of cancer. The great advantage of S3 is that a much smaller number of data points are required while much more fluorophores can be detected. S3 can be a new optical tool for medical armamentarium. Table 2 The sensitivity and specificity of the S3, NNLS, and LDA analysis for breast cancer detection.