Visible and near-infrared spectroscopy for distinguishing malignant tumor tissue from benign tumor and normal breast tissues in vitro

Abstract. The high incidence and mortality of breast cancer requires an effective, rapid, and cost-effective method for its diagnosis. Here, visible and near-infrared spectroscopy in the wavelength range of 400 to 2200 nm is utilized for distinguishing the malignant tumor tissue from benign tumor and normal breast tissues. Based on the absorption and scattering spectra of fixed samples, three spectral analysis methods are proposed which include an absorption spectral analysis, a scattering spectral analysis, and a combined spectral analysis of the two. By comparison with the histopathological examination, the sensitivity, specificity, and accuracy of the three analysis methods are calculated. The results showed that the combined spectral analysis method can significantly enhance the effectiveness when compared with the sole absorption or scattering spectral analysis method. The sensitivity, specificity, and accuracy of the combined spectral analysis method are 100%, 87.82%, and 87.50% for the benign tumor tissue and 81.82%, 100%, and 87.5% for malignant tumor tissue, respectively. All of the three values are 100% for normal breast tissue. This study demonstrates that the combined spectral analysis method has better potential for in vitro optical diagnosis for breast lesions.


Introduction
Breast cancer is one of the most common malignant tumors for women.The incidence rate of breast cancer has been growing rapidly during the past 10 years.There are almost 1.3 million women suffering from breast cancer each year, and the mortality of breast cancer is up to 40%. 1,2To reduce the morality, an accurate and efficiency method for early screening of breast cancer is required.Mammography, a common diagnostic technique, has been a common screening tool for breast cancer.However, it is not sensitive to cancerous lesions for radiological dense breasts, which may lead to a significant false-positive reports. 3iffuse optical technique based on frequency domain, [4][5][6][7][8] time resolved, 9,10 continuous wave, 11 or the combination 12,13 was used to measure the concentration of oxy-(HbO 2 ) and deoxy-(HbR)hemoglobin to assess the metabolism state of normal or lesion breast, which has shown some potential for noninvasive breast cancer screening.
For some patients, surgery is often unavoidable, but as many as 20% to 70% undergoing breast conserving surgery require repeat surgeries due to a positive-surgical margin diagnosed postoperatively. 14Until now, the histopathological examination is regarded as the gold standard for the diagnosis of breast lesions based on surgical operation, which can distinguish the malignant tumor from benign tumor and normal breast tissue according to the changes in microscopic morphology and composition of breast tissues by staining methods. 15However, this method is time consuming.Usually, patients have to wait for three to seven days to get the exact nature of tumor before the surgeons perform mastectomy or lumpectomy. 16,17][20][21] Actually, tissue optical properties, i.e., absorption, scattering, and reduced scattering coefficients, can vary with the degree of tissue lesion severity, 22 which have the potential to aid in the assessment of breast pathology in vitro. 23,24For instance, absorption coefficient reflects molecular composition of tissues, whose changes at specific wavelengths could serve as a spectral fingerprint of the molecular change for the diagnostic purposes. 25The scattering coefficient or reduced scattering coefficient depends on the size, morphology, and structure of components in tissues.The components of breast tissue include blood, 5,12 water, 6,7 lipids, 5,6,8,10 and collagen, 9,10 etc. Variations in the components of lesion tissue would affect absorption and scattering properties, thus providing a means for characterizing the pathological change of tissue. 11Previous investigations about tissue optics mainly focused on the visible and near-infrared (VIS-NIR) region between 400 and 1000 nm [5][6][7][8] and some extended to 1600 nm. 22It should be noted that the characteristic information of lipids and water in the wavelength range of 1400 to 2000 nm is more significant, 26 which may be more suitable for breast cancer diagnosis.
The purpose of this work is to develop a rapid and effective diagnosis of breast lesions by using VIS-NIR spectral analysis of fixed tissue samples.A commercially available spectrophotometer with an integrating sphere was applied to measure the reflectance and transmittance of the samples.Based on the absorption and scattering spectra of normal and abnormal breast tissues in the wavelength range of 400 to 2200 nm, an absorption spectral or scattering analysis methods were proposed by self-defined absorption spectrum variation factor A or a wavelength exponent c of scattering.Further, a combined spectral analysis of the two parameters was developed.By comparing the results of spectral analysis methods with histopathological examination, the sensitivity, specificity, and accuracy were calculated to evaluate the effectiveness of the methods.

Experimental System
In order to obtain the optical properties of tissue, a spectrophotometer (Lambda 950, PerkinElmer, Waltham, Massachusetts) with an integrating sphere was applied to measure the transmittance and reflectance spectrums of the samples in the range of 400 to 2200 nm with 10-nm intervals.
Here, the diameter of the integrating sphere is 150 mm, and the diameter of the sample port is 25.4 mm.8][29] Therefore, in this work, a single measurement of transmittance or reflectance spectrum for each sample was performed, respectively.The transmittance and reflectance were measured in the range of 400 to 2200 nm with a 10-nm interval.

Sample Preparation
All samples were obtained from 28 patients during surgery in Affiliated Tongji Hospital, Huazhong University of Science and Technology.According to the histopathological report, the samples were divided into three groups: malignant breast tumor tissue (n ¼ 11), benign breast tumor tissue (n ¼ 5), and normal breast tissue (n ¼ 16).Normal breast tissue samples were obtained from safe margins of tumors without obvious tumor characteristics, which were confirmed by histopathology.After having removed the residual blood on the tissue surface with saline, the fresh breast tissues were fixed immediately in 10% formaldehyde solution so as to keep their nanoarchitecture. 30,31ixed tissues were stored in a dark room at 4°C, then warmed up to 25°C, and sliced into tissue sections before measurements.The size of each sample was approximately 30 × 30 mm 2 to assure it was larger than the sample port (25.4 × 25.4 mm 2 ) of the integrating sphere.Each sample was sandwiched between two glass slides with 1 mm thickness each, and measured four times with a caliper on different points and averaged.The average thickness of samples was 1.1 AE 0.34 mm.

Data Analysis
Based on the measurements of sample thickness, transmittance and reflectance spectra in the range of 400 to 2200 nm and the corresponding spectra of absorption coefficient and reduced scattering coefficient were calculated by the inverse addingdoubling (IAD) method.[29]34,35 In order to find some effective optical markers to characterize different types of breast tissues, further analysis was performed as in the following sections.

Absorption and scattering spectral analysis methods
Since there are typical absorption peaks at 1450 nm for water and 1720 nm for lipid, 26 a new parameter, namely the absorption spectrum variation factor A, was defined as follows: Here, μ a1450 nm and μ a1720 nm are the absorption coefficients of breast tissue at 1450 and 1720 nm, respectively.
In general, the reduced scattering coefficient of tissue includes the total contribution of Rayleigh and Mie scatterings, which can be described with the following equation: 26 Here, a Ray1 and b Mie indicate the part of the Rayleigh and the Mie scatterings, respectively, λ is the wavelength, and c is the wavelength exponent.Equation ( 2) was used to curve fit the reduced scattering spectra of tissue in the wavelength range of 400 to 2200 nm, and then the optimal parameters would be calculated by using the least square regression.
In order to further improve the effectiveness of spectral analysis methods for distinguishing malignant tumor tissue from benign tumor and normal human breast tissues, a combined analysis method of the above two parameters was performed as follow: first, the absorption spectrum variation factor A was used to distinguish the normal samples from the diseased breast tissues.Second, the wavelength exponent c was applied to classify the diseased ones into two categories, benign and malignant tumors.

Statistical analysis
A one-factor analysis of variance (ANOVA) was applied to determine the significant difference of the absorption spectrum variation factor A or the wavelength exponent c of different types of breast tissues by SPSS16.0 software.Changes were considered to be significant if P < 0.05 and extremely significant if P < 0.01.

Effectiveness analysis
The effectiveness of the VIS-NIR spectral analysis methods is determined by comparing the results with histological examination.Based upon the results of these comparisons, three quantitative values (sensitivity, specificity, and accuracy) can be calculated to provide an objective comparison of various spectral analysis methods.Calculation of the three quantities depends upon classifying a particular analysis as either a true positive (TP), which represents classification by both techniques as lesion; a false positive (FP), which corresponds to a classification as lesion by spectroscopy and as normal by histology; a false negative (FN), which corresponds to a normal classification by spectroscopy and a lesion classification by histology; or a true negative (TN), which corresponds to classification by both techniques as normal.Once the appropriate classifications have been performed, sensitivity, specificity, and accuracy for discrimination of benign and malignant tumors are calculated as follows: 15 8 < : 3 Results  Table 1 summarizes the mean absorption coefficients at 1190, 1450, 1720, and 1930 nm for different types of breast tissues and the mean absorption spectrum variation factor A according to the Eq. ( 1).The statistical analysis shows that there are significant differences in the absorption peaks at 1450, 1930, and 1720 nm between normal and diseased breast tissues except at 1190 nm, but no significant differences in the peaks at the above four wavelengths between malignant and benign breast tissues.In contrast, the absorption spectrum variation factor A defined in this work indicates an extremely significant difference between normal and diseased breast tissues and a significant difference between benign and malignant breast tissues.

Scattering Spectra of Normal and Diseased
Human Breast Tissues Figure 2(a)-2(c) shows that the scattering spectra of all the samples are very similar for the same type of breast tissue but different for different types of tissues.In visible wavelengths, the reduced scattering coefficient of normal breast tissue samples is much smaller than that of diseased tissues, and the values of malignant tumor tissue is the biggest.In addition, there are two scattering peaks at 1450 and 1940 nm for diseased tissues.
The mean AE standard error of wavelength exponent c for different types of breast tissues was calculated, and further onefactor ANOVA was performed to test the differences among the three types of breast tissues.The results are summarized in Table 2.It can be found that significant differences not only exist between the normal breast and diseased breast tissues, but also exist between benign and malignant tumor tissues.

Effectiveness of Spectral Analysis Methods
Figure 3 shows the distribution of absorption spectrum variation factor A and wavelength exponent value c of scattering for all of the samples.It can be found that it is easy to distinguish the normal and diseased breast because the normal samples are mainly localized in areas I and II, but difficult to distinguish the benign and malignant breast tissues because there is overlap in areas III and IV.
The combined analysis method (A & c) of the absorption and scattering spectral analysis was performed to distinguish malignant tumor, benign tumor, and normal human breast samples from each other.First, when the value of A was less than 0.058 (median of limit error of A for normal and benign tumor, 99% confidence interval), the sample was defined as normal, otherwise, the sample was defined as diseased tissue.Second, the wavelength exponent c is applied to classify the types of diseased samples as benign or malignant tumor.According to the distribution of wavelength exponent c of scattering, the sample was defined as a benign tumor one when the value of c was greater than −0.829 (median of limit error of c for benign and malignant tumor, 99% confidence interval), otherwise, the sample was defined as a malignant tumor.In order to evaluate the effectiveness of the absorption, scattering, or the integrating spectral analysis methods, the type identification of spectroscopy for each sample was assessed whether it is a TP, FN, TN, or FP by comparing the calculated value of optical parameters (A, c, or A & c) and the result of histopathological examination.The sensitivity, specificity, and accuracy were calculated by using Eq.(3). 15The results are summarized in Table 3.Among the three spectral analysis methods, the combined spectral analysis method has the highest sensitivity and accuracy for discrimination of benign breast tumor tissue which reach up to 100% and 87.50%, respectively.The specificity is slightly lower than that for absorption spectral analysis, but higher than that for scattering spectral analysis.For discrimination of malignant breast tumor tissue, the specificity of integrating analysis method is 100%, which is higher than those for the two other methods; the sensitivity is 81.82%, which is the same as that of the higher of the sole analysis methods; and the accuracy has the same as that for absorption spectral analysis, which is slightly lower than that from scattering spectral analysis method.From the above measurements, it was found that there are differences in absorption and scattering spectra among the normal, benign, and malignant breast tissues.For instance, there is a characteristic peak at 1720 nm for normal breast tissue, while the peak almost disappears for diseased tissues.This should be due to the changes in stroma of breast tissue. 24It is well known that a large amount of fat cells exist in normal breast tissue, but only a few in benign breast tumor tissue, and even fewer in malignant breast tumor tissue.Instead, there is a lot of collagenous stroma in lesion breast tissue. 24,36As the main component of fat, the lipid has a characteristic absorption peak at 1720 nm, 26 so the changes in absorption peak at 1720 nm reflect the content change of fat cells in breast tissue.
In the view of histopathology, the water content of a fat cell is much lower than that of fibrocyte of tumor stroma. 37Therefore, water is regarded as an important indicator of breast lesion.Chung et al. found the increase of water content in diseased breast tissues with diffuse optical spectroscopy. 7It is reported that in the wavelength range of 1100 to 2000 nm, water has three main absorption peaks at 1190, 1450, and 1930 nm, respectively. 26Compared to the two absorption coefficient magnitudes at 1450 and 1930 nm, the value at 1190 nm is too small to show an obvious peak.This explains why there is no significant difference at 1190 nm between normal and diseased breast tissues.The latter two peaks at 1450 and 1930 nm show significant differences between normal and diseased breast tissues, but it was still unable to separate the benign and malignant tumor.Although the absorption peak of breast tissue at 1930 nm is higher than the one at 1450 nm, there is bigger noise for absorption coefficient measurement when the wavelength is longer than 1880 nm.It may be caused by different types of water in tissues.As we know, water usually has two states: free water and bound water.Free water has three absorption peaks at 1892, 1906, and 1924 nm, while the bound water has two absorption peaks at 1909 and 1927 nm. 38In addition, the reflectance of the integrating sphere wall decreases with increasing wavelength, which increases the variation of measurements.Therefore, considering the absorption characteristics of water at 1450 nm and lipid at 1720 nm, the defined absorption spectrum variation factor A makes it possible to characterize different types of breast tissues.For future work, this complex band with a center at 1930 nm should be analyzed more precisely because it contains potential information about free and bound water, 38 which could be additional intrinsic markers of tissue malignancy. 7Moreover, the content of each component, even for healthy breast tissue, is not identical for different persons or different ages, 11 so the single absorption spectral analysis is insufficient to accurately characterize the type of breast tissues.
The previous investigations showed that the content of collagen has potential implications for the assessment of breast density and cancer risk, 7 which may be used to describe the changes of nanoarchitecture in stroma among the three types of breast tissues according to the scattering spectra.The results show that there is an obvious difference in scattering spectra among normal and lesion breast tissues in the spectrum of visible light, i.e., the reduced scattering coefficient of normal breast is much smaller than the diseased ones, and the value of malignant tumor is the largest, but there is no evident difference in the range of 1000 to 2200 nm.It can be interpreted as follows: the collagenous stroma in benign tumor and the hyperplasia of fibrous connective tissue in malignant tumor will increase Mie scattering of tissue.Lipid, an important source of Mie scattering, disappears with the lesion of breast tissue, which decreases the scattering.The increase of collagenous stroma and the decrease of lipid produce opposite contributions to the scattering of lesion breast tissues for the longer wavelength.However, both fibrous connective and collagenous stromal tissue consist of microfibrils with a period of 64 nm, which results in strong Rayleigh scattering. 39,40Therefore, for the visible light, the reduced scattering coefficient of lesion breast is much higher than that of normal breast.Because there are numerous of hyperplasia of dense fibrous connective tissue in malignant tumor, it leads to a stronger Rayleigh scattering than that in benign tumor.In this study, the introduced wavelength exponent c based on the size of scatters can be used to describe the scattering characteristic of different types of breast tissues.There are two scattering peaks at 1450 and 1930 nm for the benign or malignant breast tissues, which should be due to a cross-talk between the calculated reduced scattering coefficient and the absorption coefficient.Since it has been proved that the strong absorption  raises the calculated value of reduced scattering coefficient by the IAD algorithm, 26,34,35 the scattering spectral analysis neglects the information.
In order to evaluate the effectiveness of a diagnosis technique, the statistical analysis is commonly applied to test the significant difference between normal or lesion tissues based on lots of samples.For one testing sample, clinicians are more concerned about the sensitivity, specificity, and accuracy of the technique.In this work, not only the statistical analysis of two self-defined parameters was tested, but also the effectiveness of various spectral analysis methods was evaluated.Taking the above results into consideration, the combined analysis method of the absorption and scattering spectra is superior compared to the sole absorption or scattering spectral analysis method.
With the development of various optical techniques, optical biopsy shows potential in the diagnosis of breast tumor, 41,42 which depends strongly on the accuracy of tissue optical properties.In vivo measurements are the most powerful evidence to disease diagnosis, yet still suffer from limited information on tissue optical properties.Fixed tissue is different from in vivo condition.For instance, the information of blood oxygen saturation is lost with this method, yet some important information is kept such as content of water and lipid.Besides, the morphology of tissue can be kept which is also the first procedure of the histology.The VIS-NIR spectroscopy based on fixed tissues demonstrates some abilities to distinguish malignant tumor tissue from benign tumor and normal human breast tissues, which will provide significant reference for optical biopsy.
Until now, surgeons always wait for the results of histopathological diagnosis before they decide to perform a mastectomy.It usually takes several days, because the time-consuming process includes about 50 steps.For instance, it spends 24 h on tissue fixation and 15 h on dehydration.The tissue block is then sliced by microtome and heated to fix slices on microscope slides, and then stained by hematoxylin-eosin.If the immunohistochemistry staining is used, the patient needs to wait for a longer time.Before microscopic examination, the slice should be sealed. 21In addition, the microscopic examination depends on the judgment of the pathologist, which suffers from subjective factors and experience.In contrast, the proposed spectral analysis method is based on the integrating sphere technique and IAD algorithm, which could provide objective values of absorption coefficient and reduced scattering coefficient of fixed tissues.As an in vitro diagnosis technique, this optical method is more time efficient than the histopathological examination.
It should be noticed that there are some limitations for the current study.For the integrating sphere technique, the size of the excised tissue block should be larger than the sample port of the integrating sphere.In addition, the accuracy of optical properties of the sample is relative to the homogeneity of the sample.Fortunately, it is not difficult to obtain larger breast tissues.In addition, the number of samples is relatively low in this work, which may influence categorization of the range of optical parameters (A, c, and A & c) for distinguishing malignant tumor tissue from benign tumor and normal breast tissues.Future studies would be necessary to further evaluation of the sensitivity, specificity, and accuracy of the developed spectral analysis method by testing large numbers of samples.

Conclusions
Based on the measurements of VIS-NIR spectra of fixed samples in vitro, a new combined spectral analysis method of absorption and scattering parameters was proposed to distinguish malignant tumor tissue from benign tumor and normal human breast tissues.Comparing with the histopathological examination, the effectiveness of the method was evaluated by calculating the sensitivity, specificity, and accuracy, respectively.The results show that the sensitivity, specificity, and accuracy can reach up to 100% for normal breast tissue, to 100%, 87.82%, and 87.50% for benign breast tumor, respectively, and to 81.82%, 100%, and 87.5% for malignant breast tumor, respectively.The combined spectral analysis method can significantly enhance the effectiveness as compared with sole absorption or scattering spectrum analysis method.This method is effective, simple, rapid, and cost effective, which may become a helpful alternative method for in vitro diagnosis of breast lesions.

Figure 1 (
Figure 1(a)-1(c) shows the absorption spectra of normal breast tissue samples (n ¼ 16), benign tumor tissue samples (n ¼ 5), and malignant tumor tissue samples (n ¼ 11) in the range of 400 to 2200 nm.It can be seen that the absorption spectra have good consistency in the same type of breast tissue, but differences exist in different types of tissues.Figure 1(a) shows that there are four absorption peaks at 1190, 1450, 1720, and 1930 nm for normal breast tissue.The strongest absorption peak occurs at 1930 nm, and the weakest absorption peak at 1190 nm.Comparing with Fig. 1(b) and 1(c), both the absorption peaks of benign and malignant breast tumor tissues at 1450 and 1930 nm are larger than those of normal tissue, but the peak of diseased breast tissues at 1720 nm almost disappears.Table1summarizes the mean absorption coefficients at 1190, 1450, 1720, and 1930 nm for different types of breast tissues and the mean absorption spectrum variation factor A according to the Eq.(1).The statistical analysis shows that there are significant differences in the absorption peaks at 1450, 1930, and 1720 nm between normal and diseased breast tissues except at 1190 nm, but no significant differences in the peaks at the above four wavelengths between malignant and

Figure 1 (
Figure 1(a)-1(c) shows the absorption spectra of normal breast tissue samples (n ¼ 16), benign tumor tissue samples (n ¼ 5), and malignant tumor tissue samples (n ¼ 11) in the range of 400 to 2200 nm.It can be seen that the absorption spectra have good consistency in the same type of breast tissue, but differences exist in different types of tissues.Figure 1(a) shows that there are four absorption peaks at 1190, 1450, 1720, and 1930 nm for normal breast tissue.The strongest absorption peak occurs at 1930 nm, and the weakest absorption peak at 1190 nm.Comparing with Fig. 1(b) and 1(c), both the absorption peaks of benign and malignant breast tumor tissues at 1450 and 1930 nm are larger than those of normal tissue, but the peak of diseased breast tissues at 1720 nm almost disappears.Table1summarizes the mean absorption coefficients at 1190, 1450, 1720, and 1930 nm for different types of breast tissues and the mean absorption spectrum variation factor A according to the Eq.(1).The statistical analysis shows that there are significant differences in the absorption peaks at 1450, 1930, and 1720 nm between normal and diseased breast tissues except at 1190 nm, but no significant differences in the peaks at the above four wavelengths between malignant and

Fig. 1
Fig. 1 Absorption spectra of different samples in the range of 400 to 2200 nm.(a) Normal breast tissue, (b) benign breast tumor tissue, and (c) malignant breast tumor tissue.

Fig. 2
Fig. 2 The reduced scattering coefficient spectrum of different samples in the range of 400 to 2200 nm.(a) Normal breast tissue, (b) benign breast tumor tissue, and (c) malignant breast tumor tissue.

Fig. 3
Fig. 3 Distribution of absorption spectrum variation factor A and wavelength exponent value c of scattering of different types of breast tissues.

Table 1
Absorption parameters of different types of breast tissues.
Note: A i is the mean AE standard error of A.

Table 2
The wavelength exponent values of scattering of different types of breast tissues.

Table 3
Effectiveness of spectral analysis for distinguishing different types of breast tissues.Note: A means the absorption spectral analysis method, c means the scattering spectral analysis method, and A & c means the combined spectral analysis method.