Fine needle aspiration biopsy (FNAB) is often the first line of diagnosis for a palpable mass.1, 2, 3 To perform a FNAB, the mass is manually stabilized, a small diameter needle (typically 23 to 25 gauge) is inserted into the mass, and a small amount of tissue or fluid is aspirated into the needle. The aspirate within the bore of the needle is then expressed onto a slide, smeared, stained, and examined by a pathologist. Due to the small size of the needle, patient discomfort is generally limited to the initial stick of the needle. Complications including hematoma and infection are rare. The simplicity of FNAB significantly reduces the time and cost of obtaining an initial diagnosis compared with core or excisional tissue biopsy and allows rapid feedback to both the clinician and patient. In addition, comparisons of the sensitivities and specificities of core needle biopsy (CNB) and FNAB for palpable masses show them to be high and similar.4, 5 As a result, FNAB has become a frequently used diagnostic tool for the evaluation of many superficial, palpable masses.
Manual palpation of a superficial mass is often the only cue for determining the optimal position of the needle in tissue during biopsy. As a result, FNAB can frequently be nondiagnostic, especially with an inexperienced operator.6, 7, 8, 9, 10 Sample adequacy is graded on a sliding scale based on the degree of epithelial cellularity from which a diagnosis can be made.9, 10 Nondiagnostic samples are completely void of epithelial cells and consist primarily of adipose cells and cyst fluid.10 When not guided by an imaging modality, breast FNABs obtain diagnostic tissue in approximately 65 to 78% of cases.6, 8, 9 This difficulty is particularly problematic when performing FNABs in locations that are rich in adipose tissue, such as the breast and axilla. One method of increasing FNAB yield is concomitant use of noninvasive imaging devices, such as ultrasound, to guide needle placement. Radiologic guidance is almost always employed when FNAB is performed on nonpalpable masses. Although the addition of noninvasive imaging technology has been shown to increase FNAB yield, it is time-consuming, relatively expensive, and often requires additional personnel with specialized expertise.11
Recently, a portable, low-cost device based on low coherence interferometry (LCI) has been developed for fine needle aspiration (FNA) needle guidance.12 LCI is an optical ranging technique that is capable of measuring depth-resolved (axial, ) tissue structure, birefringence, flow (Doppler shift), and spectra at a micrometer-level resolution.13, 14, 15 Other groups have investigated the use of needle-based optical probes for biopsy guidance based on imaging16 or by direct measure of tissue optical properties such as multispectral reflection analysis,17 scattering coefficient,18 and refractive index19 measurements. Miniature LCI needle probes have also been used to correlate brain motion with electrocardiogram waves in a minimally invasive fashion.20
An initial feasibility study performed on excised breast surgical specimens indicated that LCI may have the potential for classifying adipose and fibroglandular breast tissue based on the slope and standard deviation of the axial depth profiles.12 The sample size for this study was small and the accuracy of LCI for breast tissue type diagnosis was therefore not evaluated. Furthermore, this data was analyzed in a semiautomatic fashion that is not suitable for clinical use; the minimum and maximum boundaries over which the data were analyzed were selected manually. Here, we present an automated algorithm for classifying adipose and fibroglandular breast tissues that includes an additional, independent parameter that quantifies LCI signal spatial frequencies. The accuracy of this algorithm was determined prospectively in a blinded fashion on a cohort of 260 biopsy correlated LCI scans from 58 patients. Intrasample variability of the algorithm was also tested. Similar classification parameters were recently introduced to develop methods for computationally driven differentiation of human breast tissue.21 However, to our knowledge, our study represents the first complete study to test the efficacy of such parameters for classification of human breast tissue.
The LCI system and probe have been described previously and are shown schematically in Fig. 1 .12 Briefly, the LCI system consisted of a nonreciprocal fiber optic Michelson interferometer. A broadband super luminescent diode (SLD) centered at with a full width at half maximum bandwidth of (Optiphase, Inc., Van Nuys, California) was used as a light source. The axial resolution was in air, or in tissue . Light from the source was transmitted through the first output port of a circulator and an splitter, which directed approximately to the sample arm. LCI depth scans (A-lines) were obtained at a rate of and the path length in the reference arm was scanned by illuminating a retroreflector mounted on a galvanometer-driven lever arm (Model 6220, Cambridge Technology, Lexington, Massachusetts). Light from the sample and reference arms were recombined and directed toward a polarization beamsplitter and two photodetectors, enabling polarization diverse detection. Shot noise limited detection was achieved with a maximum signal-to-noise ratio (SNR) of .
The optical probe consisted of a single mode optical fiber inserted through the bore of a 23 gauge ( outside diameter) FNA needle. No focusing lens was used. The needle was attached to a regular syringe through a hub (Model 54501, Inrad, Northvale, New Jersey). The syringe was held within a FNA biopsy gun. The fiber probe was designed to be simple and therefore inexpensive. Because the fiber core aperture was always at the needle tip, there was no uncertainty regarding the probe location, and the interrogated tissue was directly in front of the needle location. Although the fiber probe was housed within a FNA needle, no tissue aspirates were collected during the measurements.
Excised surgical specimens were collected, stored in 10% phosphate buffered saline and data was collected at within of collection. The needle and FNA gun were secured onto a vertical translation stage as shown in Fig. 2 . During imaging, samples were placed flat on a piece of corrugated cardboard within a Petri dish and positioned under the needle probe. The needle was lowered onto the sample until the fiber surface came into contact with the sample. Ten consecutive A-lines were collected at each site. Following imaging, the needle was raised, and the needle location was marked with India ink. The samples were then fixed in formalin. Histologic sections were obtained and stained with hematoxylin and eosin.
Histology slides were read by a pathologist who was blinded to the LCI data. Slides were randomly ordered to avoid bias from reading samples from the same patient consecutively. Histology samples were grouped into two critical cases for this application—adipose and fibroglandular tissue types. Fibroglandular tissues included benign fibrous parenchyma, adenocarcinoma, and ductal carcinoma in situ (DCIS) tissue types. Only homogeneous samples classified as pure adipose or fibroglandular tissue were included for parameter extraction and algorithm development/classification. Samples with significant heterogeneity in the image field as defined by the pathologist or samples where no ink was visible on histology were excluded. Heterogeneous samples were defined as tissues where the ratio of major to minor tissue type was approximately less than 3:1 within of the ink mark.
For each sample, 10 consecutive A-lines were acquired. Signal parameters were extracted for each A-line, and the mean value for each parameter was used to represent the sample. Each parameter was calculated using an automated MATLAB script without the need for additional user input other than the sample file. Prior to parameter extraction, the raw LCI interferogram data was converted to depth-dependent reflectivity profiles in the standard fashion.12 The signal was transformed using discrete Fourier transform (DFT), bandpass filtered, frequency shifted to zero, and inverse transformed. The resulting linear intensity values were then converted to decibel scale by multiplication.
Automatic LCI scan boundary extraction
At the beginning and end of the LCI scan, the signal contains data that are not representative of the tissue sample. As a result, prior to parameter extraction, the data must be automatically parsed to determine the segment of the LCI scan that contains tissue reflectivity information. The location of the fiber-sample interface was automatically determined by the following procedure. The noise floor was determined by averaging the signal within the first of imaging depth, which corresponded to a region proximal to the fiber-sample interface. All signal points below the threshold were set to be equal to the noise floor, using a threshold of . Next, a first-order derivative was computed and the first peak was determined by the first zero crossing of the derivative. To avoid error from specular reflection at the fiber-sample interface, the start index was shifted an additional beyond the first peak. Automatic selection of the beginning location of the LCI scan in this manner allowed the effective start index to always fall within signal values representing the tissue structure. The last of the LCI signal were also skipped because the signal was generally low in this region. Thus, the analyzed data consisted of the region from the effective start index to the end of the LCI scan minus the last . This algorithm was automated and applied to all LCI scans to determine the data range over which to compute the slope, standard deviation, and spatial frequency content parameters. The average depth over which the signal was analyzed was with a range from .
To first order, the LCI reflectivity intensity decreases in accordance with the Beer-Lambert law. At a source wavelength of , tissue optical properties are such that scattering dominates over absorption.22 Therefore, the slope of the logarithmic axial depth profile is related to the scattering coefficient and can be used as a parameter for classifying tissue type. A higher slope indicates more attenuation and a larger scattering coefficient, whereas, a lower slope indicates a lower scattering coefficient. The slope was calculated by a first-order polynomial fit over the region of interest.
The variation of scattering cross sections within a LCI depth scan can be used as another parameter for classifying tissue type. One way to assess the scattering variance is to measure the slope-subtracted standard deviation of the axial depth profile. If the scattering fluctuates significantly, the reflection profile will have peaks interspersed with periods of low signal and the standard deviation will be high. Conversely, if the scattering is relatively homogeneous, the signal will be more continuous and the standard deviation will be low. To remove the effect of the bulk averaged scattering coefficient, the residual of the linear fit was used to compute the standard deviation.
Spatial frequency content
Scattering center distribution, representing the distance between scatterers, may be evaluated by analyzing the spatial frequency components of the signal. The power spectrum of the signal can be interpreted as the signal energy within spatial frequency windows, and the unique signature from different tissue types was recently described as a method for differentiating human breast tissue.21 The spatial frequency parameter was computed in the following manner. First, as with the computation of the standard deviation, the linear regression was conducted and the residual was used for subsequent processing. Next, the dc component was removed by mean subtraction. Data outside the start and end index were set to zero. The resulting signal was zero mean, with components that fluctuated with varying frequency content depending on tissue type. The DFT was then computed. The spatial frequency parameter was then defined by integrating the magnitude of the spatial frequency content over a particular window band. The window was defined by calculating the average DFTs for the entire training set and observing where the adipose and fibroglandular tissue samples differed. A zoomed portion of the mean DFTs for the training set are shown in Fig. 3 . The vertical lines represent the width of the integration window.
A multivariate Gaussian model was used for classification. The data set was randomly split into training and validation sets. A pooled estimate of the covariance matrix was used for the training set. The result of the model was an equation for each class that defined the probability that any new set of parameters fell within that class. Prospective analysis was then performed on the validation set. Classification was carried out by extracting parameters for each test sample, calculating the probability of falling within a particular class, and then assigning classification based on the highest probability.
To test the intrasample variability of the device and algorithm, an additional experiment was conducted using another data set. The needle probe was lowered onto each sample and a 10–A-line acquisition was performed. The needle was then raised off of the sample using the vertical translation stage and relowered back onto the sample for an additional 10–A-line acquisition. This process was repeated 10 times so that each sample had 10 data sets of 10 A-lines each all from the same exact location. After the 10 measurement, the needle was raised and the sample was marked with India ink and sent for histology sectioning and staining. Each set of 10 A-lines was processed in the same manner as the earlier experiments so that a single set of extracted parameters characterized each set of A-lines. The samples were then classified using the multivariate Gaussian model. The result was a set of 10 classifications from the same sample at the same location. The intrasample variability was defined as the percentage of misclassified measurements within a particular sample.
The accuracy of LCI for classifying breast tissue type was assessed by comparing the predicted tissue type to the gold standard histopathologic classification. All data processing and parameter extraction were done within MATLAB. Each parameter is listed as , where is the mean, and is the standard deviation. The -values were calculated using a two-sided unpaired -test to determine if the difference in sample means between parameters were statistically significant, and 95% confidence intervals (CI) are also reported.
Typical LCI profiles of adipose and fibroglandular breast tissue were very different (Fig. 4 ). The adipose samples contained multiple reflectivity peaks, presumably representing the lipid core and cell membrane interface (Fig. 4). Human adipocytes range in size from , which makes the location of the reflectivity peaks highly variable.23 The scattering centers in the fibroglandular tissue case are much closer together and most likely come from small changes in the refractive index from within the extracellular matrix. As a result, the LCI signal for fibroglandular tissue was smoother and more continuous.
Data was collected from a total of 260 samples from 58 patients. Of those, 34 were not analyzed due to the absence of a fiducial ink mark in the histopathologic slide, and 54 were excluded owing to the presence of heterogeneous tissue at the LCI measurement site. The set of 158 histopathology correlated LCI data sets included 71 adipose and 87 fibroglandular cases. The fibroglandular data set included 71 benign fibrous parenchyma, 13 adenocarcinoma, and 3 DCIS cases. The data sets were randomly separated into a training set ( ; 37 adipose, 35 fibroglandular) and a validation set ( ; 34 adipose, 52 fibroglandular). There were 7 (5 adenocarinoma, 2 DCIS) and 9 tumor (8 adenocarcinoma, 1 DCIS) cases included in the fibroglandular group for the training and validation sets, respectively. The additional samples were used for intravariability testing .
The results from the training set are listed in Table 1 . As the table demonstrates, each parameter has a significant -value. The average magnitude of the slope parameter was higher for fibroglandular tissue, which indicates a higher scattering coefficient for fibroglandular breast tissue compared with adipose tissue. The mean standard deviation was higher for adipose tissue as a result of the signal variation resulting from refractive index fluctuations. The spatial frequency parameter had more energy for the adipose samples within the integrated window band. There was no spatial frequency region where the fibroglandular tissue had higher energy as was seen at higher spatial frequencies in Zysk 21 This could be due to differences in axial resolution ( versus ) because the Fourier transform resolution highly depends on spatial sampling frequency.
Training set statistics.
|Parameter||Adipose (N=37)||Fibroglandular (N=35)||p|
Another way to represent the training data is through the use of a scatter matrix as shown in Fig. 5 . The scatter matrix plots two-dimensional scatter plots between each set of parameters and can be used to observe correlations between classification parameters. It can be seen that there is little correlation between the slope and standard deviation as well as the slope and spatial frequency parameters. In addition, the slope–standard deviation and slope–spatial frequency scatter plots show that the adipose and fibroglandular data sets fall into separate regions, making classification based on these parameters possible. The scatter plot matrix also shows that the standard deviation and spatial frequency parameters are highly correlated. This is expected as both parameters are related to the scattering strength and scatterer distribution. The correlation is higher for the adipose than for the fibroglandular tissue samples.
The results from the validation set using all three parameters for classification are listed in Table 2 . The classification parameters show the same trends as were seen in the training set data. The classification results are listed in Table 3 . During a FNA procedure, the collection of adipose tissue is seen as a nondiagnostic result. Therefore, the correct classification of adipose tissue can be viewed as a true negative (TN), and the correct classification of fibroglandular tissue can be viewed as a true positive (TP). In this way, the sensitivity, as defined by is equivalent to the accuracy of detecting fibroglandular tissue. In addition, the specificity, as defined by is equivalent to the accuracy of detecting adipose tissue. The sensitivity and specificity of the validation set were 98.1% (95% CI: 89.7 to 99.9) and 82.4% (95% CI: 65.5 to 93.2), respectively. The overall accuracy was defined as the total number of correctly classified tissue samples regardless of tissue type. With 86 (34 adipose, 52 fibroglandular) samples in the validation set, the overall accuracy was 91.9% (95% CI: 84.0 to 96.7). CIs were calculated using the normal approximation to the binomial distribution.24 The one misclassified sample from the fibroglandular validation set was an adenocarcinoma case. The other 8 of 9 tumor cases were correctly classified as fibroglandular tissues.
Validation set statistics.
|Parameter||Adipose (N=34)||Fibroglandular (N=52)||p|
|Two-Parameter ModelSlope, Std. Dev.||Three-Parameter ModelSlope, Std. Dev., Spat. Freq.|
These results use all three classification parameters as previously described. To determine whether or not the three-parameter model was statistically better than simply using the slope and standard deviation parameters,12 it was necessary to look at a truth table describing the differences between the two models. The overall classification results using only the slope and standard deviation parameters are shown in Table 3. The sensitivity and specificity were 80.8% (67.5 to 90.4) and 82.4% (65.5 to 93.2), respectively. Using only the two-parameter model, four tumor cases, all adenocarcinoma, were misclassified as adipose tissue. A truth table to quantify the differences between the two models is shown in Table 4 . In Table 4, a +/+ cell indicates that both the two-parameter and the three-parameter models correctly classified the sample. A +/− cell indicates that the two-parameter model correctly classified a sample, whereas, the three-parameter model misclassified a sample. Similarly, a −/+ cell indicates that the three-parameter model classified the sample correctly when the two-parameter model misclassified the sample, and a −/− cell indicates that both models misclassified the samples. The table shows that there are nine cases where the three-parameter model classified a fibroglandular sample correctly when the two-parameter model misclassified the sample, and no cases with the reverse scenario. The associated -value is calculated using McNemar’s test for correlated proportions and shows that there is a statistically significant difference between the two- and three-parameter models in terms of fibroglandular tissue classification. No statistical difference was observed for the adipose case.
|3 Parameter Model|
Because the standard deviation parameter is calculated from the slope-subtracted LCI signal, errors in the slope calculation could result in an artificially high standard deviation measurement. To test the effect this would have on our classification, we simulated errors in the slope by randomly modifying the slope parameter +/− 5, 10, and 20% of its nominal value. The standard deviation and spatial frequency content parameters were then recalculated using the modified slope value. The resulting classification was compared to the nominal value result using McNemar’s test as previously described. There was no significant difference in the classification results for either the adipose or fibroglandular tissue type. For the maximum error of +/− 20%, the sensitivity was 96.2% (95% CI: 86.8 to 99.5) and the specificity was 76.5% (95% CI: 58.8 to 89.3).
Data to test intrasample variability were collected and analyzed from a separate set of 14 samples from 6 patients (6 adipose, 8 fibroglandular). The average intrasample variability was 18.3% (9.5 to 30.4) for adipose and 1.3% (0.03 to 6.8) for the fibroglandular tissue samples. The overall Cohen’s statistic was 0.821 (0.725 to 0.981). The number of errors for each 10–A-line set was as follows: adipose [1 0 0 0 7 3] and fibroglandular [0 0 0 1 0 0 0 0]. The one outlier sample within the adipose data set ( error rate) was due to low signal content, which tended to reduce the spatial frequency content parameter and shift the probability toward the fibroglandular tissue type. If the outlier were be removed, the adipose intrasample variability rate would become 8.0% (2.2 to 19.2) and the value would become 0.918 (0.847 to 0.988).
We present an automated algorithm for differentiating ex vivo adipose tissue and fibroglandular human breast tissue using LCI interferometry that achieves a high sensitivity and specificity. The extracted parameters used for classification are simple and require minimal additional computation time compared with the standard postprocessing of the LCI signals. The goal of this project is to differentiate between nondiagnostic adipose tissue and the fibroglandular tissue more likely to harbor disease. The ability of LCI to differentiate between adipose and fibroglandular tissue indicates that this technology has the potential to be a useful tool in FNA procedures in an attempt to reduce nondiagnostic sampling rates. More importantly, tumor samples are correctly classified as fibroglandular, meaning that they will not be misclassified as adipose tissue, resulting in a missed diagnosis.
There remain a few challenges to taking such a system into a FNA clinic. First, in this work only homogeneous samples were used for analysis and classification. This was done to define a clear set of parameters that represent the true nature of adipose and fibroglandular tissue types. In a clinical setting, heterogeneous samples will be encountered that will decrease the accuracy of the model. Future work will focus on further defining boundaries between tissue types to provide a regional diagnosis that will account for heterogeneity. In addition, some clinical applications may require further differentiation of fibroglandular tissues into normal fibrous and tumor tissue types as well as identification of additional categories such as necrotic tissue. The classification of nondiagnostic adipose tissue samples, however, does not require this distinction, and as such, was outside the scope of this paper. Also, LCI assumes that the signal comes from single-scattering events, but the presence of multiple scattering, especially at larger depths within the sample can lead to decreased resolution, and changes in the signal profile. In particular, the slope parameter, with its connection to the Beer-Lambert law is particularly sensitive to the single-scattering assumption. It may be necessary to define the border between single and multiple scattering to improve the model and include additional tissue types. In addition, because the standard deviation parameter is calculated from the slope-subtracted LCI scan, any error in the slope fit could result in artificially high standard deviation measurements. However, we found that errors up +/− 20% did not significantly affect the classification result.
Insertion of the LCI needle probe within the tissue structure in an in vivo setting, as opposed to surface-only measurements as were done in this study, may introduce additional obstacles that could limit the algorithm’s effectiveness. Issues such as bleeding, tissue or optical fiber compression, and operator motion artifacts will all come into play. We plan on studying these issues through in vivo animal experiments to further define any limitations of the LCI needle probe. We anticipate that higher speed systems will significantly reduce any motion artifacts seen in the current device. Lastly, the ability to collect tissue aspirates directly following a LCI measurement will need to be addressed. Issues such as the collection of sufficient aspirate material as well as the development of a disposable probe are the subject of an ongoing investigation. Additional future work will focus on the development of higher speed systems based on recent advancements in LCI technology.25, 26 These advancements will allow for higher speed imaging, improved SNR, and greater imaging depth.
This research was funded by the Medical Free Electron Laser program (Grant No. FA9550-04-1-0079) and a NIH Ruth L. Kirschstein individual fellowship (Grant No. 1 F31 EB005141–01A2). The authors would like to acknowledge Sven Holder and the Massachusetts General Hospital pathology lab for their generous help in obtaining tissue samples.