Automated algorithm for breast tissue differentiation in optical coherence tomography

Mircea Mujat; Robert Daniel Ferguson; Daniel X. Hammer; Christopher M. Gittins; Nicusor V. Iftimia

doi:10.1117/1.3156821

1 May 2009 Automated algorithm for breast tissue differentiation in optical coherence tomography

Mircea Mujat, Robert Daniel Ferguson, Daniel X. Hammer, Christopher M. Gittins, Nicusor V. Iftimia

Author Affiliations +

Journal of Biomedical Optics, Vol. 14, Issue 3, 034040 (May 2009). https://doi.org/10.1117/1.3156821

Abstract

An automated algorithm for differentiating breast tissue types based on optical coherence tomography (OCT) data is presented. Eight parameters are derived from the OCT reflectivity profiles and their means and covariance matrices are calculated for each tissue type from a training set (48 samples) selected based on histological examination. A quadratic discrimination score is then used to assess the samples from a validation set. The algorithm results for a set of 89 breast tissue samples were correlated with the histological findings, yielding specificity and sensitivity of 0.88. If further perfected to work in real time and yield even higher sensitivity and specificity, this algorithm would be a valuable tool for biopsy guidance and could significantly increase procedure reliability by reducing both the number of nondiagnostic aspirates and the number of false negatives.

1. Introduction

Breast cancer is the most common cancer found in women in the United States, with an estimate of over 180,000 new cases every year. It is also the second leading cause of cancer deaths in women, after lung cancer, and is responsible for the deaths of over 40,000 women every year in the U.S.¹ The most common breast malignancy is invasive ductal carcinoma (IDC), which accounts for around 75 to 80% of invasive breast cancers, while invasive lobular carcinoma (ILC) accounts for most other invasive cases. As with all malignancies, early detection of breast cancer is the only way to effectively manage patients who suffer from this disease. According to the American Cancer Society (ACS),² the $5 - year$ survival rates for persons with breast cancer that are appropriately treated are as follows: 100% for stage 0, 100% for stage I, 92% for stage IIA, 81% for stage IIB, 67% for stage IIIA, 54% for stage IIIB, and 20% for stage IV. This indicates that improved diagnosis methods need to be developed to detect this cancer in its early stage when it is treatable and its survival rates are high.

Mammography is currently the standard screening tool for breast cancer. When suspicious masses are found during the mammographic screening, various other tests are performed to confirm and stage the disease. Among these tests, biopsy has proven to be the best way to determine whether a suspicious area seen in an image is in fact cancer. Breast biopsies provide very useful diagnostic information and can be comfortably performed with intravenous sedation and local anesthesia. However, while the biopsy of relatively large palpable masses usually has a high diagnostic yield, up to 99% (Ref. 3), the biopsy of smaller infiltrating masses has a much lower diagnostic yield (below 60% in some studies).^{4, 5, 6, 7, 8, 9, 10} Without a priori knowledge of the 3-D location of small cancer masses, it is unlikely that a biopsy protocol will yield consistently high cancer detection rates. This is the primary reason why biopsy has relatively high rates of false-negative diagnoses when no guidance modalities are used. Since the majority of lesions for which women undergo biopsy prove to be benign and many women have multiple biopsies during their lifetimes, very reliable and less invasive techniques are required. Currently, three types of biopsies are used on a large scale: fine needle aspiration biopsy (FNAB), core needle biopsy (CNB), and vacuum biopsy (VB).

FNAB is the least invasive and best tolerated procedure, typically using a 23-gauge needle or smaller,¹¹ and it is preferred by many patients because it does not produce discomfort or bleeding. However, its diagnostic yield largely depends on the clinician skills.^{10, 12, 13, 14} FNAB is a cost-effective and rapid procedure, and the results are rapidly available for cytopathologic analysis. Therefore, the patient usually has a diagnostic by the end of the visit. The number of annual FNAB procedures varies from one clinic to another within a range of a few hundred to thousands,¹⁵ but even for small clinics, it is usually over one hundred per year.

CNB is the most used technique and has high reliability because it allows for collection of larger size tissue samples (up to $1.5 mm$ in diameter and $1 to 2 cm$ in length) for histological analysis. However, CNB is often associated with deleterious side-effects, including serious discomfort and bleeding, and potential tissue morbidity. Also, CNB requires ultrasound or computed tomography (CT) needle guidance to sample nonpalpable masses and to avoid perforating major vessels.

VB has recently gained acceptance by clinicians because it provides more reliable results than FNAB or CNB.^{16, 17} However, it is relatively expensive and produces even greater discomfort than CNB.

Currently, most of the FNAB interventions are done without any guidance modality. However, due to the inability to identify tissue type by manual palpation and the challenges of positioning the needle tip within a viable tumor, which may be admixed with normal, reactive, and necrotic tissue, nondiagnostic aspirates occur in about 20% of aspirates and in 5 to 15% of patients.³ Therefore, a relatively simple but efficient method for FNAB guidance would substantially increase the diagnostic yield of this simple, minimally invasive, and affordable procedure. Proper needle placement could substantially reduce the number of the nondiagnostics aspirates and improve the sensitivity and specificity of the procedure. Ultrasound and stereotactic CT guidance of all three types of biopsy procedures have produced enhanced outcomes.^{18, 19, 20} However, ultrasound and stereotactic CT guidance is not always available and when it is available, it substantially increases the overall cost of the biopsy. Therefore, simpler and less expensive guidance methods are desirable.

Optical methods have been developed for many years to improve biopsy outcome. They can detect tissue abnormalities with relatively good accuracy, and therefore they offer a viable alternative for biopsy guidance. Among various techniques developed to date, spectroscopic-based methods have shown real promise for tissue-type discrimination. For example, several diagnostic studies have found significant differences in both the emission and excitation spectra from normal, benign, and/or malignant breast tissues. Alfano ^{21, 22} have first showed differences between the fluorescence emission spectra of normal and malignant breast tissues. Yang ^{23, 24} have used the emission spectra within the $300 to 400 - nm$ spectral region to discriminate between malignant and fibrous samples. They found 93% sensitivity and 95% specificity, but results were worse for discriminating normal fatty and malignant tissues. Gupta ²⁵ have measured emission spectra when normal, benign (fibroadenoma), and malignant (IDC) breast tissue samples were excited with $337 nm$ . Using the integrated emission intensity from the $337 - nm$ excitation, malignant tissues were separated in a binary fashion from both benign and normal with a specificity of 98%. Diffuse reflectance methods have also shown promise for use in tissue discrimination. Several studies have determined that diffuse reflectance spectroscopy can detect changes in scattering and absorption due to malignancy-associated alterations in levels and organization of hemoglobin, $β$ -carotene, DNA, and other proteins.^{21, 22, 23, 25, 26, 27, 28, 29} Bigio ²⁶ have used in vivo measurements to distinguish between malignant and normal tissue with sensitivities up to 69% and specificities up to 93%. Ramanujam³⁰ has combined reflectance and fluorescence measurements but has found no significant improvement in diagnostic performance using this multimodal approach. However, neither of the aforementioned reflectance technologies can probe the tissue in depth over a distance of at least $1 to 2 mm$ , and therefore their role is mostly limited to epithelial malignancy biopsy guidance (for example, for colon polyps, esophageal, oral, or cervical cancer). Another limitation of most of these techniques is that they cannot be performed through the lumen of a fine gauge needle. Only optical reflectance and the reduced scattering coefficient have been investigated using a needle-like probe for tissue characterization.³¹

More recently, optical coherence tomography (OCT) and low-coherence interferometry (LCI), the nonscanning, nonimaging version of OCT, have been applied to tissue discrimination toward optical biopsy^{32, 33, 34, 35, 36} and image-guided surgery of breast cancer.³⁷ OCT is an optical ranging technique that is capable of imaging depth-resolved (axial, $z$ ) tissue structure, birefringence, flow (Doppler shift), and spectra at a resolution of several microns. The tissue probing depth with this technology is on the order of $2 to 3 mm$ , which is almost one order of magnitude higher than the probing depth of the spectroscopic approaches. Besides this advantage, OCT systems can be constructed using fiber-optic components, and therefore OCT probes can easily fit within the bore of a fine gauge needle (for both scanning and nonscanning modes), allowing diagnostic information to be obtained directly from the FNAB site. Such very small fiber-optic-based probes have numerous clinical diagnostic and therapeutic applications.

Automated interpretation of OCT findings is, nonetheless, a very challenging issue. Previous studies^{32, 33, 34, 35, 36} suggested that this technology has the potential to substantially increase the diagnostic yield of the FNAB procedures. However, until now, only the differentiation between adipose and tumor has been demonstrated on both OCT and LCI scans.^{33, 34, 35, 36} A previous study³² demonstrated the possibility of differentiating between tumor, adipose, and stroma (connective tissue) using elaborated algorithms illustrated on a very limited number of samples and without axial discrimination. The capability of differentiating between the multiple tissue types (adipose, fibrotic, tumor, necrotic, etc.) that could be present within the same OCT or LCI scan will add more value to an optical guidance tool.

In this paper, we demonstrate an advanced algorithm for automated differentiation of the three major tissue constituents: adipose, fibrous, and tumor, usually found admixed in suspicious breast masses. The algorithm was tested ex vivo on 137 samples of human breast tissue and provides spatial discrimination of the tissue both lateral and axial. With this algorithm, the pathologist/clinician performing the biopsy will be able to more precisely determine what tissue type is present at the tip of the needle before performing the biopsy. Therefore, this algorithm could help substantially decrease the number of nondiagnostic aspirates and increase the overall biopsy yield.

2. Methods

2.1.

Instrumentation and Measurement Protocol

An ex vivo study on excised tissue specimens was conducted to test the capability of the OCT technology for tissue differentiation. The main objectives of this study were to identify characteristic features of each of the tissue types (fibrous, adipose, tumor), develop quantitative metrics for tissue differentiation using a training set of tissue specimens, and test these metrics on a validation set of tissue specimens.

OCT measurements were performed on over 150 fresh tissue samples from patients with breast cancer surgery (lumpectomy and mastectomy). A $1310 - nm$ spectral-domain OCT (SD-OCT) system, presented elsewhere,³⁸ was used for this study. This system provided an axial resolution of $10 μ m$ , a lateral resolution of $25 μ m$ , and an imaging range of about $2.2 mm$ at a line rate of $5.12 kHz$ . The SD-LCI/OCT system can accommodate both scanning and nonscanning probes, and therefore it can work in both LCI and OCT modes. A bench-top OCT probe was used in this study employing a free-space scanning mechanism in the sample arm. An imaging rate of $5 frames ∕ s$ of $512 \times 1024 pixels$ /frame was achieved with this system and was limited by the relatively low reading rate of the InGaAs line detector (SU512LX, Sensors Unlimited, Inc., Princeton, New Jersey). However, we are currently upgrading this system, and the new SU1024-LDH camera will allow for a frame rate of $45 frames ∕ s$ .

The breast tissue samples were obtained from the Pathology Department, Massachusetts General Hospital (MGH), and National Disease Research Interchange (NDRI). No information about tissue donors was provided. Tissue procurement, handling, and data collection were performed according to an MGH-approved Institutional Review Board protocol (2002P000487 from 09/14/2007) and NDRI protocol (DIFN1-001-005 from 03/05/2007). Tissue samples from NDRI were shipped overnight within a few hours from excision. The tissue was kept in saline and shipped in a box with ice bags and arrived in very good pathological condition.

Our tissue measurement protocol consisted of OCT B-scans over small areas of each sample ( $3 - mm$ lateral scans). Each tissue sample was kept hydrated in saline solution at $37 ° C$ during the measurements. After completion of OCT measurements, each tissue sample was marked with india ink on the OCT imaging locations (usually 3 to 5 locations on each sample) and fixed with formalin (10% formalin in PBS). Histologic preparation of each tissue specimen was then performed at the MGH histology department.

Typical OCT images for each tissue type are shown in Fig. 3. The OCT image was cropped to $1.5 mm$ depth to keep only the tissue part of the image. Representative depth reflectivity profiles (A-scans) are shown in the first row of Fig. 1. A clear difference can be observed between the adipose tissue and the other two tissue types (fibrous and tumor). However, this difference is less significant between fibrous and tumor tissue. In many cases, tissue differentiation is difficult, especially within breast masses that consist of admixed tissue types, when more than one tissue type is present within the same reflectivity profile. Therefore, a set of key metrics (signal slope and variance, mean spatial frequency of the intensity peaks, mean area of the power spectrum peaks, etc.) was developed to find specific characteristics for each tissue type.

Fig. 1

Graphical illustration of the main steps in the signal processing sequence. First column (position 1 in Fig. 3, shown laser)—adipose tissue; second column (position 2 in Fig. 3)—fibrous and adipose tissue; third column (position 3 in Fig. 3)—tumor tissue. First row—unprocessed depth profile (continuous) and smoothed version (dotted line); second row—smoothed profile with linear fit for each region; third row—depth profile variation around the linear fit; fourth row—normalized power spectra for each linear fit window.

Fig. 3

Representative examples of OCT diagnosis on breast tissue specimens. (a) adipose; (b) admixed fibroadipose; (c) fibroglandular; (d) tumor.

2.2.

Data Processing

An elaborated signal processing scheme was designed to determine a set of key metrics for tissue differentiation, and a data analysis algorithm was developed to analyze the key metrics and assign tissue type. This data processing scheme is summarized in the following. The OCT spectra are first processed following the standard SD-OCT procedure to produce the depth reflectivity logarithmic profiles (shown in Fig. 1 in linear scale using arbitrary units). The next step in data analysis is to remove the background that bears no relevant information. A constant background is subtracted from all A-lines. After background subtraction, a low-pass filter is applied to each depth profile to generate a smoothed depth reflectivity profile (Fig. 1, first row), and the smoothed profile is further used only to determine the slope of the signal decay (Fig. 1, second row).

2.2.1.

Depth reflectivity profile parameters

The slope of the reflectivity profile is the first parameter used in our algorithm. It provides information related to the depth attenuation of the signal, which is a function of tissue type. If different slopes are found at different depths, it might indicate the presence of two or more tissue types within the same depth reflectivity profile. Therefore, linear fitting is performed on several windows, each window corresponding to a portion of the depth reflectivity profile that has the same slope (Fig. 1, second row). The first depth where the signal reaches 10% of the maximum intensity of the smoothed profile is used as the starting point for the linear fit. Alternatively, the starting point can be selected at the maximum of the signal; however, the maximum of the signal might not always reflect the real tissue surface but could be deeper into the tissue and the first portion of the tissue would be missed [Figs. 1A, 1B]. The end point of the linear fit is initially selected a predetermined distance (a quarter of the total depth) away from the starting point. Then the linear fit goodness $R^{2}$ is calculated, and the end point of the linear fit is varied to maximize $R^{2}$ (minimize $1 ∕ R^{2}$ ) using a standard optimization procedure (fminbnd function in MATLAB, for example) (Fig. 1, second row). The end point of the optimized linear fit in the first window is selected now as the starting point in the second window. The end point in the second window is first automatically selected at a predetermined distance away from the starting point as described earlier, and the $R^{2}$ optimization procedure is performed in the second window. The process continues until the end of the profile is reached, the intensity of the linear fit becomes negative, or the signal is identically zero over the entire window. To avoid fitting very short segments of the smoothed profile, the minimum size of a window is selected as 20% of the whole depth range. If the first positive slope is shorter than this minimum length, it can be safely neglected, since it denotes the tissue surface. All the other parameters used in our tissue differentiation algorithm are calculated in each window that was found in this initial step of the signal processing algorithm.

In general, the adipose tissue is characterized by smaller slopes [1 and 2 in Fig. 1D and 3 in Fig. 1E], while fibrous and tumor tissue exhibit steeper slopes [1 and 2 in Figs. 1E, 1F].

The second parameter used in the tissue differentiation algorithm is the standard deviation (Std) of the depth profile variations around the linear fit (Fig. 1, third row). These variations, obtained by subtracting the linear fit from the depth profile, may also provide information about the nature of the tissue being investigated. Adipose tissue produces strong reflection peaks with low reflectivity zones between them because of the relatively high differences between the refractive indices of the fat cell cytoplasm and membrane, while fibrous and tumor tissues produce lower peaks. The spread of the depth profile variations around the linear fit is the largest for adipose tissue (large Std), is significantly smaller for fibrous tissue and is the smallest for tumor tissue. Notice here the change of vertical scale in Fig. 1, third row.

The mean distance between peaks is expected to be a characteristic size of the fat cells. Therefore, this mean distance is determined between consecutive peaks of the depth profile variations around the linear fit. This third parameter is called MeanPeakDistance. This parameter is expected to be relatively large for adipose tissue, medium for fibrous tissue, and small for tumor. (Tumor tissue is optically denser than fibrous or fibroglandular tissue.) Clearly, the mean distance between peaks in Fig. 1, third row, is the largest for adipose tissue (G) and is the smallest for tumor tissue (I).

The fourth parameter is the standard deviation of the peak spreading over depth (StdPeakDistance). A more homogeneous tumor tissue is expected to have a reduced spread of the peaks. Therefore, a peak finder, as a signal processing routine, was developed to identify the position of the peaks, neglecting in the same time the small local variations that would otherwise be interpreted as false peaks. The profile to be analyzed is first zero-padded with a factor of 5 to increase the number of points within the profile. The first derivative of the profile is then computed. Since the profile is a discrete array of points and not a continuous function, it is unlikely to find the exact zeros of the first derivative. However, the zero-padding allows us to get close enough. The algorithm searches for pairs of neighboring points for which the first derivative is negative for the left point and positive for the right point. The two points necessarily contain the zero-crossing between them, and the requirements on the first derivative ensure negative second derivative that identifies peak and not valley. The valleys can be identified this way as well by changing the signs on the first derivative. If the height difference between the peak and the neighboring valleys is smaller than a predetermined threshold, the peak is a local maximum and is disregarded.

2.2.2.

Power spectrum parameters

The power spectrum calculation is the next step in the signal processing algorithm. The power spectrum is normalized to its maximum (Fig. 1, last row) and the peak detector is used to identify the frequency peaks. These calculations are performed for each window where different slopes were found. The weighted mean frequency (MeanFrequency) and the standard deviation around this mean (StdFrequency) are another two parameters that are evaluated. The power spectrum is used as the probability function for calculating the mean frequency. The power spectrum is expected to have a dominant small frequency for adipose tissue corresponding to large spatial distances between the fat cell walls, while for tumors it is expected to exhibit multiple high frequencies (a broad spectrum with relatively high mean and standard deviation), as seen in the last row of Fig. 1.

Only the frequency peaks above a certain threshold (0.3 as shown by the horizontal line in the last row of Fig. 1) are counted (PeakNr), indicating the number of dominant strong frequencies. Their total area above the threshold is calculated (PeakArea) with the purpose of identifying the spread of the dominant frequencies. Sharp peaks (smaller area) or broad peaks (larger area) for the same number of dominant frequencies may indicate the presence of different tissue types within the reflectivity profile. For example, breast cancerous tissue is generally denser and stiffer than the surrounding tissue, and therefore the OCT signal exhibits an increased number of dominant frequencies resulting in a broad normalized spectrum with large PeakArea [Fig. 1M]. Fibroglandular tissue [slopes 1 and 2 in Figs. 1E, 1L] is more heterogeneous than cancerous tissue but more homogeneous than adipose tissue. As a result, sharper frequency peaks with smaller PeakArea than for tumor tissue are observed, but with broader frequency peaks with larger PeakArea than for adipose tissue [slopes 1, 2, and 3 in Fig. 1K and slope 3 in Fig. 1L].

2.3.

Decision Algorithm

As a result of the analysis presented earlier, eight parameters (Slope, Std, MeanPeakDistance, StdPeakDistance, MeanFrequency, StdFrequency, PeakNr, PeakArea) were calculated and assigned to each pixel in the OCT images for each tissue specimen used in the training set. The training set is selected based on the histological diagnosis provided by an experienced pathologist.

Mean values ${\bar{x}}_{i}$ of each parameter are calculated for the three tissue types: adipose, fibrous, and tumor, $i = 1,2,3$ . ${\bar{x}}_{i}$ is a column vector made of the eight means. Covariance matrices are also calculated for each tissue type accounting for all eight parameters³⁹:

Eq. 1

S_{i} = \frac{1}{n_{i}} \sum_{j = 1}^{n_{i}} (x_{i, j} - \bar{x_{i}}) {(x_{i, j} - \bar{x_{i}})}^{T},

where

n_{i}

is the number of elements in each tissue class within the training set, and the superscript

T

indicates matrix transpose. The mean values for each parameter used in our algorithm, corresponding to the three tissue types in the training set, are listed in Table 1 .

Table 1

Mean values for the eight parameters.

	Slope (a.u.)	Std (a.u.)	PeakArea (a.u.)	PeakNr	MeanPeak Distance (μm)	StdPeak Distance (μm)	Mean Frequency (105m−1)	Std Frequency (105m−1)
Adipose	$- 1.027$	128.461	7.977	1.875	154.8	17.7418	7.795	8.374
Fibrous	$- 3.031$	124.675	8.611	1.924	94.6	23.4178	7.581	7.279
Tumor	$- 3.308$	70.78	15.251	2.761	55.9	13.76	9.348	7.542

For each sample to be diagnosed, the mean values and the covariant matrices are used to calculate a quadratic discrimination score³⁹:

Eq. 2

d_{i}^{Q} = - \frac{1}{2} \ln ∣ S_{i} ∣ - \frac{1}{2} {(x - \bar{x_{i}})}^{T} S_{i}^{- 1} (x - \bar{x_{i}}),

where ∣.∣ indicates the matrix determinant,

S_{i}^{- 1}

is the inverse matrix of

S_{i}

, and

x

is the column vector made of the eight calculated parameters for that sample. Three quadratic discrimination scores are obtained for each pixel corresponding to the three tissue classes, and the maximum score is selected to assign each pixel of the image to the correct tissue type. The quadratic discrimination score is the logarithm of the probability that the tissue at that pixel belongs to a tissue class and the maximum probability is used for tissue assignment.

Figure 2 shows a scatter plot illustrating the clustering of the three main tissue types (adipose—green, fibrous—blue, tumor—red) and their projections on the $x$ , $y$ , and $z$ planes for only three parameters: Slope, Std, and PeakArea. The points represented here are for each pixel of the OCT images corresponding to the training set of tissue samples. With only three parameters, there is significant overlap for the three tissue types. One can notice however, that there is a decent degree of separation among the three tissue types in the Slope–Std projection plane. The adipose tissue is also well separated from fibrous and tumor in the Slope–PeakArea projection plane, while the tumor tissue is better separated from adipose and fibrous tissue in the Std–PeakArea projection plane. Using more parameters in a multidimensional space is therefore expected to produce better clustering of the three tissue types with much less overlap.

Fig. 2

Scatter plot illustrating the clustering of the three main tissue types (adipose—green, fibrous—blue, tumor—red) and their projections on the $x$ , $y$ , and $z$ planes for three parameters: Slope, Std, and PeakArea.

Multiple depth reflectivity profiles are acquired and processed in each OCT frame in either scanning or nonscanning mode. The result of the algorithm calculation is a numerical value (1 to 3) representing a tissue type. A specific color corresponding to each numerical value is attributed to every pixel in the frame: light $blue = 1$ to adipose tissue, $yellow = 2$ to fibrous and fibroglandular tissue, and $red = 3$ to tumor tissue. Dark $blue = 0$ corresponds to pixels that were masked due to low signal value. Averaging schemes, user selectable over a window of $20 \times 20 pixels to 50 \times 50 pixels$ , are applied before displaying the results. Each pixel is finally assigned the dominant tissue type within the averaging window. The algorithm also calculates the percentage of each tissue type present in each frame.

We note that no image processing schemes were used here. The described algorithm was designed and implemented for A-line processing and is capable of identifying different tissue types within each A-line whether the A-lines were acquired with or without scanning (OCT or LCI mode). Averaging was done only at the end before displaying the results. It was performed to remove variability between neighboring pixels and to ensure a locally smooth display map of the tissue assignment. Alternatively, multiple reflectivity profiles can be averaged first and then only the average profile is processed. This modality is applicable to the LCI mode, where multiple A-lines are collected from the same tissue location. This can speed up processing, since 50 to 100 A-lines are sufficient to get a correct estimation, and the acquisition and processing of a small number of A-lines is very fast.

3. Results and Discussions

Selected OCT frames for several cases of single tissue type [(A), (C), (D)], as well as of admixed tissues (B) are presented in Fig. 3 . The diagnostic maps reflect the spatial distribution of each tissue type. It can be observed that each tissue type is well recovered. In Fig. 3, the adipose (A) and fibroglandular (C) tissue types can be reasonably well differentiated by a trained OCT reader by examining the OCT image only. However, it is very difficult to diagnose the relatively small differences between the fibrous (C) and tumor tissue (D) based on the OCT appearance only. For these cases, the consensus among OCT readers is relatively low. Our algorithm, however, correlates well with the histology findings.

OCT measurements were performed on 152 tissue samples to test the capability of our algorithm for tissue differentiation. Each measurement site was marked with ink, and histology was performed to correlate OCT measurements with histology findings. However, for 15 samples the technician could not find the ink marking when slicing the tissue, and therefore these samples were removed from the study. Of the remaining 137 samples, 48 were assigned to a training set and 89 to a validation set. The training set allocation was based on pathologist recommendation. These samples showed the best representation of the three tissue types: adipose, fibrous, and tumor. A correlation of over 95% was found when the algorithm was retrospectively applied to the training set. The pathologist, blind to the algorithm findings, also performed histology readings on the validation set. The trained algorithm was then applied to the validation set, and algorithm findings were correlated with histology readings. 93% of the adipose samples were correctly diagnosed, while fibrous and tumor tissues were correctly identified in 75.5% and 88% of the samples, respectively. The same set of samples was measured in an LCI (nonscanning) configuration following the same protocol.³⁸ However, the OCT mode seems to provide better results. We attribute this to the larger tissue volume sampled in the OCT mode.

Our primary interest in this study was to train the algorithm to distinguish between normal (adipose, fibroadipose, or fibroglandular) and abnormal tissue (tumor or tumor admixed with normal tissue), and also to preferentially recognize adipose tissue, which usually creates nondiagnostic aspirates (fatty fluid or fatty cells). The sensitivity and the specificity of the algorithm findings were calculated as:

Eq. 3

Sensitivity = TP ∕ (TP + FN); Specificity = TN ∕ (TN + FP),

where

T P

is the true positive value that was correctly attributed as positive to cancer findings,

T N

is the true negative value that was properly attributed to normal tissue,

F N

is the false negative value that was falsely ascribed as negative to cancer sites, and

F P

is the false positive value that was falsely assigned as positive to normal tissue samples.

The results of our study are summarized in Table 2 . We note here that the results were obtained by processing multiple A-lines in each specimen generated by scanning a relatively large tissue section. Sensitivity and specificity of 0.88 were found. These are very good values considering the fact that the tissue differentiation parameters were based on a relatively small number of samples in the training set. The algorithm can be further improved by using a larger training set and by applying a weighting function to each key parameter used in the algorithm.³⁹ Some of the parameters may provide redundant information, and their uniqueness is still under investigation. However, given the large variability in biological tissue, they seem to behave differently across a large number of tissue samples. The algorithm can also be trained to minimize the $F N$ results (increase the sensitivity of the findings) by using a log-linear modeling to determine weighting factors for each tissue feature.³⁹ This will indicate how strongly each feature correlates to the histopathologic diagnosis. The classification approach used here follows from the assumption that each class is described by a multivariate normal distribution.³⁷ Determining more accurate probability distribution functions for each class based on a larger training set might improve the algorithm performance.

Table 2

The results of the automated algorithm.

Histology-based diagnosis		Automated algorithmresults
Histology-based diagnosis		Adipose	Normal fibrous,fibroglandular,or fibroadiposetissue	Tumor ortumoradmixedtissue
Adipose	15	$14 ∕ T N$	$1 ∕ T N$	$0 ∕ F P$
Normal fibrous,fibroglandular, orfibroadipose tissue	49	$4 ∕ T N$	$37 ∕ T N$	$8 ∕ F P$
Tumor or tumoradmixed tissue	25	$1 ∕ F N$	$2 ∕ F N$	$22 ∕ T P$
Total samples: 89		Algorithm diagnostic results
Total samples: 89		56 $T N$ ; 22 $T P$ ;8 $F P$ ; 3 $F N$	$Sensitivity = 0.88$ $Specificity = 0.88$

The current version of the algorithm implemented in MATLAB (The MathWorks, Inc., Natick, Massachusetts) is not fully optimized yet. Post-processing on a laptop with a $2.0 - GHz$ dual-core processor currently takes $56 s$ for an OCT image of 1000 A-lines and 256 depth points. The processing time could be significantly improved with parallel processing of A-scans, using a faster processor, reducing the number of A-scans, and algorithm optimization. The algorithm can also be implemented in hardware for real-time processing suitable for clinical applications.⁴⁰ The study presented here was applied to OCT images. Alternatively, since the processing algorithm was intentionally designed for line processing and does not use any region-based analysis techniques (e.g., texture or kernel processing schemes), it is equally well suited for nonscanning protocols in FNAB applications where a single fiber is inserted through a biopsy needle and data from a fixed location is taken at a time.^{34, 38} In the nonscanning case (LCI mode), several A-lines are acquired, individually processed, and averaged in the end, and their processing becomes faster than of 1000 A-lines OCT images as described here. Alternatively, multiple-depth reflectivity profiles could be averaged in the nonscanning case to reduce noise, and only the average profile could be processed with our algorithm. Therefore, even the current version of the algorithm might become suitable for real-time guidance of needle biopsy.

4. Conclusions

A novel algorithm for automated classification of breast tissue types based on OCT data was demonstrated. The algorithm was able to successfully differentiate three breast tissue types (adipose, fibrous, and tumor) providing both lateral and depth discrimination. Identification of healthy, normal versus diseased, cancerous tissue was done with a sensitivity and specificity of 0.88. An increase in both sensitivity and specificity might be possible by further refining the algorithm (as described earlier).

The algorithm was preliminarily tested on OCT images. Alternatively, since it is based on the processing of individual A-lines (does not use image features for tissue classifications), it is well suited for automatic interpretation of LCI data. This enables the use of simpler probes for biopsy guidance, consisting of a bare fiber inserted through a biopsy needle.^{34, 36, 38} Besides the use of a simpler probe, the LCI mode allows for processing of a reduced number of A-lines than in the OCT mode as well as for averaging first multiple A-lines and processing only the average profile. This mode makes the current algorithm faster and suitable for real-time tissue classification and therefore for guidance of the biopsy needle by providing the physician relevant information about the type of tissue present at the tip of the needle. This could positively impact the diagnostic yield of the FNAB procedures. Since FNAB is much less expensive and faster than CNB and VB, comparable yield on palpable masses could favor LCI-guided FNAB to become the preferred diagnostic modality.

The application of the algorithm to automated interpretation of OCT data could have a significant impact on clinical translation of OCT. Even the reported level of accuracy in differentiating tissue types could increase the yield of the biopsy procedure if OCT is used as a guidance tool. With this simple technology, the pathologist or clinician performing the biopsy will be able to guide the needle or the biopsy forceps to the most representative diagnostic area of the mass based on the instrument’s ability to determine the tissue type in real time. This will avoid unnecessary biopsy and increase the effectiveness of the procedure.

Acknowledgment

This research was supported in part by a research grant from the National Institutes of Health (1R41CA114896-01A1).

References

1.

. http://www.cancer.gov/cancertopics/types/breast Google Scholar

2.

. http://www.nlm.nih.gov/MEDLINEPLUS/ency/article/000913.htm Google Scholar

3.

G. Farshid, P. Downey, P. G. Gill, and S. Pieterse, “Assessment of 1183 screen-detected, category 3B, circumscribed masses by cytology and core biopsy with long-term follow up data,” Br. J. Cancer, 98 (7), 1182 –1190 (2008). https://doi.org/10.1038/sj.bjc.6604296 0007-0920 Google Scholar

4.

S. Boerner and N. Sneige, “Specimen adequacy and false-negative diagnosis rate in fine-needle aspirates of palpable breast masses,” Cancer, 84 (6), 344 –348 (1998). https://doi.org/10.1002/(SICI)1097-0142(19981225)84:6<344::AID-CNCR5>3.0.CO;2-R 0008-543X Google Scholar

5.

E. Castella, M. C. Gomez-Plaza, A. Urban, and M. Llatjos, “Fine-needle aspiration biopsy of metaplastic carcinoma of the breast: report of a case with abundant myxoid ground substance,” Diagn. Cytopathol, 14 (4), 325 –327 (1996). https://doi.org/10.1002/(SICI)1097-0339(199605)14:4<325::AID-DC9>3.0.CO;2-E 8755-1039 Google Scholar

6.

R. K. Gupta, “Fine needle aspiration cytodiagnosis of primary and metastatic squamous cell carcinoma of the breast,” Acta Cytol., 41 (3), 692 –696 (1997). 0001-5547 Google Scholar

7.

W. H. Hindle and E. C. Chen, “Accuracy of mammographic appearances after breast fine-needle aspiration,” Am. J. Obstet. Gynecol., 176 (6), 1286 –1290 (1997). https://doi.org/10.1016/S0002-9378(97)70347-8 0002-9378 Google Scholar

8.

E. D. Pisano, L. L. Fajardo, D. J. Caudry, N. Sneige, W. J. Frable, W. A. Berg, I. Tocino, S. J. Schnitt, J. L. Connolly, C. A. Gatsonis, and B. J. McNeil, “Fine-needle aspiration biopsy of nonpalpable breast lesions in a multicenter clinical trial: results from the radiologic diagnostic oncology group V,” Radiology, 219 (3), 785 –792 (2001). 0033-8419 Google Scholar

9.

H. Yen, B. Florentine, L. K. Kelly, X. Bu, J. Crawford, and S. E. Martin, “Fine-needle aspiration of a metaplastic breast carcinoma with extensive melanocytic differentiation: a case report,” Diagn. Cytopathol, 23 (1), 46 –50 (2000). https://doi.org/10.1002/1097-0339(200007)23:1<46::AID-DC11>3.0.CO;2-F 8755-1039 Google Scholar

10.

B. M. Ljung, A. Drejet, N. Chiampi, J. Jeffrey, W. H. Goodson, K. Chew, D. H. Moore, and T. R. Miller, “Diagnostic accuracy of fine-needle aspiration biopsy is determined by physician training in sampling technique,” Cancer Cytopathol., 93 (4), 263 –268 (2001). Google Scholar

11.

A. Abati and A. Simsir, “Breast fine needle aspiration biopsy: prevailing recommendations and contemporary practices,” Clin. Lab Med., 25 (4), 631 –654 (2005). https://doi.org/10.1016/j.cll.2005.08.003 0272-2712 Google Scholar

12.

G. Vlastos and H. M. Verkooijen, “Minimally invasive approaches for diagnosis and treatment of early-stage breast cancer,” Oncologist, 12 (1), 1 –10 (2007). https://doi.org/10.1634/theoncologist.12-1-1 1083-7159 Google Scholar

13.

A. R. Hatmaker, R. M. J. Donahue, J. L. Tarpley, and A. S. Pearson, “Cost-effective use of breast biopsy techniques in a veterans’ health care system,” Am. J. Surg., 192 (5), e37 –41 (2006). https://doi.org/10.1016/j.amjsurg.2006.08.028 0002-9610 Google Scholar

14.

V. S. Klimberg, “Advances in the diagnosis and excision of breast cancer,” Am. Surg., 69 (1), 11 –14 (2003). 0003-1348 Google Scholar

15.

. http://www2.massgeneral.org/pathology/FNAservice.htm Google Scholar

16.

R. J. Jackman and J. Rodriguez-Soto, “Breast microcallcifications: retrieval failure at prone stereotactic core and vacuum breast biopsy—frequency, causes, and outcome,” Radiology, 239 (1), 61 –70 (2006). https://doi.org/10.1148/radiol.2383041953 0033-8419 Google Scholar

17.

F. M. Lomoschitz, T. H. Helbich, M. Rudas, G. Pfarl, K. F. Linnau, A. Stadler, and R. J. Jackman, “Stereotactic 11-gauge vacuum-assisted breast biopsy: influence of number of specimens on diagnostic accuracy,” Radiology, 232 (3), 897 –903 (2004). https://doi.org/10.1148/radiol.2323031224 0033-8419 Google Scholar

18.

I. Grady, H. Gorsuch, and S. Wilburn-Bailey, “Ultrasound-guided, vacuum-assisted, percutaneous excision of breast lesions: an accurate technique in the diagnosis of atypical ductal hyperplasia,” J. Am. Coll. Surg., 201 (1), 14 –17 (2005). https://doi.org/10.1016/j.jamcollsurg.2005.02.025 1072-7515 Google Scholar

19.

S. M. Roe, J. A. Mathews, P. Burns, M. P. Sumida, P. Craft, and M. S. Greer, “Stereotactic and ultrasound core needle breast biopsy performed by surgeons,” Am. J. Surg., 174 (6), 699 –704 (1997). https://doi.org/10.1016/S0002-9610(97)00199-2 0002-9610 Google Scholar

20.

E. Azavedo, G. Svane, and G. Auer, “Stereotactic fine-needle biopsy in 2594 mammographically detected non-palpable lesions,” Lancet, 1 (8646), 1033 –1036 (1989). https://doi.org/10.1016/S0140-6736(89)92441-0 0140-6736 Google Scholar

21.

R. R. Alfano, A. Pradhan, G. C. Tang, and S. J. Wahl, “Optical spectroscopic diagnosis of cancer and normal breast tissues,” J. Opt. Soc. Am. B, 6 (5), 1015 –1023 (1989). https://doi.org/10.1364/JOSAB.6.001015 0740-3224 Google Scholar

22.

R. R. Alfano, G. C. Tang, A. Pradhan, W. Lam, D. S. J. Choy, and E. Opher, “Fluorescence-spectra from cancerous and normal human-breast and lung tissues,” IEEE J. Quantum Electron., 23 (10), 1806 –1811 (1987). https://doi.org/10.1109/JQE.1987.1073234 0018-9197 Google Scholar

23.

Y. L. Yang, E. J. Celmer, J. A. Koutcher, and R. R. Alfano, “DNA and protein changes caused by disease in human breast tissues probed by the Kubelka-Munk spectral function,” Photochem. Photobiol., 75 (6), 627 –632 (2002). https://doi.org/10.1562/0031-8655(2002)075<0627:DAPCCB>2.0.CO;2 0031-8655 Google Scholar

24.

Y. L. Yang, E. J. Celmer, J. A. Koutcher, and R. R. Alfano, “UV reflectance spectroscopy probes DNA and protein changes in human breast tissues,” J. Clin. Laser Med. Surg., 19 (1), 35 –39 (2001). https://doi.org/10.1089/104454701750066929 1044-5471 Google Scholar

25.

P. K. Gupta, S. K. Majumder, and A. Uppal, “Breast cancer diagnosis using N-2 laser excited autofluorescence spectroscopy,” Lasers Surg. Med., 21 (5), 417 –422 (1997). https://doi.org/10.1002/(SICI)1096-9101(1997)21:5<417::AID-LSM2>3.0.CO;2-T 0196-8092 Google Scholar

26.

I. J. Bigio, S. G. Bown, G. Briggs, C. Kelley, S. Lakhani, D. Pickard, P. M. Ripley, I. G. Rose, and C. Saunders, “Diagnosis of breast cancer using elastic-scattering spectroscopy: preliminary clinical results,” J. Biomed. Opt., 5 (2), 221 –228 (2000). https://doi.org/10.1117/1.429990 1083-3668 Google Scholar

27.

N. Ghosh, S. K. Mohanty, S. K. Majumder, and P. K. Gupta, “Measurement of optical transport properties of normal and malignant human breast tissue,” Appl. Opt., 40 (1), 176 –184 (2001). https://doi.org/10.1364/AO.40.000176 0003-6935 Google Scholar

28.

G. M. Palmer, C. F. Zhu, T. M. Breslin, F. S. Xu, K. W. Gilchrist, and N. Ramanujam, “Comparison of multiexcitation fluorescence and diffuse reflectance spectroscopy for the diagnosis of breast cancer (March 2003),” IEEE Trans. Biomed. Eng., 50 (11), 1233 –1242 (2003). https://doi.org/10.1109/TBME.2003.818488 0018-9294 Google Scholar

29.

C. F. Zhu, G. M. Palmer, T. M. Breslin, F. S. Xu, and N. Ramanujam, “Use of a multiseparation fiber optic probe for the optical diagnosis of breast cancer,” J. Biomed. Opt., 10 (2), 024032 (2005). https://doi.org/10.1117/1.1897398 1083-3668 Google Scholar

30.

N. Ramanujam, “Fluorescence spectroscopy of neoplastic and non-neoplastic tissues,” Neoplasia, 2 (1–2), 89 –117 (2000). https://doi.org/10.1038/sj.neo.7900077 1522-8002 Google Scholar

31.

M. Johns, C. A. Giller, D. C. German, and H. L. Liu, “Determination of reduced scattering coefficient of biological tissue from a needle-like probe,” Opt. Express, 13 (13), 4828 –4842 (2005). https://doi.org/10.1364/OPEX.13.004828 1094-4087 Google Scholar

32.

A. M. Zysk and S. A. Boppart, “Computational methods for analysis of human breast tumor tissue in optical coherence tomography images,” J. Biomed. Opt., 11 (5), 054015 (2006). 1083-3668 Google Scholar

33.

B. D. Goldberg, N. V. Iftimia, J. E. Bressner, M. B. Pitman, E. Halpern, B. E. Bouma, and G. J. Tearney, “Automated algorithm for differentiation of human breast tissue using low coherence interferometry for fine needle aspiration biopsy guidance,” J. Biomed. Opt., 13 (1), 014014 (2008). https://doi.org/10.1117/1.2837433 1083-3668 Google Scholar

34.

N. V. Iftimia, B. E. Bouma, M. B. Pitman, B. Goldberg, J. Bressner, and G. J. Tearney, “A portable, low coherence interferometry based instrument for fine needle aspiration biopsy guidance,” Rev. Sci. Instrum., 76 (6), 064301 (2005). https://doi.org/10.1063/1.1921509 0034-6748 Google Scholar

35.

Y. J. Rao and D. A. Jackson, “Recent progress in fiber optic low-coherence interferometry,” Meas. Sci. Technol., 7 (7), 981 –999 (1996). https://doi.org/10.1088/0957-0233/7/7/001 0957-0233 Google Scholar

36.

J. M. Schmitt, A. Knuttel, and R. F. Bonner, “Measurement of optical-properties of biological tissues by low-coherence reflectometry,” Appl. Opt., 32 (30), 6032 –6042 (1993). https://doi.org/10.1364/AO.32.006032 0003-6935 Google Scholar

37.

S. A. Boppart, W. Luo, D. L. Marks, and K. W. Singletary, “Optical coherence tomography: feasibility for basic research and image-guided surgery of breast cancer,” Breast Cancer Res. Treat., 84 (2), 85 –97 (2004). https://doi.org/10.1023/B:BREA.0000018401.13609.54 0167-6806 Google Scholar

38.

N. V. Iftimia, M. Mujat, D. X. Hammer, T. Ustun, and D. R. Ferguson, “Spectral-domain low coherence interferometry/OCT system for fine needle breast biopsy guidance,” Rev. Sci. Instrum., 80 (2), 024302 (2009). https://doi.org/10.1063/1.3076409 0034-6748 Google Scholar

39.

R. A. Johnson and D. W. Wichern, Applied Multivariate Statistical Analysis, 5th ed.Prentice Hall, Upper Saddle River, New Jersey (2002). Google Scholar

40.

T. E. Ustun, N. V. Iftimia, R. D. Ferguson, and D. X. Hammer, “Real-time processing for Fourier domain optical coherence tomography using a field programmable gate array,” Rev. Sci. Instrum., 79 (11), 114301 (2008). https://doi.org/10.1063/1.3005996 0034-6748 Google Scholar

Citation Download Citation

Mircea Mujat, Robert Daniel Ferguson, Daniel X. Hammer, Christopher M. Gittins, and Nicusor V. Iftimia "Automated algorithm for breast tissue differentiation in optical coherence tomography," Journal of Biomedical Optics 14(3), 034040 (1 May 2009). https://doi.org/10.1117/1.3156821

Published: 1 May 2009

Access the abstract

JOURNAL ARTICLE
9 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

CITATIONS

Cited by 40 scholarly publications and 14 patents.

Explore citations on Lens.org

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Tissues

Optical coherence tomography

Tissue optics

Tumors

Biopsy

Breast

Reflectivity

1.

Introduction

2.

Methods