The reduction of false positive marks in breast mass CAD is an active area of research. Typically, the problem
can be approached either by developing more discriminative features or by employing different classifier designs.
Usually one intends to find an optimal combination of classifier configuration and small number of features to
ensure high classification performance and a robust model with good generalization capabilities.
In this paper, we investigate the potential benefit of relying on a support vector machine (SVM) classifier
for the detection of masses. The evaluation is based on a 10-fold cross validation over a large database of screen
film mammograms (10397 images). The purpose of this study is twofold: first, we assess the SVM performance
compared to neural networks (NNet), k-nearest neighbor classification (k-NN) and linear discriminant analysis
(LDA). Second, we study the classifiers' performances when using a set of 30 and a set of 73 region-based
features. The CAD performance is quantified by the mean sensitivity in 0.05 to 1 false positives per exam on
the free-response receiver operating characteristic curve.
The best mean exam sensitivities found were 0.545, 0.636, 0.648, 0.675 for LDA, k-NN, NNet and SVM.
K-NN and NNet proved to be stable against variation of the featuresets. Conversely, LDA and SVM exhibited
an increase in performance when adding more features. It is concluded that with an SVM a more pronounced
reduction of false positives is possible, given that a large number of cases and features are available.
Proc. SPIE. 7263, Medical Imaging 2009: Image Perception, Observer Performance, and Technology Assessment
KEYWORDS: Visual process modeling, Computer aided diagnosis and therapy, Cancer, Breast cancer, Detection and tracking algorithms, Visualization, Computing systems, Mammography, Computer aided design, CAD systems
Most computer aided detection (CAD) systems for mammographic mass detection display all suspicious regions
identified by computer algorithms and are mainly intended to avoid missing cancers due to perceptual oversights.
Considering that interpretation failure is recognized to be a more common cause of missing cancers in screening
than perceptual oversights, a dedicated mammographic CAD system has been developed that can be queried
interactively for the presence of CAD prompts using a mouse click. To assess the potential benefit of using CAD
in an interactive way, an observer study was conducted in which 4 radiologists and 6 non-radiologists evaluated
60 cases with and without CAD, to compare the detection performance of the unaided reader with that of the
reader with CAD assistance. 20 cases had a malignant mass, and 40 were cancer-free. During the reading sessions
we recorded time and probed locations which reveal information about the search strategy and detection process.
The purpose of this study is to determine a relation between detection performance and time to first probe of
the lesion and to investigate if longer reading times lead to more reports of malignant lesions in lesion-free areas.
On average, 65.0% of the malignant lesions were found within 60 seconds and this percentage stabilizes after this
period. Results suggest that longer reading time did not lead to more false positives. 74.6% of the reported true
positive findings were hit by the first probe, and 93.2% were hit within 5 probes, which may suggest that many
of the correctly reported malignant masses were perceived immediately after image onset.
In breast cancer screening, radiologists not only look at local properties of suspicious regions in the mammogram
but take also into account more general contextual information. In this study we investigated the use of similar
information for computer aided detection of malignant masses. We developed a new set of features that combine
information from the candidate mass region and the whole image or mammogram. The developed context
features were constructed to give information about suspiciousness of a region relative to other areas in the
mammogram, the location in the image, the location in relation to dense tissue and the overall amount of dense
tissue in the mammogram. We used a step-wise floating feature selection algorithm to select subsets from the
set of available features. Feature selection was performed two times, once using the complete feature set (37
context and 40 local features) and once using only the local features. It was found that in the subsets selected
from the complete feature set 30-60% were context features. At most one local feature present in the subset
containing context features was not present in the subset without context features. We validated the performance
of the selected subsets on a separate data set using cross validation and bootstrapping. For each subset size we
compared the performance obtained using the features selected from the complete feature set to the performance
obtained using the features selected from the local feature set. We found that subsets containing context features
performed significantly better than feature sets containing no context features.
In this study we investigated different feature selection methods for use in computer-aided mass detection. The
data set we used (1357 malignant mass regions and 58444 normal regions) was much larger than used in previous
research where feature selection did not directly improve the performance compared to using the entire feature set.
We introduced a new performance measure to be used during feature selection, defined as the mean sensitivity
in an interval of the free response operating characteristic (FROC) curve computed on a logarithmic scale. This
measure is similar to the final validation performance measure we were optimizing. Therefore it was expected
to give better results than more general feature selection criteria. We compared the performance of feature
sets selected using the mean sensitivity of the FROC curve to sets selected using the Wilks' lambda statistic
and investigated the effect of reducing the skewness in the distribution of the feature values before performing
feature selection. In the case of Wilks' lambda, we found that reducing skewness had a clear positive effect,
yielding performances similar or exceeding performances obtained when the entire feature set was used. Our
results indicate that a general measure like Wilks' lambda selects better performing feature sets than the mean
sensitivity of the FROC curve.