When a computer-aided diagnosis (CAD) system is used in clinical practice, it is desirable that the system is constantly
and automatically updated with new cases obtained for performance improvement. In this study, the effect of different
case selection methods for the system updates was investigated. For the simulation, the data for classification of benign and malignant masses on mammograms were used. Six image features were used for training three classifiers: linear discriminant analysis (LDA), support vector machine (SVM), and k-nearest neighbors (kNN). Three datasets, including dataset I for initial training of the classifiers, dataset T for intermediate testing and retraining, and dataset E for evaluating the classifiers, were randomly sampled from the database. As a result of intermediate testing, some cases from dataset T were selected to be added to the previous training set in the classifier updates. In each update, cases were selected using 4 methods: selection of (a) correctly classified samples, (b) incorrectly classified samples, (c) marginally classified samples, and (d) random samples. For comparison, system updates using all samples in dataset T were also evaluated. In general, the average areas under the receiver operating characteristic curves (AUCs) were almost unchanged with method (a), whereas AUCs generally degraded with method (b). The AUCs were improved with method (c) and (d), although use of all available cases generally provided the best or nearly best AUCs. In conclusion, CAD systems may be improved by retraining with new cases accumulated during practice.