10 March 2017 Comparison of two classifiers when the data sets are imbalanced: the power of the area under the precision-recall curve as the figure of merit versus the area under the ROC curve
Author Affiliations +
Abstract
In many two-class problems in automated classification and information retrieval, the classes are imbalanced, and the separation between the positive and negative classes is large. The precision-recall (PR) curve has been suggested as an alternative to the receiver operating characteristic (ROC) curve to characterize the performance of automated systems when the classes are imbalanced, and the area under the precision-recall curve (AUCPR) has been suggested as an alternative performance measure to the area under the ROC curve (AUCROC). AUCPR and AUCROC are distinct measures of performance, even though the relationship between the precision-recall and ROC curves is well-known. In this study, we compared the statistical power of the AUCPR to that of the AUCROC. Our results indicate that the AUCPR can offer a small statistical advantage when the prevalence is low and the separation between the positive and negative classes is large. When the data set is more balanced or the separation between the classes is low or moderate, AUCROC has slightly higher power.
Conference Presentation
© (2017) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Berkman Sahiner, Weijie Chen, Aria Pezeshk, Nicholas Petrick, "Comparison of two classifiers when the data sets are imbalanced: the power of the area under the precision-recall curve as the figure of merit versus the area under the ROC curve", Proc. SPIE 10136, Medical Imaging 2017: Image Perception, Observer Performance, and Technology Assessment, 101360G (10 March 2017); doi: 10.1117/12.2254742; https://doi.org/10.1117/12.2254742
PROCEEDINGS
9 PAGES + PRESENTATION

SHARE
Back to Top