Translator Disclaimer
17 March 2015 Investigation of methods for calibration of classifier scores to probability of disease
Author Affiliations +
Classifier scores in many diagnostic devices, such as computer-aided diagnosis systems, are usually on an arbitrary scale, the meaning of which is unclear. Calibration of classifier scores to a meaningful scale such as the probability of disease is potentially useful when such scores are used by a physician or another algorithm. In this work, we investigated the properties of two methods for calibrating classifier scores to probability of disease. The first is a semiparametric method in which the likelihood ratio for each score is estimated based on a semiparametric proper receiver operating characteristic model, and then an estimate of the probability of disease is obtained using the Bayes theorem assuming a known prevalence of disease. The second method is nonparametric in which isotonic regression via the pool-adjacent-violators algorithm is used. We employed the mean square error (MSE) and the Brier score to evaluate the two methods. We evaluate the methods under two paradigms: (a) the dataset used to construct the score-to-probability mapping function is used to calculate the performance metric (MSE or Brier score) (resubstitution); (b) an independent test dataset is used to calculate the performance metric (independent). Under our simulation conditions, the semiparametric method is found to be superior to the nonparametric method at small to medium sample sizes and the two methods appear to converge at large sample sizes. Our simulation results also indicate that the resubstitution bias may depend on the performance metric and, for the semiparametric method, the resubstitution bias is small when a reasonable number of cases (> 100 cases per class) are available.
© (2015) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Weijie Chen, Berkman Sahiner, Frank Samuelson, Aria Pezeshk, and Nicholas Petrick "Investigation of methods for calibration of classifier scores to probability of disease", Proc. SPIE 9416, Medical Imaging 2015: Image Perception, Observer Performance, and Technology Assessment, 94161E (17 March 2015); doi: 10.1117/12.2082142;

Back to Top