26 February 2013 Neural network training by maximization of the area under the ROC curve: application to characterization of masses on breast ultrasound as malignant or benign
Author Affiliations +
Proceedings Volume 8670, Medical Imaging 2013: Computer-Aided Diagnosis; 86701M (2013) https://doi.org/10.1117/12.2007615
Event: SPIE Medical Imaging, 2013, Lake Buena Vista (Orlando Area), Florida, United States
Abstract
Back-propagation neural networks (BPNs) are traditionally trained using error measures such as sum-of-squares or cross-entropy. If the training sample size is small, and the neural network has a large number of hidden layer nodes, the BPN may be overtrained, i.e., it may fit the training data well, but may generalize poorly to independent test data. In this study, we investigated a training technique that maximized the approximate area under the ROC curve (AUC) to reduce overtraining. In general, the non-parametric AUC is a discontinuous and non-differentiable function of the neural network output, which makes it unsuitable for gradient descent algorithms such as back-propagation. We used a semidifferentiable approximation to AUC, which appeared to provide reasonable training for the data sets explored in this study. We performed a simulation study using synthetic data sets consisting of Gaussian mixtures to investigate the behavior of this new technique with respect to overtraining. Our results indicated that an artificial neural network trained using the AUC-maximization method is less prone to overtraining. The advantage of the AUC-maximization method was consistently observed over different values of hidden layer BPN nodes, training sample sizes, and the dimensionality of the feature spaces evaluated in our simulation study. For a five-hidden-node BPN trained using 50 training samples per class, the average test AUC was 0.896 (standard deviation (SD): 0.026) with AUC-maximization and 0.856 (SD: 0.028) with the sum-of-squares method. The gain in test performance by the AUC-maximization method over the traditional BPN training was greater when the training sample size was smaller. We also applied this new method to a data set previously acquired for characterization of masses on breast ultrasound as malignant or benign. Our results with this real-world data set had the same trend as with our simulation data sets in that the AUC-maximization technique was less prone to overtraining than the sum-of-squares method.
© (2013) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Berkman Sahiner, Berkman Sahiner, Xin He, Xin He, Weijie Chen, Weijie Chen, Heang-Ping Chan, Heang-Ping Chan, Lubomir Hadjiiski, Lubomir Hadjiiski, Nicholas Petrick, Nicholas Petrick, } "Neural network training by maximization of the area under the ROC curve: application to characterization of masses on breast ultrasound as malignant or benign", Proc. SPIE 8670, Medical Imaging 2013: Computer-Aided Diagnosis, 86701M (26 February 2013); doi: 10.1117/12.2007615; https://doi.org/10.1117/12.2007615
PROCEEDINGS
7 PAGES


SHARE
Back to Top