In recent years, convolutional neural networks (CNN) have found increasingly active application in the field of computer-aided diagnosis (CAD) research. Typically, general-use, high-performance detectors are designed using machine learning, the training of which is conducted by applying comprehensive sets of case images having various variations. In this study, we show that, when configuring CNN training data, dividing the data into multiple subsets and adjusting their ratios, instead of providing the data uniformly, has the potential for effective learning. We propose in this study a learning method by which CNN learning using these subsets is incrementally repeated. In this study, subsets of breast cancer mass learning data based on mass size and intensity were created. Using multiple data sets prepared for use in the evaluation of a CNN that had been subjected to learning, optimal ratios were considered and, based on this, performance evaluations using actual unknown data were conducted. Next, the ratios of evaluation data subsets having numerous detection errors were raised and relearning conducted. This process was repeated multiple times, as long as increases in the area under curve (AUC) were observed, thus enabling the design of a high-performance CNN. As a result of applying unknown data to this CNN, we found that it exhibited a higher AUC than a CNN to which learning data was simply provided comprehensively, demonstrating the effectiveness of the proposed learning method.
|