Translator Disclaimer
7 March 2019 Virtual clinical trial for task-based evaluation of a deep learning synthetic mammography algorithm
Author Affiliations +
Abstract
Image processing algorithms based on deep learning techniques are being developed for a wide range of medical applications. Processed medical images are typically evaluated with the same kind of image similarity metrics used for natural scenes, disregarding the medical task for which the images are intended. We propose a com- putational framework to estimate the clinical performance of image processing algorithms using virtual clinical trials. The proposed framework may provide an alternative method for regulatory evaluation of non-linear image processing algorithms. To illustrate this application of virtual clinical trials, we evaluated three algorithms to compute synthetic mammograms from digital breast tomosynthesis (DBT) scans based on convolutional neural networks previously used for denoising low dose computed tomography scans. The inputs to the networks were one or more noisy DBT projections, and the networks were trained to minimize the difference between the output and the corresponding high dose mammogram. DBT and mammography images simulated with the Monte Carlo code MC-GPU using realistic breast phantoms were used for network training and validation. The denoising algorithms were tested in a virtual clinical trial by generating 3000 synthetic mammograms from the public VICTRE dataset of simulated DBT scans. The detectability of a calcification cluster and a spiculated mass present in the images was calculated using an ensemble of 30 computational channelized Hotelling observers. The signal detectability results, which took into account anatomic and image reader variability, showed that the visibility of the mass was not affected by the post-processing algorithm, but that the resulting slight blurring of the images severely impacted the visibility of the calcification cluster. The evaluation of the algorithms using the pixel-based metrics peak signal to noise ratio and structural similarity in image patches was not able to predict the reduction in performance in the detectability of calcifications. These two metrics are computed over the whole image and do not consider any particular task, and might not be adequate to estimate the diagnostic performance of the post-processed images.
Conference Presentation
© (2019) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Andreu Badal, Kenny H. Cha, Sarah E. Divel, Christian G. Graff, Rongping Zeng, and Aldo Badano "Virtual clinical trial for task-based evaluation of a deep learning synthetic mammography algorithm", Proc. SPIE 10948, Medical Imaging 2019: Physics of Medical Imaging, 109480O (7 March 2019); https://doi.org/10.1117/12.2513062
PROCEEDINGS
10 PAGES + PRESENTATION

SHARE
Advertisement
Advertisement
Back to Top