Patch-based 3D fully convolutional neural networks (FCN) were optimized for prediction of pseudo CT images from 3D gradient echo pelvic MRI data and compared to ground truth CT data of 26 patients. CT data was non-rigidly registered to MRI for training and evaluation. We compared voxelwise L1 and L2 loss functions, with contextual multi-scale L1 and L2 (MSL1 and MSL2), and SSIM. Performance was evaluated using MAE, PSNR, SSIM and the overlap of segmented cortical bone in the reconstructions, by the dice similarity metric. Evaluation was carried out in cross-validation.
All optimizations successfully converged well with PSNR between 25 and 30 HU, except for one of the folds of SSIM optimizations. MSL1 and MSL2 are at least on par with their single-scale counterparts. MSL1 overcomes some of the instabilities of the L1 optimized prediction models. MSL2 optimization is stable, and on average, outperforms all the other losses, although quantitative evaluations based on MAE, PSNR and SSIM only show minor differences. Direct optimization using SSIM visually excelled in terms subjective perceptual image quality at the expense of a voxelwise quantitative performance drop.
Contextual loss functions can improve prediction performance of FCNs without change of the network architecture. The suggested subjective superiority of contextual losses in reconstructing local structures merits further investigations.