Deep learning models are traditionally trained purely in a data-driven approach; the information for the model training usually only comes from a single source of the training data. In this work, we investigate how to supply additional clinical knowledge that is associated with the training data. Our goal is to train deep learning models for breast cancer diagnosis using mammogram images. Along with the main classification task between clinically proven cancer vs negative/benign cases, we design two auxiliary tasks each capturing a form of additional knowledge to facilitate the main task. Specifically, one auxiliary task is to classify images according to the radiologist-made BI-RADS diagnosis scores and the other auxiliary task is to classify images in terms of the BI-RADS breast density categories. We customize a Multi-Task Learning model to jointly perform the three tasks (main task and two auxiliary tasks). We test four deep learning architectures: CBR–Tiny, ResNet18, GoogleNet, and DenseNet and we investigate the benefit of incorporating such knowledge over ImageNet pre-trained models and in the case of randomly initialized models. We run experiments on an internal dataset consisting of screening full field digital mammography images for a total of 1,380 images (341 cancer and 1,039 negative or benign). Our results show that, by adding clinical knowledge conveyed through the two auxiliary tasks to the training process, we can improve the performance of the target task of breast cancer diagnosis, thus highlighting the benefit of incorporating clinical knowledge into data-driven learning to enhance deep learning model training.