Breast mass detection in mammography and digital breast tomosynthesis (DBT) is an essential step in computerized breast cancer analysis. Deep learning-based methods incorporate feature extraction and model learning into a unified framework and have achieved impressive performance in various medical applications (e.g., disease diagnosis, tumor detection, and landmark detection). However, these methods require large-scale accurately annotated data. Unfortunately, it is challenging to get precise annotations of breast masses. To address this issue, we propose a fully convolutional network (FCN) based heatmap regression method for breast mass detection, using only weakly annotated mass regions in mammography images. Specifically, we first generate heat maps of masses based on human-annotated rough regions for breast masses. We then develop an FCN model for end-to-end heatmap regression with an F-score loss function, where the mammography images are regarded as the input and heatmaps for breast masses are used as the output. Finally, the probability map of mass locations can be estimated with the trained model. Experimental results on a mammography dataset with 439 subjects demonstrate the effectiveness of our method. Furthermore, we evaluate whether we can use mammography data to improve detection models for DBT, since mammography shares similar structure with tomosynthesis. We propose a transfer learning strategy by fine-tuning the learned FCN model from mammography images. We test this approach on a small tomosynthesis dataset with only 40 subjects, and we show an improvement in the detection performance as compared to training the model from scratch.