Improved prediction of the “most harmful” breast cancers that cause the most substantive morbidity and mortality would enable physicians to target more intense screening and preventive measures at those women who have the highest risk; however, such prediction models for the “most harmful” breast cancers have rarely been developed. Electronic health records (EHRs) represent an underused data source that has great research and clinical potential. Our goal was to quantify the value of EHR variables in the “most harmful” breast cancer risk prediction. We identified 794 subjects who had breast cancer with primary non-benign tumors with their earliest diagnosis on or after 1/1/2004 from an existing personalized medicine data repository, including 395 “most harmful” breast cancer cases and 399 “least harmful” breast cancer cases. For these subjects, we collected EHR data comprised of 6 components: demographics, diagnoses, symptoms, procedures, medications, and laboratory results. We developed two regularized prediction models, Ridge Logistic Regression (Ridge-LR) and Lasso Logistic Regression (Lasso-LR), to predict the “most harmful” breast cancer one year in advance. The area under the ROC curve (AUC) was used to assess model performance. We observed that the AUCs of Ridge-LR and Lasso-LR models were 0.818 and 0.839 respectively. For both the Ridge-LR and LassoLR models, the predictive performance of the whole EHR variables was significantly higher than that of each individual component (p<0.001). In conclusion, EHR variables can be used to predict the “most harmful” breast cancer, providing the possibility to personalize care for those women at the highest risk in clinical practice.
Technology advances in genome-wide association studies (GWAS) has engendered optimism that we have entered a new age of precision medicine, in which the risk of breast cancer can be predicted on the basis of a person’s genetic variants. The goal of this study is to evaluate the discriminatory power of common genetic variants in breast cancer risk estimation. We conducted a retrospective case-control study drawing from an existing personalized medicine data repository. We collected variables that predict breast cancer risk: 153 high-frequency/low-penetrance genetic variants, reflecting the state-of-the-art GWAS on breast cancer, mammography descriptors and BI-RADS assessment categories in the Breast Imaging Reporting and Data System (BI-RADS) lexicon. We trained and tested naïve Bayes models by using these predictive variables. We generated ROC curves and used the area under the ROC curve (AUC) to quantify predictive performance. We found that genetic variants achieved comparable predictive performance to BI-RADS assessment categories in terms of AUC (0.650 vs. 0.659, p-value = 0.742), but significantly lower predictive performance than the combination of BI-RADS assessment categories and mammography descriptors (0.650 vs. 0.751, p-value < 0.001). A better understanding of relative predictive capability of genetic variants and mammography data may benefit clinicians and patients to make appropriate decisions about breast cancer screening, prevention, and treatment in the era of precision medicine.
Breast cancer risk prediction algorithms are used to identify subpopulations that are at increased risk for developing breast cancer. They can be based on many different sources of data such as demographics, relatives with cancer, gene expression, and various phenotypic features such as breast density. Women who are identified as high risk may undergo a more extensive (and expensive) screening process that includes MRI or ultrasound imaging in addition to the standard full-field digital mammography (FFDM) exam. Given that there are many ways that risk prediction may be accomplished, it is of interest to evaluate them in terms of expected cost, which includes the costs of diagnostic outcomes. In this work we perform an expected-cost analysis of risk prediction algorithms that is based on a published model that includes the costs associated with diagnostic outcomes (true-positive, false-positive, etc.). We assume the existence of a standard screening method and an enhanced screening method with higher scan cost, higher sensitivity, and lower specificity. We then assess expected cost of using a risk prediction algorithm to determine who gets the enhanced screening method under the strong assumption that risk and diagnostic performance are independent. We find that if risk prediction leads to a high enough positive predictive value, it will be cost-effective regardless of the size of the subpopulation. Furthermore, in terms of the hit-rate and false-alarm rate of the of the risk prediction algorithm, iso-cost contours are lines with slope determined by properties of the available diagnostic systems for screening.
Combining imaging and genetic information to predict disease presence and progression is being codified into an emerging discipline called “radiogenomics.” Optimal evaluation methodologies for radiogenomics have not been well established. We aim to develop a decision framework based on utility analysis to assess predictive models for breast cancer diagnosis. We garnered Gail risk factors, single nucleotide polymorphisms (SNPs), and mammographic features from a retrospective case-control study. We constructed three logistic regression models built on different sets of predictive features: (1) Gail, (2) Gail + Mammo, and (3) Gail + Mammo + SNP. Then we generated receiver operating characteristic (ROC) curves for three models. After we assigned utility values for each category of outcomes (true negatives, false positives, false negatives, and true positives), we pursued optimal operating points on ROC curves to achieve maximum expected utility of breast cancer diagnosis. We performed McNemar’s test based on threshold levels at optimal operating points, and found that SNPs and mammographic features played a significant role in breast cancer risk estimation. Our study comprising utility analysis and McNemar’s test provides a decision framework to evaluate predictive models in breast cancer risk estimation.
Combining imaging and genetic information to predict disease presence and behavior is being codified into an emerging discipline called “radiogenomics.” Optimal evaluation methodologies for radiogenomics techniques have not been established. We aim to develop a clinical decision framework based on utility analysis to assess prediction models for breast cancer. Our data comes from a retrospective case-control study, collecting Gail model risk factors, genetic variants (single nucleotide polymorphisms-SNPs), and mammographic features in Breast Imaging Reporting and Data System (BI-RADS) lexicon. We first constructed three logistic regression models built on different sets of predictive features: (1) Gail, (2) Gail+SNP, and (3) Gail+SNP+BI-RADS. Then, we generated ROC curves for three models. After we assigned utility values for each category of findings (true negative, false positive, false negative and true positive), we pursued optimal operating points on ROC curves to achieve maximum expected utility (MEU) of breast cancer diagnosis. We used McNemar’s test to compare the predictive performance of the three models. We found that SNPs and BI-RADS features augmented the baseline Gail model in terms of the area under ROC curve (AUC) and MEU. SNPs improved sensitivity of the Gail model (0.276 vs. 0.147) and reduced specificity (0.855 vs. 0.912). When additional mammographic features were added, sensitivity increased to 0.457 and specificity to 0.872. SNPs and mammographic features played a significant role in breast cancer risk estimation (p-value < 0.001). Our decision framework comprising utility analysis and McNemar’s test provides a novel framework to evaluate prediction models in the realm of radiogenomics.
A 2% threshold has been traditionally used to recommend breast biopsy in mammography. We aim to characterize how the biopsy threshold varies to achieve the maximum expected utility (MEU) of tomosynthesis for breast cancer diagnosis. A cohort of 312 patients, imaged with standard full field digital mammography (FFDM) and digital breast tomosynthesis (DBT), was selected for a reader study. Fifteen readers interpreted each patient’s images and estimated the probability of malignancy using two modes: FFDM versus FFDM + DBT. We generated receiver operator characteristic (ROC) curves with the probabilities for all readers combined. We found that FFDM+DBT provided improved accuracy and MEU compared with FFDM alone. When DBT was included in the diagnosis along with FFDM, the optimal biopsy threshold increased to 2.7% as compared with the 2% threshold for FFDM alone. While understanding the optimal threshold from a decision analytic standpoint will not help physicians improve their performance without additional guidance (e.g. decision support to reinforce this threshold), the discovery of this level does demonstrate the potential clinical improvements attainable with DBT. Specifically, DBT has the potential to lead to substantial improvements in breast cancer diagnosis since it could reduce the number of patients recommended for biopsy while preserving the maximal expected utility.
In this paper, we describe a novel technique for vision based UAV (unmanned aerial vehicle) navigation. In this technique, the navigation (position estimation) problem is formulated as a tracking problem and solved by a particle filter. The state and observation models of the particle filter are established based on a stereo analysis of the image sequence generated by the UAV's video camera in connection with a DEM (digital elevation map) of the area of the flight, which helps to control estimation error accumulation. The efficacy of this technique is demonstrated by simulation experimental results.
Radiation therapy (RT) is an important procedure in the treatment of cancer in the thorax and abdomen. However, its efficacy can be severely limited by breathing induced tumor motion. Tumor motion causes uncertainty in the tumor's location and consequently limits the radiation dosage (for fear of damaging normal tissue). This paper describes a novel signal model for tumor motion tracking/prediction that can potentially improve RT results. Using CT and breathing sensor data, it provides a more accurate characterization of the breathing and tumor motion than previous work and is non-invasive. The efficacy of our model is demonstrated on patient data.