While the plethora of information from multiple imaging and non-imaging data streams presents an opportunity for discovery of fused multimodal, multiscale biomarkers, they also introduce multiple independent sources of noise that hinder their collective utility. The goal of this work is to create fused predictors of disease diagnosis and prognosis by combining multiple data streams, which we hypothesize will provide improved performance as compared to predictors from individual data streams. To achieve this goal, we introduce supervised multiview canonical correlation analysis (sMVCCA), a novel data fusion method that attempts to find a common representation for multiscale, multimodal data where class separation is maximized while noise is minimized. In doing so, sMVCCA assumes that the different sources of information are complementary and thereby act synergistically when combined. Although this method can be applied to any number of modalities and to any disease domain, we demonstrate its utility using three datasets. We fuse (i) 1.5 Tesla (T) magnetic resonance imaging (MRI) features with cerbrospinal fluid (CSF) proteomic measurements for early diagnosis of Alzheimer’s disease (n = 30), (ii) 3T Dynamic Contrast Enhanced (DCE) MRI and T2w MRI for in vivo prediction of prostate cancer grade on a per slice basis (n = 33) and (iii) quantitative histomorphometric features of glands and proteomic measurements from mass spectrometry for prediction of 5 year biochemical recurrence postradical prostatectomy (n = 40). Random Forest classifier applied to the sMVCCA fused subspace, as compared to that of MVCCA, PCA and LDA, yielded the highest classification AUC of 0.82 +/- 0.05, 0.76 +/- 0.01, 0.70 +/- 0.07, respectively for the aforementioned datasets. In addition, sMVCCA fused subspace provided 13.6%, 7.6% and 15.3% increase in AUC as compared with that of the best performing individual view in each of the three datasets, respectively. For the biochemical recurrence dataset, Kaplan-Meier curves generated from classifier prediction in the fused subspace reached the significance threshold (p = 0.05) for distinguishing between patients with and without 5 year biochemical recurrence, unlike those generated from classifier predictions of the individual modalities.