Heterogeneous multi-modal medical imaging data need to be properly handled in classification. Currently, generating models using multi-modal imaging data has become a common practice and greatly benefits the brain disorder diagnosis, which also holds considerable clinical potential. Although the majority of classification studies focus on using features from single modality, there is substantial evidence suggesting that classification based on multi-modal features is on upward trend. Hence, effective integration of heterogeneous data is in urgent demand. Here, we proposed a multi-kernel SVM for schizophrenia classification with nested 10-fold cross validation, which could integrate multi-modal data using the subspace similarity of the decomposed components in each MRI modality. To validate the effectiveness of the proposed method, we performed experiments on two independent datasets with three different modalities to classify schizophrenia patients and healthy controls. Specifically, multi-modal fusion method was first applied on preprocessed fMRI, DTI and sMRI data to generate components that could be used for classification. Then multi-kernel SVM models were trained on the selected component features using subspace similarity measures, and were tested on independent validation data across sites. The results on both datasets demonstrated that our method achieved accuracies of 87.6% and 79.9% separately on two datasets when combining all three modalities, which outperformed alternative methods and might provide potential biomarkers for cross-site classification and co-varying components among different modalities.