Prostate cancer remains the most common noncutaneous malignant tumor in the Western world accounting for approximately one in five of newly diagnosed tumors in men and resulting in an estimated 29,430 deaths in 2018.1 In the United States, approximately one in seven men will be diagnosed with this disease.1 Based on Gleason score, prostate specific antigen (PSA) value, tumor stage, age, and race, patients with prostate cancer are stratified into low-, intermediate-, and high-risk groups.2,3
A strong predictor of survival among men with prostate cancer is the Gleason score rendered by a pathologist based upon a microscopic evaluation of a representative histopathology specimen.4 These scores are based solely upon morphology and structural patterns of the constituent cells and glands. Patients with Gleason score 6 or lower often undergo active surveillance as there is reduced risk of tumor progression for those patients compared to patients with score 7 or higher.5,6 Tumors that are assigned Gleason score 7 can be delineated into a primary region exhibiting a histopathology pattern graded as 4 and a secondary region exhibiting a histopathology pattern graded as 3. Such samples are referred to as Gleason tumors, whereas the inverse pattern exhibiting a primary pattern of 3 and a secondary pattern of 4 would constitute a Gleason tumors. Patients with Gleason tumors have an increased risk of recurrence and progression leading to an increased risk of prostate cancer-specific mortality when compared to those afflicted with Gleason tumors.7–9 The literature clearly shows that predicting disease recurrence in a man with Gleason score 7 prostate cancer can have a significant impact on his disease management and survival.8–10
Phenotypically, tumor regions with Gleason pattern 3 are composed of single glands with distinct size and shape whereas ones with Gleason pattern 4 exhibit large irregular cribriform glands or fused, ill-defined glands with poorly formed glandular lumina.11–13 In spite of established guidelines, Gleason grading remains a relatively subjective process that results in an grading discrepancy among the scores provided by pathologists.11–15 There have been many attempts to develop computer-aided Gleason grading methods and systems11,16–21 in order to introduce objective, reproducible criteria into the process of Gleason pattern quantification, and grading. One previous study has explored an integration of image features along with protein expression to predict recurrent prostate cancer.22 However, to date, there has been no study focused on utilizing patients’ pathology images and genomic pathway analyses in combination to predict recurrence-free survival (RFS) for men with prostate cancer.
Microarray-based gene expression signatures have been used in various studies to identify cancer subtypes, determine the RFS of disease, and characterize response to specific therapies.23 Multiple investigations have also shown that gene expression signatures can be used to analyze oncogenic pathways and these signatures have been used to identify differences between specific cancer types and tumor subtypes. Moreover, patterns of oncogenic pathway activity have been used to identify differences in underlying molecular mechanisms and have been shown to correlate with clinical outcomes of patients afflicted with specific cancers.24–27
In recent years, whole-slide image (WSI) has been more widely used in histopathology diagnosis. With a fast development of deep learning, histopathology image analysis approaches have demonstrated significant advances in cellular segmentation28–31 and tissue classifications30–34 using convolutional neural networks (CNN). Some research groups reported their studies using histopathology WSI for many applications.35–37 Due to a giga-pixel size of a WSI’s, it is often impractical to train the CNN using WSIs directly. Consequently, patch-based algorithms are widely applied in histopathology image analysis.34,38–43
In this study, we developed a computational biomarker quantification system by integrating histopathology WSIs and genomic data into one deep neural network. In order to use the distribution of Gleason patterns on a WSI, we applied patches as inputs to the network. The patches were forwarded through a CNN to get the images features. Then based on the patches’ spatial relationship, the image features were modeled using a recurrence neural network (RNN),44 namely long short-term memory (LSTM).45 The pathway scores calculated from the genomic data were forwarded to a multilayer perceptron (MLP) to get the genomic features. And the image and genomic features were integrated together to get the computational biomarkers. Moreover, we used RFS (months) since their initial treatment as the time-to-recurrence variable for a survival model. We chose a Cox proportional-hazard regression model46,47 since it is commonly used in medical research for investigating associations between the survival time of patients and predictor variables.
In this section, we introduced our approach on building a unified system using WSI and genomic data through deep neural networks to quantify computational biomarkers, which were fed into a survival model for patients’ recurrence analysis. Our methods consisted of four steps. First, the pathway activities of prostate cancer were quantified by pathway scores using RNA sequences. Second, the histopathology WSIs were preprocessed to obtain the region-of-interest as the image patches preparation. Third, the image patches and pathway scores were integrated into a unified system using the deep learning approach to extract computational biomarkers. Finally, we used the computational biomarkers in conjunction with clinical prognostic factors as the input of the survival model to calculate the disease recurrence ratios and probabilities. Figure 1 shows the overview of the pipeline of the whole study.
In this study, we used publicly available prostate cancer data downloaded from the data portal of the Genomic Data Commons (GDC).48 GDC is the largest public available data portal that includes image data from The Cancer Genome Atlas (TCGA),49 genomic data, and clinical data. The TCGA barcode48 is the primary identifier of GDC data acquisition protocol. For this study, in total, there were 43 Gleason , 146 Gleason , 101 Gleason , and 49 Gleason , which contain 1229, 4753, 2997, and 1597 patches, respectively. For the recurrence study of patients with Gleason 7, we used all the data from Gleason 6, 7, and 8 to train the networks to extract the computational biomarkers. In this way, the training data contained more images of Gleason patterns 3 and 4 compared to a training data if only use patients’ data with Gleason 7 ( or ). For the recurrence study of patients with Gleason 7, the computational biomarkers of patients with Gleason 7 were fed into a survival model while the patients with other Gleason score were withheld.
The patients were randomly divided into the training set, validation set, and testing test with the ratio of 70%, 10%, and 20%; these groups were utilized for the recurrence analyses. In addition to the Gleason score, we compared the computational biomarkers quantified from the unified image and genomic data system with other clinical factors including patients’ PSA, age, and tumor stage, which are publicly available from GDC data portal.
The WSI patches preparation was a two-step cropping-selection process. First, the image patches within each WSI were automatically cropped under objective magnification with a patch size . The patches were cropped with a stride as 4096 to avoid overlapping. We resized all the patches to the size of using Lanczos filtering.50 Second, any specimens with insufficient tissue patches were automatically eliminated from the experiments due to the heterogeneous quality of the prostate WSIs. The patches with the tissue accounting for at least 20% of the whole area were selected.
Pathway Score Quantification from RNA Sequencing Data
To quantify pathway scores, we used the gene expression data, which were RNA (Illumina HiSeq) sequencing data from patients with Gleason score 7. The data are publicly available through GDC data portal. We preprocessed the RNA data by log transformed and median centered. A panel of previously published 265 experimentally derived gene expression signatures was applied to the entire cohort to identify patterns of oncogenic signaling in each tumor. To apply a given signature, the expression data were filtered to contain only those genes included in the given signature and the mean expression value of these genes was calculated to provide a score for each sample.25,26
Computational Biomarker Extraction
In order to obtain computational biomarkers from the WSIs and genomic data, we built a unified feature quantification system using CNN to model WSI histopathology image patches and genomic data together. Furthermore, we leveraged the RNN to model the spatial relationship of the cropped patches within the WSI. The network architecture is shown in Fig. 2.
Modeling histopathology image patches and genomic data
In order to combine the image information along with the genomic data, we used the patches and pathway scores as the input to the network. We forwarded the pathway scores into an MLP that includes three fully connected (FC) layers, with 1024, 512, and 256 hidden units, respectively. The genomic features were the output of the last FC layer. Meanwhile, we incorporated the AlexNet51 to extract the features from image patches. We concatenated the genomic features obtained from the pathway scores with the image features from the second to the last layer of the AlexNet. The concatenated features served as the input to an FC layer before LSTM.
Due to the giga-pixel WSI’s, we considered an integrity of the whole tissue regions on a single WSI instead of using the individual patches to quantify image features as shown in previous studies.39,41 The spatial relationship of the adjacent patches was modeled as an image sequence. We adopted a type of RNN,44 LSTM,45 to model the features extracted from the image patches and genomic data given LSTM has shown its successes among various applications including speech recognition,52,53 language translation models,54 image captioning,55 and video classification.56 Compared with the traditional RNN that has vanishing and exploding gradients problems, LSTM is more effectively in sequence modeling by incorporating memory cells with several gates to obtain long-range dependencies.
More formally, for the input feature sequence () that represents the activations from the CNN of the ’th patch, we used LSTM to compute the output sequence (), where the layer of LSTM was computed recursively from to following the equations:Fig. 3.
Since it is a sequential task to train LSTM, patches from a WSI were formed by a specific routine. As shown in Fig. 2, we used center coordinates of patches to remark the location of each patch. The sequence of patches within a single WSI was arranged from right up patch down to lower left one, which was illustrated by the dotted lines with black arrows on an example of a WSI on Fig. 2. In this way, it allowed us to consider both unique characteristics of each patch and fine-grained variations among patches within a single WSI. For each tumor WSI, the patches and the pathway scores were fed into the network to get features and then incorporated into the LSTM recursively. In addition, the average pooling layer was applied on top of the network to get the computational biomarkers for the WSI and the genomic data. The number of hidden units for each LSTM was 1024. During the training process, we applied the multitasks loss and assigned the primary pattern and the Gleason score for the WSIs and genomic data.
Multitask loss function
For the TCGA prostate WSIs, the primary Gleason pattern, the secondary Gleason pattern, and the sum of both patterns (i.e., Gleason score) were publicly available from GDC data portal. To model the variations among Gleason patterns, we utilized the multitask loss to enable the network to learn as much information about the Gleason pattern distributions from the patches of a WSI as possible. Therefore, we gave the primary pattern and the sum score as labels for each patch along with the pathway score and use the following multitask loss function:
In conjunction with clinical prognostic factors including the primary and secondary Gleason patterns, PSA, age, and tumor stage, computational biomarkers were fed into a Cox regression model46,47 for studying patient’s RFS. In our study, we used RFS (months) as the time variable for a survival model. For high dimensional data, only those with Wald test57,58 -value were selected and used in conjunction with clinical prognostic factors as input variables for the Cox regression model.
One of the most popular regression techniques for survival analysis is Cox proportional hazards regression, which is used to relate several risk factors or exposures, considered simultaneously, to assess differences in overall survival. In a Cox proportional hazards regression model, the measure of effect is the hazard ratio, which is the risk of failure (i.e., here is the risk or probability of the recurrence of the disease), given that the participant has survived up to a specific time. Given the patients (, , ), where , we have the as the patient’s recurrence time for individual ; as the label of the censored data that equals 1 if the recurrence occurred at that time and 0 if the patient has been censored; as the vector of covariates of the selected image features and clinical factors. The hazard function is the nonparametric part of the Cox proportional hazards regression function corresponding to
In the study, we assessed the computational biomarkers in conjunction with other clinical prognostic factors by their recurrence hazard ratios and concordance indices (-index).59,60 The hazard ratio and -index both are global indices for validating the predictive ability of prognostic features of a given survival model. Under a given survival model, higher values mean that prognostic features predict higher risks and probabilities of survival for higher observed survival times. In our study, we examined RFS; the higher the hazard ratio and -index, the greater the likelihood of disease recurrence.
Experiments and Results
In this section, we validated our approach on a publicly available prostate cancer dataset from the GDC data portal. The experimental results showed that the computational biomarkers discovered by the proposed method were effective for recurrence correlation for patients with Gleason score 7.
The training process of our network included two steps. We first trained the CNN using minibatch Stochastic gradient descent with batch size as 32, momentum as 0.9, and weight-decay as . The initial learning rate was and annealed by 0.1 after every 10,000 iterations. We trained the CNN for total of 50,000 iterations until the loss converge. Then, we utilized the genomic data to train the MLP to extract the genomic features and used image and genomic features to train LSTM. We kept the same momentum, weight-decay, and learning rate except, we annealed the learning rate by 0.1 after every 2000 iterations and trained the network for a total of 5000 iterations. The implementation was based on Caffe toolbox.61
Multiple studies have shown that gene expression signatures reflect the activation status of oncogenic pathways irrespective of specific mutations driving signaling.24–26 Thus, we examined genomic-based patterns of oncogenic pathway activity in prostate cancer patients with Gleason score 7 using a panel of previously published 265 gene expression signatures.
In order to qualitatively assess unique patterns of pathway activity that define the and subset of Gleason score seven tumors, pathway signatures in each group, using all tumors across the entire cohort (i.e., training, test, and validation tumors) were assessed by a Student’s two tailed -test. Significant pathway scores were clustered using Cluster 3.062 and visualized by Java TreeView.63 Quantitative assessment of patterns of pathway activity of Gleason score and subgroups is shown in Fig. 4, which displayed a heatmap identifying 27 differentially expressed signatures (). Of these, we determined that 14 signatures including three unique proliferation signatures (Wirapati,64 UNC,65 and murine proliferation65) as well as several proliferation-associated signatures predicative of BMYB,66 RB-LOSS,67 PIK3CA,68 and HERI69 signaling were significantly higher in patients with Gleason score . We further determined that 13 signatures were upregulated in Gleason patients including immune systems signatures associated with Th17 cells,70 Tcm,70 NK-CD56,70 HGF,71 and STAT326 signaling. Consistent with our findings, many studies report72–74 that Gleason tumors have a better prognosis than Gleason tumors, which would correlate with relatively higher levels of proliferation as well as lower levels of immune-related signaling evident in Gleason tumors compared to Gleason samples.
Integrated Recurrence Analysis in Conjunction with Clinical Factors
Image data on recurrence analysis
For the integrated recurrence analysis using a survival model, we first conducted the experiments where only the WSIs of tissue slides were used. Thus, the networks were trained without integrating the genomic features. This setting of experiment is denoted as CNN-LSTM. We also considered the setting that only CNN was applied on the image patches without considering their spatial relation on a WSI and the image features were extracted from the second to the last layer of AlexNet. The setting is denoted as CNN-Only. To compare the effectiveness of the feature extraction from the images, we applied three texture feature methods including SURF,75 HOG,76 and LBP77 on the WSIs to obtain image features. The image features were concatenated with clinical prognostic factors as multivariate inputs of the Cox regression model. During each iteration, each image feature in conjunction with clinical factors was fed into the Cox regression model to calculate the corresponding hazard ratios and -indices. The survival model implementation was based on an R survival package.78
The maximum hazard ratios of recurrence of image features in conjunction with clinical factors are shown in Table 1. Within our study, we used the disease RFS times as the time variable in the Cox regression model, the higher values of hazard ratio and -index of the features indicated that the image features had the higher correlations with the disease recurrence and progression. From the result of using texture features, there were no significance differences among LBP, HOG, and SURF for recurrence ratios. CNN-LSTM analysis determined that image features identified by computational image analysis outperformed other texture features and CNN-Only with higher hazard ratio and -index. When conjunction with CNN-LSMT, the primary pattern still showed greater hazard ratio and -index relative to those identified using other methods.
Recurrence hazard ratios and corresponding C-indices of clinical prognostic factors and different image features from various image quantification methods. The results are obtained by using image features quantified from the WSIs. LBP, HOG, and SURF are the texture methods. CNN-LSTM is using the image features obtained from CNN with LSTM while CNN-Only is using the image features obtained from CNN without considering patches’ spatial relation on a WSI.
|Methods||Primary pattern||Secondary pattern||PSA||Age||Tumor stage||Image features||C-index|
Image and genomic data on recurrence analysis
Before integrating image features and pathway scores, we first analyzed the correlation between them. Because the number of image features and the number of pathway scores were different, to calculate their correlation coefficients, we randomly chose the same number of image features paired with the same number of pathway scores and repeated the process times until all image features had been paired. Here, the image features included features quantified from texture methods (LBP, HOG, and SURF) and CNN-LSTM. Using a -test on correlation coefficients, the mean and standard deviation of -values is shown in Table 2. Because -value , there was no significant correlations between image features and pathway scores. This showed that the two types of data provided complementary information for prostate cancer diagnosis and prognosis. It was reasonable to integrate image and genomic data together for predicting patients’ recurrence.
Correlation analysis of image features and pathways scores using a test-test on their correlation coefficients.
|Image features||Mean of p-value||Standard deviation of p-value|
Then, we showed the experimental results by combining image features obtained from WSIs and the genomic features obtained from the pathway scores. We utilized all 265 gene expression signatures integrated with image data to identify the computational biomarkers as shown in Fig. 2. The setting was denoted as CNN-LSTM + PS. We also considered the setting where LSTM was deactivated when obtained the biomarkers from image and genomic data. We denoted the approach as CNN-Only + PS. The methods using texture features obtained from WSIs together with pathway scores for the recurrence analysis were denoted as LBP-PS, HOG-PS, and SURF-PS. We also considered only using pathway scores with clinical factors together as the input of the Cox regression model and denote it as PS. The maximum hazard ratios of the computational biomarkers from WSIs and pathway scores in conjunction with clinical factors are shown in Table 3.
Recurrence hazard ratios and corresponding C-indices of clinical prognostic factors and computational biomarkers under a Cox regression model using different image feature quantification methods along with the genomic data. Given the genomic data, we show the results using image features with pathway scores (PS). Here, LBP + PS, HOG + PS, SURF + PS, CNN-Only + PS, and CNN-LSTM + PS are image features quantified from LBP, HOG, SURF, CNN-Only, and CNN-LSTM methods with PS.
|Methods||Primary pattern||Secondary pattern||PSA||Age||Tumor stage||Biomarkers||C-index for biomarkers|
|LBP + PS||1.04||1.00||0.87||1.00||1.02||1.08||0.69|
|HOG + PS||1.04||1.00||0.87||1.00||1.02||1.08||0.65|
|SURF + PS||1.07||1.00||0.86||1.00||1.03||1.07||0.62|
|CNN-Only + PS||1.13||1.11||0.80||1.00||1.17||2.58||0.71|
|CNN-LSTM + PS||2.56||0.63||0.66||1.01||1.05||5.73||0.74|
|C-index for clinical factors||0.61||0.59||0.66||0.55||0.53||—||—|
Compared with other clinical factors, using pathway scores alone achieved equivalent hazard ratio. For the texture methods, the recurrence hazard ratios were equivalent to the ones without pathway scores. Using CNN-LSTM + PS, the Gleason primary pattern and computational biomarkers showed the increased recurrence ratios compared to the ones without pathway scores. In addition, the Gleason primary pattern and computational biomarkers showed the highest recurrence ratios compared to other clinical factors. The result showed that CNN-LSTM-PS outperformed other methods in prostate cancer recurrence analysis due to its highest recurrence hazard ratio.
Furthermore, we show the -index of the clinical factors and computational features under the Cox regression model for prostate cancer recurrence probability prediction in the last column of Table 1 and the last row and column of Table 3. As a global index for validating the predictive ability of a survival model, in our study, the -index was equivalent to a rank correlation of the risk of a recurrence of disease. High values mean that the model predicts higher probabilities of recurrence for higher observed recurrence times. From the clinical results, PSA showed higher -index values, which were correlated to a higher recurrence prediction probability compared to other clinical factors. Interestingly, texture features on WSIs or pathway scores individually showed an equivalent recurrence probability.
From the experimental results, our proposed method achieved the highest recurrence hazard ratio and the strongest C-index related to prostate cancer recurrence probability compared to other clinical prognostic factors and methods. It demonstrated that the approach was beneficial for recurrence analysis on the patients with Gleason score 7. The unified WSIs and genomic data analysis through the proposed networks could be applied to other prostate cancer risk group such as Gleason 679–81 or other cancer recurrence analysis, such as breast cancer.82
Pathway analysis, albeit with the caveat of a small sample size, identified 27 differentially expressed pathway activities in tumors with Gleason score and . Thus, these signatures could be utilized to differentiate patients with Gleason score 7 as two subgroups, which corresponds with a favorable or unfavorable prognosis.83 The recurrence analysis (Table 3) using pathway scores alone did not show an advantage over other clinical prognostic factors. The integration of pathway score with WSIs achieved the best recurrence prediction on patients with Gleason score 7. The comparison indicated that using the pathway scores directly had a limited contribution in recurrence prediction on patients with Gleason score 7. However, the embedded genomic features obtained through MLP were more effective for prostate cancer recurrence analysis.
There are other clinical factors for prostate cancer prognosis besides those used in the study, such as patients’ race. Because in the study, men were Asian or African, 30% were Caucasian, and the rest were unknown, we excluded patients’ race factor in the recurrence analysis. Other clinical factors, such as American joint committee on cancer metastasis stage, neoplasm disease stage codes, and so on, were not available for all the patients in the GDC prostate cancer datasets.
The prostate cancer datasets were acquired from various institutions and each institution may have different scanners or WSI scanning protocols. Thus, there was color heterogeneity among the prostate cancer WSIs. In the study, we did not adopt color normalization84,85 on the randomly selected testing set because it was not feasible to determine the reference image from the training set for color normalization. When we apply the approach to a new dataset, we could fine-tune the network based on the training data from that dataset. Given the limited size of the public prostate dataset, the results achieved from our experiments were preliminary. In order to further validate the generalizability of our approach on a wider population of prostate cancer patients, we will collect more prostate images from local institutions to perform extensive experiments.
In this study, we performed recurrence analyses for prostate cancer patients with Gleason score 7 integrating histopathology WSIs and genomic data. The image features and genomic features were obtained using CNN and MLP, respectively. The combination of the features was modeled using LSTM to get the computational biomarkers. Experimental results utilizing on publicly available prostate cancer dataset showed that the computational biomarkers extracted by our approach were more closely correlated with patients recurrence risk compared to standard clinical prognostic factors and engineered image texture features. The results of our study suggest that these approaches could be utilized to predict recurrence and progression for patients with prostate cancer.
Dr. Singer is the principal investigator on an investigator-initiated clinical trial that is funded by Astellas/Medivation (NCT02885649).86 The other authors declare that they have no competing interests. The public prostate GDC data have ethics approval with the NIH/NCI, which can be accessible from Ref. 48.
This research was funded, in part, by grants from NIH contracts 4R01LM009239-08, 4R01CA161375-05, and 1UG3CA225021-01, and P30CA072720.
Jian Ren is a PhD candidate of electrical and computer engineering at Rutgers University. He received his BS degree in electronic science and technology from the University of Science and Technology of China in 2014.
David J. Foran is CIO and executive director of computational imaging and biomedical informatics at Rutgers Cancer Institute of New Jersey. He also serves as professor and chief medical informatics officer of pathology, Laboratory Medicine and Radiology at Rutgers-Robert Wood Johnson Medical School. His research focuses on the design, development, and implementation of new approaches in computer-assisted diagnostics, medical imaging, and precision medicine for resolving challenging clinical problems in pathology, radiology, and oncology.
Xin Qi is assistant professor of Department of Pathology and Laboratory Medicine at Rutgers Cancer Institute of New Jersey. She also serves as adjunct research assistant professor of radiology at Robert Wood Johnson University Hospital. She received her BS degree in precision instrument and opto-electronics engineering at Tianjin University and her master’s and PhD in biomedical engineering at Case Western Reserve University.