Magnetic resonance imaging (MRI) has several important advantages over computed tomography (CT) for radiation treatment planning. The principle advantage is a more accurate and reliable organ-at-risk and target delineation with superior soft tissue contrast offered by MRI over CT.1 A potential treatment planning process with MRI as a sole imaging modality2,3 could eliminate systematic MRI-CT coregistration errors (for example, an inherent uncertainty of in cranial MRI-CT registration was reported by Ulin et al.4), reduce medical cost, minimize patient radiation exposure, and simplify clinical workflow. Despite these advantages, MRI contains neither unique nor quantitative information on the attenuation properties which are required for accurate dose calculations and the generation of reference images for patient setup based on two-dimensional planar images.2 Synthesizing PCT images with electron density information from MRIs has been proposed.5,6 Conventional PCT synthesis techniques can be broadly categorized as atlas-, segmentation-, sequence-based methods, and hybrid methods. Atlas-based methods map a single or multiple CT atlases to a newly acquired MRI to generate the corresponding PCT of the new patient based on deformable registration algorithms.7,8 However, these methods’ efficiency is highly limited by the accuracy of intersubject registration algorithms. Segmentation-based methods first manually or automatically segment MR images into different classes, such as bone, air, fat, and soft tissue, and then assign a uniform electron density value to each class.9–11 However, segmentation-based methods cannot reliably distinguish between bone and air regions due to their similar MRI intensities. Sequence-based methods use specialized MR sequences, such as the ultrashort echo time (UTE), or a mixture of standard and specialized sequences to acquire different types of MRIs and perform some postprocessing on these MRIs to generate the final PCT.12–14 The current image quality of UTE sequences is inadequate for accurate delineation of blood vessels from bone, as they both appear dark.15,16 Moreover, the use of nonstandard MR sequences may introduce additional scanning time to the existing MR scanning workflow and may increase the probability of discomfort and motion degradation. Hybrid methods combine two or more of the above methods into a mixed framework to generate a PCT.10,11,17
Recently, as machine learning has become more and more popular in the medical imaging field, using machine-learning-based methods to generate PCTs from MRIs has been developed and can be classified into the following three categories:
1. Dictionary-learning (DL)-based methods: These methods first rank the corresponding CT patches based on the similarity of MR image patches and then linearly or nonlinearly combine these top-ranked patches together to generate a PCT patch and finally reconstruct a whole PCT through combining all synthetic patches together.18–22 However, these methods are sensitive to MR intensities which can vary as a function of scanning parameters for a given tissue.
2. Random-forest (RF)-based methods: RFs constitute an intuitive model that offers a flexible probabilistic framework for solving the learning tasks that can be used for learning MRI-CT mapping.23–25 However, these methods are imperfect due to uninformative and redundant components of extracted feature vectors, which may introduce ambiguous binary splitting in the decision trees of an RF to result in an erroneous synthesis.
3. Convolutional-neural-network (CNN)-based methods: A CNN provides a complex nonlinear mapping from MRI to CT through a multilayer and fully trainable model.26–28 However, CNN-based methods suffer from a long training time and their performance can be affected by many parameters.
In this work, we incorporate anatomical signature, alternating random forest (ARF), and iterative refinement (IR) model into a learning-based framework to derive PCTs from standard MRIs for MRI-based treatment planning. The contributions of the paper are as follows: (1) a feature selection (FS) is introduced to identify the most informative components of features, which are served as an anatomical signature. (2) To reduce the uncertainty of both MRI anatomical signature and CT target, a joint information gain (JIG) is incorporated into the decision trees of an RF. (3) To cope with a lack of global optimization in RF-based methods, an ARF scheme is used to train an RF-based model. (4) An IR model is introduced to enhance the accuracy of PCT synthesis by combining the contextual information of a PCT image and MRI anatomical signature.
This paper is organized as follows: we first provide an overview of our proposed MR-based PCT synthesis framework, followed by the details on the FS, the construction of the ARF with a JIG, and the IR model for synthesis improvement. We then compare our methods with DL- and RF-based PCT methods and display our contribution within the experimental results.
For a given pair of MR and CT training images, the CT image was used as the regression target of the MR image. Prior to the training stage, we performed image preprocessing to improve the training quality by removing noise and uninformative regions by a nonlocal means method.29 An intrasubject registration was performed to align each pair of MR and CT images, and all pairs were then aligned into a common space. This registration is performed by a commercial software Velocity AI 3.2.1 (Varian Medical Systems, Palo Alto, California) using rigid registration. In the training stage, for each MRI, we first extracted multilevel features, i.e., discrete cosine transform (DCT), local binary pattern (LBP), and pair-wise voxel difference features23 from multiscale images, which consisted of an original and three derived images with a sequence of downsampling factors (0.75, 0.5, and 0.25). Then, these extracted features were concatenated as a patch-based feature vector. (The patch size of MRI is voxel size.) Next, we performed an FS process through least absolute shrinkage and selection operator to identify the anatomical signature from these extracted features. The anatomical signature of each voxel, together with the corresponding CT target, was used to train a sequence of ARFs under an IR model. In the synthesizing stage, the selected features (anatomical signature) were extracted from a new MR image and then were fed into the sequence of well-trained ARFs for the PCT synthesis. Figure 1 outlines the workflow schematic of our synthesis method.
The patch-based features may contain noisy and redundant features that could overfit a learning model, affecting the prediction accuracy and degenerate the final PCT synthesis performance. Therefore, an FS should be performed to identify the most informative features, which are called as anatomical signature to differentiate with original feature vector. As recommended in Ref. 22, minimizing a logistic sparsity regression is used for our FS, as follows:30 is a regularization parameter and we set , is an intercept scalar, denotes an anatomical label, and is a sparse coefficient vector. Each component denotes the ’th component’s importance of and denotes the length of .
We first used label if belongs to nonair regions and if belongs to air regions. After first FS, we used if belongs to bone regions and if belongs to soft tissue regions. The corresponding label was obtained by setting the CT intensity range of as air, as soft tissue, and as bone. As studied in our previous work,22 the is obtained by Eq. (1). The anatomical signature, denoted as , is composed by a subset of feature components with corresponding nonzero entries in . The total dimensionality of the original feature space is 2400. After FS, 63 selected informative features are adequate. These selected components have superior power to distinguish among bone, air, and soft tissue. An example is given in Fig. 2, (a) and (b) show the sample voxels drawn from a pair of MR/CT images. A sample belonging to the bone is highlighted by a red asterisk, and a sample belonging to the soft tissue is represented by a green circle. Figure 2(c) shows a scatter plot of two random features of the corresponding samples without any FS. Figure 2(d) shows a scatter plot of the two top-ranked features with a high Fisher’s separation score.30 It can be observed that the selected features can better separate bone and soft tissue voxels in an MRI.
Alternating Regression Forest
Joint information gain
An RF includes a series of decision trees, each of which is fed with a random subset of training data and is trained independently from the others. Each decision tree recursively splits a training sample from a parent node into two disjoint child nodes, such that the uncertainty of the CT target variables in the resulting subsets (child node) is minimized. However, if the splitting of decision trees is only based on CT target variables, two CT patches with a similar CT intensity in their central voxels may come from two totally different locations with significantly different MRI anatomical signature and vice versa. In Figs. 3(a) and 3(b), the red and yellow circles denote the central voxel of the patches of MRI and CT, respectively. In Fig. 3(a), these two MRI patches from two different locations have totally different extracted features and anatomical signatures, while in Fig. 3(b) the central voxel values of the two corresponding CT patches have the similar intensity values. Based on similar CT intensities, these two samples with different MRI anatomical signatures will be distributed in a uniform partition (child node), no matter which feature in anatomical signature is chosen and which splitting function is applied for a separation. Thus, the splitting procedure pertaining to the uncertainty reduction of CT targets may induce ambiguity for training a decision tree.
To cope with this issue, we proposed a JIG strategy to reduce the uncertainty between an MRI anatomical signature and a CT target:
Alternating random forest
Since how data are further split is only decided at a local node level for an RF-based method, the training procedure is not globally supervised by an appropriate measurement from its whole regressor’s performance. To cope with this issue, we introduced an alternating RF concept to provide a global optimization.31
The global loss function is computed as a breadth-first missing error that is measured by all weak inference learners (regressor) at each depth:31Figure 4 briefly shows the brief architecture of an ARF.
Iterative Refinement Model
It has been known that the context, i.e., the surrounding information with respect to an object of interest, plays a vital role in interpreting image content.32 Similarly, the synthesis of a PCT could be enhanced by the information from its surrounding neighbors. An IR model was used to iteratively leverage the surrounding information by utilizing an autocontext method (ACM) to improve the synthesis performance.32 An IR model is used to characterize the context information from the previous synthesis results and then uses such information as context features,32 together with the anatomical signature extracted from the input MR image, to recursively train a series of ARFs. The process is then repeated to train a series of ARFs until a synthesis error criterion is met. In the synthesizing stage, a new MR image is fed into these well-trained ARFs with the IR model to iteratively generate and refine the PCT. The architecture of an IR model is shown in Fig. 5.
To test the proposed method, we applied it to 14 patients’ brain data and 12 patients’ pelvic data. For the brain images, T1-weighted MRI was captured using a GE MRI scanner with magnetization-prepared rapid gradient echo (MP-PAGE) sequence and voxel size (TR/TE: 950/13 ms, flip angle: 90 deg) and CT was captured with a Siemens CT scanner with voxel size. For the pelvic images, MRI was acquired using a Siemens standard T2-weighted MRI scanner with three-dimensional T2-SPACE sequence and voxel size (TR/TE: 1200/123 ms, flip angle: 95 deg) and CT was captured with a Siemens CT scanner with voxel size. We performed the leave-one-out cross-validation method to evaluate the proposed method. Our PCTs were compared with the original planning CT images which served as ground truth. Three widely used metrics, mean absolute error (MAE), peak signal-to-noise ratio (PSNR), and normalized cross correlation (NCC), were used to quantify the absolute difference, relative difference, and image similarity within the body outline. They are defined as
Generally, the maximum number of training data in a leaf node is set to 5 for regression tasks, as recommended in the guidance of a regression forest.23 In theory, the synthesis performance can be improved with more decision trees and a deeper depth for each decision tree. However, we need to balance the trade-off between the computation time and the regression performance. To achieve the optimal balance between the two, we conducted a fourfold cross validation to decide the number of trees and the maximum depth of each tree, as shown in Fig. 6. Figures 6(a1)–6(a3) show the changes of the mean MAE, PSNR, and NCC metrics with varying numbers of trees while fixing the other parameters to its default values as shown in Table 1. Based on these curves, the tree number of 20 is sufficient for a good PCT synthesis. Figures 6(b1)–6(b3) show the changes of the MAE, PSNR, and NCC metrics with different maximum depth of trees while fixing the other parameters to the default as shown in Table 1. A maximum depth of 20 is adequate for a good PCT synthesis.
Default RF parameter setting.
|Number of trees||20|
|Maximum number of training data in a leaf node||5|
|Maximum depth of tree||20|
Influence of a Feature Selection
Figure 7 provides a detailed visual comparison between the proposed RF method with FS (RF + FS) and without FS (RF) on the brain dataset. To demonstrate the detailed difference with and without an FS, two regions of interest (ROIs) inside red boxes were zoomed in Figs. 7(c1)–7(c4). As illustrated in Fig. 7, the RF + FS can better preserve the continuity, coalition, and smoothness in the synthesis results as it preserves informative features in training the model. In Table 3, the comparison between RF and RF + FS quantitatively describes the averaged MAE, PSNR, and NCC results of RF with and without an FS based on leave-one-out experiments. Compared with the RF without an FS, the FR with an FS could significantly reduce the MAE ().
Influence of the Proposed Method Based on a JIG and ARF
To evaluate the influence of the proposed JIG and ARF, all the experiments were performed with the same training and test samples images in our brain dataset. Figure 8 provides a comparison among the traditional RF-based method, the RF-based method with a JIG (RF + JIG), and the ARF-based method with a JIG (ARF + JIG). Again, Figs. 8(c1)–8(c6) provide the detailed comparison of two ROIs in Figs. 8(b1)–8(b6) among these three methods. In Table 3, the comparison between RF, RF + JIG, and ARF + JIG quantitatively describes the averaged MAE, PSNR, and NCC results of these three methods based on leave-one-out experiments. A smaller MAE, higher PSNR, and NCC demonstrate the consistent improvement of the ARF + JIG-based method over the RF-based method as well as the RF + JIG-based method. The significant difference ( via -test) in the MAE, PSNR, and NCC values between the RF + JIG-based and the RF-based methods demonstrates that the JIG could significantly improve the synthesis performance of an RF-based method. There is also a significant improvement of the NCC between the ARF + JIG-based and RF + JIG-based methods ( via -test). Since the joint information from both MRI and CT, as well as a breadth-wise global loss, were introduced into our proposed ARF + JIG-based method, the proposed method significantly shows a synthesis performance improvement over the RF-based method.
Influence of an Iterative Refinement Model
The influence of the IR model was demonstrated by comparing the generated PCTs with different refinement iterations. Figure 9 shows the PCT results at different refinement iterations in the sagittal plane. Table 2 quantitatively shows the MAE, PSNR, and NCC at different refinement iterations.
Performance at different iterations of the IR model on the brain dataset.
Comparison of State-of-the-Art Methods
To further evaluate our proposed PCT generation method, we compared our ARF + FS + JIG + IR-based methods with a state-of-the-art DL-based approach,21 an RF-based approach (RF + ACM),23 and a CNN-based approach, which is called the generative adversarial networks (GAN) method.28 The parameters of all the algorithms were set according to their best performance. The comparison results of these three methods are shown in Fig. 10. Due to linear averaging of similar patches, the DL-based method often results in a loss in image quality and erroneous synthesis as shown in Figs. 10(b1), 10(b3), and 10(b5). Due to the uninformative patch-based features, the RF + ACM-based method has some errors around the bone boundary as shown in Figs. 10(c1), 10(c3), and 10(c5). Compared with the results from the DL- and the RF + ACM-based methods, the PCTs generated by our ARF + FS + JIG + IR-based method are much closer to the ground truth CTs as shown in Figs. 10(d1), 10(d3), and 10(d5). Table 3 quantitatively shows the averaged MAE, PSNR, and NCC results of these three methods. Our method obtained an average MAE of , a significant improvement over an MAE of using the DL-based method, using the RF + ACM-based method ( via -test), and using the GAN-based method ( via -test). Our method also resulted in an average PSNR of versus , , and ( via -test), and an average NCC of versus , , and obtained by the DL-, the RF + ACM-, and the GAN-based methods ( via -test), respectively. Since the bone has the most significant effect on the PCT estimation, we specifically showed the bone () comparison of generated PCTs in Fig. 11. Table 4 shows the averaged MAE in the three different regions (bone, soft tissue, and air). Again, our method based on ARF + FS + JIG + IR significantly reduces the MAE of the bone, soft tissue, and air regions compared with the DL- and RF + ACM-based methods (, -test).
Numerical results of the different methods on the brain dataset.
|Method||MAE (HU)||PSNR (dB)||NCC|
|RF + FS|
|RF + JIG|
|ARF + JIG|
|RF + ACM|
|ARF + FS + JIG + IR|
Numerical results of the three methods on the three different regions of brain CT images.
|Method||Bone region||Soft tissue region||Air region|
|RF + ACM|
|ARF + FS + JIG + IR|
Results of the Pelvic Data
To further test our proposed method, we applied our proposed method to the pelvic data. Figure 12 shows visual comparison results between PCT and CT. Figures 13(a2)–13(c2) show the quantitative results of the MAC, PSNR, and NCC for each patient’s pelvic images. Overall, the average MAE was , the average PSNR was , and the average NCC was within the body outline.
We presented a machine-learning-based approach to synthesize PCT images from routine MR images. To the best of our knowledge, this is the first work that incorporates the FS and JIG into the RF-based learning framework for this PCT synthesis task. Several mechanisms were also incorporated to improve the synthesis performance, such as the IR strategy and the alternating RF scheme. Our method significantly outperformed the state-of-the-art DL- and RF-based methods for PCT synthesis. Recently, Nie et al.28 proposed to use the GAN to generate PCT and reported that the MAE was for the brain data and for the pelvic data. The MAE from our proposed method was for the brain data and for the pelvic data. Compared with the deep-learning-based method proposed by Nie et al., our method obtained a much smaller MAE for both sites, which further demonstrates the performance of our PCT synthesis method.
Although the proposed JIG and ARF can enhance the performance, optimizing the proposed binary splitting procedure is more complex than binary splitting in a traditional RF-based method and takes more computation time (about 7 h versus 2 h in the RF-based method by Huynh et al.23 and 19 h in the GAN-based method by Nie et al.28). Although we have tested our algorithm on 14 patients’ brain data and 12 patients’ pelvic data, in the future, we plan to enroll more patients to further ascertain the robustness of our algorithm. In addition, in this study, we only used routine T1-weighted (brain) or T2-weighted (pelvis) MRIs to synthesize our PCTs. However, it is possible for our method to generate PCTs using other sequence-based MRIs if our training database includes these images. We plan to combine several types of MRIs based on widely used sequences into our training database to generate PCTs using multisequence MRIs to test whether this could improve our accuracy of PCT.
The accuracy of MRI-CT registration may affect our proposed method from two folds: (1) the mismatches between MRI and CT may affect our model training and (2) the mismatches between MRI and CT will affect our MAE, PSNR, and NCC accuracy. To deal with such small mismatch effects for our model training, we divided the whole image into multiple small patches with dense overlap. Most of the patch pairs between MRI and CT match very well. Moreover, not all mismatched patch pairs are located in the similar anatomical regions for each patient, which do not significantly affect our model training. In the synthesizing stage, for a new MRI feature arriving at each leaf node followed by the previous splitting rules, to infer its correspondence to the CT target, an ensemble model is introduced to generate the final estimation by computing the median of all the nodes within each decision tree, and then computing the median of all the decision trees. Thus, even if there are some positioning differences in pair-wise MRI and CT patches, the median estimation can avoid most of the potential bias. However, our PCT is directly generated from MRI and has same anatomical structures as the MRI, so the mismatches will directly affect our MAE evaluation, which calculates the voxel-based HU difference through a subtract processing between planning CT and PCT.
The MR imaging acquisition parameters as well as magnetic field inhomogeneity and patient-specific distortion may influence the performance of the proposed method, with implications on dosimetry calculations and patient setup. In our study, all MRI images were preprocessed using an N3 algorithm33 to effectively reduce distortion before training or synthesizing. Other innovative methods, such as a real-time image distortion correction method,34 have been reported to have excellent performance, and combining these preprocessing methods with our method could increase the accuracy of the PCTs. Due to the high fraction of air, large motion, and distortion, the lung with low resolution and intensity in MRI is very unique and difficult site for our MRI-based radiotherapy, which will be one of our future plans. In this study, we demonstrated the accuracy of PCT in HU numbers. Accurate HU numbers in PCT is of great importance for its usage for dose calculation in MRI-only radiation therapy treatment planning.35 For example, dose calculation is sensitive around tissue and bone boundaries due to the significant changes in electron density. In MR images, bony tissues pose a significant susceptibility artifact, and its ambiguous boundary with air may introduce synthesizing error of perturbation and shift. Such effect on dose calculation accuracy especially for surrounding tissues needs more related studies. In future, we need to conduct studies to investigate how much those factors on MR imaging will affect the dose calculation and patient setup during our MRI-only-based cancer radiotherapy. These studies would be important to further evaluate the clinical utility of our method.
In addition, the PCT generated by our method has potential to be used in dose calculation for other treatment modality, such as proton radiation therapy. However, proton dose deposition is widely known to be more sensitive and has more uncertainty than photon beams, thus a comprehensive study is necessary to test its feasibility. The PCT generation in this study could also be used to generate an attenuation correction map of PET. This application was not the focus of this study and using our method for PET attenuation correction will be a potential future study.
We proposed a learning-based approach to synthesize a pseudo-CT (PCT) image from a routine MR image for potential MRI-based treatment planning, which combines alternating RF with patch-based anatomical signature to effectively capture the relationship between the CT and MR images. We demonstrated that the proposed method is capable of reliably generating a CT image from its MRI counterpart on brain and pelvic data. This PCT synthesis technique could be a useful tool for MRI-based radiation treatment planning.
This research was supported in part by the National Cancer Institute of the National Institutes of Health under Award No. R01CA215718 and the Department of Defense (DoD) Prostate Cancer Research Program (PCRP) Award No. W81XWH-13-1-0269.