The accelerating complexity and variety of medical imaging devices and methods have outpaced the ability to evaluate and optimize their design and clinical use. This is a significant and increasing challenge for both scientific investigations and clinical applications. Evaluations would ideally be done using clinical imaging trials. These experiments, however, are often not practical due to ethical limitations, expense, time requirements, or lack of ground truth. Virtual clinical trials (VCTs) (also known as in silico imaging trials or virtual imaging trials) offer an alternative means to efficiently evaluate medical imaging technologies virtually. They do so by simulating the patients, imaging systems, and interpreters. The field of VCTs has been constantly advanced over the past decades in multiple areas. We summarize the major developments and current status of the field of VCTs in medical imaging. We review the core components of a VCT: computational phantoms, simulators of different imaging modalities, and interpretation models. We also highlight some of the applications of VCTs across various imaging modalities.
Medical imaging involves some of the most beneficial and advanced technologies used in medicine today. However, the design and implementation of new imaging technology is incredibly complex. Doing so through clinical trials (experiments using human subjects) is often not practical or definitive due to ethical limitations, expense, time requirements, difficulty in accruing enough subjects, or a fundamental lack of ground truth (knowledge of the exact anatomy and condition of the patient). Most current approaches to assess imaging technologies outside of clinical trials rely on simplistic physical phantoms, the results from which cannot readily predict clinical efficacy. Meanwhile, the complexity of medical imaging technologies has continued to accelerate, outpacing our ability to assess them and optimize their design and clinical use. By the time we have consummated the ideal trials, the technology has moved on again. Therefore, we either work with “old validated” technology that may be less effective or put patients potentially at risk with newer unvalidated technology.
Virtual clinical trials (VCTs) are an efficient methodological alternative to clinical trials for evaluating and optimizing imaging concepts and technologies. In a VCT, the human subject is replaced with a virtual digital phantom, the imaging system with a virtual simulated scanner, and the clinical interpretation with a virtual interpretation. In that way, a “subject” can be “imaged” and the image can be “interpreted,” emulating the clinical process without an actual clinical trial. A framework of a VCT in medical imaging is illustrated in Fig. 1. VCTs can be conducted quickly and cost-effectively on a computer, providing researchers a practical way to answer fundamental questions using the precise controls and the known ground truth, which is possible only in the virtual domain. These virtual trials enable objective optimization of new and existing imaging technologies (hardware and software) and their utility in terms of the desired diagnostic task or accuracy, while minimizing risk (e.g., radiation dose).
VCTs, as a general term, can be ascribed to simulation studies that emulate clinical experiments. These simulation experiments could be in the context of human models being imaged with imaging devices or could be focused on the interactions of a treatment with human models (e.g., pharmacokinetic and pharmacodynamics models).1 This paper only covers the VCTs that are done in the context of medical imaging, with the purpose of imaging technology advancements, clinical utility evaluations, and optimizations. The broader application of VCT to cover other topics such as outcome prediction or comparison of alternative treatment options can be future extensions of VCTs, but they are beyond the scope of this paper.
Over the past decades, there have been extensive efforts in development and application of VCTs in medical imaging, from creating models of humans and imaging scanners to designing and using interpretation models. VCTs are also challenged by computational complexity, questions relating to simulation realism, and difficulties in validations.
Computational, Anthropomorphic Phantoms
In VCTs in imaging, the virtual patient population is provided by computational, anthropomorphic phantoms that model the patient anatomy and physiology. The advantage of computational phantoms is that, unlike actual patients, their exact anatomy is known, providing a “gold standard” or “ground truth” from which to quantitatively evaluate and improve imaging devices and techniques. Imaging data of a computer phantom can be generated using a computerized scanner model under various scanning parameters or protocols, and the effects quantified in comparison with the known phantom. The user knows precisely what simulated images should reveal in terms of organ volumes or boundaries, tumor locations, sizes, shapes, extent and frequency of motion, presence and location of disease indicators, etc. The dose to the organs and structures from different procedures can also be calculated to assess patient risk from radiation exposure. None of these things is possible using live subjects.
For VCTs, it is essential to have computational phantoms that are realistic so that simulated results emulate what should occur in actual subjects. Phantoms must realistically model patient anatomy and physiology including the geometry of the organs and structures, the material properties of the tissues, patient motions, blood flow or contrast perfusion, alterations of the anatomy due to disease, and any other factors that could affect medical imaging. Computational phantoms must also be able to model the anatomical and physiological variability indicative of a clinical population as anatomy and function varies from person to person, with these factors also impacting imaging results.
Types of Phantoms
Different methods have been used over the years to create computational phantoms. Phantoms are first constructed by defining objects to represent the necessary organs and structures of a given subject. The anatomical objects can then be assigned tissue material properties (density, elemental composition, radioactivity uptake, magnetic resonance, acoustical properties, etc.) for input into corresponding imaging simulations [e.g., x-ray, computed tomography (CT), nuclear medicine, magnetic resonance imaging (MRI), or ultrasound].
Computational phantoms have been typically categorized based on how they define the anatomical structures of the body.2–4 The three main categories of models are mathematical, voxelized, and boundary representation (BREP) phantoms, demonstrated in Fig. 2. Mathematical phantoms use equations or simple geometric primitives to define the organs and structures in the body. They can easily be manipulated through these equations to simulate changes in anatomy (alterations in organ size and shape) or motion (voluntary or involuntary), but they are lacking in terms of realism. Voxelized phantoms use 3-D cuboids or voxels to define the anatomical structures based on the segmentation of patient medical images. Voxelized phantoms are more realistic, but they are not as flexible. For example, it requires great effort to modify numerous phantom voxels to simulate anatomical variations or motion. In addition, since they are based on segmented imaging data, voxelized phantoms are set to a particular resolution. Generation of the phantom at other resolutions requires interpolation, which might induce error. BREP phantoms were introduced to combine the advantages of voxelized and mathematical models. Based on segmented patient data, they go a step further using advanced surface representations such as nonuniform rational b-splines or polygon meshes to define each organ or structure. The advanced surfaces can realistically model the anatomy while providing a mathematical basis to simulate anatomical changes or motion.
Beyond these categories, computational phantoms have been recently developed using volumetric tetrahedral meshes8 as opposed to surfaces. These phantoms are advantageous in that they can be directly input into some commonly used Monte Carlo (MC) simulation codes [e.g., Geometry and Tracking (Geant4),9 Monte Carlo N-Particle (MCNP6),10 and Particle and Heavy-Ion Transport code System (PHITS)11]. The volumetric definition of the structures also provides a framework for users to more easily define spatially varying material properties within the organs and tissues, providing an added level of realism in medical simulations.
Whole Body Phantoms
Using the above techniques, hundreds of computational human phantoms by different universities and companies have been developed and used in medical imaging simulations. The review articles by Xu3 and Kainz et al.2 provide a comprehensive guide to the various models that have been created over the past 50 years, with phantoms growing in their ability to realistically model the human anatomy. To achieve the level of realism necessary for VCTs in imaging, modern computational phantoms are typically constructed as a variation of the BREP method, defining surfaces or meshes based on the segmentation of 3-D patient imaging data (e.g., MRI and CT). Figure 3 shows some example whole body phantoms developed by the Rensselaer Polytechnic Institute (RPI), the University of Florida, the IT’IS Foundation, Duke University and Johns Hopkins University (JHU) that have been commonly used for imaging research. Details on these phantoms as well as any others can be obtained in the above review articles.
In the creation of such models, the segmentation of patient data is a time-consuming process. It can take months to a year to create detailed phantoms as most of the segmentation work is done manually. As mentioned previously, it is important to model many different types of people for VCTs so as to represent the population at large. To model the variability in populations, one can simply deform existing surface or mesh-based phantoms to create new ones. Rigid or nonrigid transforms can be applied to the surfaces, manipulating the anatomy to match certain anthropometric characteristics, such as height, weight, BMI, or organ mass. An initial phantom can serve as a springboard to create any number of models representing population statistics. For example, the University of Florida created a library of 351 computational phantoms by manipulating a series of male and female template phantoms.12 To this end, the development of population imaging biobanks such as the UK Biobank13 or the German National Cohort14 provides a vital tool toward the further development of phantom populations constructing model variability from epidemiologically sound studies involving multiple organ systems.15–17
Image registration methods have also been used to more efficiently create phantom populations. For this method, easy to discern organs and structures are quickly segmented from target patient data. An existing detailed computational phantom that best matches the patient characteristics is then selected. Using image registration, a high-level transform is calculated from the template phantom to the segmented target. The transform is then used to fill in the unsegmented anatomy (muscles, blood vessels, and other small details). The XCAT library of adult and pediatric phantoms was created in this manner.18,19
Recent works have focused on deep learning algorithms for automatic multiorgan image segmentation.20–22 If successful, such algorithms can replace the time-consuming manual methods previously used in the phantom development process. Patient data may be thoroughly segmented within seconds or minutes and used to define computational phantoms. In this manner, the deep learning-based segmentation works have shown promising performance toward the high throughput development of phantom populations.
Modeling Intraorgan Structures
Earlier versions of computational phantoms included only major organs and structures. Although sufficient for some applications (e.g., dosimetry studies or low-resolution imaging modalities), lack of intraorgan structures would considerably limit these phantoms for VCT studies accessing image quality in higher resolution imaging modalities. Ideally, intraorgan structures should be segmented from clinical cases, similar to how major organs are incorporated in computational phantoms. However, it is challenging to segment these small structures due to the limitations of the current clinical images (e.g., noise, resolution, and contrast) and segmentation algorithms. Over the years, many mathematical and anatomically informed models have been developed to incorporate intraorgan heterogeneity in various organs. Specific organs are highlighted below to illustrate these methods.
For the breasts, it is particularly important to include intraorgan structures as they are mainly used for high-resolution imaging applications (e.g., mammography and tomosynthesis). In addition, the breast is a soft tissue organ lacking obvious anatomical landmarks. To address these challenges, researchers have created two main types of phantoms: procedurally generated and patient-based.
Procedurally generated phantoms are created from mathematical or statistical principles, such that the resulting structures resemble the general appearance of anatomy. This approach has three key advantages: (1) phantoms can be generated at an arbitrary resolution, which is important for meeting the high-resolution needs for mammography; (2) it is possible to generate infinite numbers of independent phantoms with low computational cost; and (3) phantoms can be customized to provide desired characteristics such as breast size, density, or parenchymal distribution patterns. Several groups have each created phantoms, including those from UPenn,23–25 FDA,26,27 and Patras.28,29 Some examples of these phantoms can be seen in Fig. 4.
Alternatively, patient-based phantoms are derived from human subject images, which for breast imaging is typically breast MR or dedicated breast CT. Since each phantom recreates a real human breast, the appearance is inherently realistic, including distributions of parenchyma that cannot be readily reproduced by procedural techniques. However, this patient-based approach has some key limitations: (1) each subject yields one phantom, so the number and diversity of phantoms are limited by finite human subject data, (2) the process of generating phantoms can be computationally expensive, (3) source data come from medical images, which have limitations of contrast, resolution, noise, and artifacts and may in turn affect the quality of the phantom, typically by limiting its resolution. Patient-based phantoms include those from UMass31 and Duke.32 Given the one-to-one correspondence between subject and phantom, this approach may not scale up readily for virtual trials that require thousands of cases. To address this limitation, investigators have created augmented patient-based phantoms using deformations and morphing,33 addition of procedurally generated details,34 and principal components analysis.35,36 Examples of phantoms and simulated mammograms from a principal components approach are shown in Fig. 5.
For the lungs, it is feasible to segment the lobes and initial branches of the vasculature and airways from volumetric images (e.g., from CT). To incorporate the remaining vasculature and airway branches, multiple groups developed physiologically based algorithms that model airways,37–39 vasculature,40 or both together.41,42 These algorithms are volume-filling branching methods in which the parameters (branching angle, diameters, lengths, branching order, etc.) are informed by physiological laws (e.g., flow dynamics) and anatomical measurement studies. Similar to nonparenchyma structures, models of parenchyma structures have been developed by synthesizing pulmonary lobules and alveolar regions, informed by high-resolution images of lung specimen and morphometry measurements.43–45 Similar approaches have been implemented to incorporate intraorgan structures for other organs such as the liver,46–48 brain,49–51 heart,52–54 and bones.55–57
Recently, some studies have showed that deep learning approaches, such as generative adversarial network (GAN) models, can synthesize images that have similar visual and statistical features of a set of training input data.58,59 These techniques can also be utilized for the purpose of modeling intraorgan heterogeneities within the organs and structures of computational phantoms, particularly for the parenchymal regions where organs usually have “textural” appearances. For example, Fig. 6 shows an example of synthetic intraorgan “textures” added to an XCAT phantom using a dual-discriminator conditional GAN network trained on 3-D CT images.59 For the purpose of creating computational phantoms, it would be more effective if the networks are trained based on higher resolution and higher quality images (e.g., micro-CT pathology images) than standard medical images (e.g., CT). This is needed so that the synthesized textures, to be added to the computational phantoms, would be true anatomical textures and not include artifacts and noise of the particular imaging device and not be limited by the resolution.
A realistic VCT requires representative models of patients with diseased conditions and pathologies, especially if the VCT study is targeted on a specific application or task. Over the years, some diseased models (lesions, cardiac diseases, pulmonary diseases, etc.) have been developed and incorporated in the computational phantoms.
Similar to the development evolution of computational phantoms, the first generation of lesion models (oncological, cardiac plaques, kidney stones, etc.) were based on mathematical forms representing their general shapes.60–63 These lesion models are easy to create and have a rough representation of an actual lesion. To make these lesions more realistic, researchers have segmented them from clinical images.64,65 Compared with simple mathematical models, these segmented lesions have a more realistic rendition of the lesion. However, current scanners are not able to preserve all of the morphological or textural attributes of the diseases (e.g., lesion spicules or lesion texture). Therefore, the segmented lesions have been enhanced to include those high spatial frequency contents either using higher resolution images (e.g., digital pathology) or with assumptions informed by morphometry studies.66,67 Figure 7 shows examples of simulated oncological lesions using these methodologies. Lesion models have been further enhanced by creating a model by incorporating cell-level biological parameters and physiologically realistic growth mechanisms.68 This lesion model determines the most probable approximation to the complete time evolution of a solid lesion based on known results from imaging and biology measurements.
In addition to the presentation of lesions or other disease-related abnormalities within the body, disease also manifests itself as an alteration in the anatomy or physiology of the organs. To model these conditions, phantoms can be created by segmenting datasets from a specific patient cohort. Alternatively, these abnormalities can be generated by altering phantoms that are originally based on healthy datasets. To simulate changes due to disease, computational phantoms can be deformed69 in a fashion similar to that described in Sec. 2.2. For example, organs can be deformed to accommodate tumors in the lesion-local environment or structures can grow or shrink in size.70,71 Models for physiological functions as presented below can also be altered to simulate abnormalities within them.
Modeling Functions and Deformations
In addition to human anatomy and pathologies, it is also important to model physiological functions (motions, blood flow, and perfusion) and their variations as these factors can also affect medical imaging technologies and results. Motions such as the cardiac, respiratory, and patient voluntary motions are an important factor in medical imaging as they can cause artifacts in the resulting images that can lead to the misdiagnosis of patients. Motion is also an important consideration in radiation therapy. Tumors must be optimally targeted, sparing healthy tissues, in the context of a changing anatomy due to patient motion.
To simulate patient voluntary and involuntary motions for research, transformations (rigid and nonrigid) can be applied to the phantom’s anatomical structures to simulate motion over time. For surface-based models,7,72 the transformations are applied to the surface control or vertex points defining the objects. For voxel-based phantoms, the transformations are applied to the individual voxels and interpolation is used to generate subsequent images.73 Transformations defining motion are typically based on patient imaging data, such as 4-D CT or MRI. The anatomical structures are deformed to follow what is observed in patient images. Figure 8 shows examples of computational phantoms modeling the cardiac and respiratory motions (RMs).
Motions can vary from individual to individual, vary in health and disease, and can even vary within the same individual (varying levels of breathing for example). To simulate variations in a given motion, parameters can be setup to alter the deformations of a given phantom’s anatomical objects. Such alterations can be based upon the analysis of several sets of patient motion data.74 Finite-element techniques75,76 are also being investigated to create physiologically based models for patient motions that can be altered in a physiologically informed way to realistically simulate normal and abnormal variations in individuals.
Another physiological factor that can affect medical imaging is blood flow and, therefore, contrast perfusion within the body. Computational phantoms provide anatomical vessel models with which to simulate blood flow77 and organ compartments to simulate contrast perfusion.78 Contrast perfusion is an important determinant of image quality as well as dose in imaging applications.79 For instance, over 60% of CT images are acquired with a contrast agent. Depending on patient attributes, each patient anatomy and physiology can produce different dynamics in the distribution and perfusion of the contrast agent throughout the body as a function of time, which can subsequently affect imaging results. Machine learning methods are currently being applied to develop robust models of contrast perfusion in patients as a function of patient attributes.80
Beyond physiological functions, computational phantoms must also have the ability to be deformed to simulate the different positions of patients for various imaging procedures. Depending on the procedure, the arms may need to be overhead or at the sides, the legs to be straight or bent at a certain angle, the head to be tilted, etc. Various methods have been developed to position computational phantoms.81–83 In addition to positioning, certain breast imaging modalities, such as mammography and tomosynthesis, also require the patient to undergo varying degrees of breast compression. This type of deformation is typically simulated within a computational phantom using finite-element methods.84
With the trials taking place in silico, VCTs require simulators of the imaging system to “virtually image” the virtual subjects. Imaging simulators can be utilized to systematically evaluate and optimize the performance of the current and emerging technologies, including both hardware and processing implementations. Simulators are also beneficial for optimization of new technologies prior to the production phase, making the design process more cost-effective and rapid.
For effective VCTs, imaging simulators should include rapid and low-cost generation of simulated images as well as the ability to produce realistic images close to those obtained from real scanners. With recent advances in computer technologies, simulators are able to image large number of cases rapidly and cost-effectively. The realism requirement can be met by accurate and detailed modeling of the scanner, in addition to having realistic computational phantoms.
In general, the development of an imaging simulator consists of several components. They include: (a) modeling the physical and geometrical components of the imaging system, (b) the physics of the imaging process (models of scanner–object interactions) that generates the data, and in some imaging modalities, (c) additional image reconstruction and/or image processing applied to form the final images.
Over the years, significant progress has been made in accurate models of the imaging systems and formation processes. They include simulators for various imaging modalities, including x-ray-based imaging (e.g., radiography, mammography, fluoroscopy tomosynthesis, CT), positron emission tomography (PET), single-photon emission computed tomography (SPECT), MRI, and ultrasound.
The system components needed for x-ray-based simulators are the x-ray source physics, detector physics, and acquisition geometry. For the x-ray source, models of the polyenergetic spectrum85–100 and focal spot shape and size101–104 are essential. Detector models require estimation of the detector response (quantum efficiency) to polyenergetic photons, quantum and electronic noise, crosstalk between adjacent pixels, afterglow, antiscatter grid, and pulse-pile up.105–114 These can be modeled either using experimental measurements or MC simulations, accounting for the geometry and materials of the detector. Scanner-specific and accurate models of these components are necessary to achieve simulated images with quality that is close to the ones obtained from actual scanners. Depending on the modality and conditions, however, some of these components have more effects on the realism of the simulations. For example, the effects of crosstalk in smaller detectors (e.g., radiography and mammography) are more prominent compared with modalities with larger detector sizes (e.g., CT), or in low-dose simulations, the accuracy of electronic noise model is more critical than high-dose simulations.
To generate the simulated images, the acquisition geometry, system components, and computational phantoms are input to an x-ray interaction simulation framework. For simulating these interactions, the common, trusted approaches are MC methods. These methods model the transport of individual x-ray photons through an object, modeling several orders of x-ray-tissue interactions (primary, secondary, and higher). A variety of MC codes have been developed for simulating images and estimating organ doses in radiography,115–118 mammography,119–125 tomosynthesis,126–129 fluoroscopy,130–132 and CT.133–137 Although accurate, MC could be computationally too slow for some applications like tomosynthesis or CT in which tens or thousands of projections are needed for an acquisition.
To overcome this bottleneck, researchers have developed ray-tracing138–140 algorithms in which only the analytical approximation of x-ray-tissues (primary signals) is estimated using the Beer–Lambert law. To include the scatter signal, hybrid approaches in which primary signal (using ray-tracing) is combined with scatter signal using either analytical scatter estimations or MC methods with limited number of histories have been developed.141–143 The other limitation with the ray-tracing methods is that they do not account for the finite size of the focal spot and detector pixels, making the simulated images undersampled and unrealistically sharp. This can be remedied by a subsampling strategy in which each source-to-detector ray is replaced by multiple rays sampling the area of the focal spot and detector pixel.
After simulating the x-ray interactions, raw images need appropriate processing and corrections depending on the modality, scanner model, and imaging task. These include scatter corrections, water calibration, beam hardening corrections, histogram corrections, and image reconstructions. Discussions of these methods are beyond the scope of this essay. In this section and the following, we focus on the techniques used to simulate the acquisition of the raw imaging data. Figure 9 shows examples of scanner-specific simulated images of mammography, tomosynthesis, and CT using state-of-the-art x-ray-based simulators.
Although the promise and capabilities of x-ray simulators are evident, they can be further improved in terms of scanner-specificity and compatibility with more advanced phantoms. To date, limited scanner models have been developed and validated.141–143 For comprehensive virtual trials, models of more diverse scanners are needed. Further, as computational phantoms advance, they become more detailed (higher resolution) with more realistic capabilities (e.g., motion and perfusion models). Developers need to alter the simulators to be compatible with these additional phantom functionalities.
PET and SPECT
Simulation of the projection data in PET and SPECT requires an accurate model of the photon generation and detection processes, i.e., those related to the imaging system and the physics.144–146 A typical PET imaging system consists of a detector system surrounding the patient, whereas a typical SPECT imaging system consists of a detector additionally fitted with a collimator. The imaging characteristics of the systems can be modeled by their respective detector response functions and the response function of the collimator in SPECT. The physics of the image formation process can be characterized by the effects of photon attenuation and scatter photons that emit from the radioactivity source inside the patient and traverse through the body and the collimator (for SPECT) before reaching the detector and registering as detected signals or counts.
The most accurate means to simulate PET or SPECT image formation is the use of photon transport MC simulation methods.147,148 They allow accurate simulation of the photon attenuation and scattering through patient’s body149,150 as well as the response of the collimator (SPECT) and the radiation detector of the imaging systems.151
There is a wide selection of MC simulation software that is available for various PET and SPECT imaging applications. For example, the relatively small simulation of imaging nuclear devices MC software package152 is designed for a standard clinical SPECT system with simple imaging configurations and applications. It is easy to use with a relatively fast processing time. The simulation system for emission tomography MC software package153 is designed for both PET and SPECT applications and allows more complicated imaging configuration and imaging applications. It is also the most efficient (by factors of to ) photon tracking system since it is customized for PET and SPECT scanner simulations, although it has less flexibility than more general purpose systems described next. The large MCNP11 and Geant4 MC software packages,154 which were originally designed for high-energy physics and nuclear energy research, have also been applied to PET and SPECT simulation. Although they provide more accurate and complete modeling of the transport of all radiations, they are difficult and cumbersome to use due to their general purpose application and relatively large software package size.
The Geant4 application for emission tomography (GATE) MC simulation toolkit for PET and SPECT155 was developed by the OpenGATE collaboration. It consists of a user application layer with an extensible set of C++-based tools that wrap around the Geant4 MC simulation toolset. The user application layer allows modeling of complex PET and SPECT system designs with various detector geometries and a large number of individual detector units that are difficult or impossible to implement using the other MC software packages. The GATE software has become a popular MC simulation toolkit for novel PET and SPECT image systems.
A disadvantage of photon-transport tracking simulations is the required computation time. An alternative approach, as described above for x-ray systems, is the use of ray-tracing methods with very similar trade-offs in bias versus computation time. However, the large advantage for VCTs is the ability to rapidly generate many (tens, hundreds, or thousands) statistically independent but identically distributed realizations. The most well-established of these methods is the analytic simulator (ASIM).156,157 This simulator has been successfully used in several VCTs of PET imaging evaluation as described below.
Figure 10 shows examples of realistic simulated PET and SPECT images generated from different human phantoms using accurate MC photon transport and ray tracing models. The images are presented with similar clinical images from patient and phantom studies for comparison.
For SPECT and PET simulation, there remain challenges that require continuing research and development efforts. As with CT, simulators need to adjust to work with more complicated and detailed phantoms and to be able to handle populations of models. MC simulations of high-resolution phantoms can be time intensive. To find an optimal balance between accuracy and simulation efficiency, hybrid modeling of the imaging processes that combine analytical and MC simulation methods158 as well as integrating different MC packages159 is being investigated. In addition, there are substantial technology changes with time-of-flight (TOF) imaging and changes in the photon detection and processing hardware systems.
Magnetic Resonance Imaging
Several simulators have been developed for MRI since the 1980s when the foundations of MRI imaging were laid.160–173 The fundamental component of any MRI simulator is an efficient solver of the generalized Bloch–Torrey equation.174–176 Several solvers have been proposed in the literature. Most solvers operate on regular169,172,177,178 (e.g., Cartesian grids) or irregular179,180 (e.g., tetrahedral grids) grids. The first approach is typical of finite difference solvers for which there are efficient numerical schemes. This is the most common strategy presented in the literature. Tetrahedral elements are more suitable for irregular geometries, and some authors have proposed finite-element discretization schemes based on tetrahedra. Some authors have focused on developing simulated MRI data with emphasis in accurately modeling the geometry while achieving close-form expressions using polyhedral shapes.181,182 These models, however, assume simplified models of the MR physics, viz., piecewise constant image intensities.
The first general purpose MRI simulators were SIMRI,161 mainly developed for medical training and subsequently extended to produce 3-D images on an IBM Blue Gene,160 and Jülich Extensible MRI Simulator (JEMRIS),169 offering a comprehensive, open-source solution for complex pulse sequence design on dedicated multicore CPU clusters. These systems were targeted to technically Savvy MRI researchers and usually required code modification to adapt simulations to new problems.
A number of MRI simulators have been developed since the late 2000s, mostly focused on 2-D MRI and Bloch equations. Cao et al.,162 for instance, utilized a Bloch-based 2-D MRI solver to estimate signal, noise, and specific absorption ranges when designing MRI systems. They demonstrated application of their solver to various coil types and with parallel transmission and reception pulse sequences/hardware. More recent developments in graphical processing units (GPUs) have triggered efficient numerical implementations and a focus toward cloud-based and a user-friendly simulation environment. For instance, MRISIMUL by Xanthis and Aletras is a solver of the Bloch equations 183 based on MATLAB and CUDA-C, exploiting GPUs. MRISIMUL was developed in cardiac MRI and includes extensions to account for cardiac, respiratory, and blood flow motion.171 However, this remains a Bloch solver useful for producing realistic cine MRI but not to incorporate MR diffusion terms.
Recent work by Xanthis and Aletras184 repackaged MRISIMUL as a simulation as a service system on a GPU cloud, thus providing the required scalability for large-scale in silico trials. BlochSolver by Kose et al. provides an efficient implementation of the Bloch–Torrey equation for Cartesian177 and non-Cartesian185 3-D MRI readouts with acceleration factors of over CPU-based implementations for the same number of processing units. Compared with physical experiments, the authors could reproduce in their simulations the effects of the static magnetic field inhomogeneity, radiofrequency field inhomogeneity, gradient field nonlinearity, and fast repetition times. The possibility of simulating non-Cartesian acquisitions paves the way for simulating advanced MRI sequences like ultrafast imaging, zero echo-time imaging, functional MRI, real-time imaging, and MR fingerprinting. Kose et al.177 demonstrated that it is possible to simulate an acquisition with matrix acquisitions in a time comparable to a real acquisition. Xanthis et al. and Kose et al. both focused on solvers of the Bloch equation, hence disregarding diffusion and bulk flow effects. Beltrachini et al.179 developed a parametric finite-element solver of the generalized Bloch–Torrey equations with application, for instance, in simulating intravoxel incoherent motion and diffusion-weighted MR imaging (DWI). Recently, other similar FE solvers have emerged,180,186 some of which are open source and available on the cloud.187
Simulators contributed greatly to the development of MRI understanding, optimization, and assessment albeit important limitations remain. The major limitation of previous MRI simulators is the simple representation for biological tissue. All previous simulators assume that all protons belong to a single compartment. However, tissue biology seems to highlight that a better model is that of multiple exchanging proton pools. Multipool modeling becomes critical when trying to simulate advanced MRI techniques with the purpose of accurately characterizing tissue composition, microstructure, or microenvironment. To this effect, Liu et al.178 presented a generalized multipool exchange tissue model; examples of these techniques are quantitative magnetization transfer, quantitative T1 and T2 relaxometry, chemical exchange saturation transfer, etc. Liu et al. however noted that the same fundamental problem affects even basic MRI sequences. In relationship to diffusion-weighed MRI, recent works employ MC188,189 and other techniques.190 These methods are impractical for tissue models with realistic microstructural complexity, needing many hours (or even days) of processing for single simulations.191 This has hindered development of more detailed microstructural models and optimization of MR pulse sequences. Existing frameworks are not flexible enough to deal with arbitrary meshes due to the difficulties imposed by the periodical boundary conditions.
Several specialized phantoms or simulators that complement the Bloch–Torrey equations, enabling simulation or calibration of advanced MR imaging techniques, have been proposed in the literature. For instance, Klepaczko et al.192 and Fortin et al.193 developed MRI simulators for magnetic resonance angiography (MRA) of the cerebral circulation. TOF MRA data simulated by Klepaczko et al.194 were used to generate ground-truth data for evaluating vascular segmentation algorithms. Fortin et al.193 extend the JEMRIS simulator to be able to produce flow-related MRA images for the main three techniques, viz., TOF MRA, phase contrast MRA, and contrast enhanced MRA. Pannetier et al.195 developed a simulator to predict dynamic contrast enhancement in MRI with bolus tracking. Cheng et al.196 developed a hardware simulator to generate reference functional blood oxygen level-dependent (BOLD) imaging data using a quadrature digital RF generator. Drobniak et al.197 offered a software simulator for BOLD fMRI signal using the Bloch equation and accounting for field inhomogeneity induced by magnetic susceptibility variations (via Maxwell’s equations), rigid-body motion, chemical shift, RF field inhomogeneity, Eddy currents, and noise. Walker et al.198 developed a simulator for magnetic resonance spectroscopy (MRS) of hyperpolarized agents, which allows real-time detection of metabolism in vivo. The MRS simulator is based on the Bloch–McConnell equations coupled to a pharmacokinetic model of tissue perfusion of hyperpolarized substrates.
Realistic 3-D MRI simulations of computational phantoms with complex heterogeneous tissue models can be extremely computationally expensive. To handle such a challenge, techniques, such as parallelized computing and GPU programming, are being studied to work with such phantoms so as to produce simulations more efficiently under a reduced computational load.
The primary goal of an ultrasound simulation software is to mimic the physical processes that govern imaging with a transducer, including interactions of the acoustic field with the target medium. The first step in the simulation is modeling the response of the transducer materials to electrical excitation and the resulting properties of the acoustic field. This step can be done using finite-element analysis tools [e.g., PzFlex (Onscale, California)], particularly to compute the physical behavior of an array and its corresponding acoustic response, for a wide variety of materials and transducer designs including piezoelectric and capacitive micromachined ultrasonic transducer array technologies. Other tools have been developed to design the specific device characteristics, such as PiezoCAD (Sonic Concepts, Inc., Woodinville, Washington,) for piezoelectric stacks and PRAP (TASI Technical Software Inc., Kingston, Ontario, Canada) for the complex impedance of various piezoelectric materials. Imaging simulations often model arbitrary transducer geometries using small elements with simplified physical characterizations or include precomputed geometries.
The spatiotemporal acoustic field produced by the transducer is then input into a physics-based model of propagation to describe how the field evolves through time. The transmitted field propagates away from the transducer governed by an acoustic wave equation that relates the evolution of acoustic pressure through space and time to material properties. The simulated field can then be used to study the spatial distribution of energy with respect to a target.
Acoustic field simulation tools are divided into linear and nonlinear methods. One of the standard tools for simulating the linear wave equation is field II,199 which uses the spatial impulse response model. The transducer is divided into sufficiently small elements such that the field points are in the “far-field,” where approximations can be made to simplify the numerical computation of the response. Temporal responses at a given point are then given by the superposition of the responses of the individual elements and can be quickly calculated. Because propagation is not directly modeled, field II199 can only simulate homogeneous media although it can apply frequency-dependent attenuation. Another tool for linear simulation is DELFI,200 which takes a similar approach to Field II but is optimized for calculating the spatial response at a single point in time. FOCUS201 is another linear tool that provides high accuracy at lower temporal sampling rates using a time-space decomposition approach with the fast nearfield method for certain transducer geometries.
Nonlinear simulation methods include additional terms in the wave equation to model nonlinear propagation as well as other effects such as absorption and diffraction. The Khokhlov–Zabolotskaya–Kuznetsov (KZK) equation provides a numerical simplification of the Westervelt equation through the assumption of directionality of the transmitted beam and is used by many of the available toolboxes. The KZK equation can be solved both in the time domain (e.g., Texas code202) and frequency domain (e.g., Bergen code203). To remove the paraxial assumption made in the KZK equation, a simulation software Abersim204 was developed. This simulator solves the equation using the angular spectrum method. The Texas, Bergen, and Abersim all assume propagation through homogeneous media. A more recent ultrasound simulator, K-wave,205 uses pseudospectral methods to efficiently solve the nonlinear time domain equation for propagation through heterogeneous media (sound speed, density, attenuation, and nonlinearity). The solution of this full-wave equation also includes the effects of multiple scattering in the wave field, simulating reverberation acoustic clutter.
Many models not only allow for forward propagation of a wave but also the reflection and/or scattering of the wave. Backward propagating waves return to the transducer, undergoing transduction from an acoustic signal to an electrical one (the complementary process of the transducer simulation described previously). An imaging simulation outputs the recorded electrical signal(s) from this process for further signal processing, just as would be required from a physical ultrasound scanner. For example, Field II199 computes the response to individual scatterers of selected amplitude in the field just as it does the transmit field, linearly combining the transmit and receive impulse response. CREANUIS206 is designed like Field II to provide the response to individual scatterers except using both the fundamental and harmonic field (computed in the frequency domain), including heterogeneity in the nonlinear coefficient. K-wave205 can be used for imaging simulation by recording the spatial field signals at the array surface that have been multiply scattered and nonlinearly propagated through the heterogeneous media, making it useful for simulating realistic human body imaging. Figure 11 shows a real image of a breast as well as a simulated image created by solving a second-order linear wave equation with heterogenous media, qualitatively demonstrating the visual realism of the simulated images.207,208 The simulation creates a realistic scattering field from the complex numerical breast phantom.
Several other tools exist to model various acoustic interactions with specific targets. For example, BubbleSim209 provides the nonlinear response of ultrasound contrast agents to ultrasound excitation. Finite-element method tools from Palmeri et al.210 are available to model the mechanical response of tissue to radiation force, as calculated using the acoustic field simulation tools above. The FDA provides a high-intensity therapeutic ultrasound simulation software211 that integrates the bioheat transfer equation with continuous wave nonlinear simulation.
Ultrasound simulators still face several challenges. Acoustic scattering depends on subresolution features, so most simulations are based solely on relative echogenicity or bulk material properties. It is, therefore, common to approximate using multiple realizations of the scatterer position and/or scattering strength, increasing computational cost. A particular simulation tool may limit the complexity of the targets to be simulated, such as describing tissue as either a collection of discrete points or on a fixed property grid. Multiphysics simulations are increasingly important, combining effects such as transducer simulation, acoustic propagation, scattering, and target response (thermal, motion, etc.). These tools are still fairly rudimentary, making simplifications such as using the output of an acoustic simulation as the input to a finite-element tissue simulation to model shear wave generation210 or the interpolation of a computational fluid dynamics model to update scatterer positions in a flow imaging simulation.212
Medical images are valuable to the extent they can be used for their intended purposes: detecting an abnormality, qualifying a disease, or assessing its progress or remission. As an analogue to clinical imaging trials, virtual imaging trials in medical imaging should likewise provide a mechanism to render a judgment (or a set of judgments) about a virtual imaging case, a function that we characterize here as “interpretation.” Without such a provision, VCT cannot deliver its promised utility to provide answers to image-based clinical or technological questions.
Image interpretation is a process by which an imaging case (or a combination of cases from the same patient) is understood in the context of the clinical task at hand. In the real domain, this interpretation is primarily performed by an expert imaging physician (usually a radiologist). In the virtual domain, the physician is replaced by a virtual observer, with its performance aspired to match that of a real human expert, just like virtual patients and virtual imaging systems aim to emulate their corresponding real counterparts as closely as possible.
Observer models refer to a class of mathematical constructs that aim to emulate diagnostic tasks performed by human observers.213–236 The term “model” here is a substitute for the term “virtual” in the VCT framework. These models are grounded on the definition of task-based image quality, i.e., the effectiveness in which an image can be used for its intended task,237 with the premise to predict human observers’ (e.g., radiologists) performance for a specific task. In their most common implementation, they provide binary decisions (e.g., signal present/signal absent, normal/abnormal) across an ensemble of images for which the ground truth is known. The most common observer model paradigm is the signal-known-exactly/location-known-exactly paradigm in which the observer “knows” the size, shape, contrast, and location of the signal to be detected. Given an image, the observer is tasked to decide if the signal is present or not. More advanced (and perhaps more realistic) paradigms include signal-known-statistically, location-known statistically, and various combinations of the above.238–241 There have also been formulations of the observer models incorporating visual search,242–244 visual discrimination,245,246 nonbinary tasks,247,248 and estimation249 that can be considered depending on the goals of the VCT.
Observer models have been deployed in numerous studies across a variety of clinical imaging tasks.215,217,227,230,231,233,250–261 Traditionally, there have been two general approaches to observer model estimation using spatial or frequency domain computations. In the spatial domain, a large ensemble of image data is used to estimate task performance directly from the signal-present and signal-absent images.240,262 This approach is well-suited to virtual trials in which a vast number of images with known ground truth can be synthesized under clinically relevant conditions. Frequency domain computation is more practical when a smaller number of images are available, enabling practical comparison of image quality across patients, abnormalities, and imaging systems, further expandable to cross-system and cross-modality comparisons.227,263,264 They, however, are restricted to conditions of local stationarity and small-signal linearity, conditions that can be met in many imaging applications.106,227,265,266 Samei and Krupinski248 provided a comprehensive review of observer models.
Most observer models have primarily been oriented toward the detection of abnormalities. Image interpretation, however, often goes beyond detection to the tasks of characterization. In characterization, images are quantified in terms of features that are deemed most relevant to the diagnostic process. One form of this characterization is through radiomics and image quantifications. Radiomics or image quantification in general is not a subsection of a VCT, rather it is one way of quantifying images that can be and has been applied to virtual data. Readers are encouraged to read further on radiomics through Refs. 267268.–269.
Although most image interpretations today are based on human observers, the process is increasingly positioned to be aided and even replaced (currently in niche applications) by computational algorithms. This is primarily due to the increasing power of computers and the utility of machine learning in pattern recognition and quantification.270–272 As there is a growing progress toward AI interpretations of images, to be effective and relevant, VCTs should adapt and have provisions for virtual images to be interpreted by these emerging “AI observers.” In fact, the field of computer-aided diagnosis (which includes computer-aided detection, machine learning, and AI) and image perception (which includes model observers) have always shared many of the same methodologies going back several decades.224,273 That work has continued to coalesce in recent years with “deep learning model observers” that report better agreement with or outperform human observers.274–277
The developments in virtual humans, virtual scanners, and virtual interpretations, as summarized in previous sections, have enabled medical imaging researchers to conduct clinical trials virtually to explore with various applications. In the following, we present examples of VCT studies that demonstrate their potentials in substituting clinical trial studies in various applications across imaging modalities.
One of the earliest applications of VCTs was in the area of breast imaging for investigations of image quality, dosimetry, optimization, and technology evaluation.30,213,278–289 Recent VCTs have attempted to predict the ranking and the magnitude of improvement of breast imaging technologies as seen by human observers.26,278,290–292
In one example, VCTs were used in the Optimam project to evaluate the smallest detectable diameter of various lesions, showing that digital breast tomosynthesis (DBT) is superior to digital mammography (DM) for masses,290 while the converse is true for calcifications.291 Subsequent work292 confirmed a significant difference between DM and DBT in mass detection but showed no significant difference between narrow and wide angle DBT although a trend toward superior performance for wide angle DBT was noted. This work also showed that detectability was affected by the radiation dose, with lower detectability (larger diameters) at lower radiation doses. The results, reported in terms of the smallest lesion diameter, yielded rankings concordant with clinical trials of DM and DBT. This work showed that VCTs could replicate the ranking of modalities in terms of lesion type and radiation dose.
To predict the degree of improvement afforded by new technologies by radiologists accurately, VCTs need to be conducted in terms of metrics used in clinical trials, such as receiving operating characteristic (ROC) curve and the area under the curve (AUC). Researchers at the US Food and Drug Administration (FDA) compared the performance of DBT and DM with predicate premarket approval data for masses and calcifications based on differences in AUC.26 In further work, Bakic et al.213 compared the performance of DBT and DM in the detection of calcifications and masses, simulating the Hologic Selenia Dimensions under clinically realistic conditions. The results of the VCT were compared with data reported by Rafferty et al.293 To compare differences in AUC, the VCT performance was calibrated to the predicted DM results; the DBT results were calculated for matching conditions. The results of the VCT closely match those of the clinical trial, with the VCT predicting the AUC for masses and calcifications to within 4% (Table 1). Note that while the results match the difference in AUC, they do not predict the shape of the ROC curves accurately (Fig. 12). ROC shape is determined by the admixture of lesion complexity. In Fig. 12, the slope of the ROC curve is greater near the origin for the clinical results than for the VCT; this implies that the clinical cases varied in difficulty, while the VCT cases were more homogeneous. Thus while VCTs have now been shown to predict human performance in terms of changes in AUC and , future work is still needed to improve VCT realism.
Detectability of calcifications and masses in terms of AUC for the Hologic Selenia Dimensions (adapted from Ref. 213).
|DM (AUCp)||DBT (AUCs)||AUCs−AUCp||DM (AUCp)||DBT (AUCs)||AUCs−AUCp|
The above presents just some representative examples for the use of VCTs in breast imaging research. Many additional studies have been conducted using breast imaging VCT pipelines and pipeline components, including the assessment of image processing, image registration, imaging device design, and optimization.30,213,278–289
In CT imaging, a broad range of VCT studies have been conducted with more focus on dosimetry and image quality assessments. With CT being the single largest source of medical radiation exposure,294 reducing the dose to patients without sacrificing image quality is desired. Dose can be studied using VCTs in which computational phantoms are “imaged” using MC-based CT simulators. Studies of this nature cannot be performed using live subjects due to ethical concerns.
Organ doses have been estimated under various imaging protocols across virtual populations of adults,295–297 pediatrics,296,298,299 and pregnant patients.300,301 In addition, these studies investigated the relationship between the estimated organ doses and CT parameters and patient attributes. These dosimetry studies showed an exponential relationship between the organ doses (as well as effective dose) and body diameter.298,299 This relationship was found to be stronger for the organs inside the scan coverage.295 Based on these studies, a smart phone application295,302 was developed to estimate organ doses given the patient attribute and the imaging protocols. Further, Zhang et al. investigated the uncertainties in organ dose estimations for four computational phantoms with matched organ mass, body weight, and height. Results showed that variation in organ locations and anatomy, as well as dose approximation, can result in large differences in the estimations, especially for partially irradiated organs.303
Another VCT study investigated the dose reduction to breast while using an organ-based tube current modulation (TCM) and a breast-positioning technique. TCM was set up to reduce the current within a 120 deg anterior zone. The breasts in the computational phantoms were morphed to model a support brassiere, constraining the majority of the organ to be inside the 120 deg anterior zone. The study showed that compared with angular TCM, the combination of organ-based TCM and the breast positioning technique reduced the dose by .
In a recent VCT study, Sahbaee et al.304 investigated the effects of an iodinated contrast agent on organ dosimetry. The study incorporated a contrast material propagation model in a library of computational phantoms (Sec. 2.5). Organ doses were estimated at different injection times. Results showed that dose increased due to the presence of iodine, suggesting the need for considering both image quality and patient dose while optimizing contrast-enhanced CT protocols.
Several groups have utilized VCTs to evaluate their novel CT image reconstruction algorithms.305–307 Abadi et al.141,308 characterized the noise texture across filtered back projection and iterative reconstruction algorithms. In this study, an XCAT phantom41,55 was imaged 50 times using a validated CT simulator, setup to mimic the parameters and settings of a specific scanner model (Siemens Definition Flash). The simulated images were reconstructed with both filtered backprojection and iterative reconstruction algorithms using a commercial software. The results showed nonstationarity of noise texture in iterative reconstructions and spatial dependence of the peak frequencies in the noise power spectra. The images with iterative reconstruction had lower noise in general but higher noise in the high spatial frequency (edges) regions.
In nuclear imaging, VCT studies have been performed to study the effects of anatomical parameters, PET309–311 and SPECT312,313 image reconstruction methods, and acquisition energy windows on SPECT image quality, and the ability of observers to detect myocardial perfusion (MP) defects in MP SPECT.314–316 The results from VCT studies have propelled the clinical implementation of new quantitative image reconstruction methods and acquisition protocols that are designed to compensate for image degrading factors to improve detection of defects or lesions in nuclear imaging, leading to improved clinical diagnosis and improved statistical power for clinical trials.317–320
Recently, VCT studies have played an important role in the understanding of the blurring effects of RM on static 3-D image quality and in the evaluation and development of 4-D image reconstruction methods that reduce RM blurring and improve image quality in both SPECT and PET.321,322 Also they have been used in the development of a new generation of 4-D image reconstruction methods that include additional compensation of cardiac motion for significant improvement in 4-D cardiac-gated MP SPECT and PET images in terms of reduced RM blur and lower noise levels in the 4-D cardiac-gate MP SPECT and PET images.323 Similar VCT studies have also been done to evaluate motion-compensated reconstruction algorithms in the context of head motions and PET brain imaging.324
VCTs have also contributed significantly to the research and development of radiopharmaceuticals used in diagnostic nuclear medicine and recently in targeted radionuclide therapy or radioimmunotherapy. Using more realistic human phantoms and combining them with biodistribution data of a given radiopharmaceutical, VCT studies have provided more accurate estimates of average radiation dose to different organs of humans of different sexes, ages, and body builds.325,326 For diagnostic imaging purposes, the results are useful in setting guidelines for the maximum allowable injected dose for the best possible image quality while protecting patients from the harmful effects of radiation, especially to critical organs that are most sensitivity to radiation. Accurate radiation dosimetry estimation of the radiopharmaceutical to different organs and cancer tissue is also important in targeted radionuclide therapy. For individualized treatment planning and precision medicine, accurate radiation dosimetry estimation before treatment is important in determining the maximum possible injected dose for the individual patient,327,328 especially children and newborns,329 and after treatment for predicting treatment success.
Modeling and simulation of MR imaging physics is a rich and mature field as shown in Sec. 3.3. Applications of these simulation techniques to VCT, however, are confined to relatively few areas. A key application is breast imaging in which VCTs addressed image quality, dosimetry, optimization, and technology evaluation studies. Realistic breast models were developed by Elangovan et al.285 for these purposes.
Other virtual studies in MRI have focused on brain applications. In the studies by Kwan330 and Aubert-Broche,331 they developed a simulator to quantitatively evaluate image analysis methods in brain MRI under different imaging conditions by varying scan parameters. Such simulations allow for the testing of different methods with complete user control over the imaging parameters and with the known ground truth offered by the computational phantoms.
Studies have also investigated 4-D MRI imaging techniques and their applicability to radiation therapy.332–334 For example, Lui et al.334 investigated the feasibility of a 4-D diffusion-weighted MR imaging (4D-DWI) technique for imaging RM for radiation therapy applications. In evaluating their technique, the authors utilized the 4-D XCAT computational phantom setup to include a pancreatic tumor and to simulate different RMs. The tumor motion trajectories from the simulated images were extracted and compared with the known RM from the phantoms. Through the simulations and additional patient studies, it was shown that 4D-DWI can lead to more accurate RM measurement, which can improve the visualization and delineation of cancer tumors for radiotherapy.
Beyond RM, recent VCTs in MRI have focused on cardiac applications.53,335–339 In such studies, simulation methods, providing the known anatomy and cardiac motion, are used to investigate acquisition and reconstruction methods in cardiac imaging. Image reconstruction is a significant area of research in MRI as different techniques are being investigated to reduce scan times and increase spatial and/or temporal resolution. Figure 13 from Ref. 338 shows a comparison of two reconstruction methods, PCA340 and SPARSE,341 for use in MP imaging.
Another area seeing considerable work is the study of electromagnetic compatibility of medical devices within an MRI scanner and, particularly, to understand device and tissue heating and to evaluate device safety as part of regulatory processes.342,343 Once more we see the role that modeling and simulation play in uses in which experimental data are impractical or unethical to collect.
Modeling and simulation in MRI have been used amply in the development and validation of novel MRI technology.344–352 With the advancement of phantoms in becoming more and more realistic, VCTs will find even more applications in MRI.
VCTs in ultrasound are primarily used in the development of new ultrasound transmission sequences, beamforming strategies, and postprocessing algorithms given a ground truth target with which to compare.
The FDA, in line with several professional societies, recommends the “as low as reasonably achievable” principle for acoustic output to minimize the risk of tissue heating, cavitation damage, and other possible bioeffects. The FDA also mandates maximum exposure levels for diagnostic imaging, which requires an understanding of the spatial and temporal average intensities as well as peak pressures achieved.353 Acoustic field modeling provides estimates of these quantities during the design phase, with experimental measurements for further validations. Nonlinear simulation, especially including heterogeneous media, is particularly valuable for understanding the distribution of acoustic energy in the body, as shown in Fig. 14. The complex acoustic environment often violates simplifying assumptions made in the conventional derating scheme.354
It is a challenging task to control the focus of sound through the skull due to the complex aberrations and reverberations induced. A patient-specific understanding of these phenomena is essential for effective high-intensity focused ultrasound therapy in which localized energy deposition is required. It has been demonstrated that CT scans of ex vivo skull samples can be used with nonlinear simulation software to perform adaptive focusing through the skull by modeling the distortions of the propagating wave, both increasing the energy delivered and reducing the spatial spot size.355 The use of these models for pulse echo imaging is even more difficult due to higher frequencies and two-way propagation, but it is an important application as well.
VCTs are instrumental in improving image quality through beamforming and image postprocessing algorithm development. Fundamental mechanisms of image degradation due to acoustic clutter are just beginning to be understood through simulation study using nonlinear tools356 combined with digitized histological samples357 or tissue models derived from other imaging methods.358 Point targets can provide information on resolution not available in clinical imaging, whereas anechoic and echogenic targets with known geometry and scattering contrast predict clinical imaging performance using a ground truth with which to compare across imaging methods.359–363 Blood vessels of varying geometries have been simulated by pairing computational fluid dynamics software with pulse echo acoustic simulation for the development of flow estimation techniques, mimicking in a controlled environment the complex flow patterns observed in vivo.212
The Quantitative Imaging Biomarkers Alliance is developing standards for using ultrasound to estimate shear wave speed as an indicator of disease state. To support the development of algorithms to estimate various tissue properties from these data, they have published simulation tools that combine finite-element methods with acoustic simulation and have provided standardized digital phantoms.364 These digital phantoms are also being provided to physical phantom manufacturers to ensure that they are designed and manufactured with accurate performance.
As in other medical imaging fields, machine learning is poised to revolutionize the processing and interpretation of ultrasound images. Development of these algorithms would be greatly accelerated if repositories of large numbers of well-labeled ultrasound images and data were made available. Medical privacy concerns and competitive advantage are both likely factors in limiting the widespread distribution of these types of data sets, but a few have been made publicly available.365–367 With sufficiently accurate human models and imaging simulators, VCT techniques can drastically increase the amount of data available for such training.
The above studies provide many different examples showing the use of virtual tools toward improved medical imaging devices and techniques. VCTs are still a relatively new concept, with challenges to be overcome in terms of their components (phantoms, simulators, and image analysis). As such, they are still not quite at the point where they can fully replace human trials. As they stand now, however, they do provide a key mechanism with which to comprehensively study the vast number of factors (patient, scanner, and physical) that can affect medical imaging, providing a means to narrow these factors down to the ones most likely to succeed, paving the way toward more targeted and efficient patient trials.
Verification, Validations, and Inference
Any model can be trusted to the extent that it can replicate or predict reality. VCTs aim to reflect the output of actual clinical trials. As such their effectiveness and utility hinge on their representational ability. Toward that objective, VCTs are expected to follow certain processes and expectations:
In the simulation and scientific computing community, verification is done to confirm that the “equations were solved correctly,” and validation is done to confirm that the “correct equations were solved.”368 In other words, verifications ensure that there is no major misassumption or coding errors in the algorithms, and validations demonstrate how close the outputs of the simulation are against experimentally measured data. Each component of a VCT, whether being the patient, the imaging system, or the image interpreter, should be independently verified and validated. They should ideally take place at multiple levels of granularity. They can be applied to a simulation in its subcomponents41,369,370 (e.g., model of x-ray spectrum), whole component141,142,262,371,372 (e.g., accuracy in creating realistic simulated images), or multicomponent26,278 (e.g., accuracy in creating the complete human imaging process from the patient to the output of the imaging task).
One approach for these validations has been through the simulation of IEC standard tests. In this process, computational models of physical phantoms with known properties are simulated and compared with actual measurements.141 Further, the American Association of Mechanical Engineers V&V 40 subcommittee373 provides a comprehensive standard framework for evaluating the relevance and adequacy of verification and validations of medical devices, suggesting that the credibility of a simulation framework should be judged based on its context and application.
The choice of evaluatory metrics is critical in designing and validating VCTs. For example, the Optimam results290–292 were reported in terms of minimum detectable diameter. Although these VCTs accurately predicted rankings of the imaging technologies that were concordant with clinical data, precise validation was not possible since ground truth is lacking for the minimum detectable diameter of lesions in clinical cases (such data do exist for phantoms). As discussed in Sec. 5.1, the use of AUC as a metric requires that the VCT be calibrated to the predicate technology in terms of the AUC, or the results must be reported in terms of .278 Additionally, accurate prediction of the ROC curve requires that the VCT match the admixture of case difficulty seen in a given clinical population. The goal is to achieve and claim equivalency.
Ascertaining the equivalency of a VCT to a corresponding real trial is a statistical task. Equivalency is never 100% assured. Even if two clinical trials are undertaken at the same time, the results will likely not match exactly. So how close is close enough? Although this question may not be answerable perfectly, it can be answered practically. The equivalency can be established based on expected variability in an actual trial. A VCT process can be considered valid and reliable in terms of its concordance to replicate a clinical scenario. If the VCT results statistically fall within the ranges of variability expected of a trial, the VCT can be considered valid. For example, if an observer model output falls within the range of the results from varied results of multiple observers, one can claim that the observer model is as good as any of those observers.
The statistical reliability of VCT can be ascertained through uncertainty analysis in which simulation parameters are perturbed and the corresponding effects to final results are evaluated. In the context of imaging simulations, these parameters could be attributes of a computational phantom (e.g., organ shape) or characteristics of an imaging simulator (e.g., source spectrum of an x-ray system). The goal is to identify the sources of uncertainties and determine how much they influence the final performance measure by repeating simulations while perturbing these parameters and examining the results.368 Such studies can enhance the confidence in the VCT predictions and thus offer validation confidence.
In verification and validation of VCT, one may see concordance with absolute performance. This is often challenging, as it is nearly impossible to model all of the nuances and permutations of the patient, the technology, or the interpretation. One may obtain absolute performance concordance by tweaking model parameters with the goal of matching the results. However, such a study provides little confidence in the generalizability of the approach to other data when such tweaking is not possible. However, matching the differentials across conditions and relative rankings in the performance between system configurations is easier because these rankings or differences tend to be less sensitive to small biases in the VCT models. Therefore, validation and the applicability of VCT are higher when targeted to predict rankings of technologies or conditions.
Related to closeness is the question of generalizability. Can a VCT be reliably applied to answer a question for which there is no clinical data? After all, if every VCT required a validation of its own to be deemed reliable, the very purpose of VCT to make the process of trials easier and more efficient is defeated. The answer lies in the diversity of the space within which the prior VCT is validated. A VCT can be considered reasonably reliable if it is applied to conditions and claims that are in close proximity of a validated space. For example, an MC simulation of dose validated for one CT scan can be expected to be reliably applied to another CT scan. There are of course different levels of closeness here as well. The generalizability of such a simulator will be strongest when it is applied to the same scanner and less when applied to other scanners, geometries, energy ranges, etc. Ideally, as practically as possible, a VCT should represent and be validated within the diversity of conditions (in patient, technology, and analysis) within which it is expected to be applied. In that way, the VCT is generalized to an “interpolated” set of conditions as opposed to an “extrapolated” set, e.g., VCTs validated for 5- and 12-year-old models can be readily trusted (unvalidated) for 7-year-old patients but not for bariatric adult models.
In the discussion of validation and generalizability, it should further be noted that the goal of a VCT is often not to predict the outcome of the imaging process for an individual patient for which 100% realism is unachievable. The most frequent objective is rather to reasonably represent a variety of human imaging conditions with diversity beyond what is possible with simple phantoms so that the outcome of imaging processes can be more reliably understood and optimized in the context of clinically relevant tasks. VCTs generally do not claim perfect realism nor individual realism, rather results that are close enough to offer imaging technology assessment from a population perspective.
In a VCT, simulation parameters can be tweaked to match almost any desired result. Therefore, an important feature in a credible virtual trial is to design a formalized study plan before the trial is started and followed through. Patient models, imaging simulators, and observer models should be tested, verified, and validated individually before the entire sample is run through the pipeline and the final predetermined performance metric is calculated. In addition, pilot testing sets should be separate from pivotal testing sets, and deviations from the protocol should be explained in the results.
It can be argued that VCTs in general and VCT in the context of medical imaging (so-called virtual imaging trials) are still in their infancy. VCTs are taking an increasing role to ascertain and qualify the effectiveness of medical imaging technologies, as evidenced in a few recent FDA approvals based on VCTs. Yet, they are still far from mainstream to be trusted as a primary method to answer qualification, research, or clinical questions. Yet the promise is worthwhile as their use can significantly advance medicine and medical science. One can imagine a future in which VCTs are embraced as a mainstream methodology in medical science to provide reliable experimentation without excessive cost or ethical roadblocks. To attain that level of reliability, much still needs to be done. Progress is needed to increase the realism and the diversity of the space covered, in terms of modeling both patients (individuals and populations) as well as systems and analyses.
For patient modeling, there remains work in progress for modeling subjects that diversely sample the population, the comprehensive suborgan anatomy and function, and the disease, all with adequate targeted diversity for the questions at hand. For imaging simulators, efforts are being spent on creating more detailed, accurate, and system-specific models of imaging systems. Similarly, it is a challenging task to simulate the myriad of observer models (from residents to attendings to domain leaders to AI). There is also a need to include all aspects of clinical interpretation beyond simple detection and classification or focal abnormalities. These remain the exciting prospects for the future role and potential of VCTs in medicine and in advancing human health.
Further, VCTs would be more impactful if the methods are standardized and disseminated. A disseminated platform enables concurrent development by multiple groups, thus promoting innovation. Today, most algorithms are published without a reference implementation. The scientific process is undermined when published results cannot be reproduced by others, and it is exceedingly difficult to evaluate how a VCT or the pipeline components will perform given different input data. For researchers to build upon the work of others, reimplementation of previous work is frequently required, but this is often difficult or infeasible and is prone to errors. VCT researchers are encouraged to collaborate to establish standards for conducting VCTs. As more work is done to advance and standardize the individual components of VCTs achieving greater levels of realism, VCTs stand to alter the paradigm of medical imaging research and applications.
The authors have no relevant financial interests in the manuscript and no other relevant conflicts of interest to disclose.
This work was partly supported by the National Institute of Health (Nos. R01EB001838 and R01HL131753).