Cancer imaging phenomics toolkit: quantitative imaging analytics for precision diagnostics and predictive modeling of clinical outcome

Christos Davatzikos; Saima Rathore; Spyridon Bakas; Sarthak Pati; Mark Bergman; Ratheesh Kalarot; Patmaa Sridharan; Aimilia Gastounioti; Nariman Jahani; Eric Cohen; Hamed Akbari; Birkan Tunc; Jimit Doshi; Drew Parker; Michael Hsieh; Aristeidis Sotiras; Hongming Li; Yangming Ou; Robert K. Doot; Michel Bilello; Yong Fan; Russell T. Shinohara; Paul Yushkevich; Ragini Verma; Despina Kontos

doi:10.1117/1.JMI.5.1.011018

11 January 2018 Cancer imaging phenomics toolkit: quantitative imaging analytics for precision diagnostics and predictive modeling of clinical outcome

Christos Davatzikos, Saima Rathore, Spyridon Bakas, Sarthak Pati, Mark Bergman, Ratheesh Kalarot, Patmaa Sridharan, Aimilia Gastounioti, Nariman Jahani, Eric Cohen, Hamed Akbari, Birkan Tunc, Jimit Doshi, Drew Parker, Michael Hsieh, Aristeidis Sotiras, Hongming Li, Yangming Ou, Robert K. Doot, Michel Bilello, Yong Fan, Russell T. Shinohara, Paul Yushkevich, Ragini Verma, Despina Kontos

Author Affiliations +

Journal of Medical Imaging, Vol. 5, Issue 1, 011018 (January 2018). https://doi.org/10.1117/1.JMI.5.1.011018

Abstract

The growth of multiparametric imaging protocols has paved the way for quantitative imaging phenotypes that predict treatment response and clinical outcome, reflect underlying cancer molecular characteristics and spatiotemporal heterogeneity, and can guide personalized treatment planning. This growth has underlined the need for efficient quantitative analytics to derive high-dimensional imaging signatures of diagnostic and predictive value in this emerging era of integrated precision diagnostics. This paper presents cancer imaging phenomics toolkit (CaPTk), a new and dynamically growing software platform for analysis of radiographic images of cancer, currently focusing on brain, breast, and lung cancer. CaPTk leverages the value of quantitative imaging analytics along with machine learning to derive phenotypic imaging signatures, based on two-level functionality. First, image analysis algorithms are used to extract comprehensive panels of diverse and complementary features, such as multiparametric intensity histogram distributions, texture, shape, kinetics, connectomics, and spatial patterns. At the second level, these quantitative imaging signatures are fed into multivariate machine learning models to produce diagnostic, prognostic, and predictive biomarkers. Results from clinical studies in three areas are shown: (i) computational neuro-oncology of brain gliomas for precision diagnostics, prediction of outcome, and treatment planning; (ii) prediction of treatment response for breast and lung cancer, and (iii) risk assessment for breast cancer.

1. Introduction

Modern medical images are complex, often derived from different and complementary acquisition protocols or modalities, and can elucidate multifaceted phenotypic aspects of cancer. Traditional measurements (e.g., tumor diameter and volume) capture only a small fraction of such multifaceted and heterogeneous phenotypes, and therefore, limit evaluation to basic features of a tumor and its progression during treatment. Extensive literature over the past decade has shown that diverse and complementary multiparametric imaging features, beyond the traditional visually observable measurements, such as volumetric, textural, morphologic, kinetic, connectomics, spatial patterns, and intensity histograms (i.e., radiomic features), may result in comprehensive phenotypic imaging signatures that can offer additional diagnostic, prognostic, and predictive value for many types of cancer.¹ This emerging field, which we will herein refer to as quantitative imaging phenomics (QIP), consistently shows that phenotypic imaging signatures of various cancers relate to underlying molecular characteristics, treatment response, and patient survival, with the potential to augment conventional prognostic and predictive assays.²

Although such QIP signatures and their use are progressively reported throughout scientific literature,¹ they have yet to be adopted in clinical studies and practice. The increasingly complex nature of computational imaging algorithms and the challenge of accessing clinical datasets for training and validating these algorithms limit the availability of QIP signatures for both clinical researchers and practitioners. The current paper describes an evolving effort in the development of the cancer imaging phenomics toolkit (CaPTk), an imaging analytics suite of open-source software algorithms, designed to derive extensive panels of QIP features and integrate them into noninvasive diagnostic and predictive models, as well as systems supporting optimized personalized cancer treatment planning. Appendix B presents a detailed overview of the features currently supported by CaPTk.

Quantitative cancer imaging phenomics, and hence the CaPTk software, builds upon work often referred to as radiomics and radiogenomics, which use various textural and shape features to build a comprehensive representation of the tumor. Additional features used in CaPTk, such as spatial patterns obtained after atlas registration, shapes of histograms in various subregions, and peritumoral heterogeneity indices, are also shown to offer highly informative feature sets, especially when properly integrated via machine learning tools.³^,⁴ In this paper, we present results from brain, breast, and lung cancer studies that underline the value of CaPTk imaging signatures as precision diagnostic and predictive tools.

2. Methods and Software

2.1.

Overview

CaPTk can be viewed as a two-level software platform (Fig. 1). The first level targets basic image processing and extraction of various features capturing different aspects of local, regional, and global imaging patterns, resulting in an extensive QIP panel. These features range from standard multiparametric image intensities and their histogram distributions, to commonly used radiomic features, such as various types of textural, morphologic, and functional descriptors,⁴ spatial patterns obtained from deformable registration,⁵^,⁶ biophysical models of tumor growth and infiltration,⁷^,⁸ and connectomic signatures, among others. The second level focuses on the integration of these features into multivariate machine learning models and systems, with specific application-oriented goals. Examples include (i) precision diagnostics and risk assessment for developing cancer,⁹ (ii) predictive models of treatment response and patient survival,⁴^,¹⁰^–¹² and (iii) detection of phenotypic imaging surrogates of underlying cancer molecular characteristics.⁴^,¹³ The following sections further explain some of the analytical capabilities of CaPTk. Furthermore, Appendix A provides application-specific descriptions and examples of commands, targeted toward both technical and clinical audiences. In addition, CaPTk’s webpage¹⁴ hosts several screenshots of CaPTk, which provide details of user-interaction and currently supported applications/features.

Fig. 1

Overview of CaPTk’s functions: at the first level, CaPTk provides image preprocessing and feature extraction functions that can be used to generate an extensive QIP panel of features capturing various aspects of imaging signals, ranging from segmentation of tumors and its partitions, to extraction of textural and perfusion dynamic features, to population-wide spatial patterns of cancer, and fiber tracts. At the second level, these QIP features and maps are integrated into algorithmically complex diagnostic and predictive models, aiming to achieve precision diagnosis and guidance of treatment, prediction of clinical outcome, and estimation of molecular characteristics of tumors.

2.2.

Extraction of an Extensive QIP Panel

2.2.1.

Image segmentation

Image segmentation is a fundamental process in automated image analysis, enabling the precise delineation of a tumor, its subregions, and the surrounding infiltrated anatomy. For example, segmentation of a glioblastoma, the most malignant brain tumor, can delineate the enhancing tumor (ET) and nonenhancing tumor (NET) (possibly necrotic) parts of the tumor core, its surrounding edematous (ED) tissue, and the distant normal-appearing tissue regions.¹⁵^,¹⁶ Such delineations enable the extraction of features specifically from each tumor subregion, and surrounding anatomy, thus allowing for more accurate quantification of the tumor’s entirety and spatiotemporal changes. It is important to note that the term delineation refers to regions based on their radiographic appearance and may differ from the actual tumor delineations.

The CaPTk suite offers several segmentation modules, ranging from general purpose, user-guided segmentation, to specialized segmentation methods tuned to the specific characteristics of certain tumors and organs. A representative example of general purpose segmentation is ITK-SNAP,¹⁷ a well-established interactive tool, which has been integrated into the main interface of CaPTk. ITK-SNAP is based on random forest classifiers, trained using a manual definition of tissues of interest, to produce an initial segmentation subsequently refined using level-sets. GLISTRboost,¹⁵^,¹⁶^,¹⁸^,¹⁹ available through the web-based image processing portal²⁰ of the Center for Biomedical Image Computing and Analytics (CBICA), leverages CaPTk as a user-friendly means of initialization. GLISTRboost performs multimodal brain glioma segmentation and atlas registration, which describes a semiautomatic hybrid generative-discriminative method. The generative part is based on an expectation–maximization framework to segment brain scans into tumor (i.e., edema, enhancing and nonenhancing tumor), as well as “healthy” tissue labels (i.e., white and gray matter, cerebrospinal fluid, vessels, and cerebellum), and incorporates the modeling of tumor growth and infiltration via reaction–diffusion–advection equations.⁷^,⁸ The discriminative part is based on a gradient boosting²¹^,²² multiclass classification scheme, to refine tumor labels based on information from multiple patients. Last, a Bayesian strategy²³ is employed to further refine and finalize the tumor segmentation labels, based on patient-specific intensity statistics from the multiple modalities available. GLISTRboost provides estimates of the segmentation labels, parameters of underlying tumor growth models, as well as the tumor anatomical location in a standardized anatomical system via deformable atlas registration.⁶ In addition to provide segmentation labels, such application-specialized segmentation approaches allow evaluation of cancer spatial patterns in standardized coordinate systems, which is receiving increasing attention as predictors of clinical outcome, as well as biomarkers of molecular characteristics of the underlying tumor.²⁴ Future work will result in tighter integration between the CaPTk desktop client and IPP web server.

2.2.2.

Imaging features

Analytic functions in CaPTk are based on an extensive QIP panel of features (Fig. 1), which are integrated into imaging signatures, using analytics and machine learning to produce a variety of diagnostic, prognostic, and predictive biomarkers. Although features can be common across many diagnostic and predictive tasks, the way in which they are integrated into a specific imaging signature depends entirely on the task of interest, such as predicting response to treatment¹³ or estimating underlying mutations.¹⁰^,¹²^,²⁵ Examples of QIP features implemented in CaPTk include: (i) multiparametric imaging signals of different coregistered protocols/modalities, such as native T1- and T2-weighted images, T1 with gadolinium (T1-Gd), T2-fluid attenuated inversion recovery (T2-FLAIR), diffusion tensor imaging (DTI), dynamic susceptibility contrast (DSC), or dynamic contrast-enhanced (DCE) MRI; (ii) textural features [e.g., co-occurrence, run-length, size-zone matrices, local binary patterns (LBPs), fractal dimensions, wavelets], capturing characteristics of the local microarchitecture of tissue. Such features have been used extensively in mammographic image analysis and breast cancer risk assessment,⁹ as well as in predictive modeling of glioblastoma;³^,⁴^,²⁶ (iii) histograms, reflecting various imaging signal distributions within different delineated tumor subregions. The shapes of these histograms express anatomical and functional changes caused by the tumor that result in signal changes and have demonstrated a connection to clinical endpoints, such as survival, risk factors, and underlying cancer molecular characteristics;⁴ (iv) temporal perfusion dynamics captured via principal component analysis (PCA), which have been related to recurrence and infiltration, as well as molecular tumor characteristics;³^,¹⁰^,¹¹^,²⁵^,²⁶ (v) DTI-derived features, including fractional anisotropy (FA), radial diffusivity (RAD), axial diffusivity (AX), apparent diffusion coefficient (ADC), water-free diffusion, fiber tract connectivity and other properties; (vi) DSC-MRI derived features: peak height (PH), percent signal recovery (PSR), and relative cerebral blood volume (rCBV); (vii) spatial patterns of cancer distribution:²⁴ although previously relatively unappreciated, such spatial patterns that capture the spatial distribution and pattern of a tumor’s entirety (i.e., to which parts of the brain does the tumor extend), obtained via deformable registration to a standardized atlas space, receive increasing attention due to their relationship to prognosis and genotype.²⁷ Connectomic signatures, later discussed in Secs. 3.3 and 5.3.2, will also be incorporated in the immediate future.

An essential step in the aforementioned segmentation and feature extraction processes is image normalization and harmonization. In particular, image characteristics vary, often considerably, across different scanners, acquisition protocols, clinical centers, as well as patients. This renders the extraction of reproducible imaging features and signatures challenging. Appropriate histogram normalization is therefore particularly important. Toward this end, CaPTk provides the WhiteStripe approach²⁸ (Appendix A, Table 7), which normalizes conventional MRI by detecting a latent subdistribution of normal tissue and linearly scaling the histogram of images. In addition, histogram matching techniques available through the insight toolkit (ITK)²⁹ are also provided.

2.2.3.

Image registration

Coregistration of different imaging sequences, i.e., alignment of different anatomical regions, is very important in order to analyze in tandem voxel-by-voxel features, coming from the intensity signals of aligned imaging sequences. For example, building QIP features for analysis of brain tumors often requires the coregistration of conventional MRI scans, such as T1-weighted and T2-FLAIR but also various diffusion- and perfusion-based images. In addition to rigid registration (or fusion) of multichannel images, deformable registration is also important in cancer imaging and is used in two contexts: (i) the evaluation of temporal changes between longitudinal scans, as the tumor, its surrounding anatomy, and patient positioning may have changed between consecutive scans, and hence, appropriate registration can augment therapy response evaluation and prediction of survival and (ii) formation of population-based atlases of the spatial distribution and pattern of cancer, in order to evaluate the relationship between such distributions and molecular characteristics or clinical outcomes. CaPTk has access to a variety of image similarity metrics offered through ITK. Moreover, specialized deformable registration methods optimized for specific types of problems are available on CBICA’s IPP.²⁰ For example, the deformable registration described in Ref. 30, which is based on DRAMMS,³¹^,³² enables follow-up scans to be superimposed onto baseline scans in order to evaluate tumor volume changes over time as a measure of response to neoadjuvant chemotherapy for breast cancer. DRAMMS is a separate, standalone package, which is compatible with CaPTk and has been tested along with the breast CaPTk modules as a means for finding breast MRI changes.

3. Feature Integration and Modeling

As described above (Fig. 1), CaPTk has a two-level organization, with the first level of complexity, offering various image processing and analysis tools that lead to the formation of comprehensive QIP feature sets. The second level functions aim at integrating these features into decision support tools. A number of models and indices are offered, based on our prior studies.³^,⁴^,¹⁰^,²⁴^,²⁶^,³³^–³⁵ In Sec. 5, we present results obtained in four specific contexts, which are described below. Through CaPTk, a wide range of QIP features can be extracted directly from any set of images, with user-defined parameters, and external clinical and genomic variables can be easily used for training. These QIP features can form the basis for a CaPTk user to develop their own decision support systems, by using the QIP along with machine learning algorithms in different contexts, for example, to predict clinical outcome or genomic variables of interest. CaPTk’s software architecture provides interfaces that make it relatively straightforward to access the machine learning module of OpenCV.

3.1.

Precision Diagnostics and Risk Assessment

In parallel with developing precision diagnostics³³^,³⁵^,³⁶ driven by genomics, risk assessment of cancer has seen similar advances in the recent past. CaPTk incorporates routines that characterize properties of the normal tissue, predisposed to a higher risk of cancer. For example, the heterogeneity of the breast parenchymal patterns has shown to augment established risk factors, including breast density.³⁷^,³⁸

3.2.

Predictive Modeling

The ability to predict patient outcome, particularly after receiving a specific treatment, is important for treatment planning, patient management, and enrollment of relatively homogeneous patient subgroups into clinical trials to increase the detection of treatment effects. CaPTk applies machine learning methods to its QIP panel to predict various outcome measures. Examples include prediction of patient survival after glioblastoma treatment,⁴ patient response, and prediction of long-term survival after breast cancer neoadjuvant chemotherapy,¹³ response to stereotactic body radiation therapy for lung cancer,³⁹ and peritumoral infiltration and probability of cancer recurrence.³

3.3.

Optimized Neurosurgical Planning

Knowing the tumor location and, more importantly, the peritumoral infiltrated functional brain tissue, in relation to important structures and fiber tracts, is critical in neurosurgical planning. CaPTk offers algorithms for neurosurgeons and radiation oncologists to plan extensive tumor resection and peritumoral radiation while preserving neurological function.⁴⁰ This CaPTk functionality aims to allow clinicians to simultaneously evaluate peritumoral glioblastoma infiltration in edematous brain tissue and target functional brain tissue likely to present early recurrence while considering the location of fiber tracts that should be preserved as much as possible. To this end, CaPTk will provide tools for edema invariant tractography⁴¹^,⁴² and automated tract detection based on connectivity signatures,⁴⁰^,⁴³ to extract fiber tracts, even distorted or broken, in the presence of mass effect and edema (Fig. 2). The edema invariant tractography⁴¹^,⁴² is based on the multicompartment modeling of diffusion data, that fits a free-water compartment representative of the edema and a compartment representing the underlying tissue, fitted with a tensor or a higher-order diffusion model, based on the acquisition. Having separated the edema compartment from the tissue, which is used for tracking, enables tracking through the edema regions. Existing tracking algorithms that are not based on multicompartment models are unable to track through edema. This is currently being validated in the clinic using direct electrical stimulation. Once the tracts are created, our connectivity-based signatures of tracts enable clustering of tracts that have been distorted by the tumor. These displaced/distorted tracts cannot be readily captured by existing tract clustering algorithms that are based on shape/geometric information.⁴⁴^–⁴⁹ Thus, the edema invariant tractography in combination with the connectivity-based clustering produces tracts that are robust in the presence of edema and mass effect. These are expected to go beyond the capabilities of existing planning tools, once they have been validated in the clinic. Currently, connectomic-signature-based tract extraction is available, with the other diffusion-based surgical planning components being in the optimization phase. The surgical planning tool is proposed as a separate visualization environment within CaPTk that will combine the tumor, tracts around it, the vulnerability and the recurrence maps. Related processing and analysis tools will be provided as a diffusion toolkit, expanding upon CaPTk.

Fig. 2

Various aspects of the diffusion based surgical planning tools: (a) tracking through edema made possible with multicompartment modeling of diffusion data; (b) atlas-based reconstruction of tracts, resilient to mass effect induced tract distortions, and vulnerability map of the brain indicating the global effect of the resection and treatment; and (c) the surgical plan with the tumor and surrounding eloquent tract.

3.4.

Radiogenomics

Following the rapid growth of radiogenomics, where imaging features are correlated with genomic information, an expanding part of CaPTk focuses on deriving imaging signatures of clinically relevant cancer molecular characteristics. Although conventional anatomical and physiological imaging does not specifically label molecular targets, their presence can be derived indirectly through the application of machine learning methods to the QIP features extracted from clinically acquired imaging, as described above. Section 5 presents results from a radiogenomic signature of epidermal growth factor receptor (EGFR) mutations,¹⁰^,¹² which also captures overexpression of wild-type EGFR,¹¹ and an imaging signature of oncotype DX in breast cancer.²⁵

4. Software Architecture

CaPTk employs widely used, community- and industry-driven libraries, including the ITK, visualization toolkit, and OpenCV, as the foundation for basic functions, such as data input/output, preprocessing tasks (registration, bias correction, etc.), rendering and machine learning (Fig. 3), which make up many of the first-level image processing operations, as shown in Fig. 1. The advantage of using these broadly established libraries is that they are large-scale projects with deep resources, extensive testing and validation, and highly optimized algorithms. Locally developed libraries provide similar low-level functions that are specific to CaPTk (Fig. 3). All foundation routines are accessible as C++ objects, making complex algorithms available to higher-level CaPTk components and to software developed by external CaPTk users through well-defined, extensible interfaces. The documented APIs allow applications written by external users of CaPTk to access algorithms at each level of the toolkit through function calls and ITK image structures.

Fig. 3

An overview of the CaPTk software architecture. Command-line and GUI of CaPTk communicate with individual applications for preprocessing, basic analysis, and decisions support outcomes via function calls (black arrows). Applications may be tightly integrated in CaPTk, accessed as C++ objects via a documented API, or applications may be external software, such as Confetti, launched via system calls. Integrated applications utilize low-level libraries, such as ITK or libraries developed specifically for CaPTk for common tasks. Data are passed between libraries and returned to integrated applications in the form of ITK and OpenCV data structures (green arrow). Results are presented graphically through the GUI (light blue arrow) or saved to disk (red arrows). External applications return data directly to disk storage. The documented APIs allow applications written by external users of CaPTk to access algorithms at each level of the toolkit through function calls and ITK image passing.

Internally, CaPTk uses both ITK and OpenCV data structures as appropriate to the particular image processing operation, allowing any user to extend the CaPTk code by calling any algorithm implemented in either of these libraries. Since both ITK and OpenCV are based on high quality C++ code, any other package developed using these tools can be tightly consolidated with CaPTk. This provides computational imaging researchers with a fast track to integrate their complex algorithms into a full-featured graphical environment, without the need to duplicate routine tasks, such as file I/O, image reorientation, etc. Another aspect of CaPTk’s architecture is the focus on modularity of the code. This ensures that CaPTk can be used as a very lightweight and efficient image viewer without the burden of the computationally expensive functions affecting the interactive experience.

All applications within the CaPTk graphical user interface (GUI) are exposed via command line wrappings, giving the option to researchers to also construct automated, scripted, and customized pipelines based on the same algorithms, for subject- or population-based studies. CaPTk is under active development, with frequent updates available to developers and clinical collaborators. The current status of the package and links to stable public releases can be found at Ref. 14.

5. Results Obtained Using CaPTk

In this section, we present several representative results obtained by using CaPTk, as described in Sec. 3, which highlight the value of QIP signatures for precision diagnostics, personalized predictions, and decision support for treatment planning.

5.1.

Precision Diagnostics and Risk Assessment in Breast Cancer

Parenchymal pattern analysis, performed through CaPTk, was evaluated in a case-control dataset (106 cancer cases and 318 age-matched controls) of digital mammograms³⁸ (Appendix A, Table 5). The complete pipeline was named Laboratory for Individualized Breast Radiodensity Assessment (LIBRA). Prior to feature extraction, CaPTk applied a series of image standardization steps, where mammograms were log-transformed, then inverted, and, finally, intensity-normalized by a $z$ -score transformation within the breast region.⁵⁰ CaPTk was then used to extract parenchymal pattern characteristics, including for each subject (i) breast percent density (PD) corresponding to the amount of radiographic dense tissue within the breast⁵¹ and (ii) parenchymal complexity feature maps representing the spatial distribution of the textural measurements as sampled by a regular lattice over the entire breast.³⁸ Case-control discriminatory capacity was assessed in a randomized split-sample setting (training set: $n = 300$ ; test set: $n = 124$ ) as follows. First, a logistic regression model was built using parenchymal pattern characteristics extracted from the training set and the model was, then, evaluated on the test set via the area under the receiver operating characteristics curve (AUC). For this evaluation experiment, the extracted parenchymal texture feature maps were summarized using statistical measures (mean and standard deviation) and stepwise feature selection was applied to the training set prior to logistic regression modeling toward limiting potential over fitting.³⁸ Breast PD demonstrated modest case-control discriminatory capacity at 95% confidence interval (CI; $AUC = 0.56$ , 95% CI: 0.52 to 0.61), which was within the range of results from previously reported studies.⁹ Compared to PD, the classification performance of the lattice-based complexity texture feature maps was substantially higher (Fig. 4). Specifically, when feature maps were summarized into simple statistical measures, the discriminative performance was equal to $AUC = 0.79$ (95% CI: 0.69 to 0.89), DeLong’s test $p - value = 0.03$ .

Fig. 4

Example of (a) parenchymal complexity feature extraction and (b) breast cancer case-control classification in conjunction based on breast parenchymal density (PD) alone compared to the CaPTk texture feature extraction panel.

5.2.

Predictive Modeling of Clinical Outcome

5.2.1.

Predicting patient survival in glioblastoma patients

Glioblastoma is a disease with grim prognosis, of median survival of around 14 months after applying the standard of care, which comprises tumor resection and peritumoral radiation therapy along with chemotherapy. However, there is a fairly broad range of survival, from a few months to more than 2 years. Having baseline predictors of patient survival is important for patient management. It is also important for selecting patients with relatively homogeneous expected survival into a clinical trial, thereby likely increasing the trial’s ability to detect treatment effects, especially its ability to prolong survival. Toward this end, a preoperative multiparametric MRI of de novo glioblastoma patients was summarized in QIP signatures, capturing various characteristics of ET, NET, and ED, to estimate the likelihood of survival (Appendix A, Table 2).⁴ The initial features included (i) normalized volume of ET, NET, ED, and their combinations; (ii) distance of tumor (ET + NET) and ED to ventricles; (iii) mean and standard deviation of intensities of T1, T2, T1-Gd, T2-FLAIR, rCBV, PH, PSR, FA, RAD, AX, and ADC in ET, NET, and ED; (iv) frequency of intensities of T1, T2, T1-Gd, T2-FLAIR, rCBV, PH, PSR, FA, RAD, AX, and ADC in each distribution bin of ET, NET, and ED; (v) location of the tumor in the brain; and (vi) age. All features were integrated via a support vector machine (SVM) configuration to build two predictive classification models: a 6- and an 18-month SVM model to differentiate between patients surviving less/more than 6 months (short-survivors) and 18 months (long-survivors), respectively. The group of subjects having survival between 6 and 18 months was considered mid-survivor group. Forward feature selection was applied on the training set only to select important features. The SVM scores from each model were combined to calculate a composite survival prediction index (SPI), where higher SPI refers to relatively longer survival. We developed our predictive models on a discovery cohort ( $n = 110$ ) and tested these prospectively on a replication cohort ( $n = 57$ ) of glioblastoma patients. The two-class balanced accuracy of the 6-month and 18-month models was 75.06% ( $AUC = 0.79$ ) and 77.85% ( $AUC = 0.77$ ) in the replication cohort, respectively. Overall three-class classification accuracy into long, medium, and short survival groups was $\sim 70.18 %$ in the replication cohort. Kaplan–Meier survival curves [ $p - value < 0.001$ , log-rank (Mantel–Cox)]⁵³ and hazard ratios were also computed for survival analysis at a 95% CI (Fig. 5).

Fig. 5

Kaplan–Meier survival curves for the replication cohort. Actual survival on $x$ -axis is compared among each of the three survival groups based on predictions generated by the SPI. med, medium SPI HR, Hazard ratio.

5.2.2.

Predicting patient survival in breast neoadjuvant chemotherapy patients

The DCE-MRI images were analyzed for a subset of 106 women with complete imaging data available, which were recruited as part of the ACRIN 6657/I-SPY-1 trial.⁵⁴ A baseline model was created with age, race, hormone receptor status (ER/PR/Her2), and functional tumor volume (FTV) after neoadjuvant chemotherapy. Utilizing the DRAMMS deformable registration method³⁰^,²⁰ adapted to breast MRI, features regarding spatial/temporal changes between longitudinal scans (before and during the first patient visit after the initiation of neoadjuvant chemotherapy in this study), including voxel-wise volume ratio (Jacobian), as well as parametric response maps (PRM) for kinetic features, were estimated. Kinetic features included signal enhancement ratio, peak enhancement (PE) and wash-in/wash-out slope (WIS/WOS). To quantify heterogeneity for each feature, discrete wavelet transformation was used and then PCA was applied to reduce the number of estimated wavelet coefficient into the top two principal components, expressing 60% to 80% of the total variance (Appendix A, Table 6). The Cox proportional hazards model was utilized to perform a time-to-event analysis and predict recurrence-free survival (33 events) by estimating the c-statistic.⁵⁵ The baseline model using the standard clinical covariates and FTV was compared with the model, where registration-derived and PRM features were added. The c-statistic was 0.70 ( $p - value < 0.001$ ) for the baseline model, whereas the augmented model with the PRM features and the Jacobian information improved the c-statistics by 0.73 ( $p - value < 0.005$ ) and 0.74 ( $p - value < 0.001$ ), respectively. A model including both Jacobian and PRM features had the highest c-statistic of 0.77 ( $p - value < 0.001$ ; Fig. 6).

Fig. 6

Survival curves as function of (a) FTV, after first visit during neoadjuvant chemotherapy and (b) Jacobian heterogeneity when FTV is greater than the mean value (between images before and first visits).

5.2.3.

Predicting treatment response and survival of early-stage nonsmall cell lung cancer

To identify radiomic biomarkers for predicting treatment response and survival of early-stage nonsmall cell lung cancer (NSCLC) in patients, who received stereotactic body radiation therapy (SBRT), we carried out a radiomic analysis using CaPTk to distinguish patients with different treatment response and investigated the association between subclusters of tumor phenotypes and clinical outcomes. This study was performed based on a longitudinal fludeoxyglucose-positron emission tomography (FDG-PET)/computed tomography (CT) dataset of 80 patients, who were treated with SBRT for stage 1 NSCLC with over 2-years median follow-up. All patients in this dataset had a solid component of their NSCLC tumor, and some also had an additional ground glass component. Although all these patients were treated uniformly ( $12.5 Gy \times 4 fractions$ , or $10 Gy \times 5 fractions$ ), they had different primary tumor outcomes. From each patient’s standardized uptake values of FDG-PET scan collected before the treatment, we extracted 343 radiomic features, including intensity statistics, gray level co-occurrence matrix, and gray-level run-length matrices, as well as LBP within the tumor region. Then, the patients were grouped into two clusters with distinctive radiomic features using an unsupervised clustering analysis method³⁹ (Appendix A, Table 4). Kaplan–Meier survival analysis with respect to death and nodal failure at group level was performed for each cluster of patients (Fig. 7). Significant differences were observed for survival ( $p - value = 0.0004$ , Log-rank test) and nodal failure ( $p - value = 0.001$ ).

Fig. 7

Survival analysis of two clusters of the early-stage NSCLC patients with respect to (a) death and (b) nodal failure.

5.3.

Neurosurgical Planning

5.3.1.

Predictive modeling of peritumoral infiltration and recurrence

Current practice in treating glioblastoma includes resection guided by imaging-based tumor margins (typically defined via enhancement in T1-Gd images), followed by uniform radiation of peritumoral brain tissue. It is well known, however, that glioblastomas infiltrate their surrounding brain tissue, especially in the peritumoral edematous tissue defined by high T2-FLAIR signal (i.e., ED). Ability to predict the regions in this peritumoral tissue that are heavily infiltrated and most likely to present early recurrence can considerably change clinical practice, by guiding aggressive supratotal resection, i.e., targeted resection of peritumoral tissue, as well as targeted elevated radiation dose in nonresected peritumoral tissue that is more likely to present early recurrence. Based on functions provided by CaPTk, it has been recently shown that the combination of multiparametric MRI and machine learning can lead to predictive models of tumor infiltration that highlight peritumoral tissue that is $\sim 10$ times more likely to recur³ (Appendix A, Table 1). In particular, two regions were selected within the edema region to train the model. The near region was defined as the area immediately adjacent to the tumor, whereas the distal edge of edema was designated as the far region. The signal intensity of T1, T1-Gd, T2, T2-FLAIR, AX, FA, RAD, ADC, and rCBV, and first five principal components derived from the DSC-MRI image were combined via Gaussian kernel function of SVM. The model was retrospectively cross-validated on a cohort of 31 patients [ $odds ratio = 11.17$ (99% CI: 10.71 to 11.64)] and was subsequently evaluated on a prospective cohort of 34 patients [ $odds ratio = 9.29$ (99% CI: 8.95 to 9.65)]. The rest of this section presents a case report from a clinical case that we recently processed and presented at the weekly Brain Tumor Conference of the Hospital of the University of Pennsylvania, using these methods and hence highlighting their potential for routine clinical use.

Case report: SN was 25 when she presented with headaches and dizziness. An outside brain MRI revealed a nonenhancing, expansile, right posteromedial occipital lesion, extending into gray matter. Low-grade glial neoplasm was top on the differential diagnosis. Three months later, a subsequent MRI with brain tumor protocol, including perfusion (DSC) and spectroscopy, was consistent with that diagnosis. Another 3 months later, a follow-up brain MRI demonstrated an increase in size of the lesion, with new intralesional enhancement, triggering a change in radiologic diagnosis to high-grade glial neoplasm. A repeat MRI confirmed that diagnosis, demonstrating increased rCBV within the enhancing component. The thought was that there had been malignant transformation of the original tumor, which was identified 6 months earlier. The lesion was resected 10 days later, and the postoperative MRI showed some residual abnormal T2 signal but no residual enhancement. The pathology diagnosis was glioblastoma. A follow-up MRI, 2.5 months after initial resection demonstrated new enhancement, is consistent with tumor progression/recurrence. Using multiparametric analysis with CaPTk, we determined the areas of high likelihood for recurrence on the preoperative scan and superimposed the corresponding probability maps [Fig. 8(a)]. The actual recurrence [Fig. 8(b)] was centered precisely on the predicted high likelihood area.

Fig. 8

(a) Sagittal postgadolinum T1-weighted images with recurrence probability maps on preop scan, calculated via CaPTk. (b) Actual recurrence scan, about 3 months later.

5.3.2.

Fiber tracking

Two of the major challenges faced by fiber tracking in the realm of neurosurgical planning are that the reconstruction of tracts is affected by mass effect, when the tracts are displaced and distorted, and edema and infiltration, when the tracts are broken as a result of the change of diffusion parameters due to the pathology. These underline the need for methods that can track through edema and reconstruct even partial and displaced tracts. We are developing tractography algorithms that are invariant to edema, based on multicompartment modeling of the diffusion data.⁴¹^,⁴² The improvement in tracking with this modeling, over-traditional tracking, can be seen in Fig. 2(a). We have developed automated atlas-based tract reconstruction called Confetti (connectivity-based fiber extraction and identification;⁴³ Appendix A, Table 8). The applicability, the reliability, and the repeatability of the Confetti were validated in a dataset of healthy individuals acquired repeatedly.⁴³ Compared to the clustering of fibers for each scan independently, our framework provided better test–retest reproducibility results, with decreased (25%) mean intraindividual distance (i.e., disagreement of clusters between different timepoints of the same individual), while preserving interindividual differences. Additionally, the Confetti was also tested in tumor patients⁴⁰ on six major fiber bundles: cingulum bundle; fornix, uncinate fasciculus (UF), arcuate fasciculus, inferior fronto-occipital fasciculus, and inferior longitudinal fasciculus (ILF). The agreement between clustering and experts as quantified by Cohen’s kappa ranged between 0.6 and 0.76. Except two tracts, ILF and UF, the agreement between clustering and experts was higher than agreement between experts themselves, highlighting the reliability of the paradigm. When the tumor demonstrated significant mass effect or shift, the automated approach was useful to provide an initialization to guide the expert with identification of the specific tract of interest.

We have developed a measure of injury called disruption index of the structural connectome (DISC), based on how the information transfer in the brain is affected by changes in different regions.⁵⁶ A representation of the vulnerability of the brain to injury can be seen in Fig. 2(b). We have tested the DISC on a dataset of traumatic brain injury (TBI) patients with moderate to severe TBI examined at 3 months postinjury.⁵⁶ DISC was significantly correlated with post-traumatic amnesia (Pearson $r = 0.52$ , $p$ -value: 0.0007), verbal learning (Pearson $r = - 0.42$ , $p$ -value: 0.0075), executive function (Pearson $r = - 0.41$ , $p$ -value: 0.0083), and processing speed (Pearson $r = - 0.58$ , $p$ -value: 0.0001), demonstrating that assessing structural connectivity alterations may be useful in development of patient-oriented diagnostic and prognostic tools.

These diffusion MRI-based tools are being developed as a separate suite in CaPTk because of special visualization needs for such data. A preliminary view of the proposed surgical plan can be seen in Fig. 2(c) in which the automatically extracted tracts are presented with regard to the tumor. The tracts, vulnerability maps, and recurrence maps will be incorporated in this plan in the future. Radiation plans can be created by using the resection cavity instead of the tumor.

5.4.

Radiogenomics

As described earlier, the emerging field of radiogenomics promises to develop an arsenal of imaging signatures reflecting underlying molecular characteristics of various cancers. CaPTk allows the construction of such imaging signatures. We present results from two such examples: imaging signatures of the EGFRvIII mutation and of molecular subtypes of glioblastoma⁴^,¹⁰ (Fig. 9), as well as an imaging signature of OncotypeDX in breast cancer (Fig. 10).

Fig. 9

Examples of how QIP features are integrated into imaging signatures of molecular characteristics of glioblastoma. (a–c) Distributions of the PHI by EGFRvIII expression status. Statistical significance was evaluated via a two-tailed paired $t$ -test comparing between the two distributions in the (a) discovery, (b) replication, and (c) combined cohorts. (d) ROC curves of four-way classification of glioblastoma into its molecular subtypes, using extensive radiogenomic signatures synthesized using machine learning.

Fig. 10

Intrinsic imaging phenotypes of breast cancer tumors via unsupervised clustering of multiparametric MRI features. The columns represent tumors and the rows features, showing four distinct phenotypes, related to tumor gene expression and hormone (ER/PR) receptor status.

5.4.1.

Imaging signatures of the EGFRvIII mutation, as well as transcriptomic subtype, in glioblastoma

CaPTk offers an imaging signature highly distinctive of the EGFRvIII mutation in glioblastoma. This imaging signature leverages the heterogeneity of DSC-MRI signals throughout the peritumoral region, which is depicted by abnormal/bright T2-FLAIR signal¹⁰ (Appendix A, Table 3). In particular, this signature is based on the observation that the gradient of perfusion between tissue immediate to the active tumor and tissue distant from the tumor but within this peritumoral edematous region (bright FLAIR) is significantly higher in tumors that do not harbor the mutation (i.e., EGFRvIII- glioblastoma). This observation is reflecting the more local invasion and neovascularization of the EGFRvIII- tumors, and vice versa, the potentially deeper invasion of EGFRvIII+ tumors. Principal component features are extracted from the DSC-MRI images using CaPTk and the Bhattacharya distance is calculated to form the peritumoral heterogeneity index (PHI). The method was evaluated in preoperative perfusion scans of independent discovery ( $n = 64$ ) and validation ( $n = 78$ ) cohorts. Analysis in cohorts demonstrated high accuracy (89.92%), specificity (92.35%), and sensitivity (83.77%), with significantly distinctive ability ( $p - value = 4.0033 \times 10^{- 10}$ , $AUC = 0.8869$ ). Figures 9(a)–9(c) show the clear separation between PHI values of EGFRvIII(+) and EGFRvIII(−) tumors.

Related work in glioblastoma was applied by obtaining a more extensive QIP integrating T1, T1-Gd, T2, T2-FLAIR, DTI extracted measurements, principal components of DSC-MRI, along with parameters of the tumor growth model, spatial location, and other features. The multiparametric obtained signature aimed to detect molecular subtypes of glioblastoma, as described in Ref. 57. Figure 9(d) shows ROC curves obtained in this four-way classification experiment (baseline “chance” level is $\sim 25 %$ ). The AUC for classical, mesenchymal, neural, and proneural subtypes, respectively, was 0.75, 0.89, 0.92, and 0.87.

5.4.2.

Breast DCE-MRI phenotypes correlate to gene-expression based breast tumor profiling

Preoperative breast DCE-MRI images from women with ER+ breast cancer were retrospectively analyzed.²⁵^,⁵⁸ All women had their primary tumor tested with Oncotype DX,⁵⁹ an assay that measures RNA expression of 21 genes from formalin-fixed paraffin-embedded tissue and provides a score recurrence risk 10 years after treatment ( $low \leq 17 %$ , $18 % < medium < 30 %$ , $high \geq 31 %$ ). Validated morphologic, kinetic, and spatial heterogeneity features were extracted from each primary tumor: tumor area and perimeter were used to measure tumor size and ellipticity and convexity were computed to capture shape and structure.²⁵^,⁵⁸ Voxel-wise kinetic tumor features of PE, time-to-peak, WIS/WOS were also estimated, from which statistics were calculated to capture spatial kinetic tumor heterogeneity.²⁵ Multivariate linear regression was used to test associations between radiomic features and the gene-expression based recurrence score. To identify intrinsic imaging phenotypes, unsupervised clustering was applied on the extracted feature vectors.²⁵ There was significant correlation ( $r = 0.78$ , $p - value < 0.01$ ) between MRI features and the recurrence score.⁶⁰ Four imaging phenotypes were detected (Fig. 10), with two including only low and medium recurrence risk tumors. Tumors with a gene expression profile at high risk of recurrence showed a predominantly rapid contrast uptake, suggesting high levels of perfusion and vessel permeability. When phenotypes were used in a model to predict recurrence risk, the AUC reached up to 0.82 ( $p - value < 0.01$ ).

6. Discussion

We have presented CaPTk, a software suite developed to derive extensive QIP signatures, synthesize them into diagnostic and predictive markers, and facilitate the clinical translation of such complex algorithms. The described software extends beyond standard radiomic signatures that primarily investigate shape and texture properties and incorporates multiparametric MRI measures of diffusion and perfusion that reflect underlying cell density, microarchitecture, and neovascularization. Importantly, CaPTk extracts spatial patterns of cancer distribution, obtained via deformable registration methods, as well as biophysical parameters obtained via tumor growth modeling. Integration of these comprehensive imaging and spatial signatures via machine learning methods was found to result into indices of diagnostic, prognostic, and predictive value.

The ability to predict clinical outcome, for example, as evidenced by our results in Sec. 5.2.1, is important as it can directly influence patient management and treatment decisions. Although glioblastoma has a grim prognosis, some patients live just for a few months, whereas others can survive for more than 2 years, under combination of surgical resection, radiation therapy, and chemotherapy. Consequently, knowing in advance, the likelihood of an individual patient to respond positively to this combination treatment directly influences treatment decisions. Perhaps even more importantly, evaluation of potential therapies is tremendously confounded by high interpatient variability of survival. Given that early-stage testing typically involves a dozen or two patients, for treatment effects to be detectable under such heterogeneity, they must be very strong, which is typically not the case. However, having a baseline predictor of response/survival can significantly improve our ability to detect subtle treatment effects in two ways. First, relatively more homogeneous patient subgroups can be selected into a treatment trial. Second, post-treatment survival can be compared with baseline predicted survival, thereby providing a self-normalization mechanism that reduces heterogeneity and allows for treatment effects to be detected more easily.

Although the main scope of CaPTk is to provide decision support diagnostic and predictive indices, it was found to also enhance our understanding of disease mechanisms that might be related to various imaging signatures. Most notably, the EGFRvIII QIP revealed that tumors harboring the mutation might become more infiltrative and less prone to neovascularization, as evidenced by peritumoral perfusion MRI measures. On the contrary, EGFRvIII(−) tumors seem to build local vasculature to support the growth of the tumor, and hence might be more responsive to localized peritumoral treatment (resection and/or radiation). Moreover, future work will further investigate the spatial and temporal heterogeneity of this QIP signature of EGFRvIII mutation, via investigation of the spatial gradient of perfusion-derived metrics, as well as their change over time. The surgical and treatment planning environment (e.g., Figs. 2 and 8) will additionally provide a means for the surgeon to view (i) the placement of the tracts around the tumor/resection cavity, (ii) recurrence maps showing regions most likely to be affected in the future, and (iii) vulnerability maps depicting the global effect on the connectivity of the brain that could lead to future cognitive deficits.

For breast cancer, the goal of neoadjuvant chemotherapy is to down-stage locally advanced cancers prior to surgery to increase breast conservation rates and, ideally, achieve pathologic complete response (pCR), as patients with pCR have generally better long-term outcomes.⁶¹^,⁶² Imaging can be useful both for determining disease extent to inform surgery and for monitoring tumor response in vivo for tailoring treatment to the individual patient.⁶³ As new anticancer therapies are increasingly introduced, including targeted and combination therapies, there is an opportunity for personalized treatment. In neoadjuvant chemotherapy, while several patients may exhibit a clinical response, the vast majority of patients do not achieve pCR solely on the basis of standard first-line chemotherapy.⁶¹^,⁶² In an ideal personalized regimen, those are the patients we would like to be able to identify as early as possible during first-line neoadjuvant treatment, so that there is an opportunity to offer them alternative or supplemental therapies that could increase their chance of achieving pCR.⁶³ Breast cancer patients may now benefit from a number of novel therapies, such as aromatase inhibitors for ER+ cancer, trastuzumab plus lepatinib/pertuzumab with standard anthracycline/taxane chemotherapy for Her2+ tumors, PARP-inhibitors for triple-negative breast cancer, and/or BRCA carriers, shown to have significant benefits.⁶⁴ Results, however, from I-SPY 1⁶⁵ indicate that early prediction of pCR based on tumor volume and aggregate MRI features is far from perfect, having moderate discriminatory accuracy at the individual level. Our results suggest that the tools developed with CaPTk could allow to more accurately characterize the heterogeneous tissue changes induced by treatment. The rationale is that this more comprehensive way of characterizing the complex biological properties targeted by treatment,⁶⁰^,⁶⁶ especially changes related to functional angiogenic response, which occur prior to changes in tumor size,⁶³ will ultimately result in better prediction of response than the current standard imaging measures. Therefore, QIP signatures hold the promise of shifting the current paradigm in tailoring neoadjuvant treatment by introducing imaging biomarkers that are better earlier predictors of response and survival. Ultimately, by integrating imaging with histopathologic and molecular markers, we will be able to develop integrative predictive models that can be more accurate for specific tumor subtypes and individual patients.

Early prediction of treatment response and survival for lung cancer patients is important in terms of optimal treatment planning and prognosis. Radiomics analysis using quantitative imaging features to predict clinical outcomes has been widely investigated recently.⁶⁷^,⁶⁸ Most prior analyses were designed to predict clinical outcomes using univariate or multivariate analyses in a supervised manner. However, due to data imbalance and curse of dimensionality (small sample size and larger feature dimensionality), it is nontrivial to obtain a reliable prediction, and feature selection or dimension reduction techniques are typically used to improve the prediction performance. Unsupervised clustering of radiomic features is a promising alternative technique for risk stratification. As demonstrated by the results shown in Fig. 7, early-stage NSCLC patients clustered into different groups based on their radiomic features had distinctive treatment outcomes with respect to both survival and nodal failure although they were treated uniformly with SBRT. Our results suggest that the tools developed with CaPTk could be used to stratify patients based on their PET scans collected before the treatment so that personalized treatment could be implemented.

The flexible architecture of CaPTk enables its use in different contexts. For example, individual components and pipelines can be used as command-line modules, either on a local computer or, ultimately, on a high performance computing infrastructure and even the cloud. CaPTk’s graphical interface can provide focused and minimally complex tools. For example, a combination of tractography with predictive maps of infiltration and recurrence can be used for neurosurgical and radiation therapy planning. LIBRA and related modules are currently used for breast cancer risk estimation purposes. Finally, longer-term goals of CaPTk development include its integration with other packages offering complementary capabilities, such as 3-DSlicer. CaPTk describes a platform that facilitates translation, enabling operators to conduct quantitative analyses in a straightforward manner without requiring a substantial computational background. Thus, CaPTk can be seamlessly integrated into the typical quantification, analysis, and reporting workflow of a radiologist, underscoring its clinical potential.

Appendices

Appendix A:

Applications’ Description and Command-Line Interface

This section provides application-specific descriptions and examples of commands, targeted toward audiences of both technical readers and imaging scientists. Tables 1 Table 2 Table 3 Table 4 Table 5 Table 6 Table 7–8 show the input, output, sequence of steps, and command-line interfaces for glioblastoma infiltration maps, SPI, EGFRvIII index, SBRT, LIBRA, PRM, WhiteStripe, and Confetti, respectively. More detail on how to use different applications from GUI and command-line interface can be found on CaPTk’s website.¹⁴ CaPTk’s NITRC documentation page⁶⁹ has extensive documentation, including tutorial videos to serve as a foundation for users beginning to work with CaPTk.

Table 1

Input, output, sequence of steps, and command-line interface for glioblastoma infiltration map.


Input:
Conventional imaging: T1, T2, T2-FLAIR, T1-Gd
Diffusion imaging: AX, RAD, FA, ADC
Perfusion imaging: DSC-MRI
Output:
Glioblastoma infiltration map (.nii.gz, .nii)
Sequence of steps:
1. Preprocessing:
a. Conversion to NIfTI and reorientation to left, posterior, superior (LPS) coordination.
b. Denoising via smallest univalue segment assimilating nucleus (SUSAN).⁵²
c. Bias correction and affine registration of T1, T2, T2-FLAIR, DTI and DSC-MRI images to T1-Gd.
d. Skull stripping.
e. Tumor segmentation (ET + NET).
2. Training using near and far ROIs of all the modalities:
a. Drawing an ROI (near) immediately adjacent to the tumor, within the peritumoral edema/invasion.
b. Drawing an ROI (far) at the farthest from the ET but still within the peritumoral edema/invasion.
c. Extracting values of conventional MRIs and DTI measures in the near and far ROIs.
d. Extracting perfusion signal (DSC-MRI) in all the time points in the near and far ROIs and scaling it down to five principal components.
e. $Z$ -score all features.
f. Leave one out cross validation via Gaussian SVM on a retrospective cohort of 31 patients.
3. Prospective study:
a. Training an SVM model using 31 patients of retrospective study.
b. Applying $Z$ -score on prospective data by using mean and standard deviation of the retrospective dataset.
c. Applying the trained model to the prospective dataset.
d. Evaluating the created infiltration maps.
Command-line interface:
Estimation of infiltration on new patients:
RecurrenceEstimator –t 0 –i input_directory_path –o output_directory_path –m model_directory_path
Infiltration maps of the test patients will be saved in the output_directory_path.
Preparing a new infiltration estimation model:
RecurrenceEstimator –t 1 –i input_directory_path –o output_directory_path
Model files will be saved in the output_directory_path.

Table 2

Input, output, sequence of steps, and command-line interface for SPI.


Input:
Conventional imaging: T1, T2, T2-FLAIR, T1-Gd
Perfusion imaging: PH, PSR, rCBV
Diffusion imaging: AX, RAD, FA, ADC
Patient’s age
Output:
SPI
Sequence of steps:
1. Preprocessing:
a. Conversion to NIfTI and reorientation to LPS coordination.
b. Denoising via SUSAN.⁵²
c. Bias correction, and affine registration of T1, T2, T2-FLAIR, DTI and DSC images to T1-Gd.
d. Skull stripping.
e. Histogram matching of T1, T2, T2-FLAIR, T1-Gd with a standard reference template.
f. GLISTR-based segmentation, both in patient’s and atlas space.
2. Feature extraction from various tumor compartments using all the modalities:
a. Volumetric features: size of ET, NET, and ED, and their combinations.
b. Statistical features: mean and standard deviation of intensities of all the modalities in ET, NET, and ED.
c. Histogram-based features: percentage of voxels in each distribution bin of the histogram calculated for all the modalities in ET, NET, and ED.
d. Location-based features: percentage of tumor located in different regions of the brain, calculated by registering segmentation to a standard atlas, and distance of tumor (ET + NET) and edema to ventricles.
e. Demographics: age.
3. Data formulation and classification:
a. $Z$ -score data normalization.
b. Generating two SVM classifiers using forward feature selection: one for patients with more and less than 6 months survival, and the second for patients having more and less than 18 months survival.
c. Combining scores from both models to compute the composite SPI.
d. Retrospective evaluation of models on a cohort of 110 patients.
e. Prospective evaluation of models on a cohort of 57 patients.
Command-line interface:
SPI calculation on new patients:
SurvivalPredictor –t 0 –i input_directory_path –o output_directory_path –m model_directory_path
SPI indices of the test patients will be saved in. csv file in the output_directory_path.
Preparing a new SPI calculation model:
SurvivalPredictor –t 1 –i input_directory_path –o output_directory_path
Model files will be saved in the output_directory_path.

Table 3

Input, output, sequence of steps, and command-line interface for surrogate imaging marker for EGFRvIII.


Input:
Perfusion imaging: DSC-MRI
Output:
PHI
Sequence of steps:
1. Preprocessing:
a. Conversion to NIfTI and reorientation to LPS coordination.
b. Bias correction.
2. PHI estimation using near and far ROIs:
a. Drawing an ROI (near) immediately adjacent to the tumor, within the peritumoral edema/invasion.
b. Drawing an ROI (far) at the farthest from the ET but still within the peritumoral edema/invasion.
c. Extracting dynamic perfusion signal (DSC-MRI) from both the near and far ROIs.
d. Estimating the three principal components of the temporal perfusion dynamics for each ROI.
e. Calculating the Bhattacharyya coefficient (PHI value) between the selected principal components of each ROI.
3. Prospective study:
a. Estimation of PHI threshold on 64 patients that best distinguished EGFRvIII+ and EGFRvIII− cases.
b. Applying estimated PHI threshold on 78 test patients to find their EGFRvIII status.
Command-line interface:
EGFRvIIISurrogateIndex.exe -i input_Image_path.nii.gz -m maskWithNearAndFarLabels_path.nii.gz
System displays the output (PHI).

Table 4

Input, output, sequence of steps, and command-line interface for SBRT.


Input:
CT image of lung
PET image lung
Output:
Prediction of lung nodal failure
Sequence of steps:
1. Preprocessing:
a. Tumor segmentation.
2. Feature extraction from each tumor region using both the modalities:
a. Texture features: gray level co-occurrence matrix (GLCM), gray-level run-length matrix (GLRLM), and LBP.
b. Intensity features: minimum, maximum, and mean.
c. Shape features: perimeter, eccentricity, principle moments.
3. Data formulation and classification:
a. Normalizing data using $Z$ -score.
b. Generating a linear SVM model on patients responded/not responded to the treatment.
c. Leave-one-out cross-validation on 80 subjects.
Note: This representation of the algorithm is slightly different from the one presented in Sec. 5.2.3 as supervised learning method has been used in CaPTk instead of the clustering method in the actual study.³⁹
Command-line interface:
Lung analysis on test subject:
SBRT_Lung_Analyze.exe -c input_CT_image_path.nii.gz –p input_PET_image_path.nii.gz -m mask_image_path.nii.gz -t trained_model_path.xml -o output_file_path.txt
Provides prediction of lung nodal failure.

Table 5

Input, output, sequence of steps, and command-line interface for LIBRA.


Input:
Full-field digital mammography (FFDM)* in digital imaging and communications in medicine (DICOM) format.
Note: LIBRA also supports batch processing when the input is a folder of multiple DICOM images. LIBRA works on both raw (i.e., “for processing”) and vendor postprocessed (i.e., “for presentation”) FFDM images, and has thus far been validated for general electric Healthcare and Hologic FFDM systems.
Output:
For each FFDM image processed by LIBRA, the software generates:
• Quantitative estimates of breast area, dense area, and breast PD% that are stored in a comma separated text file (.csv).
• A JPG image of the breast and density segmentation outlines overlaid on a window-leveled version of the digital mammogram.
• Binary masks of breast and density segmentations in a MATLAB-formatted (.mat) data file.
Optionally, LIBRA may also generate:
• Intermediate graphics in addition to the final segmentation outlines.
• A processing log file.
All results are stored in the output folder defined by the user.
Sequence of steps:
1. Preprocessing and breast segmentation:
a. Intensity-normalization of input image based on image type (raw or processed) and, then, resizing for faster computation.
b. Applying edge-detection algorithms to delineate the boundary of the breast.
c. Calculation of breast area.
2. Dense tissue segmentation:
a. Determining the number of dominant clusters of similar gray-level intensity as the number of local peaks in the gray-level histogram of the breast region.
b. Dividing the breast into regions of similar gray-level intensity using fuzzy c-means to assign each image pixel to one of the dominant clusters.
c. Estimating a wide range of texture and shape features for each cluster.
d. Finally aggregating the clusters by a pretrained support-vector machine classifier to the final dense tissue area.
e. Calculating the ratio of the segmented absolute dense area to the total breast area to obtain breast PD%.
Command-line interface:
libra.exe <input_directory_path> <output_directory_path> <saveIntermediate>
• saveIntermediate <bool> Boolean value to save the intermediate output (1) or not (0)
Saves the output maps in output_diretory_path.

Table 6

Input, output, sequence of steps, and command-line interface for PRM.


Input:
Conventional imaging: DCE-MRI
Segmentation: Segmented tumor
Output:
Prediction of recurrence-free survival
Sequence of steps:
1. Preprocessing:
a. Converting DICOM objects to a NIfTI image of given orientation.
b. Bias correction.
c. Histogram matching and image registration between the baseline and the follow-up images.
d. Tumor segmentation.
2. Training within ROIs using all the defined features:
a. Extracting values of kinetic descriptors for both baseline and follow-up images.
b. Drawing an ROI within the tumor.
c. Deforming ROI of follow-up image to the space of baseline image.
d. Creating PRM for each feature.
e. Applying univariate Cox proportional hazard ratio to select best PRM features for 106 patients.
f. Training a multivariate model using Cox regression to predict recurrence-free survival for 106 patients.
3. Prospective study:
a. Applying the trained model to the new subjects.
b. Evaluating the created PRM maps.
Command-line interface:
PRM on new patients:
ParametricResponseMap –t 0 –i input_directory_path –o output_directory_path –m model_directory_path
Parametric response maps of the test patients will be saved in the output_directory_path.

Table 7

Input, output, sequence of steps, and command-line interface for WhiteStripe.

Input:

Conventional imaging: T1/T2

Output:

Normalized T1/T2 image

Sequence of steps:

1. Preprocessing:

a. Bias correction.

b. Either skull-stripping or rigid alignment to Montreal Neurological Institute atlas space.

2. Image normalization:

a. Getting the intensity values of voxels from specified ROI of an image.

b. Computing the histogram of above intensity values to find the distribution of white matter pixels.

c. Using mean and standard deviation from above identified distribution, normalize the images as the normalized image = (image − mean)/standard deviation.

Command-line interface:

For T1 image (t=0):

WhiteStripe.exe -i input_file_path.nii.gz –o output_file_path.nii.gz -t 0

For T2 image (t=1):

WhiteStripe.exe -i input_file_path.nii.gz –o output_file_path.nii.gz -t 1

Table 8

Input, output, sequence of steps, and command-line interface for Confetti.


Input:
A fiber set: Includes fibers to be clustered.
A set of track density images (TDIs): Each TDI image is a voxel-map for number of fibers reaching to one of the 87 gray matter ROIs defined by Freesurfer using the Desikan atlas.
Output:
Set of fiber bundles corresponding to known white matter tracts.
Sequence of steps:
1. Connectivity signature generation for fibers:
a. Representation of each fiber by the probabilities of connecting to 87 gray matter regions of Desikan atlas. TDI images are used for this purpose.
2. Clustering of fibers:
a. Randomly assigning fibers to bundles.
b. Representing each bundle as a multinomial distribution.
c. Using expectation maximization to refine bundle assignments at each iteration, while using the provided template as a prior, until a convergence is achieved.
d. Assigning resulting bundles (clusters) to white matter tracts of interest, as defined in the provided template.
3. Extraction of predefined white matter tracts:
a. Saving fiber bundles that are assigned to white matter tracts of interest.
Command-line interface:
Clustering of the generated fibers into bundles:
Confetti cluster -s InputTractSignatures.csv -k 200 -o OutputClusterIDs.csv
Identification of specific tracts (requires an annotated example):
Confetti extract -t TemplateFolder/ -f InputFibers.Bfloat -c InputClusterIDs.csv -o OutputTracts.Bfloat

Appendix B:

Features Supported by CaPTk

The feature panel in CaPTk enables clinicians and other researchers to easily extract feature measurements, commonly used in image analysis, and conduct large-scale analyses. Although the feature panel in CaPTk is being actively expanded, it currently comprises (i) intensity-based, (ii) textural, and (iii) volumetric/morphologic features. The general idea is to keep the features generic and adaptable for different types of medical images by just changing the input parameters. We provide chosen parameters, validated for brain, breast, and lung applications within CaPTk. Users can alter these preselected values or create their own set of parameters. The output of the feature extraction tab can be either a comma-separated (csv) file or an extensible markup language (xml) file, each containing feature names and values and recording the chosen parameters. Table 9 gives details about the currently available features. Intensity-based and textural features are extracted per modality, per annotated region, and per offset (offset represents the radius around the center pixel; for radius 1, the offset will be $\pm 1$ value.

Table 9

Set of radiomic features supported by CaPTk.

Specific features	Parameters	Range	Default	Description
Morphologic⁷⁰^,⁷¹
• Elongation	Dimensions	2-D:3-D	3-D	• Elongation is the length of largest principle moment.
• Perimeter	Axis	$x$ , $y$ , $z$	$z$	• Perimeter is the convex hull that contours the region.
• Roundness				• Roundness = As/Ac = (area of a shape)/(area of circle), where circle has the same perimeter.
• Eccentricity				• $Eccentricity = \sqrt{1 - (a * b) / c^{2}}$ , where $c$ is the longest semiprincipal axes of an ellipsoid fitted to an ROI, and $a$ and $b$ are the second and third longest semiprincipal axes of the ellipsoid.
Histogram-based⁴
• Bin frequency	Num_Bins	NA	10	• Percentage of voxels in each distribution bin of the histogram.
First-order statistics
• Minimum intensity	NA	NA	NA	• The raw minimum intensity of the ROI in an MRI volume.
• Maximum intensity				• The raw maximum intensity of the ROI in an MRI volume.
• Mean intensity				• The raw mean intensity of the ROI in an MRI volume.
• Standard deviation				• The standard deviation of the raw intensities of the ROI in an MRI volume.
• Variance				• Variance of the histogram of the raw intensities of the ROI in an MRI volume.
• Skewness				• Skewness of the histogram of the raw intensities of the ROI in an MRI volume.
• Kurtosis				• Kurtosis of the histogram of the raw intensities of the ROI in an MRI volume.
GLRLM⁷²
• SRE	Num_Bins	N.A.	10	• The “short run emphasis” within the ROI in an MRI volume.
• LRE	Num_Directions	3:13	13	• The “long run emphasis” within the ROI in an MRI volume.
• GLN	Radius	N.A.	2	• The “gray-level nonuniformity” within the ROI in an MRI volume.
• RLN	Dimensions	2-D:3-D	3-D	• The “run-length nonuniformity” within the ROI in an MRI volume.
• LGRE	Axis	$x$ , $y$ , $z$	$z$	• The “low gray-level run emphasis” within the ROI in an MRI volume.
• HGRE	Offset	Average/Individual	Average	• The “high gray-level run emphasis” within the ROI in an MRI volume.
• SRLGE	Distance_Range	1:5	1	• The “short run low gray-level emphasis” within the ROI in an MRI.
• SRHGE				• The “short run high gray-level emphasis” within the ROI in an MRI.
• LRLGE				• The “long run low gray-level emphasis” within the ROI in an MRI volume.
• LRHGE				• The “long run high gray-level emphasis” within the ROI in an MRI volume.
GLCM⁷³
• Energy	Num_Bins	N.A.	10	• The energy within the ROI in an MRI volume.
• Contrast	Num_Directions	3:13	13	• The contrast within the ROI in an MRI volume.
• Entropy	Radius	N.A.	2	• The entropy within the ROI in an MRI volume.
• Homogeneity	Dimensions	2-D:3-D	3-D	• The homogeneity within the ROI in an MRI volume.
• Correlation	Offset	Average/Individual	Average	• The correlation within the ROI in an MRI volume.
• Variance	Axis	$x$ , $y$ , $z$	$z$	• The variance within the ROI in an MRI volume.
• SumAverage				• The SumAverage within the ROI in an MRI volume.
• Variance				• The variance within the ROI in an MRI volume.
• Autocorrelation				• The AutoCorrelation within the ROI in an MRI volume.
Gray level size-zone matrix (GLSZM)⁷²^,⁷⁴
• SZE	Num_Bins	N.A.	10	• The “small zone emphasis” within the ROI in an MRI volume.
• LZE	Num_Directions	3:13	13	• The “large zone emphasis” within the ROI in an MRI volume.
• GLN	Radius	N.A.	2	• The “gray-level nonuniformity” within the ROI in an MRI volume.
• ZSN	Dimensions	2-D:3-D	3-D	• The “zone-size nonuniformity” within the ROI in an MRI volume.
• ZP	Axis	$x$ , $y$ , $z$	$z$	• The “zone percentage” within the ROI in an MRI volume.
• LGZE	Distance_Range	1:5	4	• The “low gray-level zone emphasis” within the ROI in an MRI volume.
• HGZE				• The “high gray-level zone emphasis” within the ROI in an MRI volume.
• SZLGE				• The “small zone low gray-level emphasis” within the ROI in an MRI volume.
• SZHGE				• The “small zone high gray-level emphasis” within the ROI in an MRI volume.
• LZLGE				• The “large zone low gray-level emphasis” within the ROI in an MRI volume.
• LZHGE				• The “large zone high gray-level emphasis” within the ROI in an MRI volume.
• GLV				• The “gray-level variance” within the ROI in an MRI volume.
• ZSV				• The “zone-size variance” within the ROI in an MRI volume.
Volumetric features⁷⁰^,⁷¹
• Volume/area	Dimensions	2-D:3-D	3-D	• Size of ROI in terms of number of voxels.
• Volume/area	Axis	$x$ , $y$ , $z$	$z$
LBP⁷⁵
	Radius	N.A.	N.A.	The LBP codes are computed using $N$ sampling points on a circle of certain radius and using mapping table.
	Neighborhood	2:4:8	8
Neighborhood gray-tone difference matrix (NGTDM)⁷⁶
• Coarseness	Num_Bins	N.A.	10	• The coarseness within the ROI in an MRI volume.
• Busyness	Num_Directions	3:13	13	• The busyness within the ROI in an MRI volume.
• Contrast	Dimensions	2-D:3-D	3-D	• The contrast within the ROI in an MRI volume.
• Complexity	Axis	$x$ , $y$ , $z$	N.A.	• The complexity within the ROI in an MRI volume.
• Strength	Distance_Range	1:5	1	• The strength within the ROI in an MRI volume.

Note: GLCM, GLRLM, and GLSZM are estimated within the ROI in an image, considering 26-connected neighboring voxels in the 3-D volume.

Appendix C:

Clinician’s Opinion

This is a very powerful tool, with applications that support neuro-oncology, brain connectomics, and lung and breast radiomics. The applications are far reaching and could have significant clinical impact, having the potential for changing treatment guidelines. For example, the glioblastoma recurrence prediction tool draws probability maps highlighting areas at high risk for recurrence, which could be targeted with surgery or increased radiation therapy, and, in turn, could increase patient survival. The application on glioblastoma survival prediction may have significant prognostic value, which is important for patients and their families.

The software tool is very well-structured, with different tabs for preprocessing and applications, which allow the user to navigate easily and efficiently. Interactive tools are available to annotate anatomical regions or lesions, and images can be saved and shared.

The documentation that comes along with CaPTk package is detailed and well-organized, and step by step instructions are available for all the applications. CaPTk also provides a link to download sample data, so users can become familiar with the different applications, as well as with the mechanics of the software to run those applications.

What I would suggest though that the tool could be run by technologists that do postprocessing for more routine imaging studies (e.g., CT angiography), then the results would be made accessible to clinicians for making patient management decisions.

Appendix D:

Authors’ Discussion about the Clinician’s Opinion

We appreciate the valuable feedback provided by Dr. Michel Bilello about CaPTk. Such positive feedback of our software by the clinicians is very encouraging and will help improve the quality of the software and to seamlessly integrate it in the clinical workflow. In fact, most of the modules of CaPTk were developed in very close collaboration with various clinicians at Penn (Neurosurgeons, Radiation Oncologists, Oncologists, and Radiologists), therefore can be readily used in clinical settings. This makes CaPTk a unique software that has significant clinical impact.

We appreciate the concern raised by Michel Bilello that the software is not meant to be used by the clinician rather the results generated by the software (such as glioblastoma recurrence probability maps, survival prediction, and breast density estimation) can be used. We strongly agree with Dr. Bilello that there should be a team of technologists that do preprocessing and acquire results, which would then be used by the clinician in the decision-making process.

Disclosures

The authors have no relevant conflicts of interest to disclose.

Acknowledgments

CaPTk is supported by NCI/NIH ITCR program with award number U24CA189523. Additional work has been supported by NIH Grant Nos. R01NS042645, R01NS096606, and R01CA197000.

References

1.

H. J. Aerts, “The potential of radiomic-based phenotyping in precision medicine: a review,” JAMA Oncol., 2 (12), 1636 –1642 (2016). http://dx.doi.org/10.1001/jamaoncol.2016.2631 Google Scholar

2.

J. P. O’Connor et al., “Imaging biomarker roadmap for cancer studies,” Nat. Rev. Clin. Oncol., 14 (3), 169 –186 (2017). http://dx.doi.org/10.1038/nrclinonc.2016.162 Google Scholar

3.

H. Akbari et al., “Imaging surrogates of infiltration obtained via multiparametric imaging pattern analysis predict subsequent location of recurrence of glioblastoma,” Neurosurgery, 78 (4), 572 –580 (2016). http://dx.doi.org/10.1227/NEU.0000000000001202 NEQUEB Google Scholar

4.

L. Macyszyn et al., “Imaging patterns predict patient survival and molecular subtype in glioblastoma via machine learning techniques,” Neuro-Oncology, 18 (3), 417 –425 (2016). http://dx.doi.org/10.1093/neuonc/nov127 Google Scholar

5.

A. Gooya et al., “GLISTR: glioma image segmentation and registration,” IEEE Trans. Med. Imaging, 31 (10), 1941 –1954 (2012). http://dx.doi.org/10.1109/TMI.2012.2210558 ITMID4 0278-0062 Google Scholar

6.

A. Gooya, G. Biros and C. Davatzikos, “Deformable registration of glioma images using EM algorithm and diffusion reaction modeling,” IEEE Trans. Med. Imaging, 30 (2), 375 –390 (2011). http://dx.doi.org/10.1109/TMI.2010.2078833 ITMID4 0278-0062 Google Scholar

7.

C. Hogea, C. Davatzikos and G. Biros, “Brain-tumor interaction biophysical models for medical image registration,” SIAM J. Sci. Comput., 30 (6), 3050 –3072 (2008). http://dx.doi.org/10.1137/07069208X SJOCE3 1064-8275 Google Scholar

8.

C. Hogea, C. Davatzikos and G. Biros, “An image-driven parameter estimation problem for a reaction-diffusion glioma growth model with mass effects,” J. Math. Biol., 56 (6), 793 –825 (2008). http://dx.doi.org/10.1007/s00285-007-0139-x JMBLAJ 0303-6812 Google Scholar

9.

A. Gastounioti, E. F. Conant and D. Kontos, “Beyond breast density: a review on the advancing role of parenchymal texture analysis in breast cancer risk assessment,” Breast Cancer Res., 18 (1), 91 (2016). http://dx.doi.org/10.1186/s13058-016-0755-8 BCTRD6 Google Scholar

10.

S. Bakas et al., “In vivo detection of EGFRvIII in glioblastoma via perfusion magnetic resonance imaging signature consistent with deep peritumoral infiltration: the

φ

index,” Clin. Cancer Res., 23 (16), 4724 –4734 (2017). http://dx.doi.org/10.1158/1078-0432.CCR-16-1871 Google Scholar

11.

S. Bakas et al., “Highly-expressed wild-type EGFR and EGFRvIII mutant glioblastomas have similar MRI signature, consistent with deep peritumoral infiltration,” Neuro-Oncology, 18 (6), vi125 –vi126 (2016). http://dx.doi.org/10.1093/neuonc/now212.523 Google Scholar

12.

S. Bakas et al., “Identification of imaging signatures of the epidermal growth factor receptor variant III (EGFRvIII) in glioblastoma,” Neuro-Oncology, 17 (5), v154 (2015). http://dx.doi.org/10.1093/neuonc/nov225.05 Google Scholar

13.

A. Ashraf et al., “Breast DCE-MRI kinetic heterogeneity tumor markers: preliminary associations with neoadjuvant chemotherapy response,” Transl. Oncol., 8 (3), 154 –162 (2015). http://dx.doi.org/10.1016/j.tranon.2015.03.005 Google Scholar

14.

C. Davatzikos, “cancer imaging phenomics toolkit (CaPTk),” (2017) December ). 2017). Google Scholar

15.

S. Bakas et al., “Segmentation of gliomas in multimodal magnetic resonance imaging volumes based on a hybrid generative-discriminative framework,” in Proc. of the Multimodal Brain Tumor Image Segmentation Challenge (BRATS), 5 –12 (2015). Google Scholar

16.

S. Bakas et al., “GLISTRboost: combining multimodal MRI segmentation, registration, and biophysical tumor growth modeling with gradient boosting machines for glioma segmentation,” Lect. Notes Comput. Sci., 9556 144 –155 (2016). http://dx.doi.org/10.1007/978-3-319-30858-6 LNCSD9 0302-9743 Google Scholar

17.

ITK-SNAP, (2014) http://www.itksnap.org/pmwiki/pmwiki.php November ). 2014). Google Scholar

18.

S. Bakas et al., “Advancing the cancer genome atlas glioma MRI collections with expert segmentation labels and radiomic features,” Sci. Data, 4 170117 (2017). http://dx.doi.org/10.1038/sdata.2017.117 Google Scholar

19.

K. Zeng et al., “Segmentation of gliomas in pre-operative and post-operative multimodal magnetic resonance imaging volumes based on a hybrid generative-discriminative framework,” Lect. Notes Comput. Sci., 10154 184 –194 (2017). http://dx.doi.org/10.1007/978-3-319-55524-9 LNCSD9 0302-9743 Google Scholar

20.

M. Bergman, “CBICA Image processing portal,” (2017) March ). 2017). Google Scholar

21.

J. H. Friedman, “Stochastic gradient boosting,” Comput. Stat. Data Anal., 38 (4), 367 –378 (2002). http://dx.doi.org/10.1016/S0167-9473(01)00065-2 CSDADW 0167-9473 Google Scholar

22.

J. H. Friedman, “Greedy function approximation: a gradient boosting machine,” Ann. Stat., 29 1189 –1232 (2001). http://dx.doi.org/10.1214/aos/1013203451 ASTSC7 0090-5364 Google Scholar

23.

S. Bakas et al., “Fast semi-automatic segmentation of focal liver lesions in contrast-enhanced ultrasound, based on a probabilistic model,” Comput. Meth. Biomech. Biomed. Eng., 5 (5), 329 –338 (2017). http://dx.doi.org/10.1080/21681163.2015.1029642 Google Scholar

24.

M. Bilello et al., “Population-based MRI atlases of spatial distribution are specific to patient and tumor characteristics in glioblastoma,” Neuroimage Clin., 12 34 –40 (2016). http://dx.doi.org/10.1016/j.nicl.2016.03.007 Google Scholar

25.

A. B. Ashraf et al., “Identification of intrinsic radio-phenotypes for breast cancer tumors: preliminary associations with gene expression profiles,” Radiology, 272 (2), 374 –384 (2014). http://dx.doi.org/10.1148/radiol.14131375 RADLAX 0033-8419 Google Scholar

26.

H. Akbari et al., “Pattern analysis of dynamic susceptibility contrast MRI reveals peritumoral tissue heterogeneity,” Radiology, 273 (2), 502 –510 (2014). http://dx.doi.org/10.1148/radiol.14132458 RADLAX 0033-8419 Google Scholar

27.

B. M. Ellingson et al., “Probabilistic radiographic atlas of glioblastoma phenotypes,” Am. J. Neuroradiol., 34 (3), 533 –540 (2013). http://dx.doi.org/10.3174/ajnr.A3253 Google Scholar

28.

R. T. Shinohara et al., “Statistical normalization techniques for magnetic resonance imaging,” Neuroimage, 6 9 –19 (2014). http://dx.doi.org/10.1016/j.nicl.2014.08.008 NEIMEF 1053-8119 Google Scholar

29.

L. G. Nyul, J. K. Udupa and X. Zhang, “New variants of a method of MRI scale standardization,” IEEE Trans. Med. Imaging, 19 (2), 143 –150 (2000). http://dx.doi.org/10.1109/42.836373 ITMID4 0278-0062 Google Scholar

30.

Y. Ou et al., “Deformable registration for quantifying longitudinal tumor changes during neoadjuvant chemotherapy,” Magn. Reson. Med., 73 (6), 2343 –2356 (2015). http://dx.doi.org/10.1002/mrm.25368 MRMEEN 0740-3194 Google Scholar

31.

Deformable Registration via Attribute Matching and Mutual-Saliency Weighting (DRAMMS), (2017) http://www.med.upenn.edu/sbia/dramms.html December ). 2017). Google Scholar

32.

Y. Ou et al., “DRAMMS: deformable registration via attribute matching and mutual-saliency weighting,” Med. Image Anal., 15 (4), 622 –639 (2011). http://dx.doi.org/10.1016/j.media.2010.07.002 Google Scholar

33.

S. Rathore et al., “Imaging pattern analysis reveals three distinct phenotypic subtypes of GBM with different survival rates,” Neuro-Oncology, 18 (Suppl. 6), vi128 (2016). http://dx.doi.org/10.1093/neuonc/now212.532 Google Scholar

34.

Z. A. Binder et al., “Extracellular EGFR289 activating mutations confer poorer survival and suggest enhanced motility in primary GBMs,” Neuro-Oncology, 18 (Suppl. 6), vi105 –vi106 (2016). http://dx.doi.org/10.1093/neuonc/now212.441 Google Scholar

35.

S. Rathore et al., “Radiologic subtypes of glioblastoma calculated via multi-parametric imaging signatures reveal complementary information to current who classification,” Neuro-Oncology, 19 (Suppl. 6), vi155 –vi156 (2017). http://dx.doi.org/10.1093/neuonc/nox168.633 Google Scholar

36.

H. Itakura et al., “Magnetic resonance image features identify glioblastoma phenotypic subtypes with distinct molecular pathway activities,” Sci. Transl. Med., 7 (303), 303ra138 (2015). http://dx.doi.org/10.1126/scitranslmed.aaa7582 STMCBQ 1946-6234 Google Scholar

37.

B. M. Keller et al., “Preliminary evaluation of the publicly available Laboratory for Breast Radiodensity Assessment (LIBRA) software tool: comparison of fully automated area and volumetric density measures in a case-control study with digital mammography,” Breast Cancer Res., 17 (1), 117 (2015). http://dx.doi.org/10.1186/s13058-015-0626-8 BCTRD6 Google Scholar

38.

Y. Zheng et al., “Parenchymal texture analysis in digital mammography: a fully automated pipeline for breast cancer risk assessment,” Med. Phys., 42 (7), 4149 –4160 (2015). http://dx.doi.org/10.1118/1.4921996 MPHYA6 0094-2405 Google Scholar

39.

H. Li et al., “Unsupervised machine learning of radiomic features for predicting treatment response and survival of early-stage non-small cell lung cancer patients treated with stereotactic body radiation therapy,” in Annual Meeting of the American Society for Radiation Oncology (ASTRO), (2017). Google Scholar

40.

B. Tunc et al., “Individualized map of white matter pathways: connectivity-based paradigm for neurosurgical planning,” Neurosurgery, 79 (4), 568 –577 (2016). http://dx.doi.org/10.1227/NEU.0000000000001183 NEQUEB Google Scholar

41.

J. Lecoeur et al., “Improving white matter tractography by resolving the challenges of edema,” in MICCAI Workshop: DTI Challenge, (2013). Google Scholar

42.

J. Lecoeur et al., “Addressing the challenge of edema in fiber tracking,” in Medical Image Computing and Computer-Assisted Intervention (MICCAI), DTI Tractography Challenge, (2014). Google Scholar

43.

B. Tunc et al., “Automated tract extraction via atlas based adaptive clustering,” Neuroimage, 102 596 –607 (2014). http://dx.doi.org/10.1016/j.neuroimage.2014.08.021 NEIMEF 1053-8119 Google Scholar

44.

A. Brun et al., “Clustering fiber traces using normalized cuts,” Lect. Notes Comput. Sci., 3216 368 –375 (2004). http://dx.doi.org/10.1007/978-3-540-30135-6_45 Google Scholar

45.

M. Liu, B. C. Vemuri and R. Deriche, “Unsupervised automatic white matter fiber clustering using a Gaussian mixture model,” in Proc. of IEEE Int. Symp. on Biomedical Imaging, 522 –525 (2012). http://dx.doi.org/10.1109/ISBI.2012.6235600 Google Scholar

46.

M. Maddah et al., “A unified framework for clustering and quantitative analysis of white matter fiber tracts,” Med. Image Anal., 12 191 –202 (2008). http://dx.doi.org/10.1016/j.media.2007.10.003 Google Scholar

47.

L. J. O’Donnell et al., “A method for clustering white matter fiber tracts,” Am. J. Neuroradiol., 27 (5), 1032 –1036 (2006). Google Scholar

48.

Q. Wang et al., “Hierarchical fiber clustering based on multi-scale neuroanatomical features,” in Proc. of the 5th Int. Conf. on Medical Imaging and Augmented Reality, 448 –456 (2010). Google Scholar

49.

D. Wassermann et al., “Unsupervised white matter fiber clustering and tract probability map generation: applications of a Gaussian process framework for white matter fibers,” Neuroimage, 51 (1), 228 –241 (2010). http://dx.doi.org/10.1016/j.neuroimage.2010.01.004 NEIMEF 1053-8119 Google Scholar

50.

B. M. Keller et al., “Parenchymal texture analysis in digital mammography: robust texture feature identification and equivalence across devices,” J. Med. Imaging, 2 (2), 024501 (2015). http://dx.doi.org/10.1117/1.JMI.2.2.024501 JMEIET 0920-5497 Google Scholar

51.

B. M. Keller et al., “Estimation of breast percent density in raw and processed full field digital mammography images via adaptive fuzzy c-means clustering and support vector machine segmentation,” Med. Phys., 39 (8), 4903 –4917 (2012). http://dx.doi.org/10.1118/1.4736530 MPHYA6 0094-2405 Google Scholar

52.

S. M. Smith and J. M. Brady, “SUSAN—a new approach to low level image processing,” Int. J. Comput. Vision, 23 (1), 45 –78 (1997). http://dx.doi.org/10.1023/A:1007963824710 IJCVEQ 0920-5691 Google Scholar

53.

J. M. Bland and D. G. Altman, “The logrank test,” Br. Med. J., 328 (7447), 1073 –1073 (2004). http://dx.doi.org/10.1136/bmj.328.7447.1073 BMJOAE 0007-1447 Google Scholar

54.

N. M. Hylton et al., “Neoadjuvant chemotherapy for breast cancer: functional tumor volume by MR imaging predicts recurrence-free survival-results from the ACRIN 6657/CALGB 150007 I-SPY 1 TRIAL,” Radiology, 279 (1), 44 –55 (2016). http://dx.doi.org/10.1148/radiol.2015150013 RADLAX 0033-8419 Google Scholar

55.

H. Uno et al., “On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data,” Stat. Med., 30 (10), 1105 –1117 (2011). http://dx.doi.org/10.1002/sim.4154 SMEDDA 1097-0258 Google Scholar

56.

B. Solmaz et al., “Assessing connectivity related injury burden in diffuse traumatic brain injury,” Hum. Brain Mapp., 38 (6), 2913 –2922 (2017). http://dx.doi.org/10.1002/hbm.23561 HBRME7 1065-9471 Google Scholar

57.

R. G. Verhaak et al., “Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1,” Cancer Cell, 17 (1), 98 –110 (2010). http://dx.doi.org/10.1016/j.ccr.2009.12.020 Google Scholar

58.

A. B. Ashraf et al., “A multichannel Markov random field approach for automated segmentation of breast cancer tumor in DCE-MRI data using kinetic observation model,” Lect. Notes Comput. Sci., 6893 546 –553 (2011). http://dx.doi.org/10.1007/978-3-642-23626-6 LNCSD9 0302-9743 Google Scholar

59.

S. Paik et al., “Gene expression and benefit of chemotherapy in women with node-negative, estrogen receptor-positive breast cancer,” J. Clin. Oncol., 24 (23), 3726 –3734 (2006). http://dx.doi.org/10.1200/JCO.2005.04.7985 JCONDN 0732-183X Google Scholar

60.

L. R. Arlinghaus et al., “Current and future trends in magnetic resonance imaging assessments of the response of breast tumors to neoadjuvant chemotherapy,” J. Oncol., 2010 1 –17 (2010). http://dx.doi.org/10.1155/2010/919620 Google Scholar

61.

P. Cortazar et al., “Pathological complete response and long-term clinical benefit in breast cancer: the CTNeoBC pooled analysis,” Lancet, 384 (9938), 164 –172 (2014). http://dx.doi.org/10.1016/S0140-6736(13)62422-8 LANCAO 0140-6736 Google Scholar

62.

J. White and A. DeMichele, “Neoadjuvant therapy for breast cancer: controversies in clinical trial design and standard of care,” Am. Soc. Clin. Oncol. Educ. Book, 35 e17 –e23 (2015). http://dx.doi.org/10.14694/EdBook_AM.2015.35.e17 Google Scholar

63.

V. Dialani, T. Chadashvili and P. J. Slanetz, “Role of imaging in neoadjuvant therapy for breast cancer,” Ann. Surg. Oncol., 22 (5), 1416 –1424 (2015). http://dx.doi.org/10.1245/s10434-015-4403-9 Google Scholar

64.

A. Esposito, C. Criscitiello and G. Curigliano, “Highlights from the 14(th) St Gallen International Breast Cancer Conference 2015 in Vienna: dealing with classification, prognostication, and prediction refinement to personalize the treatment of patients with early breast cancer,” Ecancermedicalscience, 9 518 (2015). http://dx.doi.org/10.3332/cancer.2015.518 Google Scholar

65.

N. M. Hylton et al., “Locally advanced breast cancer: MR imaging for prediction of response to neoadjuvant chemotherapy—results from ACRIN 6657/I-SPY TRIAL,” Radiology, 263 (3), 663 –672 (2012). http://dx.doi.org/10.1148/radiol.12110748 RADLAX 0033-8419 Google Scholar

66.

G. von Minckwitz et al., “Response-guided neoadjuvant chemotherapy for breast cancer,” J. Clin. Oncol., 31 (29), 3623 –3630 (2013). http://dx.doi.org/10.1200/JCO.2012.45.0940 JCONDN 0732-183X Google Scholar

67.

G. Lee et al., “Radiomics and its emerging role in lung cancer research, imaging biomarkers and clinical management: state of the art,” Eur. J. Radiol., 86 297 –307 (2017). http://dx.doi.org/10.1016/j.ejrad.2016.09.005 EJRADR 0720-048X Google Scholar

68.

M. Scrivener et al., “Radiomics applied to lung cancer: a review,” Transl. Cancer Res., 5 (4), 398 –409 (2016). http://dx.doi.org/10.21037/tcr.2016.06.18 Google Scholar

69.

CBICA: Cancer Imaging Phenomics Toolkit (CaPTk), (2017) https://www.nitrc.org/docman/?group_id=1059 December ). 2017). Google Scholar

70.

G. Thibault et al., “Shape and texture indexes application to cell nuclei classification,” Int. J. Pattern Recognit. Artif. Intell., 27 1357002 (2013). http://dx.doi.org/10.1142/S0218001413570024 Google Scholar

71.

M. Vallières et al., “A radiomics model from joint FDG-PET and MRI texture features for the prediction of lung metastases in soft-tissue sarcomas of the extremities,” Phys. Med. Biol., 60 (14), 5471 –5496 (2015). http://dx.doi.org/10.1088/0031-9155/60/14/5471 PHMBA7 0031-9155 Google Scholar

72.

M. M. Galloway, “Texture analysis using grey level run lengths,” Comput. Graphics Image Process., 4 172 –179 (1975). http://dx.doi.org/10.1016/S0146-664X(75)80008-6 Google Scholar

73.

R. M. Haralick, K. Shanmugam and I. H. Dinstein, “Textural features for image classification,” IEEE Trans. Syst. Man Cybern., 3 (6), 610 –621 (1973). http://dx.doi.org/10.1109/TSMC.1973.4309314 Google Scholar

74.

X. Tang, “Texture information in run-length matrices,” IEEE Trans. Image Process., 7 (11), 1602 –1609 (1998). http://dx.doi.org/10.1109/83.725367 IIPRE4 1057-7149 Google Scholar

75.

T. Ojala, M. Pietikainen and T. Maenpaa, “Multiresolution gray-scale and rotation invariant texture classification with local binary patterns,” IEEE Trans. Pattern Anal. Mach. Intell., 24 (7), 971 –987 (2002). http://dx.doi.org/10.1109/TPAMI.2002.1017623 ITPIDJ 0162-8828 Google Scholar

76.

M. Amadasun and R. King, “Textural features corresponding to textural properties,” IEEE Trans. Syst. Man Cybern., 19 (5), 1264 –1274 (1989). http://dx.doi.org/10.1109/21.44046 Google Scholar

Biographies for the authors are not available.

Citation Download Citation

Christos Davatzikos, Saima Rathore, Spyridon Bakas, Sarthak Pati, Mark Bergman, Ratheesh Kalarot, Patmaa Sridharan, Aimilia Gastounioti, Nariman Jahani, Eric Cohen, Hamed Akbari, Birkan Tunc, Jimit Doshi, Drew Parker, Michael Hsieh, Aristeidis Sotiras, Hongming Li, Yangming Ou, Robert K. Doot, Michel Bilello, Yong Fan, Russell T. Shinohara, Paul Yushkevich, Ragini Verma, and Despina Kontos "Cancer imaging phenomics toolkit: quantitative imaging analytics for precision diagnostics and predictive modeling of clinical outcome," Journal of Medical Imaging 5(1), 011018 (11 January 2018). https://doi.org/10.1117/1.JMI.5.1.011018

Received: 30 June 2017; Accepted: 5 December 2017; Published: 11 January 2018

Access the abstract

JOURNAL ARTICLE
21 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY