Co-registered photoacoustic and ultrasound imaging of human colorectal cancer

Abstract. Colorectal cancer is the second most common malignancy diagnosed globally. Critical gaps exist in diagnostic and surveillance imaging modalities for colorectal neoplasia. Although prior studies have demonstrated the capability of photoacoustic imaging techniques to differentiate normal from neoplastic tissue in the gastrointestinal tract, evaluation of deep tissue with a fast speed and a large field of view remains limited. To investigate the ability of photoacoustic technology to image deeper tissue, we conducted a pilot study using a real-time co-registered photoacoustic tomography (PAT) and ultrasound (US) system. A total of 23 ex vivo human colorectal tissue samples were imaged immediately after surgical resection. Co-registered photoacoustic images of malignancies showed significantly increased PAT signal compared to normal regions of the same sample. The quantitative relative total hemoglobin (rHbT) concentration computed from four optical wavelengths, the spectral features, such as the mean spectral slope, and 0.5-MHz intercept extracted from PAT and US spectral data, and image features, such as the first- and second-order statistics along with the standard deviation of the mean radon transform of PAT images, have shown statistical significance between untreated colorectal tumors and the normal tissue. Using either a logistic regression model or a support vector machine, the best set of parameters of rHbT and PAT intercept has achieved area-under-the-curve (AUC) values of 0.97 and 0.95 for both training and testing data sets, respectively, for prediction of histologically confirmed invasive carcinoma.


Introduction
Photoacoustic imaging (PAI) is an emerging technique that can provide high optical absorption contrast images at reasonable microscale resolution and clinically relevant depths. 1 Several studies have established that optical absorption parameters are important biomarkers directly related to the tissue microvasculature, tumor angiogenesis, or tumor hypoxia. [2][3][4] In general, PAI is classified into photoacoustic microscopy (PAM) and photoacoustic tomography (PAT). 1 Previously, PAM and photoacoustic endoscopy have demonstrated the capability of detecting human colorectal cancer. 5,6 However, the low imaging speed (limited by the laser repetition rate and scanning scheme), small imaging area, and moderate penetration depth created obstacles for clinical applications.
Compared with PAM, PAT is able to penetrate deeper with a faster data acquisition speed and a larger field of view due to the use of ultrasonic arrays and a wide optical beam. Several studies have demonstrated that a PAT/US dual-modality imaging system can provide anatomical and functional information in tumors, [7][8][9][10][11][12][13] but no prior applications in the human distal GI tract have been reported using PAT/US dual-modality imaging.
Adenocarcinoma of the colon and rectum is the second most common malignancy diagnosed globally and the fourth leading cause of cancer mortality, with more than 100,000 new cases diagnosed annually in the U.S. 14,15 Accurate staging and post-treatment surveillance of this prevalent disease are critical because treatment strategies are predicated upon the stage at presentation and response to therapy-in some instances, detailed imaging allows certain patients to avoid surgery altogether. Although colonoscopy and biopsy are the gold-standard diagnostic tests for colorectal cancers, 16 multiple imaging modalities including optical imaging, 17,18 endoscopic ultrasound (EUS), pelvic magnetic resonance imaging (MRI), computed tomography (CT), and positron emission tomography (PET) are also utilized.
Unfortunately, each of these modalities has critical weaknesses when evaluating colorectal tumors. White light endoscopy only detects macroscopic morphology and provides no functional assessment of the imaged tissue. MRI has limited between-slice resolution and is often unable to differentiate early tumors from benign neoplasia, committing patients to potentially more invasive treatment regimens than needed. 19,20 Monitoring of tumors after chemotherapy and radiation with MRI is often confounded by fibrotic reaction and edema, which can appear similar to residual tumor. 21 CT has poorer resolution of the bowel wall layers in comparison to MRI, subsequently limiting its ability to describe circumferential resection margin status or serosal invasion in locally advanced cases. Additionally, CT also cannot distinguish induration or peritumoral fibrosis from frank malignant disease with a high degree of specificity, further limiting its application in local tumor staging. 19 PET imaging is also plagued by poor resolution, and EUS remains highly user-dependent and unable to resolve small islands of the tumor. 19 Therefore, a critical need exists for precise imaging modalities of colorectal tumors for both staging  and therapeutic response evaluations. PAT, in contrast, uniquely provides functional imaging at high resolution using hemoglobin as an endogenous contrast agent. By detecting the abnormal vasculature that accompanies colorectal malignancies, we hypothesized that this modality might be able to identify malignant or residual tumors, otherwise, undetectable by current clinical imaging. We, therefore, performed the following pilot study to test a real-time co-registered PAT/US system prototype and assess its ability to delineate differences between benign and malignant tissue. To the best of our knowledge, this study is the first utilizing co-registered PAT/ US to evaluate human colon samples.

Human Sample Preparation
Freshly resected colon and rectum samples obtained from patients undergoing surgery at Washington University School of Medicine were imaged immediately after surgery. Patients with known benign neoplasia (polyps) as well as malignancies (adenocarcinoma) were eligible for imaging. Cancer patients who had received preoperative treatment with chemotherapy and /or radiation were also included. The study was approved by the Institutional Review Board at Washington University (#201707066). Informed consent was obtained from all patients. Specimens were obtained from the operating room as previously described. 6 A total of 23 tissue samples were imaged in the pilot study using the PAT/US system. This included untreated colorectal adenocarcinomas (n ¼ 12), precancerous polyps (n ¼ 6), colorectal cancer following chemotherapy or radiation and chemotherapy (n ¼ 4), and postpolypectomy (n ¼ 1). Two treated patients have achieved complete pathological response and two partial response. The majority of patients underwent hemicolectomy for cancer and were found to have malignancy on histologic analysis (Table 1).

Co-Registered Ultrasound-Guided Photoacoustic Tomography System
Details of the real-time, co-registered PAT/US system used in this study were discussed previously. 11,22 The system consists of three main parts: a Ti:sapphire laser (Symphotics TII, LS-2134, Symphotics, Camarillo, California) optically pumped with a Q-switched Nd: YAG laser (Symphotics TII, LS-2122), an optimized optical fiber-based light delivery system, 23

Extraction of Functional, Spectral, and Textural Features
Several functional, spectral, and textural features were extracted from the PAT and US data and images as given in Table 2.

Functional features
The relative oxy-hemoglobin (rHbO 2 ) and deoxy-hemoglobin (rHb) at each pixel can be calculated using the following equations: where Cðr; θÞ ¼ ΓC 0 ðr; θÞ∅ðr; θÞ, Γ is the tissue's Grüneisen parameter, C 0 ðr; θÞ is the system acoustic operator, and ∅ðr; θÞ is the local fluence, which can be approximated as wavelength independent at the narrow wavelength window we have used. Based on these equations, deriving rHbO 2 and rHb requires a known tissue fluence distribution. Since this distribution is difficult to determine in human tissue due to wide variation in composition, we computed relative rHbO 2 and rHb values instead. By summing the rHbO 2 and rHb at each pixel, the relative total hemoglobin (rHbT) for each pixel is computed; the average rHbT for an ROI was then calculated by averaging the rHbTs of all pixels in that ROI with a value at least half of the maximum rHbT. All PAT images in the co-registered US and PAT images were rHbT without any normalization.

Spectral features
Ultrasound images were employed to select a proper ROI corresponding to the lesion for PAT spectral feature calculations. 9,24 First, PAT beam lines with a maximum value close to the background noise level of our co-registered US/PAT system (60 mV) were ignored. The rest of the beam lines were gated by a hamming window, and then their FFT in −10-dB frequency range were calculated. Moreover, to cancel the frequency response of the transducer and electrical receiving system, 24 the spectra of PAT beams were normalized to the spectra of an approximate point-like target (a 250-μm black thread orthogonal to PAT imaging place with a varied distance to transducer from 0.5 to 7 cm and a step of 0.25 cm). After calibrating our data, each of the calibrated PAT spectra was fitted linearly. The mean spectral slope (SS), midband fit (MBF), and 0.5-MHz spectral intercept [0.5-MHz SI (PAT)] were then calculated (Fig. 1). We chose 0.5-MHz spectral intercept as a feature instead of 0 MHz because the lower bound of our transducer in PAT mode is ∼0.5 MHz.
US spectral features were also calculated. To do this, similar method as PAT spectral features extraction was followed. The only differences were: first, the analysis was performed in the frequency range of 3.5 to 7 MHz, which is the −10-dB frequency range of the transducer in US mode. Second, the calibration was performed using a reference gelatin-based phantom constructed in our lab. 25

PAT image features
After visual inspection of PAT frames of malignant and normal colon samples, we noticed that the textures of these images Journal of Biomedical Optics 121913-2 December 2019 • Vol. 24 (12) looked different between the two types of samples. To confirm our observation, we extracted PAT image features from available image frames. To do so, a proper ROI was first chosen. To find the center of this ROI, the region surrounding the lesion was determined based on the US image, and the Radon transforms at the two angles of 0 deg and 90 deg of the PAT image in this region were calculated. Each of these Radon transforms was then normalized to its own maximum values and a Gaussian curve was fitted to each of them. The center of the square ROI where the image analysis was performed was determined by the means of these two Gaussian curves, and its size was assumed to be 1 cm for all cases (Fig. 2). Textural features of the normalized PAT images were calculated in the specified ROI. 24 The first step in calculating these features is to construct a gray-level co-occurrence matrix (GLCM). 26 GLCM quantifies how the pixels are connected in the image. The size of this matrix was chosen as 16 × 16 pixels. The value of pixel ði; jÞ of this matrix was chosen to be the number of times that gray levels i and j are adjacent to each other in the PAT image. Note that we assumed that the two gray levels g1 and g2 are adjacent if g1 is positioned at the immediate left of g2. After constructing the GLCM matrix, four textural features were calculated for each PAT image frame using the following equations: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 3 ; 3 2 6 ; 1 0 9 contrast ¼ cði; jÞ;  (12) Yang et al.: Co-registered photoacoustic and ultrasound imaging of human colorectal cancer E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 4 ; 6 3 ; 4 5 4 correlation ¼ E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 5 ; 6 3 ; 4 0 5 energy ¼ E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 6 ; 3 2 6 ; 7 4 1 homogeneity ¼ where cði; jÞ is the value of the ði; jÞ pixel of the GLCM, N is the dimension of this matrix, and σ and μ are the standard deviation and mean for row i or column j of the GLCM. The standard deviation of the mean radon transform (Sig_rad) was the last image feature that was calculated in this study. To calculate this feature, the radon transform of the nonnormalized PAT image at angles 0 deg to 90 deg with a step of 1 deg were calculated and an average was taken over these transforms. Then a Gaussian curve was fitted to this mean radon transform and the standard deviation of this curve was measured.

Feature Selection and Classification
A two-step approach was used to select features most likely to differentiate normal from untreated malignant tissue. In the first step, all previously discussed PAT/US features were tested in univariate analysis between untreated cancer and normal regions, and p values were generated by two-sample two-sided Student's t tests. The features where p > 0.05-which we concluded a priori not to be significantly associated with malignancy-were excluded from the classification model (Table 3).
Next, logistic a general logistic mode (GLM) and a support vector machine (SVM) were used to evaluate the strength of association of each feature with the ultimate tissue diagnosis, and a prediction model was then constructed with significant covariates. In total, 18 areas selected from 18 specimens and 12 malignant areas from 12 untreated cancer specimens were used to construct and evaluate the prediction models. Out of these, 12 normal and 8 malignant areas were used for prediction

Qualitative Analysis: Baseline Characteristics of US and PAT Images
The colorectal tissues are composed primarily of fluid, lipid, collagen, and muscle. The general architecture (from superficial to deep) in a normal specimen is mucosa (fluid-filled cells surrounded by lipid bilayers), submucosa (largely composed of extracellular collagen matrix and some muscle fibers), muscularis propria (muscle), and adipose tissue (lipid). In malignancy, the individual cell types are similar but the architecture is distorted as cancerous cells of mucosal origin penetrate into the deeper layers of the organ. As these cells invade, the organized structure of the tissue is lost. Figure 3 shows specimen photographs, US images, coregistered PAT/US rHbT maps as well as histologic images from two representative regions of normal colon samples     images, normal tissue was found to have almost no detectable rHbT signal. In contrast, malignant tissue showed much higher concentrations of hemoglobin around the tumor bed. Again, these findings appear corroborated by histologic examination. In comparison to the relative paucity of large blood vessels in normal tissue, the malignancies were more vascular and contained large blood vessels [red arrows in images 3(l) and 3(p)].
It is interesting to note that fatty tissues have limited PAT signals in the outer portions of the specimens. This is not surprising since we are specifically targeting hemoglobin-which is not concentrated in fatty tissue-as our chromophore of  (12) interest and therefore image within the 730-to 830-nm wavelength range. Additionally, all PAT images are displayed with the same dynamic range of −10 dB, so anything below this level is not displayed. The fatty tissue, due to its lack of vascular structures, falls below this range.  demonstrated a return to the normal wall structure with complete tumor destruction [image 4(j)]. Histologic comparison among specimens also correlated with these findings; reduction in vasculature along with return to a semiorganized mucosal structure was noted throughout the treated specimens.

Quantitative Analysis
In addition to the above qualitative comparisons, a total of 23 areas obtained from 12 untreated malignant tumors, 6 polyps (one has a small invasive component), 2 post-treatment complete responders, 1 no residual tumor cell following prior polypectomy, and 2 post-treatment nonresponders, as well as 18 normal areas from specimens of normal regions were used for quantitative feature extraction. Thus a total of 41 areas were used in Fig. 5. Note that one tumor area was selected from each specimen. Five specimens did not have normal regions or normal regions were too close to the tumor for imaging, therefore, 18 normal areas were selected from 18 specimens. All tumor and normal regions were identified by the attending pathologist. Treated tumors with complete response were found to have similar scores to normal tissue, whereas treated regions with residual cancer have scores similar to untreated cancers. Due to the limited number of treated cancers, statistics were not performed for these two treated categories.
To distinguish untreated malignant from normal colon tissues, GLM and SVM classifiers were established. These classifiers were developed using the independent features with a p-value <0.05 between malignant and normal colon tissues. To determine if two features are independent, a spearman's correlation was calculated between each pair of features (Table 4). To train each classifier, we first used the feature with the lowest p-value and then added other features to the feature set one by one. We continued inclusion of the features to the feature set until no increase in the AUC value for the testing data set was observed. We found that when rHbT is included in the feature set, the best performance of both GLM and SVM classifiers (the highest AUC value for the testing data set) is achieved when rHbT and 0.5-MHz SI (PAT) are employed to train the classifier although SS (PAT) has a lower p-value than 0.5-MHz SI (PAT). Adding other features did not improve the AUC for the testing data set. Figure 6 shows the ROC curves and AUC values of the training (left) and testing (right) data sets using GLM (top) and SVM (bottom) classifiers. As shown in this figure, when the features set include just rHbT, the AUC value for the training and testing data sets are 0.95 and 0.93 for both classifiers, respectively. Adding 0.5-MHz SI (PAT) to the features set, results in a significant improvement in the AUC values for both training and testing data sets (0.97 and 0.95 for the training and testing data sets for both classifiers, respectively). The three image features (Sig_rad, homogeneity, and energy) did not improve the AUC values for both training and testing data sets.
Finally, the performance of GLM (top) and SVM (bottom) classifiers without rHbT (the single-wavelength model) are presented in Fig. 7. Note that although the difference between some of the PAT image features in malignant and normal samples is statistically significant, none of these features improve the AUC for the testing data sets. The best performance of GLM classifier is achieved when the only spectral feature of SS(PAT) is included in the feature set. The best performance of SVM classifier is achieved when spectral features of SS(PAT), 0.5-MHz SI(PAT), and 0.5-MHz SI(US) are included in the feature set. The testing AUC in this case is 0.89 for the GLM classifier and 0.91 for the SVM classifier.

Discussion and Summary
In this pilot study of co-registered ultrasound and PAT, we found significant qualitative and quantitative differences between malignant tumors and normal tissue within human colorectal specimens. Specifically, the parameters rHbT, 0.5-MHz SI (PAT), 0.5-MHz SI (US), and SS (PAT) differ between the two tissue types imaged, suggesting that PAT may be able to differentiate malignant from normal tissue in the colon and rectum. Combined with the PAT system's tissue penetration depth of over 4 to 5 cm (depending on the background tissue optical properties), these findings suggest that PAT may be able to augment extant radiographic technology in the diagnosis, management, and surveillance of colorectal cancer.
As demonstrated by Xu et al. 27 and Kumon et al., 28 PAT spectral features are related to the size and concentration of the optical absorbers. The slope decreases (more negative) as PA absorber sizes increase and the intercept increases (less negative) as the sizes and concentrations of the absorbers increase. We believe that malignant lesions have larger absorber sizes and higher concentrations compared with normal colorectal tissues  (12) due to their increased microvessel networks. As introduced by Lizzi et al., 29 US SS depends on acoustic scatter size, whereas spectral intercept depends on scatter sizes, concentrations, and acoustic impedances of tissue scatter matrix. These parameters have been found valuable to characterize liver, eye, 29 prostate, 30 and breast lesions. 31 We believe that the distorted tissue architecture and abundance of cancerous cells are the source of the US spectral contrast between malignant and normal colorectal tissues. However, the findings of the PAT and US spectral features of colorectal diseases may or may not be applicable to diseases of other organs. Several technical limitations must be considered with our data. First, we imaged colorectal specimens obtained from routine surgeries and these tissues were typically with large pathologic components that often appeared malignant by visual inspection after specimens were open. These lesions may or may not need advanced PAT and US features for diagnosis. However, these lesions are excellent examples for identifying PAT and US feature characteristics that differ between cancerous and normal tissue. With this information known, we can target less obvious lesions as we look to test the utility of the device in identifying cancer margins and residual tumors after chemoradiation treatment in patient.
The second limitation of this study is the low image resolution of our prototype. The image resolution is only ∼250 μm due to the commercial endocavity ultrasound transducer array (6-MHz central frequency, 80% bandwidth). Because this resolution will impact future clinical applications of the device, we plan to upgrade the ultrasound system with a transducer array of more than 15 MHz to address this problem in future studies. Third, hemoglobin oxygen saturation (sO2) was not calculated in this study since all specimens were imaged after resection, resulting in significantly altered oxygen saturation compared to normal living tissue. sO2 is a significant biomarker for characterization of cancer 11 and assessment of treatment response.
Third, the limited sample size could lead to overfitting of the classifiers if enough care is not taken to develop the classifiers. As a rule of thumb, overfitting is least possible to occur if the number of samples is 10 times or higher than the number of independent predictors. 32 Based on this rule, as we have a total of 30 samples (18 normal colorectal tissues and 12 untreated malignant colorectal tissues) for ROC analysis, the maximum number of the predictors that should be used to avoid overfitting would be three. Figure 6 shows that when rHbT is present in the feature set, the best performances of both GLM and SVM are achieved when rHbT and 0.5-MHz SI (PAT) are the only features used to train the classifiers. Adding SS (PAT) to these feature sets neither changes the value of the AUC for training data sets, nor increases the AUC for the testing data set. Moreover, when rHbT is not included in the feature set, employing the combination of SS (PAT), 0.5-MHz SI (PAT), and 0.5-MHz SI (US) features for developing the classifiers would result in the best performance of SVM classifier and SS (PAT) only would result in the best performance of GLM classifier (Fig. 7). Although adding Sig_rad increases the AUC value for the training data set, it decreases the AUC values for the testing data set in both classifiers. This would mean that our classifiers have Journal of Biomedical Optics 121913-10 December 2019 • Vol. 24 (12) most probably been overfitted when four features have been used. In this study, to further protect our classifiers from overfitting, repeated rounds (100 times) of cross validation were applied by randomly selecting 2/3 of the samples for training and 1/3 of the samples for testing. The average ROC and AUC values were reported as the results. In summary, a real-time co-registered PAT/US system was used to image and characterize colorectal masses ex vivo in this pilot study. Twenty-three colon and rectum samples (nineteen colon and four rectums) were imaged, rHbT was computed from four wavelength data, and seven quantitative features were extracted from PAT and US power spectra and images. In pretreated malignant colorectal tumors, we found the cross-section structure to be highly disorganized with a significantly higher rHbT concentration compared to normal and precancerous regions. We performed classifications on the malignant and normal colon regions using GLM and SVM classifiers both with and without tHb in the feature set. When rHbT was employed to construct the classifiers with 0.5-MHz SI (PAT), GLM and SVM classifiers achieved optimal AUC values for the training and testing data sets (0.97 and 0.95, respectively). The small number of treated tumors included in this dataset limits the statistical power of the analysis, but the functional, spectral, and image parameters do appear more similar to normal colorectal tissue in tumors that have experienced complete responses compared to partial responders. These results indicate potential of using PAT/US for future cancer screening and post-treatment surveillance of the colon and rectum. Moving forward, we plan to increase the resolution of our system using a high-frequency US array and then adapt the technology to an endorectal probe, which will allow us to test the functional and spectral feature differences in in vivo human tissue.

Disclosures
No potential conflicts of interest to disclose.