Real-time monitoring of functional tissue parameters, such as local blood oxygenation, based on optical imaging could provide groundbreaking advances in the diagnosis and interventional therapy of various diseases. Although photoacoustic (PA) imaging is a modality with great potential to measure optical absorption deep inside tissue, quantification of the measurements remains a major challenge. We introduce the first machine learning-based approach to quantitative PA imaging (qPAI), which relies on learning the fluence in a voxel to deduce the corresponding optical absorption. The method encodes relevant information of the measured signal and the characteristics of the imaging system in voxel-based feature vectors, which allow the generation of thousands of training samples from a single simulated PA image. Comprehensive in silico experiments suggest that context encoding-qPAI enables highly accurate and robust quantification of the local fluence and thereby the optical absorption from PA images.
Photoacoustic (PA) imaging is an imaging concept with a high potential for real-time monitoring of functional tissue parameters such as blood oxygenation deep inside tissue. It measures the acoustic waves arising from the stress-confined thermal response of optical absorption in tissue.1 More specifically, a PA signal in a location is a pressure response to the locally absorbed energy , which, in turn, is a product of the absorption coefficient , the Grueneisen coefficient and the light fluence
Given that the local light fluence not only depends on the imaging setup but is also highly dependent on the optical properties of the surrounding tissue, quantification of optical absorption based on the measured PA signal is a major challenge.2,3 So far, the field of quantitative PA imaging (qPAI) has focused on model-based iterative optimization approaches to infer optical tissue parameters from measured signals (cf. e.g., Refs. 220.127.116.11.18.104.22.168.–12). Although these methods are well suited for tomographic devices with high image quality (cf. e.g., Refs. 1314.–15) as used in small animal imaging, translational PA research with clinical ultrasound transducers or similar handheld devices (cf. e.g., Refs. 1 and 1622.214.171.124.21.–22) focuses on qualitative image analysis.
As an initial step toward clinical qPAI, we introduce a machine learning-based approach to quantifying PA measurements. The approach features high robustness to noise while being computationally efficient. In contrast to all other approaches proposed to date, our method relies on learning the light fluence on a voxel level to deduce the corresponding optical absorption. Our core contribution is the development of a voxel-based context image (CI) that encodes relevant information of the measured signal voxel together with characteristics of the imaging system in a single feature vector. This enables us to tackle the challenge of fluence estimation as a machine learning problem that we can solve in a fast and robust manner. Comprehensive in silico experiments indicate high accuracy, speed, and robustness of the proposed context encoding (CE)-qPAI approach. This is demonstrated for estimation of (1) fluence and optical absorption from PA images, as well as (2) blood oxygen saturation as an example of functional imaging using multispectral PA images.
Materials and Methods
A common challenge when applying machine learning methods to biomedical imaging problems is the lack of labeled training data. In the context of PAI, a major issue is the strong dependence of the signal on the surrounding tissue. This renders separation of voxels from their context—as in surface optical imaging23—impossible or highly inaccurate. Simulation of a sufficient number of training volumes covering a large range of tissue parameter variations, on the other hand, is computationally not feasible given the generally long runtime of Monte Carlo methods, which are currently the gold standard for the simulation of light transportation in tissue.11
Inspired by an approach to shape matching, where the shape context is encoded in a so-called spin image specifically for each node in a mesh,24 we encode the voxel-specific context in so-called CIs. This allows us to train machine learning algorithms on a voxel level rather than image level and we thus require orders of magnitude fewer simulated training volumes. CIs encode relevant information of the measured signal as well as characteristics of the imaging system represented by so-called voxel-specific fluence contribution maps (FCMs). The CIs serve as a feature vector for said machine learning algorithm, which is trained to estimate fluence in a voxel. The entire quantification method is shown in Fig. 1, which serves as an overview with details given in the following sections.
Fluence Contribution Map
An important prerequisite for computing the CI for a voxel is the computation of the corresponding FCM, referred to as . represents a measure for the likelihood that a photon arriving in voxel has passed . In other words, an FCM reflects the impact of a PA signal in on the drop in fluence in voxel . An illustration of an FCM corresponding to a typical handheld PA setup is shown in Fig. 2. The is dependent on how the PA excitation light pulse propagates through homogeneous tissue to arrive in given a chosen hardware setup. The FCMs per imaging plane are generated once for each new hardware setup and each voxel in the imaging plane.
In this first implementation of the CE-qPAI concept, FCMs are simulated with the same resolution as the input data assuming a background absorption coefficient of and a constant reduced scattering coefficient of .25 The number of photons is varied to achieve a consistent photon count in the target voxel. The FCMs are generated with the widely used Monte Carlo simulation tool mcxyz.26 We integrated mcxyz into the open-source Medical Image Interaction Toolkit MITK27 as mitkMcxyz and modified it to work in a multithreaded environment. Sample FCMs for three different voxels are shown in Fig. 2, which also shows the generation of CIs for those three example voxels.
The CI for a voxel in a PA volume is essentially a two-dimensional (2-D) histogram composed of (1) the measured PA signal in the tissue surrounding and (2) the corresponding . More specifically, it is constructed from the tuples where is defined as . This constraint is set to exclude voxels with a negligible contribution to the fluence in . The tuples are arranged by magnitude of and into a 2-D histogram and thereby encode the relevant context information in a compact form. In our prototype implementation of the CE-qPAI concept, the fluence contribution and signal axes of the histogram are discretized in 12 bins and scaled logarithmically to better represent the predominantly low signal and fluence contribution components. The ranges of the axes are set as and . Signals and fluence contributions larger than the upper boundary are included in the highest bin, whereas smaller signals and fluence contributions are not. Figure 2 shows the generation of CIs from FCMs and PA signals. Labeled CIs are used for training a regressor that can later estimate fluence, which, in turn, is used to reconstruct absorption [Eq. (1)].
Machine Learning-Based Regression for Fluence Estimation
During the training phase, a regressor is presented tuples of and corresponding ground truth fluence values for each voxel in a set of PAI volumes. For estimation of optical absorption in a voxel of a previously unseen image, the voxel-specific CI is generated and used to infer fluence using the trained algorithm.
In our prototype implementation of the CE-qPAI method, we use a random forest regressor. A random forest regressor is an ensemble of decision trees, where the weighted vote of the individual trees is used as the estimation.28 To train the random forest, all labeled CIs of the respective training set need to be evaluated at once. With voxel-based CIs, thousands of training samples can be extracted from a single slice of a simulated PA training volume. Ground truth training data generation is performed using a dedicated software plugin integrated into MITK and simulating the fluence with mitkMcxyz. It should be noted that the simulated images consist mainly of background voxels and not of vessel structures, which are our regions of interest (ROI). This leads to an imbalance in the training set. To avoid poor estimation for underrepresented classes,29 we undersample background voxels in the training process to ensure a 1:1 ROI/background sample ratio. The parameters of the random forest are set to the defaults of sklearn 0.18 using python 2.7, except for the tree count which was set to . CIs are used as feature vectors and labeled with the optical property to be estimated (e.g., fluence or oxygenation). The parameters were chosen based on a grid search on a separate dataset not used in the experiments of this work.
We assume a typical linear probe hardware setup,30 where the ultrasound detector array and the light source move together and the illumination geometry is the same for each image recorded. This is also the case for other typical tomographic devices.31,32 All simulations were performed on high-end CPUs (Intel i7-5960X).
Experiments and Results
In the following validation experiments, we quantify the fluence up to an imaging depth of 28 mm in unseen test images for each dataset. With our implementation and setup, all images comprise 3008 training samples, which results in an average simulation time of about 50 ms per training sample. This allows us to generate enough training samples in a feasible amount of time, to train a regressor that enables fluence estimation in a previously unseen image in near real time. The measured computational time for quantifying fluence in a single image slice is .
In the following, we present the experimental design and results of the validation of CE-qPAI. First, we will validate the estimation of absorption from PAI volumes acquired at a fixed wavelength and then estimate blood oxygenation from multispectral PAI volumes.
Monospectral Absorption Estimation
To assess the performance of CE-qPAI in PA images of blood vessels, we designed six experimental datasets (DS) with varying complexities as listed in Table 1. With the exception of , each of the six experimental DS is composed of 150 training items, 25 validation items, and 25 test items, where each item comprises a three-dimensional (3-D) simulated PA image of dimensions and 0.6-mm equal spacing as well as a corresponding (ground truth) fluence map.
The design parameters of the DS. All ranges denote sampling from uniform distributions within the given bounds.
As labels of the generated CIs, we used a fluence correction , where is a fluence simulation based on a homogeneous background tissue assumption. We used five equidistant slices out of each volume, resulting in a generation of a total of 2,256,000; 376,000 and 376,000 CIs for each dataset—for training, parameter optimization, and testing, respectively. To account for the high complexity of , we increased the number of training volumes for that set from 150 to 400. The baseline dataset represents simulations of a transcutaneously scanned simplified model of a blood vessel of constant radius (3 mm) and constant absorption (vessel: , background: ) and reduced scattering coefficient (). To approximate partial volume effects, the absorption coefficients in the ground truth images were Gaussian blurred with a sigma of 0.6 mm. Single slices were simulated using photons for all training sets and photons for the respective test and validation sets and then compounded in a fully scanned volume. Different shapes and poses of the vessel were generated by a random walk with steps defined as
Although most studies assess the performance of a method in the entire image (cf. e.g., Refs. 6, 33, and 34), it must be pointed out that the accuracy of signal quantification is often most relevant in a defined region of interest—such as in vessels or regions that provide a meaningful PA signal. These are typically also the regions, where quantification is particularly challenging due to the strongest signals originating from boundaries with discontinuous tissue properties. To address this important aspect we validated our method, not only on the entire image, but also in the ROI, which we define for our DS as voxels representing a vessel and at the same time having a contrast-to-noise ratio (CNR) of larger than 2, to only include significant signal in the ROI. We define CNR following Walvaert and Rosseel35 in a voxel as
To investigate the robustness of CE-qPAI to noise, we added the following noise models to each dataset. The noise models consist of an additive Gaussian noise term applied on the signal volumes followed by a multiplicative white Gaussian noise term, similar to noise assumptions used in prior work.6,33 We examined three noise levels to compare against the simulation-intrinsic noise case:
The additive and multiplicative noise components follow an estimation of noise components on a custom PA system.30 For each experimental dataset introduced in Table 1 and each noise set, we applied the following validation procedure separately. Following common research practice, we used the training data subset for training of the random forest and the validation data subset to ensure the convergence of the training process, as well as to set suitable parameters for the random forest and ROI, whereas we only evaluated the test data subset to report the final results (as described in Ref. 36). As an error metric, we report the relative fluence estimation error
Figures 3(a)–3(c) show representative examples of the previously unseen 125 simulated test images from the baseline dataset , with their corresponding fluence estimation results. The optical absorption is reconstructed using the fluence estimation. A histogram illustrating absorption estimation accuracy in ROI voxels of is shown in Fig. 3(d) and compared with a static fluence correction approach.
Table 2 summarizes the descriptive statistics of the relative fluence estimation errors for the experiments on absorption estimation using single wavelength PA images. The relative fluence estimation error does not follow a normal distribution due to large outliers especially in complex DS, which is why we report median with interquartile ranges (IQR) for all DS. Even for the most complex dataset with variations of multiple parameters, CE-qPAI yields a median overall relative fluence estimation error below 4%. Errors are higher in the ROI, especially in DS with high variations of absorption.
Descriptive statistics of fluence estimation results. The median and IQR of the relative fluence estimation error er for the six validation DS used for the single wavelength experiments. The median error and IQR are provided (1) for all voxels in the respective test set as well as (2) for the voxels in the ROI only.
Previously proposed qPAI approaches reveal high drops in estimation performance when dealing with noisy data (cf. e.g., Ref. 37). To remedy this, methods have been proposed to incorporate more accurate noise representations into model-based reconstruction algorithms.33,38 When validating the robustness of CE-qPAI to noise, it yields high accuracy even under unrealistically high noise levels of up to 20% (cf. Fig. 4). Regardless of the noise level applied, the highest median errors occur in the ROIs of DS that are characterized by high absorption and inhomogeneous tissue properties.
Multispectral Blood Oxygenation Estimation
The concept of CE cannot only be used to estimate fluence and absorption, but also derived functional parameters such as blood oxygenation. To this end, the estimated absorption in a voxel for multiple wavelengths can be applied to resolve oxygenation via linear spectral unmixing. Alternatively, a regressor can be trained using the CIs labeled with ground truth oxygenation.
To investigate the performance of CE-qPAI for blood oxygenation () estimation, we designed an additional multispectral simulated dataset using the wavelengths 750, 800, and 850 nm. It consists of 240 multispectral training volumes and 11 multispectral test volumes, each featuring homogeneous oxygenation and one vessel with a radius of 2.3 to 4 mm—modeled after a carotid artery.39 For each image slice and at each wavelength, photons were used for simulation. Oxygenation values for the training images were drawn randomly from a uniform distribution . For testing, we simulated 11 multispectral volumes at three wavelengths and 11 blood oxygenation levels (). The optical absorption was adjusted by wavelength and oxygenation, as described by Jacques.25 Hemoglobin concentration was assumed to be .25 The blood volume fraction was set to 0.5% in the background tissue and to 100% in the blood vessels. The reduced scattering coefficient was again set to . We estimated the oxygenation using three methods:
Estimation of local blood oxygen saturation () is one of the main qPAI applications and is only possible with multispectral measurements. As such, the presented approaches were validated together with the baseline method on the dataset . As shown in Fig. 5(a), the estimation results for both methods are in very close agreement with the ground truth. In fact, the median absolute oxygen estimation error was 3.1% with IQR (1.1% and 6.4%) for CE-qPAI and 0.8% with IQR (0.3% and 1.8%) for the fCE-qPAI adaptation. Furthermore, our methodology outperforms a baseline approach based on linear spectral unmixing of the raw signal (as also compared to in Ref. 15). By means of example Fig. 5(b) shows that the linear spectral unmixing of the ROI on the uncorrected signal fails deep inside the ROI, where the fluence varies strongly for different wavelengths. To compensate for this effect when comparing the approach to our method, we validate all methods only on the MIP along the depth axis (as also used in Ref. 41) in Fig. 5(a).
This paper addresses one of the most important challenges related to PA imaging, namely the quantification of optical absorption based on the measured signal. In contrast to all other approaches proposed to qPAI to date (cf. e.g., Refs. 126.96.36.199.188.8.131.52.–12), our method relies on learning the light fluence in a voxel to deduce the corresponding optical absorption. Comprehensive in silico experiments presented in this manuscript show the high potential of this approach to estimate optical absorption as well as derived functional properties, such as oxygenation, even in the presence of high noise.
Although machine learning methods have recently been applied to PAI related problems (cf. e.g., Refs. 4243.–44), these have mainly focused on image reconstruction but not signal quantification. We attribute this to the fact that in vivo training data generation for machine learning-based qPAI is not at all straightforward given the lack of reference methods for estimating optical absorption in depth. Despite recent developments related to hybrid diffusion approximation and Monte Carlo methods,45 fast generation of in silico training data also remains an unsolved challenge. Note in this context that commonly applied methods of data augmentation (i.e., methods that may be used to automatically enlarge training data sets as discussed in Ref. 46) cannot be applied to PA images due to the interdependence of fluence and signal. With our contribution, we have addressed the challenge by introducing the concept of CIs, which allow us to generate one training case from each voxel rather than from each image.
As an important contribution with high potential impact, we adapted CE-qPAI to estimate functional tissue properties from multiwavelength data. Both variants—linear spectral unmixing of the fluence corrected signal, as well as direct estimation of oxygenation from multi wavelength CIs, yielded accurate results that outperformed a baseline approach based on linear spectral unmixing of the raw PA signal. It should be noted that linear spectral unmixing of the signal for estimation is usually performed on a wider range of wavelengths to increase accuracy. However, even this increase in the number of wavelengths cannot fully account for nonlinear fluence effects.3 Combined with the separately established robustness to noise, multiwavelength applications of CE-qPAI are very promising.
In our first prototype implementation of CE-qPAI, we used random forests regressors with standard parameters. It should be noted, however, that fluence estimation from the proposed CI can in principle be performed by any other machine learning method in a straightforward manner. Initial experiments suggest that even better performance can be achieved with convolutional neural networks.47
By relating the measured signals in the neighborhood of to the corresponding fluence contributions we relate the absorbed energy in , to the fluence contribution of to . In this context, it has to be noted that the fluence contribution is only an approximation of the true likelihood that a photon passing has previously passed , because is generated independently of the scene under observation assuming constant background absorption and scattering. Nevertheless due to the generally low variance of scattering in tissue, it serves as a reliable input for the proposed machine learning-based quantification.
A limitation of our study can be seen in the fact that we performed the validation in silico. To apply CE-qPAI in vivo, further research will have to be conducted in two main areas. First, we are working on accurately solving the acoustical inverse problem for specific scanners.48 The method will be integrated into the quantification algorithm to enable quantification of images acquired with common PAI probes such as clinical linear transducers. Second, training data have to be generated as close to reality as possible—considering, for example, imaging artifacts.
In contrast to prior work (cf. e.g., Refs. 6, 7, 33, 49, and 34), our initial validation handles the whole range of near infrared absorption in whole blood at physiological hemoglobin concentrations and demonstrates high robustness to noise. The impact of variations of scattering still needs investigation although these should be small in the near infrared.
Long-term goal of our work is the transfer of CE-qPAI to clinical data. In this context, run-time of the algorithm will play an important role. Although our current implementation can estimate absorption on single slices within a second, this might not be sufficient for interventional clinical estimation of whole tissue volumes and at higher resolutions. An efficient GPU implementation of the time intensive CI generation should enable real-time quantification.
In summary, CE-qPAI is the first machine learning-based approach to quantification of PA signals. The results of this work suggest that quantitative real-time functional PA imaging deep inside tissue is feasible.
Code and Data Availability
The code for the method as well as the experiments was written in C++ and python 2.7 and is partially open source and available at https://phabricator.mitk.org/source/mitk.git. Additional code and all raw and processed data generated in this work are available from the corresponding authors on reasonable request.
The authors have no relevant financial interests in this article and no potential conflicts of interest to disclose.
The authors would like to acknowledge support from the European Union through the ERC starting grant COMBIOSCOPY under the New Horizon Framework Programme grant agreement ERC-2015-StG-37960. We would like to thank the ITCF of the DKFZ for the provision of their computing cluster and C. Feldmann for her support with figure design.
https://doi.org/10.1038/nmeth.3925 1548-7091 Google Scholar
https://doi.org/10.1126/science.1216210 SCIEAS 0036-8075 Google Scholar
https://doi.org/10.1117/12.806788 PSISDG 0277-786X Google Scholar
https://doi.org/10.1364/AO.39.005256 APOPAI 0003-6935 Google Scholar
https://doi.org/10.1117/12.597190 PSISDG 0277-786X Google Scholar
https://doi.org/10.1364/AO.45.001866 APOPAI 0003-6935 Google Scholar
https://doi.org/10.1063/1.2209883 APPLAB 0003-6951 Google Scholar
https://doi.org/10.1088/0031-9155/52/1/010 PHMBA7 0031-9155 Google Scholar
https://doi.org/10.1117/12.2212440 PSISDG 0277-786X Google Scholar
https://doi.org/10.1088/0266-5611/31/6/065005 INPEEY 0266-5611 Google Scholar
https://doi.org/10.1117/1.JBO.17.6.061202 JBOPFO 1083-3668 Google Scholar
https://doi.org/10.1364/JOSAA.25.002347 JOAOD6 0740-3232 Google Scholar
https://doi.org/10.1038/nphoton.2009.157 NPAHBY 1749-4885 Google Scholar
https://doi.org/10.1109/TBME.2013.2283507 IEBEAX 0018-9294 Google Scholar
https://doi.org/10.1038/ncomms12121 NCAOBW 2041-1723 Google Scholar
https://doi.org/10.1109/TMI.2004.843199 ITMID4 0278-0062 Google Scholar
https://doi.org/10.1158/0008-5472.CAN-13-2387 Google Scholar
https://doi.org/10.1117/1.JBO.22.4.041006 JBOPFO 1083-3668 Google Scholar
https://doi.org/10.1117/1.2907157 JBOPFO 1083-3668 Google Scholar
https://doi.org/10.1117/1.2976427 JBOPFO 1083-3668 Google Scholar
https://doi.org/10.1117/1.3469829 JBOPFO 1083-3668 Google Scholar
https://doi.org/10.1038/srep15748 SRCEC3 2045-2322 Google Scholar
https://doi.org/10.1007/s11548-016-1376-5 Google Scholar
https://doi.org/10.1109/34.765655 ITPIDJ 0162-8828 Google Scholar
https://doi.org/10.1088/0031-9155/58/11/R37 PHMBA7 0031-9155 Google Scholar
https://doi.org/10.1016/j.pacs.2014.09.001 Google Scholar
https://doi.org/10.1016/j.media.2005.04.005 Google Scholar
https://doi.org/10.1023/A:1010933404324 MALEEZ 0885-6125 Google Scholar
https://doi.org/10.1111/coin.2004.20.issue-1 Google Scholar
https://doi.org/10.1117/12.2209368 PSISDG 0277-786X Google Scholar
https://doi.org/10.1016/j.pacs.2015.12.001 Google Scholar
https://doi.org/10.1109/TUFFC.2013.2646 ITUCER 0885-3010 Google Scholar
https://doi.org/10.1109/TMI.2013.2280281 ITMID4 0278-0062 Google Scholar
https://doi.org/10.1364/AO.49.003566 APOPAI 0003-6935 Google Scholar
https://doi.org/10.1371/journal.pone.0077089 POLNCL 1932-6203 Google Scholar
https://doi.org/10.1117/12.2209477 PSISDG 0277-786X Google Scholar
https://doi.org/10.1161/01.STR.0000206440.48756.f7 SJCCA7 0039-2499 Google Scholar
https://doi.org/10.1109/79.974727 ISPRE6 1053-5888 Google Scholar
https://doi.org/10.1038/srep05878 SRCEC3 2045-2322 Google Scholar
https://doi.org/10.1117/12.2255098 PSISDG 0277-786X Google Scholar
https://doi.org/10.1117/1.JBO.17.1.010501 JBOPFO 1083-3668 Google Scholar
https://doi.org/10.1109/CVPR.2016.90 Google Scholar
https://doi.org/10.1117/12.2288353 PSISDG 0277-786X Google Scholar
https://doi.org/10.1137/140959705 Google Scholar
Thomas Kirchner received his MSc degree in physics from the University of Heidelberg in 2015. He currently works on his PhD at the Division of Computer Assisted Medical Interventions, German Cancer Research Center (DKFZ), where he does research in computational biophotonics, focusing on real-time multispectral photoacoustics and signal quantification.
Janek Gröhl received his MSc degree in medical informatics from the University of Heidelberg and Heilbronn University of Applied Sciences in 2016. He currently works on his PhD at the Division of Computer Assisted Medical Interventions (CAMI), German Cancer Research Center (DKFZ) and does research in software engineering and computational biophotonics focusing on signal quantification in photoacoustic imaging.
Lena Maier-Hein received her PhD from the Karlsruhe Institute of Technology in 2009 and conducted her postdoctoral research in the Division of Medical and Biological Informatics, German Cancer Research Center (DKFZ), and in the Hamlyn Centre for Robotics Surgery, Imperial College London. She is leading the Division of Computer Assisted Medical Interventions (CAMI) at the DKFZ. Currently, she is working on multimodal image processing, surgical data science, and computational biophotonics.