Quantitative photoacoustic tomography using forward and adjoint Monte Carlo models of radiance

Abstract. Forward and adjoint Monte Carlo (MC) models of radiance are proposed for use in model-based quantitative photoacoustic tomography. A two-dimensional (2-D) radiance MC model using a harmonic angular basis is introduced and validated against analytic solutions for the radiance in heterogeneous media. A gradient-based optimization scheme is then used to recover 2-D absorption and scattering coefficients distributions from simulated photoacoustic measurements. It is shown that the functional gradients, which are a challenge to compute efficiently using MC models, can be calculated directly from the coefficients of the harmonic angular basis used in the forward and adjoint models. This work establishes a framework for transport-based quantitative photoacoustic tomography that can fully exploit emerging highly parallel computing architectures.


Introduction
Quantitative photoacoustic tomography (PAT) is concerned with recovering quantitatively accurate estimates of chromophore concentration distributions, or related quantities such as optical coefficients or blood oxygenation, from photoacoustic images. 1 The source of contrast in PAT is optical absorption, which is directly related to the tissue constituents. By obtaining PAT images at multiple optical wavelengths, it may be possible to recover chemically specific information about the tissue. However, such a spectroscopic use of PAT images must consider the effect of the spatially and spectrally varying light fluence distribution. As a photoacoustic image is the product of the optical absorption coefficient distribution, which carries information about the tissue constituents, and the optical fluence, which only acts to distort that information, the challenge in quantitative photoacoustic imaging is to remove the effect of the light fluence.
A common approach is to use a model of the unknown fluence and use it to extract the desired optical properties from the measured data. This has been done analytically [2][3][4][5] or numerically, 6,7 often within a minimization framework. [8][9][10][11][12][13][14][15] The majority of this literature uses the diffusion approximation to the radiative transfer equation (RTE) to model the light distribution, which is accurate in highly scattering media. 16 In PAT, the region of interest often lies close to the tissue surface where the diffusion approximation is not accurate. The RTE, on the other hand, is widely considered to be an accurate model of light transport so long as coherent effects are negligible, which is the case here. Finite element discretizations of the RTE have been developed [17][18][19][20][21] and proposed for quantitative PAT reconstructions, 12,15 but due to the need to discretize in angle as well as space, they quickly become computationally intensive and their applicability is limited to small-and medium-scale problems. An alternative is Monte Carlo (MC) modeling, 22,[23][24][25] which is a stochastic technique for modeling light transport that converges to the solution to the RTE. The significant advantage of the MC approach is that it is highly parallelizable, thus it scales well to the large-scale inversions that will be encountered in practice.
MC models of light transport are popular in biomedical optics and have predominantly been applied in the planning of the experimental measurements [26][27][28] and in dosimetric studies for a range of light-based therapies. [29][30][31] Many of the applications are summarized by Zhu et al. 32 One early MC model of light transport, MCML, 22 computes the fluence in three-dimensional (3-D) slab geometry. This model was later extended to simulate spherical inclusions in the tissue, 33 and later to spheroidal and cylindrical 34 inclusions. MC modeling in 3-D heterogeneous media has been shown both for voxelized media, 23 which was later GPU-accelerated, 24 and using a mesh-based geometry. 25,35 Although the RTE is an equation for the radiance, which is a function of angle at every point, the quantity usually calculated by MC models is the fluence rate, which is the radiance integrated over all angles. The reasons are practical: most measurable quantities are related to the fluence rate rather than the radiance, storing just the integrated quantity saves on computational memory, and the estimates for the fluence rate will converge sooner than the underlying estimates for the radiance. In photoacoustics, the measurable signal is related to the fluence (the time-integrated fluence rate) so current MC models can be used in the simulation of photoacoustic signals. However, as will be discussed in Sec. 4, the full angle-dependent radiance is required when tackling the inverse problem of estimating the optical coefficients, specifically the optical scattering. A common approach in estimating the optical coefficients is to use a gradient-based inversion scheme in which functional gradients are used to minimize some objective function. This paper demonstrates the computation of these gradients through the use of forward and adjoint MC models of the radiance.
In this paper, Sec. 2 introduces the inverse problem of quantitative PAT. Sections 3 and 4 present forward and adjoint MC models of the radiance employing a harmonic angular basis. In Sec. 5, it is shown that this choice of basis allows the functional gradients for the inverse problem to be calculated straightforwardly. Inversions for absorption and scattering coefficient distributions are given in Sec. 6.

Quantitative Photoacoustic Tomography
The inverse problem in QPAT can be stated as the minimization E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 1 ; 6 3 ; 6 1 9 argmin μ a ðxÞ;μ s ðxÞ ϵ½μ a ðxÞ; μ s ðxÞ; (1) where the error functional is given by E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 2 ; 6 3 ; 5 6 8 ϵ ¼ H ¼ μ a ðx; λÞΦðx; λ; μ a ; μ s Þ is the absorbed energy density and is the "data" for this problem. It is related to the photoacoustic image by the Grüneisen parameter, which here is set to 1. Additional regularization terms or terms reflecting prior knowledge may also be added to ϵ. Gradient-based approaches to solving this problem require estimates of the gradients of the error functional with respect to the parameters of interest. Saratoon et al. 12 give expressions for these gradients in terms of the forward and adjoint fields, ϕ and ϕ Ã : In the following two sections, the calculation of the forward radiance ϕðŝÞ and adjoint radiance ϕ Ã ðŝÞ using MC models is shown.

Monte Carlo Modeling of Light Transport
In PAT, the optical and acoustic propagation times are so different that the optical propagation can be considered instantaneous and the time-dependence of the light transport can be neglected. The time-independent RTE is given by E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 5 ; 6 3 ; 2 1 8 Pðŝ;ŝ 0 Þϕðx;ŝ 0 Þdŝ 0 ¼ qðx;ŝÞ; where L is the forward operator of the RTE and ϕ is the radiance, μ a and μ s are the absorption and scattering coefficients, respectively, x is position,ŝ 0 andŝ are the original and scattered propagation directions, Pðŝ;ŝ 0 Þ is the scattering phase function, qðx;ŝÞ is a source term, and S n−1 is used to indicate integration over angle in n − 1 dimensions. To obtain approximations to the solutions to this equation, various flavors of MC have been proposed. 36 The approach used here begins with launching a packet of energy, referred to herein as a "photon," from a given position x in an initial directionŝ. After traveling a distance s ¼ Uð½0; 1Þ∕μ s (using the convention s ¼ jsjŝ), where Uð½0; 1Þ is a real uniform random variable on [0, 1], a fraction of the photon's "weight" W½1 − expð−μ a sÞ is deposited in the current voxel, where W is the current weight (or energy) of the photon packet. The photon weight is updated: W←W expð−μ a sÞ.
Scattering into a new directionŝ 0 in two-dimensional (2-D) involves sampling the scattering phase function, which describes the probability of a photon scattering from directionŝ 0 into direction s. The phase function used here was the 2-D Henyey-Greenstein phase function, commonly used in biomedical optics, 37,38 E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 6 ; 3 2 6 ; 6 0 9 Pðŝ;ŝ 0 Þ ¼ 1 2π The parameter, g, a property of the medium, is known as the anisotropy factor. Sampling this equation for the scattering angle, θ ¼ arccosðŝ ·ŝ 0 Þ, by solving for θ in the cumulative integral over angle yields E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 7 ; 3 2 6 ; 5 1 9 A new step length, s, is sampled and this process is repeated until the photon weight falls below some threshold value. Carrying out the above computation for many photons and adding the voxel weights will, for a sufficient number of photons, converge on a solution to the RTE. By calculating photon paths through the medium, the MC models presented in the literature [22][23][24]35 do in fact simulate the radiance, but this is typically integrated over angle upon deposition of the weights in the voxels. In order to simulate the radiance, a method of depositing the weight in the voxels without losing the angular information is required.

Monte Carlo Modeling of the Radiance
In order to compute the radiance using an MC model, angular as well as spatial discretization is required. One approach is to use discrete ordinates, whereby the unit circle is divided equally into sectors and the weight deposited in a voxel is also assigned to the relevant angular sector. The memory required will scale linearly with the number of sectors, and will slow convergence of the radiance estimate, compared with the fluence estimate, by a factor inversely related to the number of sectors. Here, a harmonic angular basis was used because a sufficiently diffuse field is dense in such a basis, meaning the field can be represented using relatively few orders. Less memory will, therefore, be required.
In 2-D, the expansion for the radiance in a Fourier basis is 39 E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 8 ; 3 2 6 ; 1 9 1 ϕðx; θÞ ¼ 1 2π a 0 ðxÞ þ 1 π X N¼∞ n¼1 a n ðxÞ cosðnθÞ where a n and b n are the coefficients associated with each harmonic and θ ∈ ½−π; π is the angle of the photon direction s relative to the z-direction [i.e., θ ¼ arccosðŝ ·ẑÞ]. (The equivalent expansion in 3-D would be into spherical harmonics. 40 ) Journal of Biomedical Optics 126004-2 December 2016 • Vol. 21 (12) Hochuli et al.: Quantitative photoacoustic tomography using forward and adjoint Monte Carlo models of radiance For a given voxel, the weight is deposited into the relevant Fourier coefficients according to E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 9 ; 6 3 ; 5 0 0 a 0 ¼ E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 0 ; 6 3 ; 4 4 7 a n ¼ dW n p cosðnθ n p Þ; E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 1 ; 6 3 ; 3 6 1 b n ¼ where dW n p is the weight deposited by the n th p photon traversing a voxel and θ n p is the direction of the n th p photon relative toẑ. The radiance MC algorithm, or RMC, was implemented in the Julia programming language. 41

Validation of the Forward Model
Analytical solutions to the RTE are available for the fluence for a range of geometries and source types; 23,42,43 however, there are few analytical solutions for the radiance, particularly in 2-D. The RMC model was compared to one such analytic solution for an infinite, homogeneous 2-D domain illuminated by an isotropic point source. [44][45][46] An isotropic point source was placed at the center of a domain of size 15 mm × 15 mm, large compared to the transport mean free path in order to approximate an infinite domain. The absorption and scattering coefficients were 0.01 and 10 mm −1 , respectively, and the Henyey-Greenstein phase function 47 was used with g set to 0.9. The pixel size was 0.05 mm × 0.05 mm, and five Fourier harmonics were used. Figure 1 shows the good agreement between the analytical and RMC modeled radiance at radial distances of 2 and 3 mm from the source along the horizontal axis.

Adjoint Monte Carlo Model
The adjoint equation to the RTE is given by E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 2 ; 3 2 6 ; 4 5 6 where L Ã is the adjoint operator of the RTE and ϕ Ã is the adjoint radiance and q Ã is the adjoint source. This was implemented numerically using the same MC scheme as for the forward RMC model (Sec. 3.1). The principle difference is that the light sources q typically used in PAT are restricted to the boundary, but the adjoint source q Ã will not be, as a consequence of the fact that the "data" in QPAT-the photoacoustic images-is volumetric. The internally distributed sources, q Ã , were simulated by uniformly distributing the initial positions of each photon over a source pixel, with their direction randomly sampled from uniform angular distribution. The weight of each photon launched from a source pixel was scaled by μ a ðrÞ½H meas ðrÞ − Hðr; μ a ; μ s Þ, where r is the position corresponding to the source pixel. This of course can result in photons whose weight is negative. The adjoint RMC model treats a negative weight photon in the same way as one whose weight is positive; the photon's weight decays to zero and rather than the deposition of energy into the computational grid, thus negative weight photons produce a reduction in the energy in each voxel it traverses.

Validation of the Adjoint Model
The adjoint model was validated by checking that it satisfied the condition E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 3 ; 3 2 6 ; 1 3 0 where G and G Ã are the (Green's) operators solving the corresponding forward and adjoint RMC models, and u and v are the angle and position dependent source and detector, i.e., Let Case 1∶ðisotropic source and detectorÞ Case 2∶ðisotropic source; angular detectorÞ Case 3∶ðspatial source; angular detectorÞ where ϕ 1;2;3 and ϕ Ã 1;2;3 are the forward and adjoint radiances from computing Gu 1;2;3 and G Ã v 1;2;3 , respectively. Φ is the fluence or angle-integrated radiance. It can be seen from Eq. (15) that case 1, where a pair of isotropic δ-functions are used for u 1 and v 1 , that we expect the resulting fluence values at their respective positions, Φ 1 ðr d Þ and Φ Ã 1 ðr s Þ, to be equal. This is an intuitive result given the reciprocity of the RTE and the angular independence of the source-detector combination.
Simulations were performed using a 40 mm × 40 mm (101 × 101 pixel) domain, and 10 Fourier harmonics. Each source distribution emitted 10 6 photons. r s was set to be the center of the domain with r d moved along the x-direction across the domain. Comparisons are shown in Fig. 2 for case 1 with an isotropic source and detector, Fig. 3 for case 2 with an isotropic source and anisotropic detector with ρ d ¼ δðr − r s Þ, Θ d ¼ 1 π sin 2 ð2θÞ, and Fig. 4 for case 3 with the same ρ d , Θ d but with istropic Θ s and the distributed Θ s ðrÞ shown in Fig. 4(a). Good agreement was obtained in all cases, showing the the RMC adjoint model is an accurate representation of the RTE adjoint. It can be seen in the comparison of hGu 1;2;3 ; v 1;2;3 i with hu 1;2;3 ; G Ã v 1;2;3 i that some noise is present in the latter. As the overall number of photons simulated in the forward and adjoint simulations was the same, the use of spatially distributed sources in the adjoint case resulted in lower photon density in the domain, thereby reducing SNR. This issue was exacerbated in the case where an angularly dependent detector was used because a significant fraction of photons went undetected.

Functional Gradients
Both the radiance and the adjoint radiance can be expressed as Fourier series as in Eq. (8). By substituting these expressions into Eqs.
a n a Ã m cosðnθÞ cosðmθÞ þ X ∞ n¼1 X ∞ m¼1 a n b Ã m cosðnθÞ sinðmθÞ By orthogonality, all terms for which n ≠ m integrate to zero and Eq. (18) reduces to E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 9 ; 6 3 ; 2 1 6 a n a Ã n cos 2 ðnθÞ E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 0 ; 6 3 ; 1 0 9 This expression for the absorption gradient is computationally straightforward to evaluate due to the fact that it requires simply summing over products of Fourier coefficients already loaded in memory. The second term in Eq. (4) is where θ and θ 0 are the angles between the z-axis andŝ andŝ 0 , respectively. As such, the scattering angle between the previous directionŝ 0 into the new directionŝ is given by ðθ − θ 0 Þ. It is possible to expand cos½lðθ − θ 0 Þ as cosðlθÞ cosðlθ 0 Þ þ sinðlθÞ sinðlθ 0 Þ which in turn allows us to employ orthogonality relationships to simplify the above integrals and write E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 4 ; 3 2 6 ; 1 4 0 a n a Ã n g n þ 1 π X ∞ n¼1 b n b Ã n g n : Substituting this expression into Eq. (4), we can write the full expression for the functional gradient with respect to the scattering coefficient E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 5 ; 6 3 ; 7 1 9 E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 6 ; 6 3 ; 6 3 9 The ability to calculate these gradients is the first step to finding a computationally efficient way to solve the full QPAT inversion using an MC model of light transport.

Inversions for Absorption and Scattering
The forward and inverse MC models of radiance described above were used with a gradient-descent (GD) scheme to estimate μ a ðxÞ and μ s ðxÞ from simulated PAT images by minimizing the error functional in Eq. (2). As the adjoint source, q Ã ðx;ŝÞ ¼ μ a ðxÞ½H meas ðxÞ − HðxÞ was independent of angle, photons were launched istropically with the launch position being spread out over the range of a source voxel using a randomly distributed number on the interval [0, 1]. The initial photon weight was scaled according to the source strength with normalization of the output quantity (i.e., radiance, absorbed energy density, and harmonic) being N p . The adjoint source may be negative in some places, so the initial photon weight is negative and weight deposition is also negative. The photon termination condition was, therefore, set to be the absolute value of the photon weight falling below the threshold value. The gradients were calculated using Eqs. (20) and (26). A GD scheme was chosen for the minimization because it is more robust to the MC noise in the functional gradients and error functional than techniques such as L-BFGS that use second-order information. A line-search algorithm presented by Hager and Zhang 48 was used for the reconstruction of μ a . A backtracking line search was implemented for the reconstruction of μ s . This is described in Sec. 6.2. The termination condition used by the optimization was where i is the iteration number. For all the reconstructions, it was assumed that the data H meas was given; no noise was added to the data, but MC noise from the forward simulation of the data was present at about 0.7% (evaluated by taking several runs of the forward model to estimate the average standard deviation over all positions across all model runs). For each inversion, the forward and adjoint RMC simulations used 10 8 photons and 10 Fourier harmonics, and was executed on a Dell 2U R820 32-core server.

Inversion for Absorption Coefficient
The domain used in the estimation of the absorption coefficient consisted of a background absorption coefficient of 0.01 mm −1 with a rectangular inclusion equal to 0.2 mm −1 [shown in Fig. 5(a)], and a background scattering coefficient of 5 mm −1 with a rectangular inclusion equal to 15 mm −1 [shown in Fig. 5(b)]. The anisotropy was a homogeneously distributed value of 0.9. The measured data, H meas , were formed using an MC simulation illuminated by a collimated line source on the boundary at z ¼ 0 mm and on the adjacent boundary at x ¼ 4 mm, consisting of 10 8 photons. The inversion for the absorption coefficient only was performed under the assumption that the scattering coefficient was known and the starting estimate of the absorption coefficient was a homogeneous value of 0.01 mm −1 . The termination condition in Eq. (27) was satisfied after 11 iterations, having taken 4.1 h to run, and is shown in Fig. 5(b) with profiles through the true and reconstructed distributions of μ a shown in Fig. 5(d).
Very good agreement between the true and reconstructed absorption coefficient is observed, with a value of the error function after 11 iterations being 2.9 × 10 −9 . The optimization routine was stopped as the change in the error function on the 12th iteration was below the function tolerance of 10 −9 , indicating convergence. The average error in the estimate of the absorption coefficient μ est a , computed as jμ true a − μ est a j∕μ true a , was 0.2% over the entire domain.

Inversion for the Scattering Coefficient
Inversions for the scattering coefficient were performed using the same domain as above, shown in Figs. 6(a) and 6(b), with the illumination and number of photons in the forward simulation also being the same. Here, it was assumed that the absorption coefficient was known and the starting estimate of the scattering coefficient was equal to the background value of 5 mm −1 .
Two modifications were necessary to achieve convergence in the optimization for the scattering coefficient. First, a custom GD algorithm was used in which a backtracking line search was implemented. A backtracking line search 49 starts with a large candidate step length and progressively reduces the step size while checking for a sufficient decrease in the error functional. The sufficient decrease condition is expressed as where ν was chosen to be 0.2 by inspection because this produced rapid convergence. In order to improve efficiency of the line search, step sizes were bounded between ½10 5 ; 10 9 ; it was found that this range yielded sufficiently large steps to ensure reasonably efficient progress in the minimization. Second, the termination condition in Eq. (27) was relaxed Hochuli et al.: Quantitative photoacoustic tomography using forward and adjoint Monte Carlo models of radiance due to the much slower convergence of the scattering coefficient, and instead required a relative change in the error functional of 10 −5 . This was satisfied after 35 iterations, and is shown in Fig. 6(c) with profiles through the true and reconstructed distributions of μ s as shown in Fig. 6(d).
It can be seen from Figs. 6(c) and 6(d) that the inversion has partly reconstructed the inclusion in the scattering coefficient. The inability to reconstruct edges of the inclusion in the scattering coefficient is expected, given the diffusive nature of the scattering. However, the discrepancy in μ s in the inclusion, evident from Fig. 6(d), suggests premature termination of the optimization. This is due to the fact that the gradient with respect to scattering is small and prone to noise in the functional gradients. This low SNR in the gradients has the impact that that search directions in the optimization routine are often suboptimal, which results in little or no progress of the optimization. The progressive reduction in SNR in the gradient means that nondescent steps are likely and can therefore trigger the termination condition.

Discussion and Conclusions
In this paper, an MC model of the RTE was presented. The model computes the radiance in a Fourier basis in 2-D and is straightforward to extend to 3-D using a spherical harmonics basis. The accuracy of the model was demonstrated by comparing the angle-resolved radiance at two positions in the domain to corresponding appropriate analytic solutions.
Sections 5 and 6 presented the application of the RMC algorithm to estimating the absorption and scattering coefficients from simulated PAT images. In Sec. 6.1, it was observed that the absorption coefficient was estimated with an average error of 0.2% over the domain relative to the true value, when the scattering coefficient is known, and in the presence of 0.7% average noise in the data. This is encouraging, particularly because noise is not only present in H meas but also in the Fourier harmonics computed using the forward and adjoint RMC simulations, which is propagated to the estimates of the functional gradients. Consequently, the search direction in the GD algorithm will always be suboptimal. Furthermore, noise in Hðμ ðlÞ a ; μ s Þ, the estimate of the absorbed energy density at the l'th iteration of the line search, will be also be propagated to the error functional, resulting in a nonsmooth search trajectory for the line search because at every point μ ðlÞ a , the error function will be corrupted by some different noise σ ðlÞ ∶ϵðμ ðlÞ a ; μ s Þ ¼ kH meas − Hðμ a ; μ s Þð1 þ σ ðlÞ Þk 2 . In practice, this did not preclude reconstruction of the absorption coefficient since the calculated gradients remained descent directions despite the noise. Furthermore, the error functional in μ a is sufficiently convex that the addition of some noise does not prevent the linsearch from yielding sufficiently large a step length to allow rapid convergence.
Reconstruction of the scattering coefficent correctly located the scattering perturbation in the simulated image; however, the peak value in the reconstruction was lower than the true value. This is a direct consequence of the fact that the scattering coefficient is related to the absorbed energy distribution only through the optical fluence distribution. Consequently, the SNR in ∂ϵ ∂μ s is typically much less than that for absorption. This causes termination of the algorithm before the peak magnitude of the parameter has been found in the search space.

Disclosures
No conflicts of interest, financial or otherwise, are declared by the authors.