Optical tomography with discretized path integral

Abstract. We present a framework for optical tomography based on a path integral. Instead of directly solving the radiative transport equations, which have been widely used in optical tomography, we use a path integral that has been developed for rendering participating media based on the volume rendering equation in computer graphics. For a discretized two-dimensional layered grid, we develop an algorithm to estimate the extinction coefficients of each voxel with an interior point method. Numerical simulation results are shown to demonstrate that the proposed method works well.


Introduction
Optical tomography [1][2][3][4][5][6][7][8] is known as a safer alternative to x-ray tomography.Usually, tomography consists of a light source generating penetrative light and a detector capturing the light, which allows to estimate the inside of the object through which the light is passing.The most important application is x-ray computed tomography (CT), where x rays are used due to their penetrative property.The balance between the radiation exposure of the human body and the quality of the obtained results has been debated since the early days when x-ray CT was invented.Therefore, there is an urgent demand for a safer medical tomography, such as optical tomography.
Modeling the behavior of light plays an important role in optical tomography, and in the mesoscale, in which the wavelength of light is close to the scale of tissue, the radiative transport equation (RTE) is used for describing the behavior of light scattering. 5,9At the macroscale, 6 the time-independent or dependent RTE is often approximated with a diffusion equation.
Similarly, the computer graphics community has the used time-independent RTE, and in contrast to the (surface) rendering equation, 10,11 often calls it the volume rendering equation (VRE). 10,12Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 1 ; 6 3 ; 2 1 7 ðω • ∇ÞLðx; ωÞ ¼ −σ t ðxÞLðx; ωÞ þ σ s ðxÞ Z S 2 f p ðx; ω;ω 0 ÞLðx; ω 0 Þdω 0 ; (1) and the notations will be introduced in the following sections.The use of VRE enables us to render volumes of participating media, such as fog, cloud, and fire through which light is penetrating, and to obtain realistic volume rendering images of such scenes. 13,14The path integral, which can be considered as a discrete version of the continuous Feynman path integral, 15,16 has been recently employed to solve the VRE in an efficient way with Monte Carlo integration, such as Metropolis light transport 17,18 or bidirectional path tracing. 19n this paper, we propose an optical tomography method using path integral as a forward model and solving a nonlinear inverse problem that minimizes the discrepancy between measurements and model predictions in a least-squares sense.To the best of our knowledge, the discretized path integral has not been used in optical tomography before.In our work, we simplify the path integral with some assumptions.1][22][23] We approximate the integral of an infinite number of paths with the sum of a finite number of paths, discretize a continuous medium into voxels of a regular grid, and continuous light paths into discrete ones (i.e., polylines).We deal with anisotropic scattering having a peak in the forward direction, which is different from other discretization methods using discrete ordinate or spherical harmonics. 13,24,25In this work, we focus on estimating the spatially varying extinction coefficient σ t ðxÞ at each discretized voxel location of the medium while fixing scattering properties (e.g., scattering coefficients σ s and phase functions f p ).By separating the scattering properties from our problem, we formulate optical tomography as an optimization problem with inequality constraints solved by an interior point method.
An interior point method 26 is an iterative method to solve an optimization problem with inequality constraints describing a feasible region in which the optimal solution must reside.To this end, a series of nonconstrained optimization problems are constructed by combining the constraints and the original objective function and are solved by an ordinal gradient-based (Quasi-Newton) method.
To summarize our contribution, we reformulate the problem of optical tomography by combining a path integral with several simplifying assumptions to model the light transport in the participating media.This paper is an extension of our previous conference version 27,28 with additional theoretical background and additional experiments and discussions, and is structured as follows.In Sec. 2, we briefly review previous work related to path integrals and optical tomography.In Sec. 3, we describe how to model the light transport in participating media and turn optical tomography into an optimization problem.In Sec. 4, we show how to solve the optimization problems.Section 5 reports some simulation results, and Sec.6 concludes the paper.

Related Work
In this section, we briefly review related work on optical tomography and path integrals in computer graphics.
Optical tomography 4,5 (or inverse transport, 6,7 inverse scattering, 29 scattering tomography 30,31 ) is a problem in medical imaging using light sources to reconstruct the optical properties of tissue from measurements (time-dependent or stationary, angular-dependent or independent) at the surface boundary.Analytically solving the RTE [Eq.(1)] with boundary conditions is difficult, however, and approximations, such as discrete ordinates and N'th-order spherical harmonics (P N approximation), are often used and solved numerically by, for example, finite element methods (FEM) or finite difference methods.The famous diffuse approximation 5,6 (DA) is a P 1 (thus first-order) approximation with the assumption on a phase function being isotropic.The DA is an approximation to RTE at a macroscopic scale when scattering is large while absorption is low and scattering is not highly peaked.Diffuse optical tomography (DOT) is based on DA and today represents the frontier of optical tomography 32,33 with many clinical applications. 34DA, however, does not often hold in realistic participating (scattering) media; absorption may not be small compared to scattering, and the shapes of the phase functions can be highly peaked in the forward direction, which is often modeled by Henyey-Greenstein, 35 Schlick, 36 or Mei and Rayleigh phase functions. 10,12,37,38Experimental evidence 39 also suggests a highly peaked shape of the phase functions in biological media.DOT works, but is still limited; therefore, other methods have also been studied for cases when DA does not hold.
Statistical Monte Carlo methods are used for media in which the assumptions do not hold; however, they are computationally intensive and inefficient for solving the forward problem, [4][5][6][7]34 i.e., solving the RTE with given parameters. Thefore, Monte Carlo based approaches have been used for estimating the spatially constant (not varying) parameters in homogeneous media, such as paper, 40,41 clouds, 42 liquids, 43 plastics, 44 or uniform material samples.45 Another difficulty of Monte Carlo based inverse methods is that an analytical forward model prediction is hard to obtain when we want to minimize the difference between the prediction and measurements except for very special structures.46,47 A gradient based least-square approach has been proposed but only for spatially constant parameter estimation, 40,41,48 while model-free approaches have relied on genetic algorithms, 42,44 numerical perturbation, 49,50 voting, 51 or even simple backprojection.52 One of the contributions of the current paper is to enable us to use a gradient based optimization approach for estimating spatially varying parameters, which is extensible by using many optimization methods.
Similar to optical tomography, modeling light transport plays a very important role in computer graphics.Our own work on optical tomography is inspired by Monte Carlo based statistical methods.In the last two decades, methods based on path integrals [17][18][19][53][54][55] have provided models of light transport for efficient volume rendering. For soving RTE, a path integral has been used for a forward problem solver, 16,56,57 and has also been applied to optical tomography, but under the diffusion assumption.58,59 Our proposed method is based on a path integral to explicitly express the forward model prediction, which is very suitable for solving the inverse problem with gradient based methods.This is an advantage of our method over existing methods because the paths used in the forward model can be generated by either a deterministic or statistical (Monte Carlo) method.To achieve an efficient forward model, we introduce a simplified layered scattering model that uses a limited number of deterministic paths instead of Monte Carlo simulated ones.

Method: Forward Problem
We deal with the following optical tomography problem [this is a conceptual formulation and the actual problem is shown in Eq. ( 29)].
E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 2 ; 3 2 6 ; 5 5 3 min (2 where σ t is a vector representing the spatial distribution of the extinction coefficients to be estimated.We divide our discussion into two parts: forward and inverse problems.The forward problem focuses on building a mathematical model P ij ðσ t Þ of the light transport between a light source i and a detector j.We will make some assumptions on the light transport and the medium to simplify the forward model.An inverse problem minimizes the difference between the observations I ij of the detector and the forward model to estimate the spatial distribution of the extinction coefficients σ t .

Forward Model
In the forward problem, as we mentioned before, we use a path integral to build a mathematical model for the light transport.
Here, we follow the notation developed in the computer graphics literature 17,23,53,60 to introduce the path integral.Sections 3.2 to 3.6 will show the simplified model we propose.Given a space R 3 , a light source is located at x 0 ∈ R 3 and a detector at x Mþ1 ∈ R 3 , and in between them is the participating media ν ⊂ R 3 with boundary ∂ν and interior volume . Thus, absorption, scattering, or reflection events happen at x 1 ; : : : ; x M .The set of all paths of length M is denoted by Ω M .The path space Ω is the countable set of all paths Ω M of finite length.
E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 3 ; 3 2 6 ; 2 1 5 A direction is denoted by ω ∈ S 2 , where S 2 is a unit sphere in R 3 .A unit vector ω x m ;x mþ1 is the direction from vertex x m to vertex x mþ1 in a path x.
Veach 20 introduced a framework representing the rendering equation in the form of a path integral for scenes without participating media (i.e., no scattering), and later, Pauly et al. 17 extended it to the volume rendering equation with scattering.The amount of light I observed by the detector is given by the path integral E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 4 ; 6 3 ; 7 5 2 I ¼ Z Ω fðxÞdμðxÞ; (4) which is an integral over the path space.Here, μðxÞ is a measure of path x.
E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 5 ; 6 3 ; 6 9 6 where dμðx m Þ denotes the differential measure at vertex x m .fðxÞ is a measurement contribution function defined as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 6 ; 6 3 ; 5 9 3 where W e ðx M ; x Mþ1 Þ is the camera response function, and L e ðx 0 ; x 1 Þ is the intensity of the light emitted from the light source x 0 to vertex x 1 .f f ðx m−1 ; x m ; x mþ1 Þ is a scattering kernel at x m with respect to the locations of vertices x m−1 and x mþ1 .
E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 7 ; 6 3 ; Here, the bidirectional scattering distribution function f s ðx m−1 ; x m ; x mþ1 Þ is used for locations on the surface of objects, and the scattering coefficient σ s ðx m Þ at x m and the phase function f p ðx m−1 ; x m ; x mþ1 Þ are used for those inside the medium.Gðx m ; x mþ1 Þ is a generalized geometric term.
E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 8 ; 6 3 ; 3 5 0 Gðx m ; x mþ1 Þ ¼ Tðx m ; x mþ1 Þgðx m ; x mþ1 Þ; where gðx m ; x mþ1 Þ is a geometric term.
E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 9 ; 6 3 ; 3 0 8 gðx m ; x mþ1 Þ ¼

<
: with unit normal n g ðx m Þ of the surface at x m ∈ ∂ν.Tðx m ; x mþ1 Þ is a transmittance that describes the attenuation when light passes through the medium.
E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 0 ; 6 3 ; 2 1 5 Tðx m ; x mþ1 Þ ¼ e −τðx m ;x mþ1 Þ ; fx m ; x mþ1 g ⊂ ν 0 ∪ ∂ν 0; otherwise ; E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 1 ; 6 3 ; 1 5 4 τðx m ; where σ t ðx m Þ is the extinction coefficient at vertex x m .Putting all together, we have a path integral of the following infinite sum of all possible path contributions.
E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 2 ; 3 2 6 ; 7 5 2 Note that all vertices fx m g depend on a path k; different paths have different sets of vertices.In the equation above, however, we omit the path index k for simplicity.Later, we will again use k as the path index.

Assumptions on the Path Integral Formulation
As our target is optical tomography, we restrict the model to deal with inside the participating media.To do so, we assume that the light source x 0 and detector x Mþ1 are located on the surface, and the other vertices x 1 ; x 2 ; : : : ; x M ; x Mþ1 are inside the medium, that is, x 0 ; x Mþ1 ∈ ∂ν and x 1 ; : : : ; x M ∈ ν 0 .Then the transmittance is simplified as Furthermore, we assume that the observations are ideal and the camera response function is the identity, W e ðx M ; x Mþ1 Þ ¼ 1.
Apart from the assumptions above, we rewrite the geometric term and the differential measure.The definitions above use area measures dAðx m Þ and volume measures dVðx m Þ along with the squared distance geometric term; 17,23,53 however, steradian measures dωðx m Þ and the identity geometric term is equivalent and also widely used. 10,12,60Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 4 ; 3 2 6 ; 3 8 0 Therefore, we employ the steradian measures and rewrite it as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 5 ; 3 2 6 ; 3 2 3 E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 6 ; 3 2 6 ; 2 8 1 Now, Eq. ( 12) is written as

Discretization of the Forward Model
For numerical computation, we first discretize the medium into voxels of a regular grid, where each voxel has its own extinction coefficient σ t ½b (b is the index of the voxel) as shown in Fig. 1.
With this voxelization, the paths of light are also divided into segments, as explained below.First, we explain the integral [Eq.(11)] along a single segment x m ; x mþ1 of a path x.It describes the attenuation of light along the segment due to the extinction coefficients of the voxels involved.Because of the discretization of the medium, Eq. ( 11) can be written as a sum of voxel-wise multiplications.
For the second equality, b is the index of a set B x m ;x mþ1 of all voxels involved by segment x m ; x mþ1 , and d x m ;x mþ1 ½b is the length of the part of the segment x m x mþ1 passing through voxel b.This is illustrated in Fig. 1(c).The extinction coefficient σ t is now a piece-wise constant function because of the voxelization; then the integral turns into a sum (the idea that this integral can be turned into a sum has been discussed before, 61 however, not in the context of tomography).This simplifies the computation; however, the sum over a set B x m ;x mþ1 is not preferable in terms of implementation and optimization.We propose here to use a vector representation of both extinction coefficients and segment lengths, which is the third equality of the above equation.The first vector σ t stores the values of the extinction coefficients σ t ½b of all voxels.This vector can be generated by serializing the voxels on the grid in a certain order.The second vector d x m ;x mþ1 contains the values of the lengths d x m ;x mþ1 ½b for all voxels.We should note that this vector is very sparse; most of the voxels have no intersection with the segment x m ; x mþ1 .Hence, only a few elements in d x m ;x mþ1 have nonzero values, and the other elements are zero because those voxels b have no intersection and d x m ;x mþ1 ½b ¼ 0.
This sparsity of the vector facilitates the construction of a whole path x because path segments can be added as follows: where D k is the vector of a complete path k of length M þ 2; the b'th element can be interpreted as the length of the segment when the path passes through voxel b.This notation simplifies a part of Eq. ( 17) as follows: ; t e m p : i n t r a l i n k -; e 0 2 0 ; 3 2 6 ; 5 1 9 Using this notation to rewrite Eq. ( 17), we have where the factor H k , defined as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 2 ; 3 2 6 ; 3 6 4 describes the contributions of the scattering coefficients and phase functions, and the exponential factor represents attenuation due to absorption (and outscattering) over the path.

Two-Dimensional Layered Model of Forward Scattering
As a first attempt, we design a two-dimensional (2-D) layered grid, instead of the three-dimensional (3-D) one.Since we voxelize the medium into a regular grid, the 2-D medium consists of parallel layers.Hereafter, a 3-D direction ω between vertices is written as a 2-D direction θ and a steradian measure dω as an angular measure dθ.
As shown in Fig. 2, we assume a particular layer scattering having the following properties.First, vertices x 1 ; • • • ; x M of path x are located at the centers of each voxel.The light source x 0 is located on the boundary of the top surface of the voxels in the top layer.Similarly, the detector x Mþ1 is located on the boundary of the bottom surface of the voxels in the bottom layer.Second, directions θ x 0 ;x 1 and θ x M ;x Mþ1 at the beginning and end of a path are perpendicular to the boundary.This means that scattering begins at x 1 and ends at x M .Third, forward scattering happens layer by layer.More specifically, light is scattered at the center of a voxel in a layer and then goes to the center of a voxel in the next (below) layer.Scattering is assumed to happen every time the light traverses voxel centers.Even if the next voxel is just below the current voxel and the path segment is straight, it is regarded as scattering.Fourth, the scattering coefficient is uniform, σ s ðxÞ ¼ σ s .By ignoring paths exiting from the sides of the grid, the number of all possible paths is N M , where M is the number of layers and N is the number of voxels in one layer.

Approximating the Phase Function with a Gaussian
We use a Gaussian model f p ðθ; σ 2 Þ as an approximation of the phase function ; t e m p : i n t r a l i n k -; e 0 2 3 ; 3 2 6 ; 6 3 4 where the variance σ 2 controls the scattering property; larger values of σ 2 mean strong forward scattering.This Gaussian approximation is convenient in our case because of the following two reasons.First, existing phase function models 10,12,[35][36][37][38] are those for 3-D scattering, not for 2-D.This means that those functions are normalized for integrals over the unit sphere S 2 : ∫ S 2 f p ðωÞdω ¼ 1.Most of the phase functions assume isotropy (rotational symmetry), and hence, the function has a form taking angle θ as an argument; however, ∫ π −π f p ðθÞdθ ≠ 1.These functions, therefore, are not adequate for our case.The tallest and narrowest shapes correspond to σ 2 ¼ 0.1, and the shape becomes shorter and rounder for larger values of σ 2 .(b) Heino's two-dimensional analogs 62 of Henyey-Greenstein's phase function with parameter g ¼ 0.1; 0.2; : : : ; 1.0.The tallest and narrowest shapes correspond to g ¼ 1.0, and the shape becomes shorter and close to a hemisphere for smaller values of g.
Second, our assumption of layer-wise forward scattering does not allow scattering to happen backwards or sideways, and the Gaussian model is suitable for it.As shown in Fig. 3, the Gaussian model has the form of forward-only scattering (no backwards or sideways) in a reasonable range of σ 2 , and it is almost normalized; ∫ π∕2 −π∕2 f p ðθ; σ 2 Þdθ ≈ 1.Other 2-D phase functions exist which are not forward-only.For example, Heino et al. 62 introduced a 2-D analog of Henyey-Greenstein's phase function, 35 shown in Fig. 3.Although the parameters are different, the two functions in Fig. 3 have similar shapes.The most important difference is that Heino's function has backward scattering, but our Gaussian model does not.More realistic scattering rather than the layer-wise forward scattering introduced here needs Heino's or Henyey-Greenstein's phase function.
We should note one further simplification in our layer-wise forward scattering model.The angle θ m in the phase function is usually defined between θ x m−1 ;x m and θ x m ;x mþ1 , that is, the difference of directions changed by the scattering event.Instead of dealing with such an exact difference of directions, we use the angle between θ x m ;x mþ1 and the vertical (downward) direction for efficiency of computation.This assumption enables us to discretize the Gaussian phase function much more easily.Since f p ðθÞ integrates to (approximately) one, such a normalization can be discretized with a sum as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 4 ; 3 2 6 ; 7 5 2 where B is a set of voxel indices in the next layer n, θ b is an alternative form of the corresponding θ x m ;x mþ1 , and Δθ b is the angle measure as shown in Fig. 4.
The above equation can be considered as the energy distribution from a voxel in one layer to the voxels in the next layer.For a voxel b at direction θ b , the value of f p ðθ b ; σ 2 ÞΔθ b describes what percentage of the energy will be scattered to this voxel.Figure 5 shows plots of the values corresponding to two phase functions with different parameters.We can see that, due to forward scattering, most of the energy is concentrated in the voxel just below, and a small part goes to the adjacent voxels.
The contribution H k in Eq. ( 22) now needs to be rewritten so that it deals with the Gaussian phase function and the discretized energy distribution discussed above.First, we reorder the measure E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 5 ; 3 2 6 ; 5 4 8 ; t e m p : i n t r a l i n k -; e 0 2 6 ; 3 2 6 ; 5 0 1 and then replace the factors with the Gaussian phase function.
E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 7 ; 3 2 6 ; 4 4 5 Note that the factor dAðx 0 ÞΔθ x 0 ;x 1 σ M s is common for all paths because we assumed that the grid is uniform so that dAðx 0 Þ is constant, and the direction θ x 0 ;x 1 (or ω x 0 ;x 1 ) is perpendicular to the top surface, and σ s is constant.

Observation Model
Suppose the 2-D layered medium is an M × N grid; it has M layers, each of which is made of N voxels.We now construct  an observation model of the light transport between a light source and a detector: emitting light to each of the voxels at the top layer, and capturing light from each voxel from the bottom layer.More specifically, let i ∈ B 1 and j ∈ B M be voxel indices of the light source and detector locations, respectively.By restricting the light paths to only those connecting i and j, the observed light I ij is written as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 8 ; 6 3 ; 4 3 3 where H ijk and D ijk are the same as in Eqs. ( 27) and ( 21), respectively, but are restricted to paths connecting i and j, and I 0 ¼ L e ðx 0 ; x 1 Þ assuming the light source to be constant.In the above equation, k indexes the light paths, which share the same i and j.Due to the layered scattering model in the N × M grid, the number of different paths between i and j is N ij ¼ N M−2 .This is, however, too large even for small N and M, e.g., N ¼ M ¼ 10.Therefore, we exclude paths having small contributions from the computation.This is done by a simple thresholding while computing H ijk as shown in Algorithm 1.
This results in generating fewer paths; N ij ≤ N M−2 .For example, there are N ij ¼ 742 paths for N ¼ M ¼ 20 with σ 2 ¼ 0.4 when th ¼ 0.001, which enables us to reduce the computation cost.

Method: Inverse Problem
Next, we propose a method for the inverse problem of the forward model [Eq.(28)] to estimate the extinction coefficients of the 2-D layered model.As we mentioned before, we fix the light paths and assume that the scattering coefficients and parameters of the Gaussian phase function are uniform and known in advance.

Cost Function
In the M × N 2-D layered medium described in Sec.3.6, we had assumed a configuration of a light source and detector similar to the left-most one shown in Fig. 6; the light source is located above the medium and the detector is below, and the observed light is I ij , where i; j are the voxel indices of the light source and detector locations.By sliding the light source and the detector, we can obtain N 2 observations, resulting in the following leastsquares equation: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 9 ; 3 2 6 ; 5 0 1 min ; t e m p : i n t r a l i n k -; e 0 3 0 ; 3 2 6 ; 4 3 7 where ⪯ denotes the generalized inequality, i.e., all elements in the vector must satisfy the inequality.The lower bound 0 comes from the fact that any media must have positive extinction coefficients, while the upper bound u is used for numerical stability to exclude unrealistic values to be estimated.Furthermore, as shown in Fig. 6, we have four configurations of light sources and detectors by changing their positions.This gives us four different sets of observations I ij and paths ijk.These four different sets lead to four objective functions (f T2B , f L2R , f B2T , f R2L ) as shown in Fig. 6.Since the four objective functions share the same variables σ t , we can use all of them at the same time by adding them to form a new single function f 0 at the expense of additional (factor of four) computation cost.

Optimization Problem with Inequality Constraints
Since the inverse problem [Eq.(31)] is nonlinear, we employ an interior point method, 26 an iterative optimization algorithm for problems with constraints.Here, we first review several key points in optimization; then we will develop an algorithm to solve Eq. ( 31) along with the required first-and second-order derivatives of the cost function.

Unconstrained problem: Quasi-Newton
First, we review optimization without constraints, which is used inside the interior point method.The general form of unconstrained optimization is E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 3 2 ; 6 3 ; 5 3 6 min where σ t ∈ R N×M is a real vector and f: R N×M → R is an objective function that is twice continuously differentiable.
To solve it, an iterative procedure begins with an initial guess σ 0 t and generates a sequence fσ k t g ∞ k¼0 .It stops when the change of solutions is small enough.The information about function f at σ k t or even previous estimates σ 0 t ; is used to calculate a direction p k to move with a step size α k .A line search is often used to determine the step size by searching along the direction starting from σ k t for finding σ kþ1 t with the least value of the objective function E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 3 3 ; 6 3 ; 3 8 9 min Once we find the step size, the estimate σ kþ1 t is updated as The Newton's method is well known for its second-order convergence and accuracy.However, when the dimension of the problem is large, calculating the Hessian and its inverse is computationally expensive.Therefore, Quasi-Newton methods are often used, where the inverse Hessian is updated by incremental approximations in order to reduce the computation cost.The Broyden-Fletcher-Goldfarb-Shanno (BFGS) update rules are well known. 63E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 3 4 ; 6 3 ; E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 3 5 ; 6 3 ; E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 3 6 ; 6 3 ; 1 5 4 B k ¼ When the conditions y T s > 0 and B 0 ≻ 0 (where ≻ 0 means positive definite) are satisfied, the BFGS update guarantees the positive definiteness of B k .Algorithm 2 shows the Quasi-Newton method.

Constrained problem: interior point
Here we introduce a constrained optimization with inequality constraints of the form E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 3 7 ; 3 2 6 ; 5 0 5 min where σ t ∈ R N×M is a real vector and f 0 ; : : : ; f m : R N×M → R are twice continuously differentiable.The idea is to approximate it as an unconstrained problem.Using Lagrange multipliers, we can first rewrite Eq. ( 37) as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 3 8 ; 3 2 6 ; 4 2 4 min where I: R → R is an indicator function, which keeps the solution inside the feasible region.
E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 3 9 ; 3 2 6 ; 3 5 3 Equation ( 38) now has no inequality constraints, while it is not differentiable due to I.
The barrier method 26 is an interior point method that introduces a logarithmic barrier function to approximate the indicator function I as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 4 0 ; 3 2 6 ; 2 5 4 ÎðfÞ ¼ −ð1∕tÞ logð−fÞ; (40)   where t > 0 is a parameter to adjust the accuracy of approximation.The log barrier function goes to infinity rapidly as f goes close to 0, while it is close to 0 when f is far away from 0. Since ÎðfÞ is differentiable, we have E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 4 1 ; 3 2 6 ; 1 7 7 min or equivalently, E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 4 2 ; 3 2 6 ; 1 1 7 min Algorithm 2 The Quasi-Newton method with BFGS update rule.
1 repeat 2 Compute the Quasi-Newton direction: The barrier method solves Eq. ( 42) iteratively by increasing the parameter t.At the limit of t → ∞, the above problem coincides with the original problem [Eq.( 38)].

Algorithm for Solving the Inverse Problem
Algorithm 3 shows our algorithm, which uses a barrier method with Quasi-Newton for solving the inverse problem.We should mention the following parts where we have modified the original algorithm. 26arm start: For each inner loop, the Quasi-Newton method needs an initial guess of the inverse Hessian B 0 .Instead of fixing B 0 for every inner loop, we reuse the B k of the last inner loop to accelerate the convergence (shown in lines 4 and 19 in Algorithm 3).
Checking feasibility: Since the Quasi-Newton method and line search estimate without constraints, the next estimate σ kþ1 t may go beyond the constraints; in our case, each element σ kþ1 t ½b in σ kþ1 t must be inside ½0; u after the step size has been determined.Therefore, in line 8, we check the feasibility of the estimate σ kþ1 t for the current step size α k .If it exceeds the boundary of the feasible region, we pull the estimate back into the feasible region by halving the step size.If it is still outside the feasible region, then the step size is halved again.Why do we not just set the step size so that σ kþ1 t is exactly on the boundary?The reason is the log-barrier: if σ kþ1 t is on the boundary, in other words, σ kþ1 t ½b is either 0 or u, then logðσ t ½bÞ or logðu − σ t ½bÞ becomes infinite, which results in numerical instability.Therefore, the procedure described above is needed.
Checking for positive definiteness: The BFGS update rules guarantee B k to be positive definite if y T s > 0 and B ≻ 0 are satisfied.While the latter is satisfied by giving an appropriate initial guess, the former depends on the updates at each iteration.If it is not satisfied, then the BFGS updates are no longer valid, and we reset the inverse Hessian B k to a scaled identity 63 at line 16.

Jacobian
Here, we represent the Jacobian of the objective function f 0 in Eq. ( 29).Note that the objective function f 0 in Eq. ( 31) can be derived in the same manner.
We first rewrite the objective function f 0 as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 4 3 ; 3 2 6 ; 4 9 1 E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 4 4 ; 3 2 6 ; 4 4 0 E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 4 5 ; 3 2 6 ; 3 5 8 and the gradient of f 0 is given by To simplify the equation, we use the following notation: . . .
Algorithm 3 Barrier method of interior point with Quasi-Newton solver.
Now, f 0 and the gradient can be represented as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 4 9 ; 6 3 ; 3 4 5 f 0 ¼ E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 5 0 ; 6 3 ; 2 9 4 where sum½ stands for the sum over the elements of the container [Eq.( 49)] of vectors, × is the element-wise product, and ⊗ denotes the tensor product, defined as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 5 2 ; 3 2 6 ; 4 8 8

Numerical Simulations
In this section, we report the results obtained by numerical simulations using the proposed model.The following parameters have been used in Algorithm 3: t init ¼ 1.0, μ ¼ 1.5, ε ¼ 10 −2 .For the line search, the range for the step size was α k ∈ ½0; 100.For the initial guess, we used B ¼ I, σ 0 t ¼ 0. For the 2-D layered medium, the grid size was set to N ¼ M ¼ 20 with square voxels of size 1 (mm), i.e., the medium is 20 ðmmÞ × 20 ðmmÞ, and dA ¼ 1 (mm).The values of the extinction coefficients are set between 1.05 and 1.55 (mm −1 ), and the upper bound in Eq. ( 30) is set to u ¼ 2.0 (mm −1 ).The parameter of the Gaussian phase function is 0.2 or 0.4, and the scattering coefficient is set to σ s ¼ 1 (mm −1 ).The threshold for excluding low contribution paths is th ¼ 0.001.
The ground truth and the estimated extinction coefficients are shown in Fig. 7.The matrix plots in the top row of the figure represent five different media [from (a) to (e)] used for the simulation.Each voxel b is shaded in gray according to the values of the extinction coefficient σ t ½b, and darker gray represents larger values of σ t ½b.Also, the values of σ t ½b are displayed at each voxel.In the same manner, the middle and bottom rows show the estimated results when the following values of the parameter of the Gaussian phase function were used: σ 2 ¼ 0.2 and 0.4.Figure 8 shows the observations I ij in a matrix form, from which the extinction coefficients are estimated.Each element in these plots is now an observation I ij .We can see observations with higher values (shown in darker shades of gray in the plots) on the diagonal.The observations obtained for σ 2 ¼ 0.4 seem to be fainter than those obtained for σ 2 ¼ 0.2 due to the larger amount of scattering.The left-most column of Fig. 7(a) shows the simplest case: the medium has almost homogeneous extinction coefficients of value 1.05 (voxels shaded in light gray) except for a few voxels with much higher coefficients of 1.2 (voxels shaded in dark gray), which means that those voxels absorb much more light than other voxels.The coefficients are estimated reasonably well as shown in the middle and bottom rows, and the root mean squared error (RMSE) shown in Table 1 is small enough with a relative error of 0.0075∕1.05¼ 0.7% to the background coefficient value.The other media, shown in columns (b) to (e), have more complex distributions of the extinction coefficients.We summarize the quality of the estimated results in terms of RMSE in Table 1.Numbers in the brackets are relative errors of RMSE to the background extinction coefficient values (i.e., 1.05).Computation time is also shown in Table 1.Note that our proposed method has been currently implemented in MATLAB®, which can be accelerated further by using C++.
The values of the cost function f 0 over iterations of the outer loop in Algorithm 3 are shown in Fig. 9 for each medium.These curves show that the proposed method effectively minimizes the original objective function [Eq.( 31)] for the five different types of media shown here and probably for other media.Figure 10 demonstrates how the log-barriered cost function f in Algorithm 3 evolves over all iterations of the inner loop; the number of iterations in the horizontal axis accumulates all inner iterations of the Quasi-Newton method.We can see that each inner loop successively minimizes the log-barriered function and the warm start (reusing the Hessian from the previous outer loop) may help the gap of values between inner loops.

Comparison Results
We compare our method to a standard DOT with FEM (Refs.64 and 65) using different optimization methods implemented in the Electrical Impedance Tomography and Diffuse Optical Tomography Reconstruction Software (EIDORS). 64,65The ground truth used in this comparison is shown in the top row of Figs.11(a For solving DOT by EIDORS, we used 24 × 24 × 24 ¼ 1152 triangle meshes (i.e., each voxel is divided into two triangle meshes), and for the boundary condition, we placed 16 light sources and 16 detectors at the same intervals around the medium.We chose two solvers: Gauss-Newton (GN) method and primal-dual (PD) interior point method.We used σ 0 t ¼ 0 as the initial guess for both our method and EIDORS.
The results by our method (σ 2 ¼ 0.4) and DOT with GN and PD are shown in Fig. 11.The results obtained by the proposed method are shown in the second row, which are similar to those in the third row of Fig. 7.The third row in Fig. 11 shows the results for DOT with GN.These kind of blurred results are typical for DOT estimation due to its diffusion approximation.The last row shows results for DOT with PD, which look better than those obtained for DOT with GN, but still have a tendency of overestimating the high coefficient value areas.
We summarize RMSE values and computation time for each method in Table 2 in the same format as Table 1.RMSE values of our method are two to five times smaller than those of DOT, and this demonstrates that the proposed method can achieve much more accurate results.
The current disadvantage is its large computation cost, as our method takes up to 1000 times longer than DOT.We plan to reduce the computation cost by optimizing the code using C++ and adopting other solvers.

Conclusion with Discussion
In this paper, we have proposed a path integral based approach to optical tomography for multiple scattering in discretized participating media.Assuming the scattering coefficients and phase function are known and uniform, the extinction coefficients at each voxel in a 2-D layered medium are estimated by using an interior point method.Numerical simulation examples are shown to demonstrate that the proposed framework works better than DOT in the simplified experimental setup, while its computation cost needs to be reduced.
There are many directions for further research, including relaxing the assumption of 2-D layered scattering model to more realistic scattering with other phase functions, using paths generated by Monte Carlo based statistical methods, extending the formulation to a full 3-D scattering model, and solving the issues mentioned below.
Limitations-stability and uniqueness: The current formulation presented in this paper estimates only the extinction coefficients; the scattering coefficients and phase function parameters are assumed to be known and uniform.This is one of the limitations of the proposed method, however, it is a common limitation of optical tomography.It is known that the scattering and absorption coefficients cannot be separated from stationary measurements of light intensity, 34 and the solutions are not unique.Also, given stationary measurements without angle information, the problem becomes ill-posed 6,7 and hence not stable.To overcome this limitation, we need to extend the current formulation to handle other measurements that enable stability and uniqueness, such as time-dependent, frequency-dependent, or angle-dependent measurements.
Computational cost: A large part of the computational cost of the proposed method comes from the forward model prediction [Eq.(28)], which appears in the gradient computation [Eq.(7)].It depends on the number of paths N ij ; we currently use about 700 paths out of all 20 18 possible paths, and for each path, we need to compute path vectors D ijk , D ijk þ D ijl , and factors H ijk .A possible acceleration is the precomputation of these variables, but this would lead to a trade-off with storage cost.Each D ijk has dimensions of 20 × 20 ¼ 400, each pair of ij has about 700 vectors of D ijk , and the number of pairs ij (hence observations) is 20 × 20 ¼ 400.In total, ∼450 MB memory would be required even if single precision floating numbers were used for storing all D ijk .Fortunately, these vectors are necessarily sparse, and we have used sparse matrices to store them.However, the increase will be linear in the number of paths N ij and quadratic with the grid size maxðN; MÞ.Therefore, we plan to consider more efficient implementations.

Fig. 1
Fig. 1 Illustration of a discretization example.(a) Voxelization of the medium into a regular grid of size 5 × 5. Voxels are indexed in raster scan order in this example, from left to right, and top to bottom.Each voxel b has extinction coefficient σ t ½b.(b) A path segment between vertices x 1 and x 2 .Voxels involved in the segment are shaded.(c) Lengths d 12 ½b of the involved voxels b ¼ 2; 3; 8; 9.Here we denote d 12 ½b instead of d x 1 ;x 2 ½b for simplicity.

Fig. 2 Fig. 3
Fig.2Proposed two-dimensional layered model of scattering.This example shows path x consisting of vertices x 1 ; • • • ; x M located at the centers of voxels in a grid with M parallel layers.x 0 is a light source located on the top surface, and x Mþ1 is a detector at the bottom.At each vertex, the light scatters to voxels in the next layer, and possible scattering directions are indicated by arrows.

Fig. 4
Fig. 4 An illustration of angle measure Δθ b for voxel b in the next layer.For the center voxel of the upper layer, voxel b (shaded) in the next layer subtends an angle of Δθ b , which is used for the angle measure in Eq. (24).

Fig. 5
Fig. 5 (a) The phase functions with parameter σ 2 ¼ 0.2 (dashed line) and σ 2 ¼ 0.4 (solid line).Plot of the value f p ðθ b ; σ 2 ÞΔθ b for each voxel b for (b) σ 2 ¼ 0.2 and (c) σ 2 ¼ 0.4.Note that index b is relative to the voxel in the next layer just below the voxel in consideration.The voxel just below is b ¼ 0, the voxel on its right side is b ¼ 1, and that on the left side is b ¼ −1.

Fig. 6
Fig. 6 Four configurations of light sources and detectors.From left to right, we call configurations T2B (top-to-bottom), L2R (left-to-right), B2T (bottom-to-top), and R2L (right-to-left), which represent locations of light sources and detectors.

Fig. 7 Fig. 8
Fig. 7 Numerical simulation results for a grid of size 20 × 20 mm .Darker shades of gray represent larger values (more light is absorbed at the voxel).The bars on the side show extinction coefficient values [mm −1 ] in gray scale.The first row shows ground truth for five different types of media [(a)-(e)] used for the simulation.The second and third rows show estimated results for σ 2 ¼ 0.2 and σ 2 ¼ 0.4, respectively, of the Gaussian phase function.

Fig. 9 Fig. 10
Fig. 9 Original cost function values f 0 over iterations of the outer loop of Algorithm 3 with (a) σ 2 ¼ 0.2 and (b) 0.4.The horizontal axis shows the number of outer iterations, and the vertical axis represents the log of the original cost function values.Different plots indicate five different types of media [(a)-(e)] used for the simulation.

Fig. 11
Fig. 11 Numerical simulation results for a grid of size 24 × 24 mm, comparing our method to diffuse optical tomography (DOT) with two solvers.Darker shades of gray represent larger values (more light is absorbed at the voxel).The bars on the side show extinction coefficient values [mm −1 ] in gray scale.First row shows the ground truth for five different types of media [(a)-(e)] used for the simulation.Second row shows the estimated results of the proposed method.Third and fourth rows show estimated results for DOT by using Gauss-Newton (GN) and primal-dual (PD) interior point solvers.

Table 1
Root mean squared errors (RMSEs) and computation time for the numerical simulations for five different types of media [(a) to (e)] with a grid size of 20 × 20, for two different Gaussian phase function parameter values.Numbers in the brackets are relative errors of RMSE to the background extinction coefficient values (i.e., 1.05).

Table 2
RMSEs and computation time for the numerical simulations for five different types of media [(a) to (e)] with grid size of 24 × 24, for the proposed method and diffuse optical tomography (DOT) with two solvers.Numbers in the brackets are relative errors of RMSE to the background extinction coefficient values (i.e., 1.05).