## 1.

## Introduction

Photoacoustic (PA) imaging is a noninvasive imaging modality, which gives the advantage of optical contrast at ultrasonic resolution.^{1}^{–}^{6} Application of PA imaging ranges from blood vessel imaging, tumor imaging, and biomarkers to gene activities.^{1} It consists of irradiating the tissue with a nanosecond laser pulse, which gets absorbed by the tissue and generates ultrasonic waves due to thermoelastic expansion. The ultrasonic detectors placed at the boundary of the domain detect these ultrasonic waves.^{2} The collected data gets utilized in an inverse framework resulting in the distribution of initial pressure rise.

Several image reconstruction algorithms exist for the reconstruction of PA tomographic images, such as filtered back projection, time-reversal, and Fourier transform methods.^{7}^{,}^{8} They require sufficiently large amount of data for accurate reconstruction and may not be useful for quantitative comparisons.^{9}^{–}^{11} Acquisition of large amounts of data is often linked with large data-acquisition time and/or increase in the instrumentation cost. Moreover, the setups used for recording the acoustic signals cover an aperture, which may not enclose the complete object, resulting in incomplete or limited data scenarios.^{12}^{–}^{14} Obtaining quantitatively accurate reconstruction using limited data has been preferred in the recent past.^{6}^{–}^{8}^{,}^{15} In limited data cases, many iterative image reconstruction algorithms were proposed, which improve the quantitative accuracy of the reconstructed images.^{9}^{–}^{11}^{,}^{15} The direct methods (noniterative type) often have limitation in terms of not being robust to noise, and typically in the noisy data cases, the iterative type algorithms are preferred as they exhibit better stability.^{7}^{,}^{8} Moreover, iterative algorithms often provide inherent regularization in terms of number of iterations, especially for the ill-posed problems like the one at hand. In terms of computation, typically, iterative algorithms require matrix–vector computations [$O({n}^{2})$ operations per iteration with $n$ being the number of unknowns] and direct reconstruction algorithms that utilize regularization require matrix–matrix computations [$O({n}^{3})$ operations]. Above all, the memory requirement for the iterative algorithms is at least one order less, thus enabling the solving of large problems.^{7}^{,}^{8} Even though these iterative reconstruction methods are preferred for PA image reconstruction, often these methods require prohibitively large computational time to converge especially when the initial guess is far away from the expected one, making them less appealing in the real-time.^{7}^{,}^{8}

In this work, we address this problem by deploying a vector polynomial extrapolation method, which can accelerate any iterative scheme, thus, making iterative reconstruction methods more appealing. Vector polynomial extrapolation approach relies on finding the limit or antilimit of the initial solutions of an iterative method. Here, we have used these methods to accelerate the most widely used iterative reconstruction methods, namely regularized steepest descent (RSD)^{16} and total variation (TV) regularization.^{17}^{,}^{18} Both numerical and experimental studies involving regular and irregular objects were performed to show the efficacy of the proposed method in comparison to the original iterative methods.

## 2.

## Photoacoustic Image Reconstruction

PA wave equation can be written as follows:^{6}

## Eq. (1)

$${\nabla}^{2}P(x,t)-\frac{1}{{c}^{2}}\frac{{\partial}^{2}P(x,t)}{\partial {t}^{2}}=\frac{-\beta}{{C}_{p}}\frac{\partial H(x,t)}{\partial t},$$The PA wave propagation model can be written in terms of linear system of equations and known to provide a simple yet effective way of solving the PA wave equation (given in Refs. 19 and 20). In this approach, the forward model for PA imaging can be written as follows:^{7}

^{19}

^{,}

^{20}The simple linear back-projected (LBP) image ${x}_{\mathrm{bp}}$ can be obtained via

^{20}

^{,}

^{21}where $\mathbf{T}$ represents the transpose of the matrix. Earlier works have also introduced filtered back projection as well as delay-and-sum,

^{22}which are also noniterative as well as analytic types of reconstruction methods. Since all of them are noniterative, they are computationally efficient but only provide qualitative results along with requirement of having complete data.

^{7}This ${x}_{\mathrm{bp}}$ is typically used as an initial guess (${x}_{0}$) for all iterative methods that are discussed in this work, which is a standard practice.

^{22}

Typically, for limited data cases, a least-squares framework can be utilized along with Tikhonov regularization, resulting in the cost function:^{8}^{,}^{22}

## Eq. (5)

$$x={({\mathbf{A}}^{\mathbf{T}}\mathbf{A}+\alpha \mathbf{I})}^{-1}{\mathbf{A}}^{\mathbf{T}}b.$$The direct solution is prone to numerical instability as in most cases $\mathbf{A}$ tends to be sparse and/or contains small values. In these scenarios, it is very common to use iterative technique to minimize the cost function given in Eq. (4).

## 2.1.

### Regularized Steepest Descent Method

The simplest of them is the RSD, given in Algorithm 1. First, in step 2, the residual (${r}_{n}$) is computed, then the direction of ascent (${L}_{n}^{\alpha}$) is computed in step 3, and finally, the step size (${k}_{n}^{\alpha}$) is computed in step 4. The solution vector ${x}_{n}$ is updated in step 5. For all methods discussed in the work, the iterations are stopped when there is less than 1% change in the relative residual [for 40-dB signal-to-noise ratio (SNR) case]. The convergence criteria were varied similarly for different noise levels.

## Algorithm 1

Algorithm for RSD method.

Input:$\mathbf{A},b,{x}_{0},\alpha $ |

Output: Solution vector: $x$ |

1 Repeat 2 to 5 till convergence is satisfied |

2 ${r}_{n}=\mathbf{A}({x}_{n})-b$ |

3 ${L}_{n}^{\alpha}={\mathbf{A}}^{\mathbf{T}}{r}_{n}+\alpha {x}_{n}$ |

4 ${k}_{n}^{\alpha}=\frac{{\Vert {L}_{n}^{\alpha}\Vert}^{2}}{{\Vert \mathbf{A}{L}_{n}^{\alpha}\Vert}^{2}+\alpha {\Vert {L}_{n}^{\alpha}\Vert}^{2}}$ |

5 ${x}_{n+1}={x}_{n}-{k}_{n}^{\alpha}{L}_{n}^{\alpha}$ |

The choice of the regularization parameter ($\alpha $) is critical as it effects the reconstructed image quality (trade-off between the residual and the minimum norm solution). Usually, $\alpha $ is decreased at every iteration (${\alpha}_{0}>{\alpha}_{1}>{\alpha}_{2}>\dots >{\alpha}_{n}$) for the RSD methods, which is also known as adaptive regularization strategy.^{16}

## 2.2.

### Total Variation Regularization Method

TV regularization is another popular method used for solving ill-posed linear inverse problems. In this, the cost-function is written as follows:^{17}^{,}^{18}

A popular method, split augmented Lagrangian shrinkage algorithm (SALSA),^{24} was utilized here for the TV regularization, which utilizes the alternating direction method of multipliers framework given in Ref. 25, detailed steps are given in Algorithm 2. The step 2 of the algorithm has a closed form solution as given below:

## Eq. (7)

$${x}_{k+1}={({\mathbf{A}}^{\mathbf{T}}\mathbf{A}+\lambda I)}^{-1}[{\mathbf{A}}^{\mathbf{T}}b+\lambda ({v}_{k}+{d}_{k})].$$## Algorithm 2

Algorithm of SALSA.

Input:$\mathbf{A},b,\lambda $,$k=0,{v}_{0}$ and ${d}_{0}$ |

Output: Solution vector: $x$ |

1 Repeat 2-5 till convergence is satisfied |

2 ${x}_{k+1}=\mathrm{arg}\text{\hspace{0.17em}}{\mathrm{min}}_{x}\text{\hspace{0.17em}}{\Vert \mathbf{A}x-b\Vert}_{2}^{2}+\lambda {\Vert x-{v}_{k}-{d}_{k}\Vert}_{2}^{2}$ |

3 ${v}_{k+1}=\mathrm{arg}\text{\hspace{0.17em}}{\mathrm{min}}_{v}\tau \varphi (v)+(\lambda /2){\Vert {x}_{k+1}-v-{d}_{k}\Vert}_{2}^{2}$ |

4 ${d}_{k+1}={d}_{k}-({x}_{k+1}-{v}_{k+1})$ |

5 $k=k+1$ |

The step 3 of the algorithm has a solution of the form:

where $\psi $ denotes the soft thresholding function and $\varphi $ denotes the TV functional ${\Vert x\Vert}_{\mathrm{TV}}$. In case of TV, where it does not have a closed form solution as in Eq. (8) and needs an iterative solution, the convergence can be guaranteed by Theorem 1 in Ref. 24.## 2.3.

### Proposed Vector Extrapolation Methods

The iterative methods refine the current iterate and in an asymptotic sense, this sequence of iterates produced by these algorithms converge to the exact solution. Typically, these required iterations are in the range of few hundred and ideally, one would like to converge to the solution with few tens of iterations. The acceleration of convergence to achieve this can be obtained by fast cyclic iterative algorithms,^{26} which use vector extrapolation technique. Here, each cycle of the algorithm generates a finite number of consecutive iterates by the algorithms discussed above in Algorithm 1 or Algorithm 2. These are used to obtain an extrapolated estimate of the solution. The estimated iterates are used as the next guess for the iterative algorithms. This process is repeated until convergence is achieved.^{27}^{,}^{28} Among the available methods, minimal polynomial extrapolation (MPE)^{27} and reduced rank extrapolation (RRE)^{26}^{,}^{28} are utilized in this work, which are known to be the state-of-the-art extrapolation methods.

## 2.3.1.

#### Minimal polynomial extrapolation method

Polynomial extrapolation methods compute the approximation as a weighted sum of the iterates, where the weights are determined by the coefficients of the minimal polynomial $P(\lambda )$ of $\mathbf{A}$ with respect to ${u}_{0}$, where ${u}_{0}$ is defined as follows:

where ${x}_{0},{x}_{1},\dots ,{x}_{n}$ are the initial iterates, i.e., the unique monic polynomial will satisfy the following property:Suppose $k$ is the degree of the minimal polynomial $P(\lambda )$, where $k\le N$. Here, $N$ denotes the dimension of the solution vector. Then, minimal polynomial can be written as

where ${c}_{j}$ are unknown coefficients (to be determined). Combining Eqs. (10) and (11) results in which can be written as follows: where $c={({c}_{0},{c}_{1},\dots ,{c}_{k-1})}^{T}$ and ${c}_{k}=1$. Using $\mathrm{QR}$ factorization of ${\mathbf{U}}_{\mathbf{k}-\mathbf{1}}$, where ${\mathbf{Q}}_{\mathbf{k}-\mathbf{1}}^{*}{\mathbf{Q}}_{\mathbf{k}-\mathbf{1}}={\mathbf{I}}_{\mathbf{k}-\mathbf{1}\times \mathbf{k}-\mathbf{1}}$: Multiplying both sides by ${\mathbf{Q}}_{\mathbf{k}-\mathbf{1}}^{*}$, it becomes## Eq. (15)

$${\mathbf{Q}}_{\mathbf{k}-\mathbf{1}}^{*}{\mathbf{Q}}_{\mathbf{k}-\mathbf{1}}{\mathbf{R}}_{\mathbf{k}-\mathbf{1}}c=-{\mathbf{Q}}_{\mathbf{k}-\mathbf{1}}^{*}{u}_{k},$$Solution to the above linear system of equations will give $c$, which are the unknown coefficients of the minimal polynomial. When $\mathbf{U}$ is rank deficient, Eq. (13) reduces to

where ${\mathbf{U}}_{\mathbf{k}-\mathbf{1}}^{+}$ represents the pseudoinverse of ${U}_{k-1}$.Suppose, the error at every iteration, ${e}_{n}$, is defined as

where $s$ denotes the exact solution of the system and ${x}_{n}$ is the current iterate. From the error propagation theory for linear systems, it is well known thatSince $P(\lambda )$ is the minimal polynomial of $\mathbf{A}$. Therefore, $P(\mathbf{A}){e}_{0}=0$. Combining Eqs. (19) and (20) lead to

## Eq. (21)

$$\sum _{j=0}^{k}{c}_{j}{\mathbf{A}}^{\mathbf{j}}({x}_{o}-s)=\sum _{j=0}^{k}{c}_{j}({x}_{j}-s)=0,$$To simplify above equation, write ${\gamma}_{j}=\frac{{c}_{j}}{\sum _{i=0}^{k}{c}_{i}}$, where $j=\mathrm{0,1},\dots ,k$ with $\sum _{j=0}^{k}{\gamma}_{j}=1$. Multiplying both sides of Eq. (13) by ${(\sum _{j=0}^{k}{c}_{j})}^{-1}$ converts it to

Simplifying Eq. (22), writing it in terms of ${\gamma}_{j}$ leads to

Since ${x}_{j}={x}_{0}+\sum _{i=0}^{j-1}{u}_{i}$, Eq. (24) can be rewritten as follows:

where which in the matrix form can be written asIn MPE, we will define matrix ${\mathbf{U}}_{\mathbf{k}-\mathbf{1}}\in {\mathbb{R}}^{n*k}$ as comprising of vectors ${u}_{0},{u}_{1},{u}_{2},\dots ,{u}_{k-1}$, where ${u}_{i}\in {\mathbb{R}}^{n}$. Let $c$ be the least square solution of Eq. (13). Start with ${c}_{k}=1$ and compute ${\gamma}_{0},{\gamma}_{1},\dots ,{\gamma}_{k}$ using

where $j=\mathrm{0,1},\dots ,k$. Thus, the MPE estimate becomesThe computation of the solution ($s$) thus requires only an estimate of ${\gamma}_{j}$‘s and followed by simple vector–vector multiplication.

## 2.3.2.

#### Reduced rank extrapolation method

In RRE, the definition of $\mathbf{U}$ changes to ${\mathbf{U}}_{\mathbf{k}}\in {\mathbb{R}}^{n*(k+1)}$ as comprising of vectors ${u}_{0},{u}_{1},{u}_{2},\dots ,{u}_{k}$, where ${u}_{i}\in {\mathbb{R}}^{n}$. Similar to MPE, the RRE estimate becomes

A unified algorithm for implementing MPE and RRE is given in Algorithm 3.^{29} It is based on the modified QR factorization.

## Algorithm 3

Algorithm listing important steps for implementing MPE and RRE.

Input:${x}_{n},{x}_{n+1},{x}_{n+2},\dots ,{x}_{n+k+1}$ |

Output: Solution vector: ${s}_{n,k}$ |

1 ${u}_{i}=\mathrm{\Delta}{x}_{i}={x}_{i+1}-{x}_{i}$ where $i=n,n+1,\dots ,n+k$ |

2 ${\mathbf{U}}_{\mathbf{j}}=[{u}_{n}|{u}_{n+1}|\dots |{u}_{n+j}]$ where $j=0,1,\dots $ |

3 ${\mathbf{U}}_{\mathbf{k}}={\mathbf{Q}}_{\mathbf{k}}{\mathbf{R}}_{\mathbf{k}}$, QR factorization of ${\mathbf{U}}_{\mathbf{k}}$, ${\mathbf{Q}}_{\mathbf{k}}$ is unitary and ${\mathbf{R}}_{\mathbf{k}}$ is upper triangular |

4 (MPE) |

• ${\mathbf{R}}_{\mathbf{k}-\mathbf{1}}{c}^{\prime}=-{\rho}_{k};{\rho}_{k}={[{r}_{0k},{r}_{1k},\dots ,{r}_{k-1,k}]}^{T},{c}^{\prime}={[{c}_{o},{c}_{1},\dots ,{c}_{k-1}]}^{T}$ |

• ${c}_{k}=1;\alpha =\sum _{i=0}^{k}{c}_{i}$ |

• ${\gamma}_{i}={c}_{i}/\alpha $ where $i=\mathrm{0,1},\dots ,k$ |

(RRE) |

• ${\mathbf{R}}_{\mathbf{k}}^{*}{\mathbf{R}}_{\mathbf{k}}d=e,d={[{d}_{o},{d}_{1},\dots ,{d}_{k}]}^{T},e={[1,1,\dots ,1]}^{T}\in {\mathbb{R}}^{k+1}$ |

• $\lambda ={(\sum _{i=0}^{k}{d}_{i})}^{-1}$ |

• $\gamma =\lambda d$ |

(Common steps) |

$\zeta =1-{\gamma}_{0};{\zeta}_{i}={\zeta}_{i-1}-{\gamma}_{j}$ where $j=1,\dots ,k-1$ |

${s}_{n,k}={x}_{n}+{\mathbf{Q}}_{\mathbf{k}-\mathbf{1}}({\mathbf{R}}_{\mathbf{k}-\mathbf{1}}\zeta )$ |

Algorithm 3 is called for $L$ cycles with $k$ representing the order of extrapolation. First, one of the above algorithms (Algorithm 1 or Algorithm 2) is called for $k$ steps. The generated $k$ iterates are extrapolated to get the solution ($s$). This completes one cycle of the algorithm. The solution obtained becomes an input to Algorithm 1 or Algorithm 2 as an initial guess and the process is repeated till the convergence is obtained. These methods combining vector extrapolation with RSD, via Algorithm 1, will be called as minimal polynomial extrapolated regularized steepest descent (MPERSD) and reduced rank extrapolated regularized steepest descent (RRERSD). The ones where TV regularization, via Algorithm 2, will be called as minimal polynomial extrapolated total variation (MPETV) and reduced rank extrapolated total variation (RRETV).

## 3.

## Figures of Merit

The figures of merit utilized for effectively comparing the reconstruction methods introduced in this work were Pearson correlation (PC), contrast-to-noise ratio (CNR), and SNR.^{30}

## 3.1.

### Pearson Correlation

PC is defined as follows:

## Eq. (31)

$$\mathrm{PC}({x}^{\mathrm{target}},\tilde{x})=\frac{\mathrm{COV}({x}^{\text{target}},\tilde{x})}{\sigma ({x}^{\text{target}})\sigma (\tilde{x})},$$## 3.2.

### Contrast to Noise Ratio

CNR is defined as follows:

## Eq. (32)

$$\mathrm{CNR}=\frac{{\mu}_{\mathrm{RoI}}-{\mu}_{\mathrm{back}}}{{({\delta}_{\mathrm{RoI}}^{2}{a}_{\mathrm{RoI}}+{\delta}_{\mathrm{back}}^{2}{a}_{\mathrm{back}})}^{1/2}},$$## 3.3.

### Signal-to-Noise Ratio

In cases, especially experimental and *in-vivo*, it may not be possible to know the target initial pressure distribution to calculate both PC and CNR. In these cases, the figure of merit that has been used is SNR of the reconstructed image. It is expressed as given below:

## 4.

## Numerical and Experimental Phantom Studies

The acceleration achieved by the proposed vector extrapolation methods was tested using both numerical and experimental phantom cases. The measurement of actual initial pressure is challenging in real experiments. It leads to difficulty in comparing the accuracy of different algorithms. Thus, numerical phantoms are used for comparing the accuracy of different reconstructions.

The two-dimensional numerical phantoms have a size of $401\times 401$ spanning the imaging region of $20.1\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{mm}\times 20.1\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{mm}$. The schematic diagram showing the imaging domain along with detectors placement is given in Fig. 1. The experimental data were generated on a high dimensional grid ($401\times 401$) and the reconstruction was performed on a lower dimensional grid ($201\times 201$). The data generated using the high dimensional grid was added with a white Gaussian noise, resulting in different SNR levels ranging from 20 to 60 dB. An open source tool box $k$-WAVE in MATLAB was used for generating the data.^{31} The computational imaging grid has a size of $501\times 501\text{\hspace{0.17em}\hspace{0.17em}}\text{pixels}$ ($0.1\mathrm{mm}/\text{pixel}$) and the detectors were placed on a circle of radius 22 mm (Fig. 1). The sampling frequency for the data collection is 20 MHz with number of time samples for each detector being 500. The hundred detectors were considered as point detectors having center frequency of 2.25 MHz and 70% bandwidth. Thus, the system matrix ($\mathbf{A}$) has a dimension of $\mathrm{50,000}\times \mathrm{40,401}$. The speed of sound was assumed to be $1500\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{m}/\mathrm{s}$ and the medium was assumed to be homogeneous with no absorption or dispersion of sound. In all numerical experiments conducted here, 100 cycles of second order extrapolation was utilized.

Initially, two numerical phantoms with a maximum initial pressure of 1 kPa [blood vessel and Derenzo given in Figs. 3(a) and 4(a), respectively] for targets were considered for proving the effectiveness of the proposed method. The numerical blood vessel phantom [Fig. 3(a)] consisted of thick and thin blood vessels mimicking the blood vessel structure. The modified Derenzo phantom [Fig. 4(a)] consisted of circular objects with varying diameter grouped according to size.

## Table 1

Computational time (in s) for the results presented in Figs. 5(a) and 5(b) corresponding to RSD, MPERSD, and RRERSD methods.

Phantom | Method | SNR=20 dB | SNR=40 dB (Figs. 3 and 4) | SNR=60 dB |
---|---|---|---|---|

BV [Figs. 5(a) and 5(b)] | RSD | 3.07 | 89.61 | 1334.22 |

MPERSD | 2.47 | 86.16 | 279.77 | |

RRERSD | 2.54 | 73.02 | 566.48 | |

Derenzo [Figs. 5(a) and 5(b)] | RSD | 9.32 | 115.76 | 1662.89 |

MPERSD | 6.08 | 87.61 | 558.53 | |

RRERSD | 6.26 | 104.66 | 426.65 |

## Table 2

Computational time (in s) for the results presented in Figs. 5(c) and 5(d) corresponding to TV, MPETV, and RRETV methods.

Phantom | Method | SNR=20 dB | SNR=40 dB (Figs. 3 and 4) | SNR=60 dB |
---|---|---|---|---|

BV [Figs. 5(c) and 5(d)] | TV | 6.11 | 5.92 | 39.11 |

MPETV | 7.21 | 6.99 | 13.04 | |

RRETV | 6.99 | 6.97 | 15.79 | |

Derenzo [Figs. 5(c) and 5(d)] | TV | 5.70 | 5.47 | 41.17 |

MPETV | 6.93 | 6.92 | 12.78 | |

RRETV | 6.88 | 9.19 | 15.19 |

The schematic of the experimental setup for the collection of the PA data is shown in Fig. 2(a) [similar to Fig. 1(e) of Ref. 32]. A Q-switched Nd:YAG laser was used for delivering 532-nm wavelength laser of 5-ns duration at 10-Hz repetition rate. Four right-angle uncoated prisms (PS911, Thorlabs) and one uncoated planoconcave lens L1 (LC1715, Thorlabs) were used to deliver the laser pulses to the sample. The laser energy density on the phantom was $\sim 9\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{mJ}/{\mathrm{cm}}^{2}$ ($<20\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{mJ}/{\mathrm{cm}}^{2}$: ANSI safety limit^{33}). A triangular shaped horse hair phantom was utilized for imaging. The side-length and diameter of hair are $\sim 10$ and 0.15 mm, respectively. The hair phantom was glued to the pipette tips adhered on acrylic slab.^{34} The PA data were collected continuously around the hair phantom in full 360 deg using a 2.25-MHz flat ultrasound transducer (Olympus NDT, V306-SU) with 13 mm diameter active area and 70% nominal bandwidth. Another experimental phantom was also used to evaluate the effectiveness of the proposed method. It was circular in shape and made using low density polyethylene tubes (5-mm inner diameter), which were filled with black Indian ink. The tubes were placed at 0 and 15 mm from the scanning center and affixed at the bottom of the acrylic slab. The acquired PA signals were first amplified and filtered using a pulse amplifier (Olympus-NDT, 5072PR), and then recorded using a data acquisition (DAQ) card (GaGe, CompuScope 4227) using a single channel with a sampling frequency of 25 MHz inside a desktop (Intel Xeon 3.7 GHz 64-bit processor, 16 GB RAM, running windows 10 operating system). Synchronization of data acquisition with laser illumination was achieved using a sync signal from laser. The reconstructed PA imaging region has a size of $40\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{mm}\times 40\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{mm}$ containing $200\times 200\text{\hspace{0.17em}\hspace{0.17em}}\text{pixels}$.

The data collected have 2400 A-lines averaged over six times resulting in 400 detected signals collected by the ultrasound transducers acquiring the data continuously around the hair phantom in full 360 deg for an acquisition time of 240 s with a rotational speed of 1.5 deg/s. This averaging of A-lines reduced the energy fluctuations during multiple firings of the laser. The ultrasound transducers and the phantom are immersed in water to enable ultrasound coupling. The PA data were acquired with sampling frequency of 25 MHz (1024 samples) and subsampled at half the rate at 12.5 MHz to result in 512 time samples. The system matrix has a dimension of $\mathrm{51,200}\times \mathrm{40,000}$ (51,200: number of detectors is 100 with each collecting 512 time samples, 40,000: dimensions of the imaging domain being $200\times 200$). Determining the actual initial pressure rise (target values) in these experiments is not plausible, so only a comparison between the proposed and standard methods was presented here with SNR being the figure of merit for quantitative comparison. Note that a Linux workstation with 32 cores of Intel Xeon processor having a speed of 3.1 GHz with 128 GB RAM was used for all computations performed in this work.

## 5.

## Results and Discussion

The reconstructed initial pressure distribution for the numerical blood vessel phantom [Fig. 3(a)] data having SNR of 40 dB using RSD method is shown in Fig. 3(b). The reconstructions using MPERSD and RRERSD methods are shown in Figs. 3(c) and 3(d), respectively. The LBP result is shown in Fig. 3(e). Even though the required computation time for obtaining LBP results is only 0.5 s, it can be observed that the LBP performance is poor compared to other methods in terms of both image quality (streak artifacts are more observable in LBP image) as well as reconstructed dynamic range of initial pressure (the maximum value is only going up to 0.1 against expected 1). The figures of merit, PC and CNR, for these results along with the results (not shown) for SNR of data being 20 and 60 dB are shown in Figs. 5(a) and 5(b). The corresponding computational time (in seconds) is given in Table 1. The reconstruction times presented here are for smallest order of extrapolation (quadratic). From Table 1, it is evident that the time taken by proposed extrapolation methods (MPERSD and RRERSD) was always less as compared to the standard (RSD) method. From Figs. 5(a) and 5(b), it can be seen that the PC and CNR values are nearly same for both standard and proposed methods for SNR values of 40 and 60 dB. For 20 dB, the standard method performs slightly better in terms of PC and CNR. Note that the proposed extrapolation methods generally perform well when the initial iterates are robust to noise and in the case of 20-dB initial iterates may not be well regularized solutions, thus causing the convergence to a local minima. In terms of speedup (top rows corresponding to blood vessel phantom results of Table 1), for SNR of 60 dB, it is 4.7 for MPERSD and 2.3 for RRERSD. Clearly, for low noise cases (SNR of 60 dB), there is a significant reduction in the computation time. For SNR of 40 dB, the speedup is 1.2 for RRERSD, which shows an improvement over the standard RSD method. For the case of SNR of 20 dB, the overall computation time was reduced using extrapolation methods, but these are not significant.

The reconstructed initial pressure distribution for the blood vessel phantom [Fig. 3(a)] using TV is shown in Fig. 3(f). The reconstructions using MPETV and RRETV are shown in Figs. 3(g) and 3(h). The figures of merit for these are shown in Figs. 5(c) and 5(d), along with results corresponding to varying SNR values of 20 and 60 dB. The corresponding computational time (in seconds) is reported in Table 2. From Table 2, it is evident that the time taken by MPETV and RRETV is less as compared to the standard TV method for the 60-dB SNR case. From Figs. 5(c) and 5(d), it is evident that the PC and CNR values are similar for the methods discussed in this work. For the case of SNR being 60 dB, the speedup obtained is 2.9 for MPETV and 2.4 for RRETV.

Similarly for the Derenzo phantom, the reconstructed initial pressure distribution is shown in Fig. 4(b) for the RSD method. The reconstructions for MPERSD and RRERSD are shown in Figs. 4(c) and 4(d) correspondingly. The computational times reported in Table 1 show that the speedup obtained for 60-dB SNR case is 2.9 for MPERSD and 3.89 for RRERSD. The same trend of LBP result having poor performance as observed in blood vessel phantom case was also observed here [result is presented as Fig. 4(e)]. So, for rest of experimental phantom cases, the LBP method was not utilized in the comparison. The reconstruction result pertaining to TV for Derenzo phantom case is given in Fig. 4(f). The reconstructions for MPETV and RRETV are shown in Figs. 4(g) and 4(h). Similarly, from Table 2, the speedup obtained for 60-dB SNR is 3.2 for MPETV and 2.7 for RRETV. There could not be any speed-up achieved by the proposed methods (MPETV and RRETV) for SNR values of 20 and 40 dB in these numerical phantom cases. TV has the property of preserving the features of an image. It finds the solution, which has the total minimal variation but is not biased toward a smooth or sharp solution. Thus, the disk boundaries are smooth in case of TV regularization. The RSD is equivalent of solving least-squares, which promotes only smooth solutions, does not have any properties of minimizing the variation and hence, the disk boundaries have stronger pixel values.

The reconstruction results for the experimental horse hair phantom are shown in Figs. 6(a) and 6(d) for the standard methods, RSD and TV correspondingly. The reconstruction results pertaining to MPERSD and RRERSD methods are shown in Figs. 6(b) and 6(c) correspondingly. The reconstructions for the MPETV and RRETV methods are shown in Figs. 6(e) and 6(f) correspondingly. From the results, it is evident that the TV-based methods provide better performance compared to RSD methods. Table 3 provides computational time needed for computing these results, and it is evident that the extrapolation methods that are proposed here provide significant speedup, specifically, it is 1.2 for MPERSD and 1.6 for RRERSD, whereas it is 3 for MPETV and 3.1 for RRETV. The extrapolated methods are clearly efficient as compared to the standard methods in terms of computation and also provide better SNR in terms of reconstruction results.

## Table 3

Computational time (in s) required for obtaining the results presented in Figs. 6 and 7 corresponding to experimental horse hair and the tube phantom.

Phantom | RSD | MPERSD | RRERSD | TV | MPETV | RRETV |
---|---|---|---|---|---|---|

Horse hair | 150.88 | 127.19 | 94.93 | 70.47 | 23.16 | 22.70 |

Tube | 287.45 | 192.55 | 153.02 | 54.82 | 29.46 | 30.12 |

The axial resolution that was calculated from the reconstructed horse hair phantom image (estimated FWHM in Fig. 6) was found to be $301\text{\hspace{0.17em}\hspace{0.17em}}\mu \mathrm{m}$.^{35} This resolution matches closely with the theoretical expected spatial resolution ($\sim 333\text{\hspace{0.17em}\hspace{0.17em}}\mu \mathrm{m}$) for a 2.25-MHz transducer. As stated earlier, the diameter of the hair phantom was $150\text{\hspace{0.17em}\hspace{0.17em}}\mu \mathrm{m}$. To resolve such object, one can use higher frequency transducer of 5 MHz^{36} and as the aim of this part of the work is to show the real-time utility of the proposed vector extrapolation methods, this was not attempted.

The reconstruction results for the experimental tube phantom are shown in Figs. 7(a) and 7(d) for the standard methods, RSD and TV correspondingly. The reconstruction results pertaining to MPERSD and RRERSD methods are shown in Figs. 7(b) and 7(c) correspondingly. The reconstructions for the MPETV and RRETV methods are shown in Figs. 7(e) and 7(f) correspondingly. Here, as the expected objects are more circular in nature, the improvement in terms of SNR is not significant with TV-based methods showing a marginal improvement compared to its counterparts. Table 3 provides computational time needed for obtaining these results, and it is evident that the extrapolation methods that are proposed here provide significant speedup, specifically, it is 1.5 for MPERSD and 1.9 for RRERSD, whereas it is 1.9 for MPETV and 1.8 for RRETV.

The postprocessing of reconstructed images using Hilbert transform can improve the visualization of initial pressure distributions (currently, they are bipolar, although the expected is unipolar).^{37}^{–}^{39} As the main aim of this work is to show the speed up obtained by vector extrapolation schemes, this postprocessing step was not included to show the raw performance of proposed methods. The performance of the reconstruction algorithm can also be characterized by the dynamic range of the reconstructed initial pressure [for example, compare Figs. 4(e) and 4(f)], normalizing these plots either using Hilbert or other transformation methods will make the representations inaccurate.

The work presented here is geared toward accelerating the image reconstruction procedure in case of limited data. The definition of limited data here is relative to the state-of-the-art setups. To be precise, the current clinical PA tomography scanners with advances in instrumentation acquire large amounts of data with number of detectors being as high as 512 and number of time samples for each detector being 2048.^{40} Note that in this work, the maximum number of detectors that were utilized is 100 with number of time samples being 512, thus making the data available here relatively limited.

It is important to note that the vector extrapolation methods presented in this work are known to be providing global optima as they estimate the limit (or antilimit) of the initial solutions of an iterative method. Thus, the performance of these methods largely depends on the initial solutions and often regularized solutions (like the ones obtained using TV) can improve the performance of these methods (which is also observed here). Ideally, in noise-free cases, both extrapolation-based iterative methods and standard iterative methods should converge to the same solution as long as the stopping criterion is same for both methods (which is the case here). Unfortunately, noise-free cases are far away from real-time scenarios, so, this exercise was not taken up and interested readers can refer to Refs. 26–29 for detailed mathematical framework.

The important step in the PA tomography is image reconstruction and developing computationally efficient methods that can be deployable in real-time is desirable to translate PA imaging into a clinical imaging modality. Especially, in cases of limited data, most reconstruction algorithms present in the literature even though provide better quantitatively accurate results, often they are prohibitively expensive in terms of computation. In this work, we have deployed extrapolation techniques to accelerate two standard reconstruction methods and shown using example cases that these methods can be very effective in reducing the total computation time required especially in experimental scenarios. It is important to note that the extrapolation techniques that are presented in this work are more of universal in nature and can be employed in any iterative reconstruction method. The results presented here are limited in nature as the focus is more toward showing the utility of these extrapolation methods, one could expect the similar speed-up in other cases as well. The corresponding algorithms along with necessary code are given as an open-source for enthusiastic users.^{41}

## 6.

## Conclusions

The vector extrapolation methods have been shown to improve any iterative procedure, where there will be a reduction in the residual error as the iterations progress (convergence is guaranteed). Application of these extrapolation methods in the frame work of two iterative methods that are commonly used in PA imaging was presented here. Specifically, the initial iterates that were calculated using RSD or TV were given as input to these vector extrapolation methods and was shown to provide significant speed-up (as high as 4.7 times) especially in numerical cases. The extrapolation techniques are important in achieving the limit of the function without iterating through a large number of steps as frequently performed by many iterative methods. It is shown using both numerical and experimental phantom cases, the extrapolation methods (two of such variants were presented in this work), have a potential in terms of speeding up the reconstruction algorithm along with improving the reconstructed image quality.

## References

## Biography

**Navchetan Awasthi** received his MTech in computational science from Indian Institute of Science (IISc), Bangalore, in 2016. He also received his BTech in electronics and communication engineering from the National Institute of Technology (NIT), Jalandhar, in 2011. He is currently a PhD student in the Department of Computational and Data Sciences at Indian Institute of Science, Bangalore. His research interests include inverse problems in biomedical optics and medical image reconstruction.

**Sandeep Kumar Kalva** received his master’s degree in biomedical engineering from the Indian Institute of Technology (IIT), Hyderabad, India, in 2015. He is currently a PhD student at the School of Chemical and Biomedical Engineering, Nanyang Technological University, Singapore. His research area is on functional photoacoustic imaging for various clinical applications.

**Manojit Pramanik** received his PhD in biomedical engineering from Washington University in St. Louis, Missouri. Currently, he is an assistant professor at the School of Chemical and Biomedical Engineering, Nanyang Technological University, Singapore. His research interests include the development of photoacoustic/thermoacoustic imaging systems, image reconstruction methods, clinical application areas, such as breast cancer imaging, molecular imaging, contrast agent development, and Monte Carlo simulation of light propagation in biological tissue.

**Phaneendra K. Yalavarthy** received his MSc degree in engineering from Indian Institute of Science, Bangalore, and his PhD degree in biomedical computation from Dartmouth College, Hanover, New Hampshire, in 2007. He is an associate professor with the Department of Computational and Data Sciences at Indian Institute of Science, Bangalore. His research interests include medical image computing, medical image analysis, and biomedical optics. He is a senior member of IEEE and serves as an associate editor of IEEE Transactions on Medical Imaging.