Comparison of photoacoustic image reconstruction algorithms using the channelized Hotelling observer

Abstract. We demonstrate the use of task-based image-quality metrics to compare various photoacoustic image-reconstruction algorithms, including a method based on the pseudoinverse of the system matrix, simple backprojection, filtered backprojection, and a method based on the Fourier transform. We use a three-dimensional forward model with a linear transducer array to simulate a photoacoustic imaging system. The reconstructed images correspond with two-dimensional slices of the object and are 128×128  pixels. In order to compare the algorithms, we use channelized Hotelling observers that predict the detection ability of human observers. We use two sets of channels: constant Q and difference of Gaussian spatial frequency channels. We look at three tasks, identification of a point source in a uniform background, identification of a 0.5-mm cube in a uniform background, and identification of a point source in a lumpy background. For the lumpy background task, which is the most realistic of the tasks, the method based on the pseudoinverse performs best according to both sets of channels.


Introduction
Photoacoustic imaging is based on the detection of acoustic waves that are generated by thermoelastic expansion induced by optical absorption of laser light. The majority of photoacoustic imaging systems can be classified as either microscopy systems or computed tomography (CT) systems. 1 In general, the microscopy-based systems do not require a reconstruction step as the lateral resolution is set by either a focused transducer or focused laser illumination. CT systems can be further subdivided into scanning systems (also called synthetic aperture) and array systems. From a mathematical point of view, synthetic apertures and arrays require the same reconstruction algorithm, although arrays tend to have smaller individual transducers so the directivity can be more uniform over the imaging area. Transducer arrays have a speed advantage as they do not need to be mechanically scanned, which can be important in many applications. In addition, the array-based imaging systems have the potential to be integrated with ultrasound imaging systems, and indeed many of the arrays in use in photoacoustic systems are taken from ultrasound systems. [2][3][4] In this paper we consider a linear transducer array that is similar to the ones used in current photoacoustic and ultrasound imaging systems.
There are several reconstruction algorithms used in linear array imaging systems. Filtered backprojection is based on an analytical inversion of the solution to the photoacoustic wave equation. 5 It is theoretically exact for a continuous, infinite linear transducer array, and when the object to be imaged is acoustically homogeneous and has no acoustic absorption. Unfiltered (or simple) backprojection, although not theoretically exact, is numerically equivalent to the delay-and-sum beamformer method employed in ultrasound imaging systems, and hence is very readily implemented in combined ultrasoundphotoacoustic imaging systems. The Fourier method of image reconstruction is also based on inverting the solution to the photoacoustic wave equation, but this inversion is performed in the frequency domain. 6 For sufficiently small datasets or given sufficiently powerful computer hardware, all of these simple reconstruction algorithms can be used in real-time imaging systems. There are more sophisticated reconstruction algorithms that can correct for acoustic inhomogeneity and absorption such as iterative methods. 7 In general, these iterative algorithms produce high-quality images; unfortunately, using current computers they are not capable of real-time imaging.
The minimum-norm least-squares (MNLS) solution to reconstructing a photoacoustic image is given by the pseudoinverse of the system matrix, which relates the initial pressure distribution to the measured pressure wave. In this paper, we calculate the pseudoinverse using a singular-value decomposition (SVD). The pseudoinverse calculation involves dividing by the nonzero singular values of the system matrix. In practice, we have to regularize the problem by throwing out small singular values that would otherwise amplify the noise. Where to set the regularization threshold depends on the specific signal and noise levels. The noise in an imaging system can generally be divided into object-dependent and object-independent noise. In many imaging systems, such as x-ray CT, the object-dependent noise varies strongly across detectors. In photoacoustics, however, the object-dependent noise is relatively uniform across transducer elements (as long as the object is not too close to the transducer). The object-independent noise (such as electronic noise) is also uniform across the transducer elements.
The regularization level still depends on the relative strength of the signal and noise, but it is possible to calculate a few pseudoinverse matrices based on different singular-value truncations, store these in memory, and then decide in real time which one to use based on image quality. Due to the more uniform noise properties, we expect this method to work relatively better in photoacoustics than, say, x-ray CT, which would benefit from a more advanced regularization. Using this method the reconstruction step consists only of a matrix-vector multiplication, which is very fast. In a recent publication, the system matrix of a photoacoustic imaging system was measured experimentally, and the corresponding pseudoinverse was calculated ahead of time and used for real-time imaging. 8 This method has the advantage of including all the relevant acoustic properties in the system matrix, but the disadvantages of requiring the experimental measurement of the system matrix and the inevitable inclusion of system noise in the measurement. In this paper we develop a model for a two-dimensional (2-D) system matrix and use that to calculate the pseudoinverse. We also develop a three-dimensional (3-D) forward model to simulate a photoacoustic imaging system.
A natural question then arises as how best to select an algorithm to optimize image quality. In a previous work, 9 we performed a preliminary comparison between the filtered backprojection and pseudoinverse reconstruction algorithms. In that paper, we looked at various metrics such as resolution, streak length, and ideal observer signal-to-noise ration (SNR) for a point source object; however, these metrics can be misleading. The resolution and streak length tell nothing of the noise properties of the algorithms, and the ideal observer SNR only measures how much information is lost for each algorithm in going from the raw data to the reconstructed image, which does not necessarily correlate with image quality as judged by human observer task performance.
In this paper we demonstrate the use in photoacoustics of so-called task-based metrics of image quality that seek to define image quality based not on physical measures but on the utility of an image for performing a predefined task. Such task-based image-quality assessment has gained widespread attention in nuclear medicine and x-ray imaging modalities. 10 While the gold standard of task-based assessment is human observer studies using the receiver operating characteristic (ROC) methodology, these are tedious and labor-intensive. As an alternative, various model observers have been proposed that seek to emulate human observer performance in well-defined tasks, such as the detection of a known signal in a known or random background. The channelized Hotelling observer (CHO) has been shown to predict human performance particularly well. 10 It entails integrating images over a set of frequency-domain channels that mimic the frequency response of the human visual system and then applying a linear discriminant to the output. To assess the image quality of the reconstruction algorithms in this study, we use the CHO with two sets of channels that have been shown to predict the performance of human observers: constant Q 11 and difference of Gaussian (DOG) channels. 12

System Matrix
The photoacoustic equation can be written as 1 where v s is the speed of sound, pðr; tÞ is the pressure, β is the thermal coefficient of volume expansion, C P is the specific heat at constant pressure, and Hðr; tÞ is the heating function. The Green's function solution for the pressure spectrum can be written as 13p where k ¼ ω∕v s . The geometry we are considering is shown in Fig. 1 and consists of a linear transducer array located on the x axis with a cylindrical lens on its face and consisting of N individual transducers, an object to be imaged in the half-space z 0 > 0, and a light source above the object pointing in the positive y 0 direction. The cylindrical lens rejects acoustic waves that do not originate near the imaging ðx 0 − z 0 Þ plane. We assume uniform light illumination on the imaging plane to avoid modeling photon propagation. With this geometry, we can separate the heating function spectrum intoHðr 0 ; ωÞ ¼ μðr 0 ÞĨðωÞ, where μðr 0 Þ is the absorption coefficient andĨðωÞ is the irradiance spectrum. The average pressure on each individual transducer is the pressure integrated over x and y divided by the area, where the width in the x direction is w. The cylindrical acoustic lens imparts a phase of e −ikd e iky 2 ∕ð2fÞ to the wave, where d is the acoustic path length of the lens (refractive index times total thickness of the lens) and f is the focal length. The lens has a height of h and the transducer has a transfer function denoted byTðωÞ. This gives the measured voltage spectrum as where n denotes each individual transducer and runs from −ðN − 1Þ∕2 to ðN − 1Þ∕2. The middle transducer has n ¼ 0, and the center of the middle transducer is located at x ¼ 0.
The extra z 0 ∕jr − r 0 j is an obliquity factor. In this paper we consider a 32-element transducer with cylindrical lens focal length of 6 mm, height of 3 mm, individual element width of 0.15 mm, and center-to-center element spacing of 0.15 mm. The transducer transfer function is Gaussian with a Fig. 1 Geometry of photoacoustic imaging system. center frequency of 3.5 MHz and 50% bandwidth, and we assume the laser pulse is sufficiently short so that IðωÞTðωÞ ¼ I 0T ðωÞ, where I 0 is a constant. Note that the transducer spacing satisfies the Nyquist sampling criterion. For the forward model, we use the full 3-D expression given by Eq. (3). For the system matrix used to calculate the pseudoinverse, we use a 2-D approximation of Eq. (3) with y ¼ y 0 ¼ 0. For the discrete version, we use the notation g ¼ Hf, where g is a vector representing the voltage measured by the transducers, H is the system matrix, and f is a vector representing the initial pressure distribution of the object being imaged. We use a 128 × 128 pixel basis with a pixel size of 37.5 μm. The temporal sampling rate is 100 ns with 71 samples per transducer, which produces a system matrix of size 2272 × 16384.

Pseudoinverse
We performed an SVD on the system matrix, which was then used to calculate the pseudoinverse. The SVD factors the system matrix into H ¼ USV † , where U and V † are unitary matrices, the dagger represents conjugate transpose, and S is a diagonal matrix consisting of the singular values in decreasing order. The pseudoinverse is then given by H þ ¼ VS þ U † , where S þ is a diagonal matrix consisting of the reciprocal of singular values above a threshold. The thresholding of the singular values serves as a regularization. Note that this method is also referred to as truncated SVD inversion. Figure 2 shows a plot of the magnitude of the singular values. The image, which is an estimation of the initial pressure f, is then given byf ¼ H þ g.

Filtered Backprojection and Simple Backprojection
The filtered backprojection formula 5 can be written as μðx 0 ; z 0 Þ ¼ z 0 X n g n ðτ n Þ Ã Rðf c τ n Þ; where g n ðτ n Þ is the time integrated pressure (along with other constants), the convolution is with respect to τ n ¼ ½ðnw − x 0 Þ 2 þ z 2 0 1∕2 ∕v s , RðuÞ ¼ 4 sincð2uÞ − 2 sinc 2 ðuÞ, sincðuÞ ¼ sinðπuÞ∕ðπuÞ, and f c is approximately the highest temporal frequency in the data. Note that in the Fourier domain, the filter is the Ram-Lak filter (ramp filter with cutoff f c ). In order to construct the filtered backprojection matrix, we used impulses for g n ðτ n Þ that were time-integrated and interpolated to increase the sampling rate to prevent quantization errors. 14 A Hamming window was applied along the transducer elements for apodization. The simple backprojection algorithm is equivalent to the filtered backprojection algorithm except the pressure is not time-integrated or filtered with the Ram-Lak filter. Note that this filtered backprojection formula is built on a foundation developed by Norton in the context of reflection tomography. 15 An alternative algorithm has been developed by Burgholzer et al. but has not been implemented in the present study. 16

Fourier Method
The Fourier transform-based formula 6 can be written as where k 2 ¼ k 2 x þ k 2 z .Ṽðk x ; ωÞ is constructed by taking the real part of the Fourier transform with respect to time, and taking the Fourier transform with respect to x. The image is formed by employing bilinear interpolation and taking the inverse Fourier transform ofμðk x ; k z Þ. Again, a Hamming window is used for apodization. We are ignoring multiplicative constants that would be required for quantitative results, as we did for the backprojection algorithms.

Deconvolution
For the filtered backprojection, simple backprojection, and Fourier based methods we deconvolve the transducer response usingD ðωÞ ¼T Ã ðωÞ where the asterisk indicates complex conjugation and ζ is a regularization parameter that varies from 10 −6 to 10. For small values of ζ the deconvolution is the inverse of the transducer response and noise amplification occurs. When ζ is 10, DðωÞ ≈T Ã ðωÞ, which is a matched filter. The matched filter lowers the noise at the expense of resolution. For all of the reconstructed images we perform envelope detection by taking the absolute value of the analytic function of the image. Figures 3 and 4 show the image reconstruction of a point source and a 0.5-mm cube, respectively, for various singularvalue truncations as well as the backprojection, filtered backprojection, and Fourier-based algorithms with different regularization parameters. Uncorrelated Gaussian noise is added to g before reconstruction to simulate electronic noise. The standard deviation of the noise is set to be equal to the peak signal level divided by 10. There is no background for these images. Figure 5 shows the image reconstruction of a lumpy background 17 for the different algorithms. For each lumpy background image, there is a corresponding image with lumpy background and point source signal. All of the images in Fig. 5 contain the same background; in the next section the number of lumps and their locations are treated as random variables. The lumpy background is only present in a volume represented by a 0.5 mm cube at the center of the object. Uncorrelated Gaussian noise is also added to g before reconstruction, where, again, the standard deviation is set to the peak signal level divided by 10. As mentioned in the Introduction, the CHO has been shown to predict human observer performance for various tasks. The choice of channels depends on the specific task to be performed. In this paper we use two sets of channels: constant Q 11 and DOG channels. 12 Constant Q channels have been shown to work well at predicting human performance when the noise is additive and Gaussian. DOG channels work well for additive Gaussian noise as well as for a lumpy background. The channel profiles are shown in Fig. 6. In order to quantify image quality, we compute the channelized Hotelling SNR given by 11 where the brackets indicate expectation values,f ch ¼ U chf , U ch is a matrix that represents the spatial frequency filtering and summing of the channels, and K ch is the covariance matrix off ch . Note that the covariance matrix is only a 5 × 5 matrix for the constant Q channels and a 3 × 3 matrix for the DOG channels, which means the number of simulations required to make the matrix full rank is greatly reduced as compared to the full image covariance matrix. In this study we use 500 simulations to construct Eq. (7). For the uniform background tasks, uncorrelated Gaussian noise is added to g before reconstruction, as discussed in the previous section. These tasks are referred to as signal known exactly, background known exactly (SKE/ BKE). For the lumpy background 17 task, the number of lumps and their locations are treated as random variables. The number of lumps, L, is modeled as a Gaussian random   variable with a mean and variance of 1250. That number was selected using hLi∕N ¼ 1∕100, where N ¼ 1.25 × 10 5 is the number of discrete object points in a 0.5-mm cube in our simulations. The location of the lumps is modeled as a uniform random variable over the volume represented by a 0.5-mm cube at the center of the object. Each lump is modeled as a point source with a magnitude that is 100 times lower than the point source signal magnitude. Uncorrelated Gaussian noise is also added to g before reconstruction for this task. This task is referred to as SKE in a lumpy background and is a good model of tumor contrast agent imaging where the signal represents uptake of the contrast agent by the tumor and the lumpy background represents contrast agent uptake by the surrounding tissue.
We define the CHO efficiency as where SNR 2 data is given by In Figs. 7 through 9, we show the CHO efficiency for the three tasks. For the SKE/BKE tasks, the efficiency is highest    when the regularization is chosen so that the noise is lowest even though the resolution is worse as seen in Figs. 3 and 4. The performance of the algorithms is similar for the SKE/BKE tasks except that the Fourier-based algorithm performs best for the identification of the cube as determined by the difference of Gaussian channels. However, the SKE/BKE task is generally not a realistic model of actual imaging systems. For the SKE in a lumpy background task, which is more realistic, the method based on the pseudoinverse performs best according to both sets of channels. Also notice that for this task the optimal regularization is not necessarily to choose the lowest possible noise amplification. It is also interesting that for the three tasks studied in this paper, the simple backprojection performs as well as the filtered backprojection.

Conclusion
In this paper, we have used task-based image quality metrics to compare image reconstruction algorithms in photoacoustic tomography. The study was not intended to be the final word on algorithm selection but rather to introduce the use of taskbased assessment and model observers to the field. As the field grows and matures, it could be valuable to deploy such tools for optimizing systems and algorithms for various tasks.
We have shown that for the case of a linear transducer array and an SKE in a lumpy background task, the image reconstruction algorithm based on the pseudoinverse performs best according to the CHOs we look at in this paper. Future work is planned to study the validity of the channels used in this paper by performing human observer experiments. Future work is also planned to study tasks that include signal variability. It should be noted that in this paper we assume uniform light illumination on the imaging plane. Depending on the specific system and task to be performed this may not be a realistic assumption, and in this case the laser irradiation geometry and photon propagation should be included in the forward model.
Another interesting conclusion of this study is that the simple backprojection algorithm performs as well as the filtered backprojection algorithm for the tasks we looked at. Since the simple backprojection algorithm is numerically equivalent to the delayand-sum beamformer reconstruction employed in existing ultrasound imaging systems, combined ultrasound-photoacoustic imaging systems may be able to use this algorithm without loss of performance. This would be advantageous in terms of system design. The type of analysis presented in this paper could be used to determine if this conclusion is valid for the specific task the combined ultrasound-photoacoustic imaging system is being built for.