Estimating blood oxygenation from photoacoustic images: can a simple linear spectroscopic inversion ever work?

Abstract. Linear spectroscopic inversions, in which photoacoustic amplitudes are assumed to be directly proportional to absorption coefficients, are widely used in photoacoustic imaging to estimate blood oxygen saturation because of their simplicity. Unfortunately, they do not account for the spatially varying wavelength-dependence of the light fluence within the tissue, which introduces “spectral coloring,” a potentially significant source of error. However, accurately correcting for spectral coloring is challenging, so we investigated whether there are conditions, e.g., sets of wavelengths, where it is possible to ignore the spectral coloring and still obtain accurate oxygenation measurements using linear inversions. Accurate estimates of oxygenation can be obtained when the wavelengths are chosen to (i) minimize spectral coloring, (ii) avoid ill-conditioning, and (iii) maintain a sufficiently high signal-to-noise ratio (SNR) for the estimates to be meaningful. It is not obvious which wavelengths will satisfy these conditions, and they are very likely to vary for different imaging scenarios, making it difficult to find general rules. Through the use of numerical simulations, we isolated the effect of spectral coloring from sources of experimental error. It was shown that using wavelengths between 500 nm and 1000 nm yields inaccurate estimates of oxygenation and that careful selection of wavelengths in the 620- to 920-nm range can yield more accurate oxygenation values. However, this is only achievable with a good prior estimate of the true oxygenation. Even in this idealized case, it was shown that considerable care must be exercised over the choice of wavelengths when using linear spectroscopic inversions to obtain accurate estimates of blood oxygenation. This suggests that for a particular imaging scenario, obtaining accurate and reliable oxygenation estimates using linear spectroscopic inversions requires careful modeling or experimental studies of that scenario, taking account of the instrumentation, tissue anatomy, likely sO2 range, and image formation process.


Introduction
Blood oxygen saturation, sO 2 , defined as the ratio of oxygenated to total hemoglobin concentration, is an important physiological indicator of tissue function and pathology and therefore has many clinical and preclinical applications. Examples of such applications include locating regions of brain activation, [1][2][3] and characterizing and staging tumors. 4,5 Optical methods for estimating blood oxygenation are also in common use, with pulse-oximetry 6 routinely used to monitor a patient's systemic blood oxygenation, and near-infrared (NIR) spectroscopy 7 and diffuse optical tomography 8 used, for example, for stroke characterization 9 and functional studies of brain activation in neonates. 10 Optical methods are appealing because the absorption spectrum of blood in the "NIR window," where water and blood absorption are low so the light can penetrate the tissue, is strongly dependent on the level of oxygenation. However, purely optical approaches are only able to provide low-resolution images of tissue oxygenation, beyond a superficial region, due to the highly scattering nature of most biological tissues.
Photoacoustic imaging does not suffer from this limitation. The basis of the photoacoustic contrast is optical absorption, so it retains the specificity of optical imaging techniques, but the image resolution can be much higher, as the information about the contrast is carried out of the tissue by ultrasonic waves, which undergo little or no scattering. Photoacoustic imaging can, therefore, provide high-resolution images of subcutaneous vasculature. [11][12][13][14] Now, because the source of photoacoustic contrast is optical absorption, there is a strong temptation to assume that the photoacoustic image is proportional to the absorption coefficient and thus, the photoacoustic spectrum is proportional to the linear sum of the absorption spectra of the absorbers present. If this were the case, then a simple linear inversion of the type used in conventional optical transmission tissue spectroscopy 15 could straightforwardly be employed to recover sO 2 from photoacoustic images acquired at different wavelengths. However, this is not the case because, while the photoacoustic image depends on the optical absorption coefficient for contrast, it is not proportional to it: a photoacoustic image depends on both the absorption coefficient spectrum and the local light fluence spectrum. Since the fluence spectrum in tissue is not generally flat, it is unreasonable to ignore this dependence and assume, without further analysis, that absorption coefficients or related quantities such as blood sO 2 can be obtained from photoacoustic images via a simple linear spectroscopic inversion. To give a practical example, the wavelength dependence of the photoacoustic source at a blood vessel will depend not only on the absorption spectra of the oxyhemoglobin (HbO 2 ) and deoxyhemoglobin (Hb) present in the vessel, but, via the fluence, on the spectra of the absorbers and scatterers in the surrounding illuminated region; for example, the Hb and HbO 2 in the nearby microvasculature, water and lipids in the tissue, and melanin in the skin. This has the effect of distorting the photoacoustic spectrum, commonly known as spectral coloring, which is a well-known confounding factor that compromises sO 2 measurement accuracy, and there is already a large amount of literature describing attempts to ameliorate it. [16][17][18][19][20][21][22][23] Despite this, an increasing number of publications are appearing that do not take spectral coloring into account, and yet assume that the estimates of sO 2 that are obtained are trustworthy. It is particularly concerning that articles are appearing in which these questionable estimates of sO 2 are being used to make judgments in preclinical [24][25][26][27][28][29][30][31][32][33][34][35][36] and even clinical studies. [37][38][39][40][41] This trend has been exacerbated by the appearance of commercially available photoacoustic imaging platforms that appear to provide accurate estimates of sO 2 , even though the justification for these claims and experimental validation is weak at best. It is therefore of paramount importance that the accuracy with which blood sO 2 can be estimated from photoacoustic images without making an explicit correction for the wavelength-dependent fluence is studied carefully. The current paper explores this by investigating whether sO 2 can be accurately estimated in the presence of spectral coloring using linear inversions in a simple numerical phantom, with specific emphasis on examining the impact of the choice of optical wavelength.
The structure of this paper is as follows. Section 2 presents the theory of how blood oxygen saturation is linked to photoacoustic images. Section 3 describes two requirements, which, if satisfied, ensure that the spectral coloring can be safely ignored for the purposes of estimating sO 2 . In Sec. 4, the simple case of a single vessel embedded in a homogeneous absorbing and scattering medium is examined, with a series of modeling studies using a Monte Carlo (MC) model of light transport, to gain insight into the effects of spectral coloring. The necessary requirements for accurate sO 2 estimates are then applied to this situation to determine whether, even in this simple best case scenario, there are choices of wavelengths for which accurate sO 2 estimates are achievable without correcting for spectral coloring.

Estimation of Blood Oxygen Saturation from Photoacoustic Images
Photoacoustic tomography involves the generation of broadband acoustic pulses through rapid deposition of energy from visible or NIR nanosecond pulses of laser light. In tomography mode, the target tissue is flooded with light using wide-field illumination, and the resulting acoustic pulses are recorded at different spatial points on the tissue surface. There are then two steps to the image reconstruction. In the first step, the initial acoustic pressure distribution, p 0 , is estimated from the measured pressure time series, for a number of optical wavelengths, λ. In the second step, tissue properties, such as optical absorption coefficients, or chromophore concentrations, or oxygen saturation, are extracted from the p 0 ðλÞ images. This paper is concerned with the effect of the light fluence distribution in the tissue on the estimation of oxygen saturation, sO 2 , which is step 2, and so, it will be assumed that step 1 has been performed perfectly. This paper is therefore focused on the task of obtaining accurate estimates of sO 2 from a set of perfectly reconstructed images of p 0 ðλÞ. The initial acoustic pressure distribution p 0 is related to the spatial distributions of the light fluence Φ and tissue chromophore concentrations c k by (5) where † is used to indicate the pseudoinverse. [The constant κ has no bearing on the sO 2 estimates as it cancels out in Eq. (4).] Note that the matrix of wavelength-dependent fluence, Φðλ n Þ, is unknown and must be estimated, which, in general, is a nontrivial task.
We might be tempted to assume, in the first instance, that spectral coloring does not exist so the fluence is not wavelength-dependent and Φðλ n Þ can be replaced by the identity matrix. The photoacoustic signal would then be proportional to the absorption coefficient, μ a , and the problem becomes akin to transmission optical spectroscopy, in which the concentrations of Hb and HbO 2 can be calculated straightforwardly from the following simple linear inversion: In order to gain some insight into the effect of the fluence on photoacoustic sO 2 estimation, the two-wavelength case will be examined in detail. The same general principles will apply to the use of multiple wavelengths, but focusing on the two-wavelength case allows a clearer exposition of the essential ideas.

Without Fluence Correction
The effect of fluence can be understood by noting that a change in amplitude at a point in a photoacoustic image, δp 0 , due to a change in absorption coefficient there, δμ a (in this case due to a change in wavelength) is given by E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 7 ; 6 3 ; 4 5 2 δp 0 ¼ κðδμ a Φ þ μ a δΦ þ δμ a δΦÞ; where δΦ is the change in fluence due to the change in absorption. Dividing through by p 0 and using Eq. (1) yields E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 8 ; 6 3 ; 3 9 9 where the small second-order term ðδμ a ∕μ a ÞðδΦ∕ΦÞ has been neglected. When the sO 2 estimation proceeds by ignoring the effect of fluence, it will only be accurate when the relative change in the photoacoustic image is equal to the relative change in the absorption coefficient. Equation (6) can then be used for the inversion. From Eq. (8), it is clear that for this to be the case, i.e., for δp 0 ∕p 0 ≈ δμ a ∕μ a , the following condition must hold.
• Requirement 1: Fluence difference is small. At every point of interest, the relative difference in the fluence at the two wavelengths should be much smaller than the relative change in absorption coefficient: ; t e m p : i n t r a l i n k -; e 0 0 9 ; 6 3 ; 2 2 3 δμ a μ a ≫ δΦ Φ : At first glance, Eq. (9) might suggest that a good way to satisfy requirement 1 would be to choose wavelengths for which jδμ a ∕μ a j is large. However, as will be demonstrated in the example in Sec. 4.2, this would be wrong because the two terms in Eq. (9) are not independent and a large change in absorption coefficient will result in a large change in fluence. It turns out that the best way to satisfy Eq. (9) is to choose wavelengths at which the absorption coefficient is similar, i.e., δμ a is small. While such a choice minimizes spectral coloring, it introduces another problem. When there is only a small change in the absorption coefficient and a consequent small change in fluence, there will be only a small change in the photoacoustic amplitude. It is therefore important to consider the effect of noise, specifically the noise in the difference image δpðxÞ. This leads to requirement 2.
• Requirement 2: Low noise. The noise in the difference image must be sufficiently low that the uncertainty that it gives rise to in the estimates of sO 2 is sufficiently small for the application under consideration.
Requirement 2 can be subdivided into two related requirements.
• Requirement 2a: High sensitivity. The sensitivity of the imaging device must be sufficiently high (i.e., the instrument noise level must be sufficiently low) that the noise in the two images p 0 ðλ 1 Þ and p 0 ðλ 2 Þ, and therefore, the noise in the difference image δp is low enough that the level of the resulting uncertainty in sO 2 due to the noise is sufficiently small.
• Requirement 2b: Good conditioning. The wavelengths must be chosen so that the molar absorption coefficient matrix in Eq. (6) is sufficiently well-conditioned that the level of the uncertainty in sO 2 due to the noise is sufficiently small.
Requirement 2b puts a bound on the condition number of the molar absorption coefficient matrix given the noise level and the accuracy required in the sO 2 estimates. In an extreme case, the molar absorption coefficient matrix will be singular (have no inverse). For example, when the wavelengths have been chosen at two isosbestic points for hemoglobin (e.g., 586 nm and 808 nm; see Fig. 4), δp 0 will be the same for all oxygenation levels, so there is no way to determine, given a measured δp 0 , what the corresponding value of sO 2 should be. There is no sensitivity to sO 2 for this choice of wavelengths. The requirement that the matrix is not singular is another way of saying that δp 0 must map uniquely onto sO 2 . In a less extreme case, but one in which the condition number is large, the inversion will be very sensitive to the noise in the images, so even a low level of noise could lead to uncertainties in the sO 2 estimates that are greater than its full range (0% to 100%), rendering them meaningless. An example of such a case would be where the molar absorption coefficient of Hb is almost proportional to the molar absorption coefficient of HbO 2 at the chosen wavelengths, i.e., α Hb ðλ n Þ ≈ Kα Hb0 2 ðλ n Þ, for the chosen wavelengths λ n and some constant K. This could arise, for example, if the two wavelengths were chosen to lie between 875 and 925 nm, where the spectral characteristics of Hb and HbO 2 are similar (see Fig. 4). In this wavelength range, the measured δp 0 is only weakly dependent on blood oxygenation making accurate sO 2 estimates very challenging in the presence of noise. In Ref. 42, maximizing the condition number was used as the sole criterion for choosing the optimal wavelengths. The risk of considering only condition number (which is to say the sensitivity to noise) when choosing the wavelengths, and failing also to take into account the effect of spectral coloring, is that the recommended wavelengths could well be those at which spectral coloring has the greatest deleterious effect on the accuracy.
In summary, requirement 1 must be satisfied to avoid the estimate of sO 2 suffering from spectral coloring. Requirement 2 must be satisfied to avoid the estimate of sO 2 being dominated Journal of Biomedical Optics 121914-3 December 2019 • Vol. 24 (12) by noise. Both requirements 1 and 2 apply to every image voxel independently and must be satisfied if sO 2 estimates from photoacoustic images are to be considered accurate, when no attempt has been made to correct for the unknown fluence spectrum and a linear inversion [Eq. (6)] used.

With Fluence Correction
There may be situations in which an approximate correction for the spectral coloring can be incorporated into the routine to estimate sO 2 . For example, an estimate of the fluence spectrum from a measurement or a model can be divided out of the measured photoacoustic amplitudes to give a measured photoacoustic spectrum closer to the absorption coefficient spectrum.
To obtain accurate estimates of sO 2 when using a fluence correction, similar conditions to those above apply.
• Requirement 1: The condition in Eq. (9) still applies if the fluence difference δΦ is replaced by δΦ 0 , the corrected fluence difference: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 0 ; 6 3 ; 5 3 5 δΦ 0 ≔ If the fluence correction is accurate at both wavelengths, then δΦ 0 ¼ 0 and this requirement is always satisfied.
• Requirement 2: This requirement still applies, although the relevant matrix in Requirement 2b is now the "fluence-weighted" molar absorption coefficient matrix (the molar absorption coefficient matrix multiplied by the diagonal matrix containing the estimated fluence spectrum). Also, the relevant noise level is now that of the difference between the "fluence-corrected" p 0 images, which will likely be different from the uncorrected case.

Multiwavelength Case
The two-wavelength case was studied above so that the essential requirements for accurate estimation of sO 2 could be clearly identified. When multiple wavelengths are used, equivalent requirements apply: it is still necessary that the fluence variation be minimized between the wavelengths, it is still necessary that the molar absorption coefficient matrix be well-conditioned, and it is still necessary that the noise level in the measurements are sufficiently low.

Example: Estimating Blood Oxygen Saturation in a Single Isolated Vessel
To demonstrate the above general principles, we will describe a simple numerical example consisting of a blood-filled tube immersed in a homogeneous, absorbing and scattering background. The use of an unadorned phantom such as this permits close investigation of the effect of spectral coloring from both inside and outside the tube region, from which insights into the effect may be drawn. As the focus of this paper is on the effect of fluence, it is important to isolate its effect from any other possible causes of error. With experimental data, there may be artifacts caused by the choice of image reconstruction algorithm, limited detection aperture, acoustic band-limiting, positional uncertainty of sources and sensor, sensor directivity, heterogeneous tissue acoustic properties, and so on. The effects of these uncertainties on sO 2 estimates are not considered in this paper, which should, therefore, be taken as a best case scenario. In Sec. 4.2, images obtained at 26 equally spaced wavelengths in the range 500 to 1000 nm were used simultaneously in the inversion for sO 2 . Section 4.3 explores whether the sO 2 estimates could be improved using pairs of these wavelengths rather than all at once, as only two wavelengths are needed to make Eq. (5) invertible, and often the use of just two wavelengths is a practical requirement.
Two cases will be examined in this paper: (1) the effect on estimates of sO 2 ignoring spectral coloring, i.e., setting ΦðλÞ ∝ I , the identity matrix, and (2) the effect on estimates of sO 2 approximately correcting for spectral coloring using an expression for the depth-dependent decay of the form, E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 1 ; 3 2 6 ; 5 9 8 Φðλ; zÞ ¼ exp½−μ fit ðλÞz; (11) where z is the depth from the tissue surface and μ fit ðλÞ is obtained by fitting Eq. (11) to the function of z obtained by taking the projection of p 0 in the lateral directions. The reason for examining this latter case is that such an approximate correction is expected to improve the sO 2 estimates, is simple and easy to implement, and is quite commonly used in one form or another (if only as a form of time-gain compensation to improve image visualization with depth). The former approach will be referred to in shorthand as an "uncorrected" and the latter as "corrected," although, of course, the correction is not perfect.

Digital Phantom
The cubic 5 × 5 × 5 mm domain had cubic 25 μm voxels and index-matched, optically absorbing boundaries. A blood-filled tube, 250 μm in diameter, was embedded in the domain at a depth of 1 mm parallel to the illumination surface. The domain was illuminated with a collimated 2.5 mm diameter column of light (top-hat profile). The MC model of light transport used 10 12 photons, which was determined to be sufficient to ensure sO 2 estimates were not impacted by MC noise (see Sec. 6). The arrangement is shown in Fig. 1.
The optical properties were chosen to be representative of soft biological tissue. 43 The blood within the tube had a total hemoglobin concentration c HbT ¼ 150 g l −1 and oxygen saturation of 90% (c HbO2 ¼ 135 g l −1 and c Hb ¼ 1 5 g l −1 ). Two cases were considered for the background optical properties: (a) no spectral dependence and (b) spectral dependence typical of a capillary bed, modeled as 55% water and 45% blood with a total hemoglobin concentration, c bg HbT ¼ 5.63 g l −1 and 60.7% oxygenation (c HbO2 ¼ 3.42 g l −1 and c Hb ¼ 2.11 g l −1 ). The scattering coefficient throughout the domain followed the relationship 21.3 mm −1 À λ 500 nm Á −1.2 , and the Henyey-Greenstein phase function was used, with an anisotropy factor of g ¼ 0.9 for the background and g ¼ 0.9945 for blood. The initial pressure distribution was calculated using Eq. (1) with κ set to 1 with no loss of generality. Figure 2 shows examples of the initial acoustic pressure and fluence distributions at wavelengths of 580 and 600 nm with a spectrally varying background [case (b)]. Both are maximum intensity projections in the y direction and are normalized by the maximum value of acoustic pressure or fluence, respectively. Spectral coloring is due to the wavelength dependence of the fluence distribution, which can be clearly seen in this example, as the fluence distributions at 580 and 600 nm are notably different, especially in the region of the tube. This difference is due to the absorption coefficient of blood being an order of magnitude greater at 580 nm than at 600 nm, and it would not be surprising if it were to impact on the ability to accurately estimate oxygen saturation using these wavelengths.

Estimates using 26 Wavelengths over
the 500-to 1000-nm Range As a starting point for this investigation, images were obtained at 26 wavelengths evenly spaced at 20-nm intervals between 500 and 1000 nm. These were used simultaneously to estimate sO 2 using Eq. (5) with Φ ∝ I . Both the cases of (a) no spectral dependence in the background and (b) spectral dependence typical of tissue were studied. Figure 3 shows a slice in the x − z plane, halfway along the y axis, of the error in the resulting sO 2 estimates in the tube for both the spectrally flat (upper plot) and spectrally varying (lower plot) backgrounds. (Throughout the paper, error will be calculated as jsO true 2 − sO 2 j.) The estimates in the tube differ voxel-to-voxel, and the histograms show how the distribution of the oxygenation estimates varies with depth. The voxels included in the histograms are those within the tube in the x − y slices at four depths (a, b, c, d), where each x − y slice is orthogonal to the x − z slice shown. The sO mean 2 values are the means of the associated histograms. The global mean sO g:mean 2 is the mean of the sO 2 estimates in all voxels in the tube.
The histograms of the error in the sO 2 estimates show that, even in the case of a spectrally flat background, the mean sO 2 is at best 75% (compared to the true value of 90%) and the accuracy decreases rapidly with depth through the tube. As there is no spectral coloring from the background absorbers in this case, it is due wholly to the presence of the blood within the tube. These results provide a compelling illustration of the adverse influence of spectral coloring. They suggest that even under these highly benign conditions of a spectrally flat background, reasonable accuracy, without any fluence correction, is only possible if the spatial resolution of the imaging instrument is sufficient to resolve those few voxels that lie in the center of the x − y plane within the uppermost region (slice "a"). Any spatial averaging, particularly in the z direction, due to the resolution limitations of a practical imaging instrument, will rapidly produce very significant errors. For the spectrally varying background (Fig. 3 lower plot), the situation is even worse to the extent that accuracy is significantly compromised even for the "best" voxels in the center of the most superficial slice "a." The error is larger because of spectral coloring by the background region as well as the blood within the tube itself.
The results in Fig. 3 also show that there is a tendency to underestimate sO 2 relative to the true value of 90%. This is the result of the inversion being weighted by wavelengths, where the change in fluence between illumination wavelengths is large. It can be seen in Fig. 4, which shows the spectra of the absorption coefficient in the tube and the fluence averaged over the tube volume that the change in fluence is most significant between 500 and 600 nm. (In a separate set of simulations, not shown here, linear inversions only using wavelengths in the 500-to 600-nm range always resulted in the underestimation of sO 2 relative to the true value of 90%, whereas when only wavelengths greater than 600 nm were used, sO 2 was overestimated.) Estimating sO 2 over this wavelength range without accounting for the spectral coloring of the fluence is clearly not likely to give accurate values. To make a first-order correction for the fluence while retaining the simplicity of the linear inversion, the depth-dependent fluence correction, Eq. (11), was applied to the dataset obtained with a spectrally varying background. The resulting estimates are shown in the lower plots in Fig. 3. Comparing the corrected and uncorrected histograms, it is clear that the sO 2 estimates for this case are improved by 20% to 30% and are now comparable to the case with a spectrally flat background. Although the fluence correction provides a clear benefit, the accuracy remains, at best, only comparable to that achieved in the spectrally independent case with acceptable accuracy (∼6%) only achieved for those voxels in the center of the x − y plane in the most superficial slice "a." In summary, the results in this section show that, when using 26 wavelengths in the range 500 to 1000 nm, spectral coloring has a highly deleterious impact on the accuracy of the sO 2 estimates, to the extent that acceptable accuracy is unlikely to be achieved in practice using this wavelength range. However, Fig. 4 indicates that the error is likely to be most significant between 500 and 600 nm, where the wavelength dependence of the fluence is greatest. This suggests that, in contrast to conventional optical transmission spectroscopy, there may be an advantage to selecting wavelengths where the absorption of blood varies slowly with wavelength (which here is also where the absorption coefficient is low) because this will minimize spectral coloring. For this reason, in the next section, consideration is given as to whether more accurate estimates could be obtained by selecting wavelengths beyond 600 nm, where blood absorption and its spectral dependence is relatively low. 44

Optimal Wavelength Pairs in the 500-to 1000-nm Range
In general, when estimating some quantity from experimental measurements, it is often reasonably assumed that the more data available the better the estimate will be. 45 The implicit intuition is that the effect on the estimate of the errors in the individual instances of the data will be symmetrically distributed about the of the corrected data, which represents sO 2 estimates averaged over all voxels enclosed within the entire tube, is shown to the right of the histograms. Fig. 4 Plot of absorption spectrum in the tube (dashed blue) for 90% oxygenated blood and average fluence spectrum (green) in the tube. Also shown are the spectra for fully oxygenated blood (red) and deoxygenated blood (black). Fluence simulated using MC with 10 12 photons, collimated, illuminated 2.5 mm in diameter at wavelengths in 500-to 1000-nm range, and averaged over tube volume.
Journal of Biomedical Optics 121914-6 December 2019 • Vol. 24 (12) true value and so "cancel out," resulting in an estimate whose accuracy increases with the amount of data used. Here, while using more wavelengths may improve the SNR and may improve the conditioning of the multiwavelength inversion, the effect of spectral coloring will remain and could counteract both these advantages. Unfortunately, the coloring that the fluence gives the photoacoustic spectra is not guaranteed to cancel out, and so using images at more wavelengths does not guarantee a better estimate of sO 2 when spectral coloring is not corrected for. Following this reasoning, using fewer wavelengths can potentially give more accurate estimates, but only if the spectral coloring at the wavelengths chosen is low. This will be examined in this section, but rather than investigating all possible combinations of wavelengths, pairs of wavelengths will be investigated. This has two advantages: first, the practical advantage that it is a manageable number of sets of wavelengths, and second, the advantage that the simplicity of using two wavelengths allows insights to be gleaned, which might be harder to see were larger groups of wavelengths used.
In order to demonstrate that the wavelengths that give the best estimates of sO 2 are those that satisfy requirement (1), sO 2 was estimated using Eq. (5) for all possible pairs of the 26 wavelengths in the 500-to 1000-nm range, i.e., 26 C 2 ¼ 325 wavelength pairs. Three cases were examined: (a) spectrally flat background and no fluence correction (Φ ∝ I), (b) spectrally varying background and no fluence correction, and (c) spectrally varying background and the depth-dependent correction [Eq. (11)]. Figure 5 shows histograms of the sO 2 estimates for these three cases, for just the pairs of wavelengths that gave a mean sO 2 closest to the true value. The voxels that satisfy the condition in Eq. (9) that the relative fluence change be much smaller than the relative absorption change (here interpreted as 2 orders of magnitude smaller) are highlighted in red.
Comparing Figs. 3 and 5, it is evident that these wavelength pairs provide significantly improved accuracy compared to using all 26 wavelengths across the 500-to 1000-nm wavelength range. The global mean sO 2 values in all cases are within 2% of the true value of 90% in Fig. 5, whereas the global mean sO 2 values achieved when using the 500-to 1000-nm spectral range are subject to an error of almost 30% (Fig. 3). When the background is spectrally flat, Fig. 5(a) shows that sO 2 is accurately estimated (to within 0.5%) and that estimates made under the condition that the relative change in the absorption coefficient far exceeds the accompanying relative change in fluence are more accurate. These estimates correspond to regions at the surface of the tube and at the center of the x − y slice. When the background has some spectral dependence, sO 2 estimates are less accurate with errors of 1% to 2%, and much higher variance, as shown Fig. 5(b). It can be observed that there are no voxels that satisfy Eq. (9) in Fig. 5(b) due to spectral coloring being significant at 700/820 nm. Figure 5(c) shows that the use of a depth-dependent fluence correction shifts estimates to a global mean value of 88.8% in the tube, and again the values closest to the true value are those that satisfy Eq. (9).
Given that in most practical settings the ratio ðδΦ∕ΦÞ∕ ðδμ a ∕μ a Þ is unknown, and therefore cannot be used directly as an indicator of the accuracy of the sO 2 estimate, it is desirable to obtain a result that is of more general applicability. One could, for example, determine the wavelength pairs ðλ 1 ; λ 2 Þ that yield the most accurate sO 2 estimates on average over the tube, thereby providing some indication of which wavelengths are best-suited to recovering sO 2 . Figure 6 illustrates this by showing the error in oxygenation, averaged over the tube, obtained at a range of wavelengths λ 1 and λ 2 between 500 and 1000 nm when the background was spectrally varying. Four different tube sO 2 values were considered. The bottom-left triangles show estimates obtained without the depth-dependent fluence correction, whereas the top-right triangles contain values obtained when the data were corrected using Eq. (11).
Several observations emerge from the results in Fig. 6. First, it is noticeable that there are two distinct spectral regions "a" and "b," indicated by white-colored pixels, where the error is particularly high (>100%). The error is large for region "a" because it encompasses the spectral range of 500 to 600 nm, where blood absorption is strongly wavelength-dependent, resulting in significant spectral coloring. Region "b" represents the spectral range of 820 to 900 nm and exhibits poor accuracy for a different reason, namely, poor conditioning of the molar absorption coefficient matrix as the spectra of Hb and HbO 2 vary almost in proportion, i.e., the ratio α Hb ∕α HbO 2 is approximately constant in Fig. 5 Histograms of sO 2 estimates in each voxel within the tube for (a) spectrally flat with mean sO 2 of 90.3%, (b) spectrally varying background with mean sO 2 of 91.6%, and (c) spectrally varying with depthdependent fluence correction with mean sO 2 of 88.8% obtained using linear inversion at 620/920 nm, 700/820 nm, and 700/820 nm, respectively; sO 2 estimates for which the ratio ðδΦ∕ΦÞ∕ðδμ a ∕μ a Þ < 0.01 shown in red and ðδΦ∕ΦÞ∕ðδμ a ∕μ a Þ ≥ 0.01 in black. Notes: vertical line indicates true oxygenation of 90%; there are no sO 2 estimates for which ðδΦ∕ΦÞ∕ðδμ a ∕μ a Þ < 0.01 for the spectrally varying background (uncorrected) due to severity of spectral coloring of fluence; note the scale change on vertical axis between flat and various backgrounds. These wavelength pairs yielded the most accurate sO 2 estimates of all pairs examined. sO g:mean 2 represents the mean of the sO 2 estimates all voxels enclosed within the entire tube and is termed the global mean sO 2 .
Journal of Biomedical Optics 121914-7 December 2019 • Vol. 24 (12) this wavelength range. Second, if the wavelengths that give the most accurate estimate of sO 2 are selected, a depth-dependent fluence correction may not provide a noticeable improvement over no correction in the accuracy of the sO 2 estimates. This is consistent with Figs. 5(b) and 5(c) and is likely to be due to the fact that the spectral coloring is less significant when wavelengths are well-chosen. The exception to this is where one wavelength lies in the 500-to 580-nm range. In this case, the application of the depth-dependent fluence correction provides a benefit as spectral coloring is more severe in this wavelength range. This may be important in applications that necessitate the use of shorter wavelengths, for example, in situations where high SNR is required or the excitation wavelength range of the laser source is limited. Figure 6 also shows that in each case inversions, whether with or without the fluence correction, yield the highest accuracy at wavelengths between 620 and 1000 nm (excluding region "b"), with few wavelength pairs outside this range producing accurate sO 2 values. Moreover, the number of wavelength pairs within this wavelength range yielding an error between 0% and 100% (the range of the color bar in Fig. 6) does not vary significantly for the four different tube sO 2 values. However, the number of wavelength pairs yielding reasonable accuracy (an error <10%) is strongly dependent on the value of sO tube 2 and decreases with decreasing sO tube 2 . This can be seen most clearly by comparing the maps for sO tube 2 ¼ 70% and sO tube 2 ¼ 100%. These show that, while almost any pair of wavelengths between 620 and 1000 nm is likely to yield reasonable accuracy for sO tube 2 ¼ 100%, the same does not apply at sO tube 2 ¼ 70%, where there are significantly fewer wavelengths yielding accurate estimates. This is because the wavelength dependence of blood absorption increases with decreasing sO 2 for wavelengths below the isosbestic point at 800 nm.
At low sO tube 2 values, spectral coloring, therefore, increases for the majority of wavelengths in the 620-to 1000-nm range, so there are fewer wavelengths that provide the low levels of spectral coloring required for accurate estimates. Moreover, at low sO 2 values, those wavelength pairs that do yield accurate In each figure, the bottom-left triangle represents sO 2 estimates made with a spectrally varying background and the top-right triangle represent sO 2 estimates made with a spectrally varying background using a depth-dependent fluence correction. The white pixels represent estimates with an error >100%. In the regions marked "a," this error is due to spectral coloring; in the regions marked "b," it is due to poor conditioning.  (12) estimates become increasingly sparsely distributed in wavelength space. Under the conditions of low sO tube 2 , the combination of these factors makes it very challenging to select the optimal wavelengths since they depend critically on the tube sO 2 , which is generally unknown a priori in practice. (Interestingly, additional simulations, not shown, showed that, while the optimal wavelengths were dependent on the tube sO 2 , they remained relatively independent of changes in the background sO 2 . This may be due to the relatively low background absorption in this simulation.) In summary, the results in this section suggest that, unlike the 500-to 600-nm wavelength range, where spectral coloring severely compromises accuracy under almost all conditions, it may be possible to achieve reasonable accuracy in some circumstances by using longer wavelengths between 620 and 1000 nm. For example, for high sO tube 2 values (>90%), a modest level of accuracy may be achieved by selecting any pair of wavelengths in this range. This relaxed wavelength selection criterion also has the important practical advantage that accurate estimates are not then contingent upon accurate a priori knowledge of sO tube 2 , just that it is around 90% or higher. This could be useful, e.g., when measuring sO 2 in an artery, where the physiological range of sO 2 is quite narrow. However, for lower sO tube 2 (e.g., 70%), wavelength selection becomes more problematic as there are only relatively few sparsely distributed wavelength pairs that provide reasonable accuracy and identifying these wavelengths requires a priori knowledge of sO tube 2 , which is generally unavailable in vivo. Even our seemingly encouraging conclusion that blind optimal wavelength selection within the 600-to 1000-nm range is feasible when sO tube 2 is high (>90%) may be unduly optimistic since it is based on the assumption of noise-free data. In practice, any imaging system will introduce noise, which may make certain wavelengths, where the absorption coefficient is low, impractical to use as the SNR would be too low, as discussed in Sec. 4.4.

Signal-to-Noise Ratio
The suggestion above to select wavelengths above 620 nm, where blood absorption changes less with wavelength than in the 500-to 600-nm range, means SNR is sacrificed in favor of the accuracy of the oxygenation estimates. This trade-off is made evident by considering that we require that the change in the initial pressure, p 0 , as a result of the change in wavelength from λ 1 to λ 2 , exceeds the minimum detectable change in p 0 , the "image noise equivalent pressure": jδp 0 j > η, or E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 2 ; 6 3 ; 2 5 0 jδμ a j > assuming any system-dependent scaling has been calibrated out, and the wavelengths have already been chosen so that δp 0 ≈ ΓΦδμ a . (The ffiffi ffi 2 p factor comes from assuming that p 0 ðλ 1 Þ and p 0 ðλ 2 Þ are uncorrelated and have equal SNR such that their variance is additive. 46 The inequality in Eq. (12) was applied to the data in Fig. 6 in order to illustrate the effect that taking SNR into account can have on the wavelength selection. This was performed for 325 wavelength pairs in the 500-to 1000-nm range and typical values for the parameters in Eq. (12) were used: η ¼ 0.2 kPa and Γ ¼ 0.15. 47 The fluence was computed using the MCgenerated data used in the inversion and was scaled by an incident fluence of 20 mJ cm −2 . Using these parameters yielded mean image SNRs in the range 26.5 to 45.7 dB, depending on wavelength and the value of the tube sO 2 . These SNRs are significantly higher than those of high quality in vivo photoacoustic images of superficial blood vessels 48,49 and therefore on the optimistic side of what can be achieved in practice. The data in Fig. 6 were then replotted in Fig. 7 but masking the pairs of wavelengths for which the above inequality in Eq. (12) is not satisfied. In other words, those wavelengths for which the change in initial pressure from λ 1 to λ 2 was less than η.
For all the oxygenation levels studied, the number of wavelengths that yield accurate estimates in the presence of noise is considerably reduced compared to the noise-free case (shown in Fig. 6). For the remaining wavelength pairs, the depth-dependent fluence correction can provide an improvement in accuracy, particularly for low values of sO tube 2 . This is because for low values of sO tube 2 , most of the wavelength pairs that satisfy Eq. (12) comprise at least one wavelength that lies within the spectral range of 500 to 620 nm, where blood absorption is high and varies rapidly with wavelength. The resulting spectral coloring can, therefore, be mitigated by the fluence correction.
In terms of optimal wavelength selection, there are several spectral regions that yield reasonable accuracy. For sO tube 2 ¼ 100% and sO tube 2 ¼ 90%, the spectral ranges defined by regions "a" and "b" in Fig. 7 provide accurate estimates with errors predominantly below ∼10%. However, the number of wavelength pairs in these regions decreases with decreasing sO tube 2 to the extent that these regions disappear entirely for sO tube 2 ¼ 70%. This is because δμ a becomes progressively smaller with decreasing sO 2 for wavelengths that lie in these spectral ranges until it no longer satisfies Eq. (12). Perhaps, a more useful spectral range is that defined by region "c," where one wavelength is 600 nm and the other must lie between 800 and 1000 nm. Any wavelength pair lying in region "c" will not only satisfy Eq. (12) but will also yield reasonable accuracy for all the sO tube 2 values represented in Fig. 7. This is because blood absorption at 600 nm is high so δμ a arising from the wavelength change λ 1 ¼ 600 nm to λ 2 > 800 nm is large enough to satisfy Eq. (12) but not so high as to introduce excessive spectral coloring. However, although accuracy appears to be reasonable for these wavelength pairs, it does decrease significantly with reducing sO tube 2 . For example, for the spectral range defined by region "c" and for sO tube 2 ¼ 100%, if λ 2 ¼ 600 nm, then the error varies from 2.8% to 8.1% depending on the choice of λ 1 . By contrast for sO tube 2 ¼ 70%, the corresponding error varies from 7.7% to 18% for region "c." The wavelength pair that yielded the best accuracy across all four sO 2 values in Fig. 7 was found to be 600∕800 nm with an error ranging from 9.8% for sO tube 2 ¼ 70% to 3.6% for sO tube 2 ¼ 100%. Although this appears to suggest that reasonable accuracy may be possible using the 600-∕800-nm wavelength pair, this should be treated with caution because the spectral region for one of the wavelengths is narrow, and a higher noise level or a small mismatch between the model and an experimental scenario could easily shift or eliminate it. In this study, optimistic assumptions of perfect image reconstruction and a homogeneous background have been made, so there are no artifacts, and the typical SNR achieved in vivo is often lower than used here. The latter will have the effect of reducing the number of wavelengths that satisfy Eq. (12) and for those remaining wavelengths that do satisfy it, they will be biased to shorter wavelengths, where spectral coloring is greater and accuracy reduced. Furthermore, the specific wavelength pairs that do Journal of Biomedical Optics 121914-9 December 2019 • Vol. 24 (12) yield accurate estimates will be increasingly dependent on the sO tube 2 values, which are unknown, making it hard to identify those wavelength pairs with confidence.
In summary, in the previous section it was observed that it may be possible to achieve accurate estimates for sO 2 values by selecting wavelength pairs in the 600-to 1000-nm range. However, noise has the negative impact of not only significantly reducing the number of wavelength pairs that provide accurate estimates but also making the specific values of these wavelengths increasingly dependent on the sO tube 2 . Indeed, it is the latter that arguably presents the greatest challenge since sO tube 2 is typically unknown a priori, determining it being the point of the measurement, so there is no certain way of identifying those few wavelength pairs that yield accurate estimates.

Summary and Conclusions
In this paper, we investigated whether blood oxygenation can be accurately estimated from multiwavelength photoacoustic images when using a simple linear inversion, which makes the simplifying assumption that the fluence does not change with wavelength, and spectral coloring is therefore ignored. The resulting linear inversions were tested using a digital phantom of a tube in a homogeneous background under the assumption that the acoustic properties and Grüneisen parameter within the domain were constant and that the acoustic inversion used to reconstruct p 0 provided an exact image without artifacts, distortion, or other errors.
These simulations demonstrated that spectral coloring distorts the internal fluence distribution and can introduce significant inaccuracies in the estimation of blood oxygenation. It was found that using a set of illumination wavelengths covering the NIR/visible spectrum (500 to 1000 nm) yields highly inaccurate blood oxygenation estimates. This is due to strong spectral coloring arising from the large changes in fluence that occur between 500 and 600 nm due to the strong wavelength dependence of hemoglobin absorption. Improved accuracy in sO 2 Fig. 7 Matrix of average error in oxygenation for 26 wavelength pair combinations in 500-to 1000-nm range for different values of oxygenation in the tube, sO tube 2 . In each figure, the bottom-left triangle represents sO 2 estimates made with a spectrally varying background and the top-right triangle represents sO 2 estimates made with a spectrally varying background using a depth-dependent fluence correction. The white pixels represent estimates with an error >100%. In the regions marked "a," this error is due to spectral coloring; in the regions marked "b," it is due to poor conditioning. Light blue regions indicate wavelengths that do not satisfy the condition in Eq. Journal of Biomedical Optics 121914-10 December 2019 • Vol. 24 (12) estimates was obtained when the absorption coefficient of blood changed little between illumination wavelengths resulting in reduced spectral coloring, which is generally achieved at wavelengths in the range 620 to 1000 nm. However, at these wavelengths, blood absorption is relatively low, so, in practice, this may result in the change in photoacoustic amplitude being below the noise floor, again resulting in inaccurate sO 2 estimation. There is, therefore, a "spectral coloring dilemma" in photoacoustic imaging when a fluence correction is not applied, a fundamental trade-off between accuracy and SNR, which limits the practical applicability of a linear spectroscopic inversion even for the simple tube geometry studied here.
Having said that, the tube study (see the results in Fig. 7) also showed that there may be pairs of wavelengths for which both the spectral coloring is low and there is sufficient SNR to obtain a fairly accurate sO 2 estimate. Unsurprisingly, the accuracy was improved when a fluence correction, although a partial, only depth-dependent, correction was used. (Of course, a complete and accurate fluence correction, were it available, could be used to remove the problem of spectral coloring altogether.) Nevertheless, the difficulty in making this observation practically useful is that the wavelengths that satisfy these requirements are by no means obvious. Here, they were found by testing hundreds of wavelength pairs using a model that, by definition in this phantom study, was accurate. In an in vivo context, an accurate description of the tissue anatomy and physiological status is rarely available a priori, and a simplified model will need to be used to compute the suitable wavelengths, bringing with it inevitable uncertainty.
It should also be noted that this study made a number of optimistic assumptions: First, and perhaps most notably, the images, p 0 ðλÞ, contained no reconstruction artifacts. In practice, a photoacoustic image is not an exact depiction of the initial pressure distribution in the tissue due to instrumentation related factors, such as detector bandwidth limitations and limited detection aperture: the latter being a particularly significant source of error with the linear or planar detection arrays commonly used in clinical and preclinical photoacoustic imaging. Second, the spectral coloring was modest due to the very simple absorber (one tube), its superficial location (1 mm depth), and the limited sO 2 range investigated. Third, the SNR was higher than typically found in in vivo images. This was, therefore, a study of a best-case scenario. Despite this, there were relatively few cases for which reasonable sO 2 estimates were obtained. This suggests caution should be exercised when estimating blood oxygen saturation with a linear inversion in vivo, where the situation is likely to be considerably more complex and noisy.
In summary, the aim of this work was to investigate whether there are circumstances in which a linear spectroscopic inversion, which implicitly does not account for spectral coloring, can provide accurate in vivo sO 2 estimates. On a narrow reading, the conclusion is that there are, or could be, by choosing the right pair of wavelengths. The challenge will be finding these wavelengths with confidence. This will likely require carefully controlled experimental phantom studies or numerical simulations that fully account for the imaging instrumentation and image reconstruction process and the geometry and physiological status of anatomical region of interest. Given the multitude of confounding factors, it is our conclusion that a simple linear inversion should be applied in vivo with very considerable caution and only following a careful and comprehensive validation.
By highlighting the challenges presented when trying to estimate blood oxygenation using photoacoustic imaging, the findings of this study will help inform the practical application of spectroscopic photoacoustic techniques to the in vivo measurement of blood sO 2 . This is becoming ever more important as photoacoustic imaging is translated to practical biomedical applications in the clinical and life sciences and commercial systems are increasingly available. There are a number of ways in which light distributions in scattering media can be modeled. The radiative transfer equation (RTE) is accurate, but the requirement for spatial and angular discretization quickly leads to excessive computational demands, especially for 3D domains. The diffusion approximation to the RTE, while computationally efficient, does not accurately model the fluence close to the surface, a region of interest for photoacoustics, or in nonturbid media. Here, a MC model of light transport was used. MC models estimate solutions to the RTE by summing a large number of "photon" random walks. They are highly parallelizable as the photons do not interact, and therefore, the computational effort scales well with domain size. The code used here was MCX, 50 which has been validated extensively and is GPU-accelerated, offering relatively short computation times even for a large number of photons required for convergence. A method to determine the number of photons required for convergence is presented below.

Level of Uncertainty in sO 2 Estimates Due to MC Noise
Although MC models are considered the "gold standard" (Ref. 51), it is important to recognize that due to the stochastic nature of simulations, the estimate of the fluence in each voxel will exhibit some variance. As the number of photons increases, the variance will reduce, and the fluence estimate will converge to the solution given by the RTE. For these simulations, it is necessary to use sufficient photons that the variance (the "MC noise") falls below an acceptable level. To assess what level this needs to be, we examined the effect of the MC noise on two-wavelength sO 2 estimates calculated from initial pressure distributions generated using the MC models. The standard deviation in oxygenation at a particular voxel, σ sO 2 , can be estimated from the standard deviation in the initial acoustic pressure simulated for that voxel, σ p 0 ðλ i Þ , by assuming that the initial pressure at the two wavelengths, p 0 ðλ 1 Þ and p 0 ðλ 2 Þ, is uncorrelated 52 and using the standard formula: where the partial derivatives of sO 2 with respect to initial pressure can be computed from Eqs. (4) and (5). The standard deviation σ p 0 ðλ i Þ was estimated from multiple simulations in the same domain with the same number of photons, by reseeding the random number generator to ensure different photon trajectories from simulation to simulation. Running the simulation multiple times at a pair of wavelengths allowed σ sO 2 to be calculated. The criterion used to ensure a sufficient number of photons was that Journal of Biomedical Optics 121914-11 December 2019 • Vol. 24 (12) the spatially varying standard deviation σ sO 2 was everywhere much less than the precision required in the oxygenation estimates. Performing the analysis above for the tube phantom using 10 model runs at 500 and 800 nm (chosen due to the significant difference in the absorption of blood at these wavelengths) yielded relative uncertainties of <0.16% in the initial pressure in the tube. Using Eq. (13) to propagate these errors to sO 2 estimates made using p 0 ðλÞ averaged over the 10 model runs produced a maximum σ sO 2 value in the tube of 0.34%, indicating that uncertainties in sO 2 estimates due to MC noise are negligible. Note that this was corroborated by directly computing the standard deviation in the sO 2 estimates themselves.

Disclosures
The authors have no relevant financial interests in this article and no potential conflicts of interest to disclose.