High-speed hyperspectral imaging enabled by compressed sensing in time domain

Abstract. Hyperspectral imaging (HSI) is a powerful tool widely used for various scientific and industrial applications due to its ability to provide rich spatiospectral information. However, in exchange for multiplex spectral information, its image acquisition rate is lower than that of conventional imaging, with up to a few colors. In particular, HSI in the infrared region and using nonlinear optical processes is impractically slow because the three-dimensional (3D) data cube must be scanned in a point-by-point manner. In this study, we demonstrate a framework to improve the spectral image acquisition rate of HSI by integrating time-domain HSI and compressed sensing. Specifically, we simulated broadband coherent Raman imaging at a record high frame rate of 25 frames per second (fps) with 100  pixels  ×  100  pixels, which is 10  ×   faster than that of previous work, based on an experimentally feasible sampling scheme utilizing 3D Lissajous scanning.


Introduction
Hyperspectral imaging (HSI) is a powerful tool widely used in a diverse range of applications, such as remote sensing, 1,2 food testing, 3 pharmaceutical testing, 4 forensic testing, 5 and disease diagnosis, 6 as it can acquire spatially resolved spectral information. In HSI, hyperspectral images are acquired by obtaining the light intensities in the spatiospectral three-dimensional (3D) ðx; y; λÞ data cube via various methods, such as whisk broom sensing (point-scanning to record spectral information of each pixel with a diffraction grating and a linear sensor array), 7 push broom sensing [line-scanning to record a one-dimensional (1D) spectral image with a two-dimensional (2D) camera], 8 staring sensing (spectral filtering of incoming light to record the images of each spectral band with a 2D camera), 9 snapshot spectral imaging (a nonscanning method to record the remapped 3D data cube with a 2D camera), 10 and compressed sensing (CS)-based coded aperture snapshot spectral imaging (CASSI) 11,12 as shown in Figs. 1(a)-1(c). Among them, the staring method is most widely used in modern HSI applications because it is compatible with on-the-fly spectral imaging, as its image acquisition is easily performed with a 2D camera after spectral filtering by a tunable filter.
However, the staring method often suffers from low light transmittance and therefore limited signal-to-noise ratio (SNR) especially in the infrared (IR) region due to the lack of sensitive image sensors. Also the wide-field illumination of the staring method is an impediment to nonlinear optical measurements, which require high optical intensity at the sample to produce a detectable signal. Similarly, CASSI cannot be applied to nonlinear HSI due to its wide-field illumination. To overcome these issues, it has been demonstrated that scanning in the spatiotemporal space (x; y; τ), or time-domain HSI, provides a higher SNR by virtue of its multiplex advantage (also known as Fellgett's advantage) compared to direct measurements in the spectral domain. Unfortunately, the major drawback of timedomain HSI is its low image acquisition rate, primarily due to slow optical scanners for multidimensional spectral image acquisition. Even with high-speed optical scanning using a resonant scanner, only up to 2.4 frames per second (fps) with a pixel resolution of 100 pixels × 100 pixels has been realized. 13 To boost the image acquisition speed, one possible approach is to combine time-domain HSI with CS 14,15 [ Fig. 1(b)], which provides a technique to recover a complete data set from fewer measurements than what is required by the Nyquist-Shannon sampling theorem based on two assumptions: (i) the signal from the target object is sparse in a suitable domain and (ii) the measurement process is incoherent; the sampling intervals along the τ axis are randomized in the case of Fourier-transformationbased measurements like time-domain HSI. Although CS has been demonstrated for time-domain HSI, the methods implemented in the previous demonstrations are not applicable to high-speed hyperspectral image acquisition because the requisite sampling randomization was implemented during postmeasurement processing 16 or was paired with a lower camera frame rate. 17 In this paper, we report a CS-powered method to achieve high-speed time-domain HSI. Specifically, we used broadband (200 to 1600 cm −1 ) Fourier-transform coherent anti-Stokes Raman scattering (FT-CARS), a type of nonlinear Raman process, 13,18,19 as a platform to evaluate this method. We sampled time-domain interferograms sparsely by utilizing the 3D Lissajous scanning method 20 to boost the hyperspectral image acquisition rate and then reconstructed hyperspectral images by solving an inverse problem regularized with spatiospectral total variation (SSTV). 21 We demonstrated hyperspectral image acquisition at a frame rate of 25 fps with a pixel resolution of 100 pixels × 100 pixels under experimentally feasible conditions, which is 10 times higher than that of the previous record. 13 We compared the original sinusoidal Lissajous scanning and the triangular Lissajous scanning for CS-based high-speed time-domain HSI in terms of noise robustness and achievable compression ratios, showing that Raman hyperspectral images were reconstructed with reasonable quality in both methods. Because the present scheme is applicable to many realistic Raman or HSI methods and these simulations are based on existing experimental demonstrations that used commercially available equipment, our method offers a powerful platform for the video-rate visualization of rapid chemical processes in various systems, such as inhomogeneous chemical reactions and living cells. Figure 2(a) schematically shows the optical setup of our simulated CS-based FT-CARS imaging system. The light source is a femtosecond laser with a repetition rate of 80 MHz. In FT-CARS measurements, molecular vibrations are coherently excited by a pump pulse and interrogated by a time-delayed probe pulse. 22 Pump-probe pulse pairs are generated by a Michaelson interferometer, where the optical path length of one of the arms is modulated by a resonant scanner. The incident beams are then spatially scanned by the two galvanometric scanners for interrogating different spatial points of the sample. A long-pass and short-pass filter pair, placed before and after the sample, respectively, isolates the blueshifted CARS signals. Just as in previously demonstrated FT-CARS measurements, 22 the CARS signal is acquired as a function of the pump-probe delay, which is Fourier-transformed to obtain the Raman spectrum as a function of Raman shift or wavenumber.

Methods
In the present scheme, the (x; y; τ) data cube was scanned in a 3D Lissajous trajectory 20 such that the sampling interval along the τ axis became pseudorandom at each spatial point [ Fig. 2 To this end, there are several options for scanning functions to generate the 3D Lissajous scanning trajectory, such as triangular and sinusoidal functions, especially for the xy scanning. Each of them has advantages and disadvantages in the context of CS recovery. As discussed in the previous research, 20 triangular functions are more advantageous than sinusoidal functions in terms of the uniformity of the number of sampled points in the xy plane, as shown in Fig. 2(c). On the other hand, sinusoidal functions realize shorter pixel dwell time than triangular functions because sinusoidally oscillating scanners, such as a resonant scanner and a microelectrochemical system scanner, can operate at higher frequencies. Because shorter pixel dwell time leads to a more randomized sampling pattern along the τ axis at each pixel, as shown in the caption of Fig. 1(b), more reliable CS recovery is expected. 14 We compared the performances of triangular and sinusoidal scanning functions in the following numerical simulations.
The procedure to reconstruct the hyperspectral image from the sparsely sampled data is introduced as follows. First, we define N x ; N y , and N b as the dimensions of the hyperspectral image, N ¼ N x N y N b as the full size of data points in the spectral domain, and M as the total number of sampled points in the time domain. Let u ¼ ½u T 1,1 ; Á Á Á ; u T N x ;N y T ∈ C N be the hyperspectral image made by stacking the spectrum of each spatial position on top of one another and f ¼ ½f T 1,1 ; Á Á Á ; f T N x ;N y T ∈ R M be a stack of coarsely measured time-domain interferograms at each spatial position. A hyperspectral image u was reconstructed from the coarse time-domain signals f obtained with Lissajous scanning by solving the following optimization problem with a regularization term called the anisotropic SSTV: 21 where W ∈ C M×N is a measurement matrix and λ ∈ R is a hyperparameter. D x , D y , and D b are differential operators with respect to the x axis, the y axis, and the frequency axis, respectively, which act on each spectrum where ðu p;q Þ b ðb ¼ 1; Á Á Á ; N b Þ represents the bth band of the spectra u p;q . W is a block-diagonal matrix containing inverse nonuniform discrete Fourier transform (NDFT) matrices for each pixel. The number of measurement points at each pixel is made uniform at 300 by picking up measurement points randomly or padding with zeros so that each diagonal element of W has the same shape, which significantly accelerates the multiplication of W and u. The value of hyperparameter λ was determined so that the value of the root-mean-squared error (RMSE) between the ground-truth time-domain signals and the pixel-wise inverse Fourier transform of the estimated spectral image is minimized (Fig. 3). It is important to note that the optimal value of λ can be estimated by cross-validation even if we do not know the ground truth. Equation (1) was solved with the primal-dual splitting (PDS) algorithm. 23,24 The PDS algorithm can solve optimization problems in the following form: min u ffðuÞ þ gðuÞ þ hðGuÞg; where G is a matrix and f; g; and h are the convex functions. f is differentiable, whereas g and h do not need to be differentiable. The algorithm is given by where γ 1 ; γ 2 > 0 are step sizes, a proximity operator of a function f is defined as prox γf ðxÞ ¼ arg min y ½fðyÞ þ ð1∕2γÞkx − yk 2 2 , and a proximity operator of a conjugate function of f is defined as prox γf Ã ðxÞ ¼ x − γprox f∕γ ðx∕γÞ.

Results
As a proof-of-concept demonstration, we simulated hyperspectral image reconstruction with a simple artificial hyperspectral  image, shown in Fig. 4(a). We first determined the sinusoidal Lissajous scanning trajectory by three sinusoidal functions with scanning frequencies of f x ¼ 12,705 Hz, f y ¼ 12,505 Hz, and f τ ¼ 12,520 Hz, and the triangular Lissajous trajectory by two triangular functions with f x ¼ 1005 Hz and f y ¼ 1060 Hz and a sinusoidal function with f τ ¼ 12,520 Hz, with a measurement duration of 40 ms so that the trajectory covered a large number of spatial positions and never repeated itself and the restricted isometry constant of the measurement matrix was small. This measurement duration was 10 times shorter than the corresponding raster scan measurement. We assumed that one kind of chemical with five Raman vibrational modes was spread in a droplet-like concentration distribution shown in Fig. 4(a). Then we sampled corresponding time-domain interferograms in the generated Lissajous trajectory. Figures 4(b)-4(d) show the results of the hyperspectral image reconstruction with and without CS. Simple pixel-wise NDFT of undersampled data failed to recover the spectra due to aliasing, as shown in Fig. 4(b), whereas five input peaks were recovered in the spectral image reconstructed via CS [Figs. 4(c) and 4(d)].
To quantitatively evaluate the reconstruction performance, we reconstructed spectral images from the sparsely sampled time-domain interferograms with different noise levels, different compression ratios, and different scanning schemes (faster sinusoidal and slower triangular). The reconstruction performances were evaluated by the RMSE between the ground-truth time-domain interferograms and the pixel-wise inverse Fourier transform of the estimated spectral image. First, we investigated the measurement-duration dependence of the reconstruction performance in the sinusoidal and triangular scanning schemes [ Fig. 5(a)]. To simulate the reconstruction with different measurement durations, the measurement duration was simply changed with the same scanning frequencies in the sinusoidal case, whereas the scanning frequencies were modified so that the number of samples at each pixel became more uniform in the triangular case. These simulation conditions are consistent with the realistic experimental situation, where the frequency of the resonant scanner cannot be significantly changed, while that of the galvanometric scanner can flexibly be changed within a certain frequency range. As seen in Fig. 5(a), the sinusoidal scanning provides lower RMSE values with different measurement durations than that of the triangular scanning, presumably because the sinusoidal Lissajous trajectory has more randomized sampling intervals along the τ axis at each pixel.  To further evaluate the performance of our method in practical applications, we also demonstrated the reconstruction of a biological hyperspectral image in a semiexperimental manner using a Raman spectral image of a Euglena gracilis cell taken by a spontaneous Raman microscope (Renishaw inVia). A 31 pixel × 33 pixel Raman hyperspectral image of a wild-type E. gracilis cell was obtained using 74 mW of 532 nm excitation, a 50× objective, and a per-pixel measurement duration of 300 ms. The image was denoised by truncated singular value decomposition using the top-10 components, and then spectral background was removed using the rolling-ball filter method. 25 We generated the time-domain interferograms by inverse Fourier transforming the spectrum at each pixel after resizing it to 100 pixels × 100 pixels and applying a window function. The sinusoidal Lissajous sampling proposed above was used with a measurement duration of 60 ms. Figures 6(a)-6(j) show typical intensity maps and spectra of the reconstructed spectral image and the Fourier transform of the generated full-size timedomain interferograms. Although the spatial resolution is degraded due to the use of total variation regularization, the same molecular distributions between the reconstructed [Figs. 6(a), 6(c), and 6(e)] and ground-truth [Figs. 6(b), 6(d), and 6(f)] spectral images and peak distributions among their corresponding spectra [Figs. 6(g)-6(i)] are evident. Figure 6(k) shows the performance of CS-based time-domain HSI for different measurement durations.

Discussion
It is worthwhile to compare our scheme with other CS-based HSI methods that operate in the frequency domain. 20,26 To date, the frequency-domain approach has used random masks 26 or Lissajous scanning 20 to randomly sample frequency or spatial components; these reconstructions are only reliable when the spectra are nonsparse because CS works well under the condition that signal sampling occurs in a nonsparse domain. 27 On the other hand, in our time-domain approach, spectra composed of sharp peaks are preferred because the energy of such sharp peaks is delocalized in the time domain. Thus FT-CARS spectroscopy is a suitable platform for CS because its spectra have a sparse nature originating from the narrow molecular lines spread in a broad spectral range and the absence of background signals such as fluorescence. In summary, we have numerically demonstrated CS-based time-domain HSI with a spectral image acquisition rate of 10× higher than that of conventional raster scanning. By combining the Lissajous-scanning-based sampling method and spectral image reconstruction by SSTV, reliable Raman spectral image reconstruction was realized for a simple synthetic bead image and a cell image taken by a spontaneous Raman microscope. We compared different scanning methods, faster sinusoidal scanning, and slower triangular scanning and showed that both schemes are feasible for acquiring Raman hyperspectral images at a video rate (with a measurement duration of 30 ms). The present method requires only slight modifications to the existing FT-CARS imaging setup and is extendable to other time-domain HSI methods, such as FT-IR and FT-two-photon excitation. 28