Temporal compressive super-resolution microscopy at frame rate of 1200 frames per second and spatial resolution of 100 nm

Abstract. Various super-resolution microscopy techniques have been presented to explore fine structures of biological specimens. However, the super-resolution capability is often achieved at the expense of reducing imaging speed by either point scanning or multiframe computation. The contradiction between spatial resolution and imaging speed seriously hampers the observation of high-speed dynamics of fine structures. To overcome this contradiction, here we propose and demonstrate a temporal compressive super-resolution microscopy (TCSRM) technique. This technique is to merge an enhanced temporal compressive microscopy and a deep-learning-based super-resolution image reconstruction, where the enhanced temporal compressive microscopy is utilized to improve the imaging speed, and the deep-learning-based super-resolution image reconstruction is used to realize the resolution enhancement. The high-speed super-resolution imaging ability of TCSRM with a frame rate of 1200 frames per second (fps) and spatial resolution of 100 nm is experimentally demonstrated by capturing the flowing fluorescent beads in microfluidic chip. Given the outstanding imaging performance with high-speed super-resolution, TCSRM provides a desired tool for the studies of high-speed dynamical behaviors in fine structures, especially in the biomedical field.


Introduction
Exploring fine structures and their dynamics beyond the optical diffraction limit is an urgent requirement in many research fields, especially in biology and medicine.To date, various super-resolution microscopy techniques have been developed to surpass the optical diffraction limit.For example, stimulated emission depletion microscopy (STED) improved the resolution by shrinking the point spread function (PSF) with nonlinear stimulated emission depletion based on confocal microscopy. 1 Single molecule localization microscopy (SMLM), involving photoactivated localization microscopy (PALM) 2 and stochastic optical reconstruction microscopy (STORM), 3 achieved a higher resolution by localizing a single fluorescent molecule with sparse fluorescence activation instead of recording fluorescence distribution.Structured illumination microscopy (SIM) obtained a super-resolution image by loading normally inaccessible high spatial frequency information into the recorded images by the moiré effect. 4Super-resolution optical fluctuation imaging (SOFI) utilized random temporal signal fluctuations of single emitters to achieve background-free super-resolution microscopy based on high-order statistics. 5In addition, some novel microscopy techniques are emerging by combining multiple super-resolution imaging methods.For example, a combining method with STED and SMLM realized better resolution and less fluorophore bleaching, such as minimal STED 6 or minimal photon fluxes (MINFLUX). 7A STED-SIM method achieved 30 nm resolution and single-molecule sensitivity by utilizing STED to provide nonlinear modulation for SIM. 8 A SIM-based point localization estimator (SIMPLE) method obtained simultaneous particle localization with twofold precision by using phaseshifted sinusoidal wave patterns as nanometric rulers. 9Superresolution microscopy, as a powerful imaging tool, has boosted the development of biomedicine, and numerous discoveries have been reported, 10 such as centrosome structure and function, 11,12 nuclear and chromatin organization, 13,14 and mitochondrial membrane protein organization. 15t should be noted that all the techniques mentioned above acquire the super-resolution ability at the expense of reducing the imaging speed by either point scanning or multiframe computation.Thus, the imaging speed is inevitably limited, which greatly affects the observation of high-speed dynamics of fine structures.Recently, a single-image super-resolution (SISR) technique was proposed to overcome the limited imaging speed by extracting a super-resolution image from one recorded image, which allowed the super-resolution imaging speed to reach the frame rate of a camera.Many deep-learning-based algorithms with neural networks have accelerated the development of SISR due to their outstanding image processing ability.For example, Wang et al. 16 employed a generative adversarial network (GAN) to realize cross-modality super-resolution from confocal microscopy images to STED images or from total internal reflection fluorescence (TIRF) images to SIM images.Chen et al. 17 proposed a novel network combining a superresolution network and a signal-enhancement network to transfer wide-field images to SMLM images.Qiao et al. 18 developed a deep Fourier channel attention network (DFCAN) for superresolution imaging by leveraging the frequency content difference across distinct features to learn precise hierarchical representations of high-frequency information in diverse biological structures.Obviously, SISR improves the super-resolution imaging speed by avoiding the point scanning and multiframe computation, but the imaging speed is still restricted by the frame rate of a camera.
To further improve the super-resolution imaging speed that breaks through the frame rate limit of a camera, we propose and demonstrate a novel temporal compressive super-resolution microscopy technique, termed TCSRM, which combines an enhanced temporal compressive microscopy and a deep-learningbased image reconstruction.Here, the purpose of the enhanced temporal compressive microscopy is to improve the imaging speed by reconstructing multiple images from one compressed image, and the deep-learning-based image reconstruction seeks to achieve the super-resolution without reduction in the imaging speed.The high-speed super-resolution imaging ability of TCSRM is verified in theory and experiment, and the experimental result shows that TCSRM has the imaging capability with a frame rate of 1200 frames per second (fps) and spatial resolution of 100 nm based on a 200 fps CMOS and a 100× objective lens.TCSRM can provide a well-established tool for capturing the high-speed dynamics of fine structures and will have promising applications in the biomedical field.

Theoretical Model
As an inherent feature, natural dynamic scenes have sparsity in some transform domains.Thus, the spatiotemporal information of a dynamic scene can be recovered from a compressed sampling based on compressive sensing theory. 19,20Moreover, the spatial distributions at adjacent moments have continuity.Therefore, the spatial distribution at a moment can provide the reference information for the dynamic scene. 21,22Based on these premises, we propose an enhanced temporal compressive microscopy to capture the high-speed dynamic scene, which combines the spatiotemporal compressive information and the transient spatial information.The imaging model is shown in Fig. 1(a).The original dynamic scene Dðx; y; tÞ is first transferred into a diffraction-limited dynamic scene Bðx; y; tÞ after passing through an optical microscope.This process can be treated as the convolution with the PSF of the microscope, and is expressed as Bðx; y; tÞ ¼ HDðx; y; tÞ ¼ Dðx; y; tÞ Ã PSFðx; yÞ; ( where H is the diffraction limitation operator, and PSFðx; yÞ is the PSF of the microscope.The diffraction-limited dynamic scene is then synchronously sampled by two channels: a compressive sampling (CS) channel and a transient sampling (TS) channel.
The CS channel is utilized to collect all the spatiotemporal information of the diffraction-limited dynamic scene, while the TS channel is used to acquire the spatial information at a moment of this dynamic scene.where R is the merging estimation operator, and V x and V y are the components of the motion vector V in the horizontal and vertical directions.Based on the compressed sensing theory, one can get the estimation of the dynamic scene by solving a constrained optimization problem, which is given as where λ is the channel weight factor that is determined by the light flux ratio between the CS and TS channels, ρ is the regularization factor of sparsity constraint, and Φ is the sparse transform operator.For simplicity, the indices in Eq. ( 5) are omitted.
To solve this constrained optimization problem, Eq. ( 5) is split into three subproblems, involving a compressed sensing recovery problem, a motion estimation problem, and an image superresolution problem.The three subproblems are solved step by step by alternative iteration with constraints, 23 which can be expressed as Step 2∶ B ðnÞ fr ¼ arg min Step 3∶ D ðmÞ r where B cr is the coarse reconstruction result of the diffractionlimited dynamic scene based on the compressed image M cs from the CS channel, B fr is the fine reconstruction result combining the spatiotemporal information from the CS and TS channels, D r is the recovered super-resolution dynamic scene, ρ 1 and ρ 2 are the regularization parameters in each subproblem, Φ 1 and Φ 2 are the sparse transform operators in each subproblem, F is the motion vector estimation operator, S is the SISR operator, and n and m are the iteration numbers in Eqs. ( 7) and ( 8), respectively.Using a gradient descent method, 24 Eqs.( 6)-( 8) can be iteratively calculated by considering the balance among super-resolution mapping, measurement constraint, and sparsity constraint.
The image reconstruction framework of TCSRM is shown in Fig. 1(b).The compressed image in the CS channel is first recovered to a diffraction-limited dynamic scene B cr with a plug-and-play (PnP) algorithm embedded with multiple denoisers, including total variation (TV), 19 FFDnet, 25 and FastDVDnet. 26Here various priors of images and videos are utilized.The reconstruction from the compressed image can only obtain the coarse spatiotemporal information of the dynamic scene due to the insufficient sampling.The result is further processed by a fine reconstruction, containing motion estimation, merging estimation, and scene correction modules.The motion estimation is conducted to extract the dynamic features.In this process, the motion vector V is determined by a sparse motion estimation algorithm based on block-matching 27 and Lucas-Kanade optical flow. 28Then, the reference image M ref from the TS channel and the motion vector V are fused in the merging estimation module to provide an estimation for the dynamic scene by combining the spatiotemporal information from the two channels.The errors in the motion estimation and merging estimation are compensated in the scene correction module by optimizing the details of the dynamic scene based on the PnP algorithm with the measurement and prior constraints.The three modules are conducted iteratively to acquire the final dynamic scene with fine details, which satisfies the measurement constraints, motion estimation, and prior constraints simultaneously.The reconstructed dynamic scene is further processed by a super-resolution reconstruction module.Here, a pretrained DFCAN is utilized to handle the task, which is a residual network with Fourier channel attention blocks.Exploiting the power spectrum characteristics of distinct feature maps in the Fourier domain, DFCAN can bridge the low-resolution and high-resolution image spaces precisely. 18A forward estimation is used to calculate the error E between the reconstruction results and actual measurements, involving the compressed image and reference image, which is expressed as Here, the error will be calculated in each iteration.Once the error reaches the preset threshold, the desired super-resolution dynamic scene is obtained.It is worth mentioning that TCSRM is rather different from the simple concatenation of temporal compressive imaging and DFCAN.The recovered images from temporal compressive imaging have different features compared with natural images, which is due to the information loss during compressive acquirement and the imperfect optimization during image reconstruction.Moreover, the DFCAN is trained by using natural image pairs with low and high resolution.The mismatch in image features makes it difficult to obtain acceptable results because of the generalization problem in end-to-end networks.However, TCSRM utilizes the additional reference frame to recover the images with higher accuracy, which decreases the mismatch in image features between recovered images and natural images.In addition, the global iterations of compressive image reconstruction and super-resolution processing are conducted in TCSRM to optimize the final super-resolution images with corresponding forward estimation.In this way, the superresolution ability of DFCAN can be fully utilized, alleviating the generalization problem.

Simulation Result
In order to verify the feasibility of TCSRM, we design a dynamic scene with high-speed moving nanorings for simulation.In the simulation, the diameter of the rings is 750 nm, and the width of the rings (full width at half-maximum, FWHM) is 96 nm.The three nanorings move in different ways.The top one moves right with a constant velocity of 97.5 nm∕frame, the middle one moves right with an initial velocity of 32.5 nm∕frame and a rightward acceleration of 3.25 nm∕frame 2 , and the bottom one moves along a curve with an initial velocity of 91.91 nm∕frame with direction 45 deg to the horizontal and a rightward acceleration of 3.25 nm∕frame 2 .The fluorescence wavelength of the nanorings is 560 nm, and the numerical aperture (NA) of the objective lens is 1.5.The dynamic scene contains 36 images with the size of 512 × 512, which is utilized as the ground truth (GT).After passing through a microscope, the diffraction-limited dynamic scene is individually sampled by the CS and TS channels.Six compressed images with a size of 128 × 128 are acquired in the CS channel.Thus, the data compression ratio is 6, which means that one compressed image contains the information of six original images.However, six reference images with a size of 256 × 256 are recorded in the TS channel, which means that one of six original images is selected to provide a reference for the image reconstruction in the CS channel.The images from the two channels are then processed by the reconstruction algorithm in Fig. 1(b) to recover the original dynamic scene.One compressed image is shown in Fig. 2(a), together with the corresponding reference image.As can be seen, the compressed image shows an obvious blur due to spatial coding and temporal integration, while the reference image shows a clear profile of the nanorings.The reconstructed images by TCSRM are shown in Fig. 2(b), associated with the GT images.Similarly, the reconstructed images have a clear spatial profile.The width of the rings is much smaller than that in the reference image, and it is close to that in the GT images.The average peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) of the recovered images by TCSRM are 27.87 dB and 0.98, respectively.The motion traces of the nanorings extracted from TCSRM are given in Fig. 2(c), associated with the motion traces from GT. Obviously, the motion traces of TCSRM agree well with those of GT, which demonstrates that TCSRM can recover the dynamic scene containing the objects with various displacement patterns with high temporal accuracy.For quantitatively characterizing the effect of super-resolution, the radial intensity distributions of the nanorings in the reference, GT, and TCSRM images are extracted, as shown in Fig. 2(d).The width of the nanorings is decreased to 102 nm in the TCSRM images from 251 nm in the reference image, which is close to 96 nm in the GT images.The whole video of the high-speed moving nanorings is provided in Video 1.

Experimental Design
The experimental arrangement of TCSRM is shown in Fig. 3.A continuous-wave laser with the wavelength of 532 nm (Laser Quantum, Torus 532) is used as the excitation source.The laser beam is expanded by a beam expander and reflected by a dichroic mirror and is then focused in the microchannel of a customized glass microfluidic chip on the sample stage with an objective lens (Olympus, UPlanApo, Oil, 100×, NA 1.5).The depth and width of the microchannel are 10 and 120 μm, respectively.Fluorescent beads (Thermofisher, F8800) with the diameters of about 100 nm and an emission wavelength of 560 nm are dispersed in distilled water and then injected into the microchannel by an injection pump (MesoBioSys, MS-102P) with adjustable flow rate.The fluorescence signal passes through the dichroic mirror and then is divided into two components by a beam splitter: one is imaged on a digital micromirror device (DMD, Texas Instrument, DLP6500) for spatiotemporal encoding, and then recorded by a camera CMOS1 (Andor, Zyla 5.5); the other is directly recorded by a camera CMOS2 (Andor, Zyla 5.5).The DMD has a micromirror array of 1920 × 1080 with a size of 7.56 μm.Here, CMOS1 is utilized to acquire compressed images of the dynamic scene, and CMOS2 is used to acquire transient images at some specific moments of the dynamic scene.A field programmable gate array (FPGA) device provides the trigger signals to synchronize the cameras and DMD accurately.The time sequences of these devices are shown in the inset of Fig. 3.The frame rates of both cameras are set as 200 fps, while the refresh rate of DMD is set as 1200 Hz.The exposure times of CMOS1 and CMOS2 are set to be 4.9 and 0.7 ms, respectively.In each exposure, the compressed image by CMOS1 contains the information of the dynamic scene with six spatial encodings, and the reference image by CMOS2 only records the transient information at a moment of the dynamic scene.In the reconstruction for the experimental data, the regularization parameters ρ 1 and ρ 2 are both set as 0.07, and the iteration numbers for coarse reconstruction, fine reconstruction, and forward estimation are 224, 64, and 4, respectively.The pixel number of the reconstructed images is 1024 × 2048.

Experimental Result
The experimental result of a flowing fluorescent bead in a microfluidic chip is shown in Fig. 4, and the whole video is provided in Video 2. One selected compressed image and corresponding reference image are given in Fig. 4(a), and the reconstructed six images by TCSRM are shown in Fig. 4(b).The size of the bead in TCSRM is obviously decreased compared with that in the reference image, and the moving trajectory can be clearly distinguished.To show the resolution improvement, the intensity distributions of the bead along the horizontal and vertical directions in the reference image and the first frame of TCSRM images are extracted and given in Figs.4(c) and 4(d).The sizes of the bead (FWHM) in the horizontal and vertical directions for the reference image are 264 and 237 nm, and those in the TCSRM image are 118 and 93 nm, respectively.Thus, the resolution is improved by a factor of about 2.2.That is to say, TCSRM has the high-speed super-resolution ability with the frame rate of 1200 fps and the spatial resolution of about 100 nm, which surpasses conventional microscopy.The difference in the sizes in the two directions is due to the highspeed moving of the bead in the horizontal direction, which results in the stretch of the bead in this direction during the image reconstruction.According to the measurement of TCSRM, the   average speed of the bead is about 0.39 mm∕s, which is close to the flowing speed of the water with 0.42 mm∕s.Here, the flowing speed is calculated based on the flux of the water and the size of the microchannel.Moreover, the bead does not move along a straight line, which may result from turbulent flow 29 or Brownian movement. 30By measuring the speeds of the fluorescent beads at different locations with TCSRM, the flowing speed distribution of the microchannel can also be extracted.

Discussion and Conclusion
TCSRM is a lossy imaging by spatial encoding, which will reduce the image quality.One way is to improve the sampling rate in hardware, such as multiple CS channels, and the other way is to develop a more advanced image reconstruction algorithm in software, such as hybrid super-resolution algorithm.The reference image in the TS channel provides detailed spatial information for the image reconstruction in the CS channel, and therefore the exposure time of CMOS2 should be as short as possible under the condition of ensuring high enough signalto-noise ratio.In general, the maximum exposure time should be shorter than the division of the exposure time of CMOS1 and the compressive ratio.Additionally, in order to obtain the effect of the super-resolution, the data compression ratio in the CS channel cannot be too large, and the value of around 10 is appropriate.An important application of TCSRM is biomedical imaging.Compared with other wide-field super-resolution imaging, such as SIM and SISR, TCSRM has lower light flux due to two-channel sampling, and the light field in the CS channel is spatially modulated in amplitude, while that in the TS channel is partially detected in a very short time scale.An end-to-end deep-learning super-resolution algorithm is utilized in TCSRM, which has the limitation in generalization.Transfer learning 31 can be used to reduce the required training datasets.Meanwhile, self-supervised networks, such as GAN 32 and deep image prior, 33 may be adopted to improve the generalization of TCSRM.
In conclusion, we have developed a high-speed superresolution microscopy technique TCSRM by combining an enhanced temporal compressive microscopy and a deep-learningbased image reconstruction.The enhanced temporal compressive microscopy realizes the high-speed imaging and the deep-learning-based image reconstruction obtains the resolution beyond the optical diffraction limit.Both the theoretical and experimental results verify the high-speed super-resolution imaging ability of TCSRM, and the imaging performance with a frame rate of 1200 fps and spatial resolution of 100 nm is experimentally obtained.TCSRM provides a powerful tool for the observation of high-speed dynamics of fine structures, especially in hydromechanics and biomedical fields, such as microflow velocity measurement, 34 organelle interactions, 35 intracellular transports, 36 and neural dynamics. 37In addition, the framework of TCSRM can also offer guidance for achieving higher imaging speed and spatial resolution in holography, 38 coherent diffraction imaging, 39 and fringe projection profilometry. 40 He et al.: Temporal compressive super-resolution microscopy at frame rate of 1200 frames per second… Advanced Photonics 026003-2 Mar∕Apr 2023 • Vol.5(2)

Fig. 2
Fig. 2 Simulation result of moving nanorings by TCSRM.(a) Compressed image and reference image measured by two channels in TCSRM.(b) GT and TCSRM images for six consecutive frames.The moving trajectories of the nanorings are labeled with green lines.(c) Motion traces of the three nanorings in the whole scene from GT (lines) and reconstructed result by TCSRM (circles, squares, and rhombuses).(d) Radial intensity distributions of the nanorings along the white line in the reference, GT, and TCSRM images (Video 1, mp4, 845 KB [URL: https://doi .org/10.1117/1.AP.5.2.026003.s1]).

Fig. 4
Fig. 4 Experimental result of flowing fluorescent bead in microchannel by TCSRM.(a) Compressed and reference images recorded by two cameras.(b) Reconstructed images by TCSRM.The trajectory of the moving bead is marked with white dashed lines.(c) and (d) Intensity distributions of the fluorescent bead along the horizontal and vertical directions in the reference image and the first frame in TCSRM images (Video 2, mp4, 89.7 KB [URL: https://doi.org/10.1117/1.AP.5.2.026003.s2]).