Hybrid reconstruction of the physical model with the deep learning that improves structured illumination microscopy

Abstract. Structured illumination microscopy (SIM) has been widely used in live-cell superresolution (SR) imaging. However, conventional physical model-based SIM SR reconstruction algorithms are prone to artifacts in handling raw images with low signal-to-noise ratios (SNRs). Deep-learning (DL)-based methods can address this challenge but may lead to degradation and hallucinations. By combining the physical inversion model with a total deep variation (TDV) regularization, we propose a hybrid restoration method (TDV-SIM) that outperforms conventional or DL methods in suppressing artifacts and hallucinations while maintaining resolutions. We demonstrate the performance superiority of TDV-SIM in restoring actin filaments, endoplasmic reticulum, and mitochondrial cristae from extremely low SNR raw images. Thus TDV-SIM represents the ideal method for prolonged live-cell SR imaging with minimal exposure and photodamage. Overall, TDV-SIM proves the power of integrating model-based reconstruction methods with DL ones, possibly leading to the rapid exploration of similar strategies in high-fidelity reconstructions of other microscopy methods.


Introduction
Superresolution (SR) fluorescence microscopy provides nanoscale resolution for studying subcellular structures and biological processes. [1][2][3][4][5][6][7] However, the higher light dose required for SR imaging than conventional microscopy, phototoxicity, and photobleaching severely limit their applications in live-cell imaging. 8 Structured illumination microscopy (SIM) demonstrates a higher photon efficiency than other SR microscopy. [9][10][11] In particular, two-dimensinal (2D)-SIM can achieve a doubling resolution beyond the light diffraction limit using nine sequentially acquired images, making it useful for live-cell SR imaging. However, live-cell SR-SIM imaging still suffers from phototoxicity and photobleaching, and image restoration is an ill-posed inverse problem. 9,10,[12][13][14][15] Therefore, for raw images of low signalto-noise ratio (SNR) caused by short exposure or excessive photobleaching, the conventional Wiener-based reconstruction method is prone to artifacts. [16][17][18][19] Various physical model-based restoration methods have been developed to suppress SIM artifacts, such as total variation regularization, 19 notch filtering, 16,18 high-fidelity (HiFi)-SIM, 20 and joint space and frequency reconstruction-SIM. 21 Using spatiotemporal continuity as the prior knowledge, we have developed an iterative restoration method based on the Hessian regularization term (Hessian-SIM) that suppresses artifacts due to the amplification of random noise. 18 However, other artifacts persist, such as hammer-stroke and honeycomb artifacts due to the out-of-focus background 17,18,20 and artifacts due to illumination scattering, 22 which cannot be suppressed completely by model-based methods. Deep neural networks can approximate arbitrary functions with infinitesimal errors to extract highdimensional features from low-resolution and low-quality images. 23 Therefore, researchers have proposed end-to-end deeplearning (DL)-based reconstruction algorithms to suppress different artifacts indiscriminately with low SNR raw SIM images. [24][25][26] However, DL-based methods may suffer from hallucinations 27 and generally reduced resolution. For example, current DL methods often incorrectly predict mitochondrial cristae structures in live cells.
To combine the advantages of both methods, we try to balance the reconstruction fidelity of traditional methods and the artifacts suppression of DL methods. However, suppose these two parts are combined into one objective function to achieve simultaneous optimization; in that case, the network is required to calculate the partial differential 28 of the input images rather than the normal network weights. Therefore, we utilized the total deep variation (TDV) network as a regularizer in the reconstruction objective function. By combining the physical SIM reconstruction procedure with the TDV regularizer, 28 we propose a hybrid restoration method (TDV-SIM) to suppress artifacts and maintain resolution simultaneously. On processing images of different cellular structures, TDV-SIM retains the actual signals better than the pure DL methods while removing artifacts more effectively than the model-based methods.

Principle and Parameter Selection of TDV-SIM
For SIM imaging, the sample is excited by sinusoidal illuminations with different pattern orientations and phases. The raw images contain low-and high-frequency information, which need to be separated and reassembled in SIM reconstruction. 16,18,19 We transformed the SIM reconstruction into an optimization problem and constructed an objective function [Eq. (1) where f is the target image to be estimated, g is the inverse Fourier transform of the high-and low-frequency information separated from the SIM raw data, and λ is the weight parameter of the regularization term. By optimizing the objective function with the gradient descent algorithm [Eq. (2)], TDV-SIM can reconstruct SR-SIM images that preserve the high-frequency information more faithfully than pure DL-based methods, and suppress artifacts more effectively than pure model-based methods, where η is the step size. The entire reconstruction pipeline is shown in Fig. 1(a), where f 0 is the initial SIM image obtained by Wiener deconvolution and f T is the final reconstruction after T iterations. The computation pipeline of ∇RðfÞ is shown in Fig. 1(b). Compared to the ground truth (GT) image of actin filaments [averages of multiple Wiener-processed images, Fig. 1 Fig. 1(e)], artifacts may not be suppressed entirely if λ (or T) is too small; in contrast, if λ (or T) is too large with a fixed T of 25 (or a λ of 2.5), genuine signals may be removed incorrectly. Thus we set the optimal parameters to be 2.5 and 25 for λ and T, respectively.
To label mitochondria, COS-7 cells were incubated with 250 nM MitoTracker Green FM (Thermo Fisher Scientific, M7514) in an hank's balanced salt solution medium (Thermo Fisher Scientific, 14025076) containing Ca 2þ and Mg 2þ at 37°C for 15 min, followed by washing 3 times before conducting 2D-SIM imaging. To label actin, COS-7 cells were transfected with Lifeact-enhanced green fluorescent protein (EGFP). According to the manufacturer's instructions, the transfections were executed using Lipofectamine 2000 (Thermo Fisher Scientific, 11668019). After transfection, the cells were plated on precoated coverslips. Live cells were imaged in a complete cell culture medium containing no phenol red in a 37°C live-cell imaging system. To label endoplasmic reticulum (ER), COS-7 cells were transfected with EGFP-Lys-Asp-Glu-Leu. According to the manufacturer's instructions, the transfections were executed using Lipofectamine 3000 (Thermo Fisher Scientific, L3000015). After transfection, the cells were cultured for 20 to 28 h before the experiments. Live cells were imaged in a complete cell culture medium containing no phenol red in a 37°C live-cell imaging system. The cells were tested for mycoplasma contamination before use.

Image Acquisition, Preprocessing, and Training
The same SIM settings in Hessian-SIM 18 were used. To obtain low SNR raw images and the corresponding GT images for training the neural network, we imaged the specimen with SIM. We recorded 20 images for each illumination pattern and then changed the phase and orientation of the pattern. We repeated the cycle nine times, corresponding to three orientations multiplied by three phases, thus obtaining 180 raw images. Then we divided the raw images into 20 groups, with each group containing nine illumination patterns of three phases and three orientations. After removing the fluorescent background, we can obtain 20 SR images with artifacts using Wiener deconvolution. Finally, we mimicked the artifact-free GT by averaging the 20 SR images. We imaged ∼20 cells, and the images were preprocessed to obtain pairs of raw data and GT images at each time point. Next, we divided such image pairs into a training set, a validation set, and a test set; then, we applied random cropping, quarter rotating, and horizontal/vertical flipping to further enrich the training data set. We trained the TDV-SIM using an Adam optimizer, with the learning rate set to 10 −4 . For actin, we adopted the mean square error (MSE) loss function, where W and H represent the image width and height, respectively. For mitochondria and ER, a combination of the MSE loss and the SSIM loss was used, where k is a scalar weight that balances the relative contributions of SSIM and MSE losses and is set to 0.1 throughout this paper.

Calculation of Assessment Metrics
To avoid the influence of different methods on the dynamic range of the inferred SR images, we first normalize the SR images, We used the PSNR, SSIM, and normalized root MSE (NRMSE) to evaluate the similarity between the reconstructed image and GT. They were calculated as follows: PSNRðX; YÞ NRMSEðX; YÞ ¼ where W and H represent the image width and height, respectively. X and Y represent the reconstruction result and the GT image, respectively. MAX I is the maximum possible pixel value of the image and equals to 2 B − 1 when the image is represented with linear pulse-code modulation of B bits (e.g., MAX I equals 255 for an 8-bit image). μ X and μ Y represent the averages of X and Y, σ X and σ Y represent the variances of X and Y, and σ XY represents the covariance of X and Y. c 1 and c 2 are small positive constants that stabilize each term; c 1 ¼ ð0.01LÞ 2 , c 2 ¼ ð0.03LÞ 2 , where L is the dynamic range of the pixel values. Artifacts often emerged in regions of minor signals, such as the meshed region within actin filaments. Therefore, benchmarked against the GT, we selected these regions to calculate their variances.

TDV-SIM Excels in Restoring Regular Structures
Imaged with a Low SNR We compared TDV-SIM with other reconstruction methods, including physical-model-based (Wiener deconvolution, 11 HiFi-SIM, and Hessian-SIM) and pure DL-based methods [skip-layer connecting U-Nets (scU-Net) 24

TDV-SIM Enables Better Reconstruction of Intricate Structures Prone to Photobleaching
Photobleaching constitutes a major problem of fluorescence SR imaging, continuously reducing image SNR, and compromising the quality of reconstructed images, especially upon resolving nonstereotypical structures such as mitochondrial cristae. 30 Therefore, we benchmarked the performance of TDV-SIM in resolving mitochondrial cristae dynamics for a prolonged time in live cells [ Fig. 3(a)]. During the 20 s recording, the fluorescence  intensity of MitoTracker decreased by ∼30% due to photobleaching [ Fig. 3(b)]. In the beginning, model-based methods could reconstruct high-quality intricate mitochondrial cristae, which were gradually corrupted with artifacts gradually due to photobleaching [ Figs

TDV-SIM Enables Better Reconstruction of Actin Filaments under Nonlinear SIM
In comparison to conventional linear SIM, nonlinear (NL) SIM achieves higher lateral resolution up to ∼60 nm, 9 whereas NL-SIM suffers from the reconstruction artifacts, especially with low SNR raw data. By combining the NL-SIM physical model with the TDV regularization term, we proposed the TDV-NL-SIM. We benchmarked the performance of TDV-NL-SIM with Wiener deconvolution, Hessian-NL-SIM, and DFCAN on actin filaments within the BioSR data set 25

Discussion
For traditional reconstruction methods such as Hessian-SIM, the denoising effect is limited to images with a low SNR. In contrast, the pure DL method directly fits the SR image through raw images, in which the fitting process is a black box. Therefore, reconstruction fidelity entirely depends on the network fitting ability and its comparativeness with the sample. For the proposed TDV-SIM, the SR information is extracted from raw images by the conventional frequency-extracting process and then integrated into the TDV network for artifact suppression. By combining the advantages of conventional physical modelbased algorithms with DL-based algorithms, TDV-SIM outperforms existing reconstruction methods in removing artifacts associated with regions of low SNR while retaining sharpcontrast intricate structures. For example, the reconstructed actin filaments and ER of TDV-SIM have an 80.1% decrease in the background artifacts compared with Hessian-SIM and a 24.3% increase in signal fidelity compared with DFCAN. Indeed, all current DL-based reconstruction methods generate blurred mitochondrial cristae structures, 24,25 highlighting the difficulty of pure data-driven methods in predicting irregular and complicated structures in constant changes. Under such circumstances, incorporating physical constraints about the image formation process becomes critical, as we show here. Therefore, TDV-SIM has significant advantages over pure DL methods in the face of samples with intricate and dynamic structures. However, the current TDV-SIM has limitations. On the one hand, inherited from conventional restoration algorithms, better reconstruction results depend on choosing ideal parameters. Through comparative experiments, we set the optimal hyperparameters to be 2.5 and 25 for λ and T respectively, which are applicable in most cases. However, we may need to introduce adaptive mechanisms to achieve an optimized adjustment step in the future. On the other hand, we cannot apply the current neural network regularization term to different specimens and imaging modalities. Future exploration of other regularization terms more generally applicable to different samples may further improve the adaptability and robustness of our method. Besides, TDV-SIM aims to recover the real signal from the noisy raw images. In the second-order spectrum of NL-SIM, excess noise renders signals in the reconstructed SR image to be discontinued, even with the TDV-SIM. However, it will not produces hallucinative signals such as the pure DL method.
Starting from a hybrid angle, TDV-SIM presents a novel solution for high-resolution and HiFi SR-SIM reconstruction from low SNR images. Endorsed with reduced photon dosage and associated phototoxicity, improved imaging speed, and extended imaging duration, TDV-SIM will be crucial for SR imaging subcellular structure dynamics in live cells.
Jianyong Wang received his bachelor's degree in mechanical design, manufacturing, and automation from the University of Electronic Science and Technology of China in 2019 and his master's degree in software engineering from the School of Software and Microelectronics of Peking University in 2022. His research interest is super-resolution structured illumination microscopy.
Junchao Fan received his bachelor's and PhD degrees in engineering from Huazhong University of Science and Technology in 2014 and 2020, respectively. He is an associate professor at Chongqing University of Posts and Telecommunications. His research is focused on the imaging processing and reconstruction algorithm of computational imaging.
Bo Zhou received his bachelor's degree in mechanical design, manufacturing, and automation from Central South University in 2017 and his master's degree in software engineering from the School of Software and Microelectronics, Peking University in 2020. Currently, he is a PhD student at Cell Secretion and Metabolism Laboratory, Institute of Molecular Medicine, Peking University. His research interests are the reconstruction algorithms of super-resolution fluorescence microscopy.
Xiaoshuai Huang received his bachelor's degree in science from Wuhan University in 2013 and his PhD from Peking University in 2018. He is an assistant professor at Peking University. From 2018 to 2020, he was trained as a postdoctoral research fellow at Peking University. His research is focused on super-resolution microscopy and cell biology.
Liangyi Chen is a Boya Professor at Peking University. He majored in biomedical engineering as an undergraduate at Xi'an JiaoTong University and a PhD student at Huazhong University of Science and Technology. His lab focused on developing state-of-the-art imaging techniques, including ultrasensitive Hessian structured illumination microscopy, superresolution fluorescence-assisted diffraction computational tomography, sparse deconvolution enabled mathematical superresolution, and fast high-resolution miniature two-photon microscopy for brain imaging in freely behaving mice. He is also a guest professor at Université PSL and École Normale Supérieure.