Physics-constrained deep-inverse point spread function model: toward non-line-of-sight imaging reconstruction

. Non-line-of-sight (NLOS) imaging has emerged as a prominent technique for reconstructing obscured objects from images that undergo multiple diffuse reflections. This imaging method has garnered significant attention in diverse domains, including remote sensing, rescue operations, and intelligent driving, due to its wide-ranging potential applications. Nevertheless, accurately modeling the incident light direction, which carries energy and is captured by the detector amidst random diffuse reflection directions, poses a considerable challenge. This challenge hinders the acquisition of precise forward and inverse physical models for NLOS imaging, which are crucial for achieving high-quality reconstructions. In this study, we propose a point spread function (PSF) model for the NLOS imaging system utilizing ray tracing with random angles. Furthermore, we introduce a reconstruction method, termed the physics-constrained inverse network (PCIN), which establishes an accurate PSF model and inverse physical model by leveraging the interplay between PSF constraints and the optimization of a convolutional neural network. The PCIN approach initializes the parameters randomly, guided by the constraints of the forward PSF model, thereby obviating the need for extensive training data sets, as required by traditional deep-learning methods. Through alternating iteration and gradient descent algorithms, we iteratively optimize the diffuse reflection angles in the PSF model and the neural network parameters. The results demonstrate that PCIN achieves efficient data utilization by not necessitating a large number of actual ground data groups. Moreover, the experimental findings confirm that the proposed method effectively restores the hidden object features with high accuracy.


Introduction
2][3][4][5] The light in the NLOS system originates from a pulsed laser or other sources to illuminate the diffuse reflective relay surface (rough wall, rock face, etc.); then the diffused light is incident on an object out of the line of sight and is scattered back to the relay surface.][11][12][13][14] The NLOS reconstruction problem is an inverse mathematical problem, aimed at recovering the hidden scene from the detected signal.Several challenges exist in NLOS imaging reconstruction.First, NLOS is an ill-posed problem characterized by a very low signal-to-noise ratio (SNR), resulting from environmental noise and high light loss along the scattered propagation path, rendering high-quality reconstruction challenging.Second, although the forward physical process is clearly understood, the physical model lacks clarity in handling multiple diffuse reflections in the NLOS system, making it difficult to obtain accurate values such as the direction of diffused light and energy attenuation.Furthermore, the inverse process is extremely complex.As a result, it is challenging to derive simple mathematical expressions directly, making highquality image recovery for the NLOS system through a physical model difficult.
The research on NLOS imaging dates back to 2009 when Kirmani 15 proposed a framework utilizing time-of-flight camera imagery and transient reasoning to reveal scene properties inaccessible to traditional computer vision.Building on Kirmani's work, Velten 16 successfully recovered the three-dimensional shape of objects hidden around corners, combining time-offlight techniques with computational reconstruction algorithms.Subsequently, O'Toole 17 introduced a confocal NLOS imaging system.The confocal system, in contrast to traditional nonconfocal NLOS systems, facilitates finding a closed solution to the NLOS problem and yields higher-quality image reconstructions.Expanding on the confocal system, researchers have developed methods like light-cone transformation, 18 directional light-cone transformation, 19 and virtual wavefronts 20 for NLOS image restoration.However, these confocal methods employ the time-of-flight approach with time-resolving detectors (such as SPAD).As a result, the system requires data capture via scanning.To ensure clarity in NLOS image reconstruction, this approach necessitates scanning numerous points, often exceeding a measurement time of 10 min.Consequently, the prolonged data acquisition required renders these methods unsuitable for real-time NLOS imaging applications.
With the development of machine learning and neural networks, researchers have proposed data-driven algorithms for NLOS image reconstruction.Chen et al. 21introduced a trainable architecture that maps diffuse indirect reflections to scene reflectance, relying solely on synthetic training data.To overcome the long scan time associated with traditional systems.Metzler 22 employed a plane array complementary metaloxide-semiconductor (CMOS) detector to capture speckle images within a second.However, data acquisition in NLOS imaging remains cumbersome, and there is currently a scarcity of real large-scale data sets.These synthetic images are based on the assumption that the relay wall is a standard Lambertian surface.However, in reality, the wall often deviates from a standard Lambertian surface, not conforming to isotropic theory.
The point spread function (PSF) is a core concept in image reconstruction.4][25][26][27][28][29][30] By understanding both the PSF and the image produced by the optical system, information about the object's surface can be retrieved through deconvolution.
This technique has widespread applications in various fields, including astronomy, 31,32 microscopy, 24,33 and medical imaging, 34,35 o ¼ F −1 ðI∕ΦÞ; (1) where F −1 represents inverse Fourier transformation, o is the object information detected by the camara, and I and Φ are Fourier transformations of the image and PSF matrix, respectively.Establishing a PSF model is crucial for image reconstruction in scattering and diffuse reflection systems.Faber 35 developed a PSF model for weakly scattering media within an optical coherence tomography system, enabling the quantitative measurement of attenuation coefficients.By manipulating a specific single PSF, Xie et al. 36 achieved depth-resolved imaging of thin scattering media, extending beyond the original depth of field.In the context of NLOS imaging systems, Pei et al. 37 calculated the PSF employing a Gaussian-shaped laser pulse and the Poisson noise of a time-resolved camera.However, this model had limitations in accurately reflecting the NLOS scattered propagation process.
This work introduces a novel NLOS imaging recovery model that addresses these limitations, incorporating advancements in both the physical model and the computational reconstruction algorithm.We developed an accurate forward PSF model using ray tracing for the NLOS system, offering a physical constraint for an untrained neural network.Contrary to previous methods that assumed perfect isotropic reflectance, our proposed method takes into account the randomness of actual reflection angles on the relay wall.Furthermore, our method does not necessitate training data sets to ascertain neural network parameters, setting it apart from conventional deep-learning-based approaches.Instead, it starts from the random initialization parameters of the neural network, constrained by the forward physical model and speckle image, and iteratively employs the gradient descent algorithm to estimate parameters and establish the mapping relationship.Experimental data were employed to validate the proposed NLOS image reconstruction algorithm.Specifically, we make the following contributions: 1. We proposed an advanced PSF model for the NLOS process, initiating from the optical system's wavefront aberration, calculating the PSF at the optical pupil through reverse tracing, and enhancing the PSF model by employing the forward propagation process to simulate the light's exit angle at the optical pupil.This approach provides a more comprehensive representation of the NLOS imaging system by capturing the intricate light-scattering process and yielding a more precise estimate of the PSF.
2. We developed a physics-constrained inverse network (PCIN) for NLOS imaging reconstruction.This method synergizes the advantages of both physical models and deep-learning techniques, providing a potent tool for reconstructing images of obscured objects from captured speckle images.The proposed method utilizes a physics-based model to guide the inverse network, ensuring that the reconstructed image aligns with the NLOS imaging system's physical characteristics.
3. We performed experimental validation of the proposed method for NLOS imaging reconstruction.Through experimentation, the efficacy of the proposed method in identifying objects of various shapes was demonstrated.

PSF Model for NLOS Imaging
The experimental setup for the NLOS imaging system is presented in Fig. 1(a).A laser, emitting at a wavelength of 632.8 nm, is expanded and collimated by a lens group before illuminating the wall.The light scatters toward the hidden object, reflects back to the relay surface, and is captured as a speckle image by a CMOS camera.In the speckle image, each pixel corresponds to a point on the original object via the PSF matrix, depicted in Fig. 1(b), where o represents the hidden object, φ is the PSF matrix of the NLOS imaging system, and i represents the captured speckle image.To elucidate this relationship further, Fig. 1(c) presents a simple example of a hidden object and its corresponding speckle image as captured by the camera.This experimental arrangement and the associated data lay the groundwork for assessing the efficacy of the proposed NLOS image reconstruction method.
In this context, the relay wall of the NLOS imaging system can be conceptualized as a mirror exhibiting aberrations.This conceptualization allows for the determination of the NLOS imaging system's PSF through wavefront aberration analysis.Specifically, the PSF of a coherent optical system is expressed as where Pðu; vÞ is the pupil function of the system, u, v is the spatial frequency, which can be represented as u by the pupil rectangular coordinates x 0 , y 0 , and d is the distance from the image plane to the exit pupil.The pupil function is shown as: The amplitude component of the pupil function, denoted as Aðx 0 ; y 0 Þ, is a function of the pupil shape.r 0 is the radius of the exit pupil.The parameter k is equal to 2π∕λ, where λ represents the wavelength of the light source.Wðx 0 ; y 0 Þ represents the wave aberration of the optical system.
The scattered light-propagation diagram is shown in Fig. 2. The diffuse reflection wall is considered as the combination of both specular reflection and diffuse reflection.Under the ideal imaging condition, the diffuse reflector is akin to a fully specular reflector.Consequently, the imaging lens group and the diffuse reflection wall form an optical system where reverse tracing is performed on the detector to obtain the position and radius of the exit pupil for the NLOS optical system.Then, the relay wall is regarded as an optical element with aberrations, and the optical source in this system can be considered as a laser diffusely reflected by the relay surface.Here, we employ an improved diffuse reflection model proposed by Wolff et al. 38 in which the diffused surface is represented by microfacets arranged in V-grooves, distributed over various orientations.The diffused reflected radiance is formulated as a combination of the reflection radiance from microfacets, which accounts for masking and shadowing, and the reflection radiance due to interreflections, L r ðθ r ; θ i ; ϕ r − ϕ i ; σÞ where A ¼ 1 − 0.5 σ 2 σ 2 þ0.33 , B ¼ 0.45 σ 2 σ 2 þ0.09 , σ is the standard deviation of the Gaussian distribution as a measure of surface roughness, α ¼ Max½θr; θi, β ¼ Min½θr; θi, and ρ is the diffuse albedo as defined by Lambert's Law.Thus, the radiance from the relay wall to the hidden object for the first diffused reflection is derived as where L 0 is the radiance of the laser after beam expansion and ϕ i1 and θ i1 are the azimuth and incident angles of the incident light, respectively.For the NLOS system in this study, these three parameters are constant.The radiance from a measured object is given as where O is the hidden object, and the speckle image captured by the CMOS detector is For an optical system, the image captured by the detector can also be expressed by Equation ( 8) represents the expression with noise depicted in Fig. 1(b) in the frequency domain.Here, Φ signifies the PSF matrix of the NLOS system, and N symbolizes the system's noise.Within the NLOS system, the predominant noises include photon noise, represented by Gaussian noise, and background noise, which appears as a peak and a uniform offset in the speckle image. 22To mitigate the impact of these noises on image reconstruction, regularization techniques in deep learning and data preprocessing strategies are implemented.
Retrieval of the hidden object image in the NLOS system relies on solving Eqs. ( 3), (7), and ( 8) simultaneously.The precision of the PSF matrix, especially the exact exit angle of diffuse reflection in the exit pupil, is critical for the success of this retrieval process.If the exit angle is ascertainable, the system's wave aberration can be deduced through reverse tracking, facilitating the derivation of the NLOS system's PSF.However, modeling the relay surface accurately becomes challenging with multiple diffuse reflections, thereby complicating the attainment of the precise angle θ i2 of diffused light.

Physics-Constrained Inverse Network
A critical challenge in NLOS imaging is the unknown specifics, such as the size and location of the hidden object, and the PSF of the optical system varying with position and field of view.Consequently, accurately modeling the PSF of the NLOS system solely based on physical theory is not feasible.Deeplearning methods have been explored for computational imaging.In this approach, object reconstruction is achieved through the solution of an optimization problem.convolutional neural network (CNN), as one of the methods in deep learning, has been widely used in superresolution imaging, [39][40][41] lensless imaging, 42,43 imaging estimation through scattering media, 44,45 etc.The utilization of CNN is deemed effective in modeling the PSF matrix for NLOS imaging, as highlighted in the preceding study. 37However, being a primarily data-driven technique, CNN significantly relies on the volume of measured data.Its accuracy in modeling the PSF matrix is also reliant on the amount of available truth data.Additionally, this approach lacks constraints from physical models, leading to the neglect of a priori information.Conversely, while the forward physical process of the PSF in NLOS is clear, directly obtaining angle parameters in the model is challenging due to multiple diffuse reflections.In this paper, we propose a method known as PCIN, which combines the advantages of a neural network and a physical model.This method integrates the PSF model into traditional CNN architecture and introduces a neural network that enables NLOS reconstruction without the necessity for data training.We name this approach PCIN, with its workflow illustrated in Fig. 3. To initiate the process of NLOS image recovery using the PCIN algorithm, the speckle image is input into the CNN network with random initial weights and the angles in the forward model.The procedure for NLOS image recovery utilizing the PCIN algorithm unfolds as follows: 1.The speckle image obtained from measurement is fed into CNN with randomly initialized weights p 0 .The output of the CNN is taken as the initial reconstruction of the NLOS speckle image.The initial reconstruction is then incorporated into the physical model with a randomly initialized diffuse angle θ 00 .In this process, the forward physical model generates a speckle image corresponding to the initial reconstructed image.The loss is calculated between the measured speckle image and the one obtained from the physical model to optimize the initial physical parameters.This iterative process is repeated to derive the final physical parameter θ 0n .The detailed optimization process is illustrated in the right flow of Fig. 3, with the network structure kept unchanged.
2. The optimization variable replaces the initial value in the network.The optimized physical parameter and reconstruction image from Step 1 are fed into the network.The forward physical model is used to obtain the speckle image corresponding to the initial reconstructed image, and the loss between the measured speckle image and the speckle image obtained from the physical model is calculated to optimize the initial network parameters.These steps are repeated iteratively to obtain the final network parameters pn.The optimization process is illustrated in left flow 2 of Fig. 3, with the physical parameter remaining unchanged.
3. These two steps are alternated, optimizing both the network weights and physical model parameters until the final network output, as shown in Fig. 3, is obtained.
The proposed method leverages the CNN's robust modeling capabilities to construct an inverse physical model neural network, representing the inverse physical processes of NLOS.Contrary to traditional deep-learning-based approaches, this method does not necessitate extensive training data sets to establish the parameters of this neural network.Instead, it employs a gradient descent algorithm with alternating iterations to optimize both the neural network parameters and the unknown parameters in the forward model.This optimization is constrained by the forward physical model and the measured speckle image, enabling the estimation of parameters in both the neural network and the forward model, and ultimately deriving the mapping relations.Therefore, the reconstruction of the NLOS system can be retrieved by solving the optimization problem, where TV stands for total variation regularization and Ô for reconstruction image obtained by CNN.Î for the speckle image calculated by the forward PSF physical model and reconstruction image Ô. Upon obtaining the optimized weight P and diffused angle θ, the NLOS recovered image is estimated and outputted as the final layer of the PCIN.Considering both resolution and calculation speed, the size of the measured image is selected as 512 pixels × 512 pixels.The network is implemented in PyTorch.The renowned U-net architecture is employed for our CNN, utilizing the ADAM optimizer with a learning rate set to 0.01.All computations were executed on an Nvidia GTX 3090 GPU to guarantee computational efficiency and accuracy.

Results
In this section, we present the experimental validation of the proposed method.The experimental setup, shown in Fig. 4, employs a laser with a wavelength of 632.8 nm and an optical power of 5 mW as the light source.A lens group expands the beam, increasing the collimated beam diameter to 3 mm, thereby illuminating the hidden object.The Dyhana 4040 CMOS camera with a sensitive area of 36.9 mm × 36.9 mm and a field of view of 40 deg is chosen as the detection device, which is capable of capturing information from the NLOS system after 3 times of diffuse reflection.To adequately capture the hidden object's information without overexposure, the camera is set to capture images with a 40 ms exposure time.
To evaluate the effectiveness of our proposed methodology, four letters were chosen for imaging experiments.Specifically, the light source was positioned 1.2 m from the relay wall with an incident angle of 15 deg, and the hidden object was placed with a 1-m separation from both the camera and the relay wall.
The reconstruction results of different exposure time and different postures for the selected hidden objects are presented in Figs. 5 and 6, illustrating the capability of the proposed PCIN method to reconstruct the shape of hidden objects from diffused images.It is noteworthy that with camera exposure time of fewer than 20 ms, the algorithm is generally unable to complete the reconstruction due to insufficient information capture within such a brief period.As the exposure time increases, there is a corresponding enhancement in the accuracy and detail of the reconstructed image.At an exposure time of 40 ms, the detailed features of the object are essentially reconstructed.However, increased exposure time inevitably leads to more noise in the system, manifesting as poorer reconstruction quality at the image edges.
Encouragingly, in the initial run, the network required more than 4000 iterations and took several minutes to achieve satisfactory results.For subsequent runs involving the same object type, the optimization time was reduced to approximately 1800 iterations by leveraging the previous run's optimization results as input.Figure 6    proposed method can precisely recover fine features and accurately determine the position and posture of hidden objects.
For further validation of the algorithm's performance, we chose more complex subjects, including cartoon images and Chinese characters, as hidden objects.Similarly, the detector's exposure time varied from 10 to 40 ms.The inversion results of the algorithm are shown in Fig. 7.With the increased complexity of the object, shorter exposure time (10 to 20 ms) prove inadequate for reconstructing the hidden object.This suggests that with complex hidden objects, shorter exposure time fail to capture sufficient effective information.When the exposure time is increased to 30 ms, the algorithm can essentially reconstruct the approximate shape of the object under examination.At an exposure time of 40 ms, the algorithm fully reconstructs the shape of hidden objects, capturing relatively fine features as well, demonstrating its efficacy in reconstructing complex objects.
To evaluate the algorithm's adaptability to diffusely reflecting walls of various shapes, we fabricated concave, convex, and wavy diffusely reflecting walls using highly flexible white foam.The camera exposure time was set to 40 ms to capture more information about the hidden objects.The reconstructed images, as illustrated in Fig. 8, reveal that surface variations  of the diffusely reflecting walls lead to differences in the reconstruction of the same object.Nevertheless, the reconstruction was generally successful.The wavy surface resulted in the poorest reconstruction outcome.This is attributed to the creases of the wavy surface acting as light traps, causing the light to undergo multiple diffuse reflections.Consequently, the quantity of light carrying information about the target object that enters the detector is diminished, reducing the precision and detail of the reconstructed images.
The NLOS reconstruction can be seen as a phase-retrieval (PR) problem.We compared the proposed PCIN method with the alternating minimization PR algorithm (Alt-Min) from Ref. 46 and the traditional CNN method from Ref. 22.The training data set for CNN is synthesized by a physical model.For better reconstruction results, the exposure time is chosen as 40 ms.The comparative results in Fig. 9 demonstrate that both the PCIN-and CNN-based methods outperform traditional PR methods in terms of reconstruction quality within the same exposure time.Subsequently, the incidence angle was adjusted to 10 deg.Notably, the CNN network model utilized parameters trained in the previous step, rather than undergoing retraining.The reconstructed images in Fig. 10 illustrate the limited universality of the traditional CNN network, highlighting its inapplicability in changing external environments.This implies that both the PR algorithm and our proposed PCIN algorithm excel in reconstructing NLOS images amid external environmental changes, whereas the deep-learning approach necessitates generating new training data and retraining for each new scene.
The reconstruction results under a 20 ms exposure time are shown in Fig. 11.Under conditions of poor SNR, the PR algorithm is ineffective in reconstructing NLOS imaging results.Conversely, both CNN and the proposed PCIN method demonstrate strong noise robustness.
The PR algorithm was initialized with spectral initializers and used default parameters. 47To optimize for the best reconstruction, several minutes were used for PR and 38 h were used for CNN to train parameters for net.Under low exposure conditions, the number of optimization iterations for each algorithm increases, yet the overall time remains relatively consistent.From the above analysis, it is evident that both   PR and PCIN methods do not require extensive data and exhibit superior adaptability to various scenes compared to the CNN method.Regarding runtime, the PR and the proposed model operate on a similar scale, with both taking approximately several minutes.While there are variations in the results depending on different objects, these variations are not markedly significant.However, at low SNRs, PR fails to reconstruct hidden objects, while both PCIN and CNN exhibit robust noise resistance.

Discussion and Conclusion
In this study, we present a novel theoretical framework for NLOS imaging based on a PSF physical model.The proposed approach incorporates wave aberration theory and reverse tracking to determine the pupil and obtain the PSF model of the NLOS system.Additionally, we introduce an innovative inverse network framework, embedding a physics-constrained neural network, to optimize unknown parameters in the physical model via neural network iteration.This method achieves precise reconstruction outcomes through mutual feedback between the neural network and the physical model.Although involving an iterative process and potentially time-consuming for reconstructions, this method significantly eliminates the need for paired data sets during training.Consequently, this results in substantial time savings in data preparation.
Experimental validation on NLOS imaging data confirms the method's success in reconstructing hidden objects from a single measured speckle image with a 40 ms exposure time using a traditional CMOS detector.The combination of the PSF model and deep learning demonstrates potential for NLOS imaging in complex environments, such as rescue operations and field exploration, and represents a significant advancement towards high-resolution NLOS imaging.
In summary, this method fundamentally optimizes the traditional physical model using deep-learning techniques.In the NLOS system, due to the random nature of diffuse reflections, both the forward and reverse models cannot be obtained precisely.Specifically, the emergence angle at the optical pupil in the forward tracing cannot be determined, which prevents the direct establishment of the PSF matrix in the reverse model.However, by integrating deep learning with the physical model, the emergence angle in the physical model can be optimized, enabling the reconstruction of an image from a speckle image without training data or ground truth.Our proposed algorithm achieves superior performance that neither the physical model nor deep learning alone can achieve.This makes it ideal for scenarios like hostage rescue or intelligent driving in complex environments, where extensive real measurement data for training is not available.Unlike deep-learning methods that rely on specific scenes, our algorithm can be applied to a wide range of scenes and scenarios.Moreover, our model more accurately mirrors the real NLOS propagation process compared to the traditional, simplified NLOS physical model.Nonetheless, the model still has some limitations, such as deviations in reconstruction accuracy, particularly at the edge, and longer running time.Our future work will be focused on enhancing the algorithm's performance, accuracy, and running speed to enable real-time rapid NLOS reconstruction.

Disclosures
The authors declare no conflicts of interest.

Fig. 1
Fig. 1 The NLOS system and reconstruction principle.(a) A confocal NLOS imaging system with a CMOS camera to capture the image.(b) The imaging equation in an optical system with PSF and (c) propagation process from object to image in the NLOS system. L

Fig. 2
Fig. 2 Light path in the NLOS system.(a) Wavefront propagation process of diffuse reflection and (b) definition of diffuse reflection parameters.

Fig. 3
Fig. 3 Flowchart of PCIN algorithm for NLOS imaging reconstruction.The speckle image captured by the camera is put into CNN, and PCIN iteratively updates the parameters in CNN using the loss function constructed by the speckle image and forward physical model.The optimized parameters are utilized to obtain a high-quality reconstructed image.
displays the NLOS imaging reconstruction results following posture changes.The results indicate that the

Fig. 4
Fig.4Back and front of the experimental scene.Light passes from the laser, to the collimator, to the wall, to the hidden object, and finally to the camera.

Fig. 5
Fig. 5 Comparison of the reconstructed images of various exposure time from the proposed PCIN method.(a) Speckle images of different exposure time captured by the camera.(b) Ground truth.(c) Reconstructed images of different exposure time.

Fig. 6
Fig. 6 Comparison of the reconstructed images of various exposure time from the proposed PCIN method.(a) Speckle images of different exposure time captured by the camera.(b) Ground truth.(c) Reconstructed images of different exposure time.

Fig. 7
Fig. 7 Comparison of the reconstructed cartoon images and Chinese characters of various exposure time from the proposed PCIN method.(a) Speckle images of different exposure time captured by the camera.(b) Ground truth.(c) Reconstructed images of different exposure time.

Fig. 8
Fig.8Comparison of the reconstructed images of convex, concave, and wavy walls.

Fig. 9
Fig. 9 Comparison of the reconstructed images of PR, CNN, and PCIN methods at 40 ms exposure time.

Fig. 10 Fig. 11
Fig. 10 Comparison of the reconstructed images of PR, CNN, and PCIN methods after a 10 deg change in image plane inclination.