As an optical processor, a diffractive deep neural network ( |
1.IntroductionThe recent resurgence of analog optical information processing has been spurred by advancements in artificial intelligence (AI), especially deep-learning-based inference methods.1–9 These advances in data-driven learning methods have also benefited optical hardware engineering, giving rise to new computing architectures such as diffractive deep neural networks (), which exploit the passive interaction of light with spatially engineered surfaces to perform visual information processing. , also referred to as diffractive optical networks, diffractive networks, or diffractive processors, have emerged as powerful all-optical processors9,10 capable of completing various visual computing tasks at the speed of light propagation through thin passive optical devices; examples of such tasks include image classification,11–13 information encryption,14–17 and quantitative phase imaging (QPI),18,19 among others.20–24 Diffractive optical networks comprise a set of spatially engineered surfaces, the transmission (and/or reflection) profiles of which are optimized using machine-learning techniques. After their digital optimization (a one-time effort), these diffractive surfaces are fabricated and assembled in 3D to form an all-optical visual processor, which axially extends at most a few hundred wavelengths (). Our earlier work10,25 demonstrated that a spatially coherent can perform arbitrary complex-valued linear transformations between a pair of arbitrary input and output apertures if its design has a sufficient number () of diffractive features that are optimized, i.e., , where and represent the space-bandwidth product of the input and output apertures, respectively. In other words, and represent the size of the desired complex-valued linear transformation that can be all-optically performed by an optimized . For a phase-only diffractive network, i.e., only the phase profile of each diffractive layer is trainable, the sufficient condition becomes due to the reduced degrees of freedom within the diffractive volume. Similar conclusions can be reached for a diffractive network that operates under spatially incoherent illumination: Rahman et al.26 demonstrated that a diffractive network can be optimized to perform an arbitrary nonnegative linear transformation of optical intensity through phase-only diffractive processors with . However, encoding information with spatially incoherent light inherently confines both the input and output to nonnegative values, as they are represented by intensity patterns at the input and output apertures of a . To process complex-valued data with spatially incoherent light, other optical approaches were also developed;1,27–29 however, these earlier systems are limited to one-dimensional (1D) optical inputs and do not cover arbitrary input and output apertures, limiting their functionality and processing throughput. An extension of these earlier 1D input approaches introduced the processing of 2D incoherent source arrays using relatively bulky and demanding optical projection systems that are hard to operate at the diffraction limit of light.30,31 Here, we demonstrate the processing of complex-valued data with compact diffractive optical networks under spatially incoherent illumination. We show that a spatially incoherent diffractive network that axially spans can perform any arbitrary complex-valued linear transformation on complex-valued input data with negligible error if the number of optimizable diffractive features is above a threshold dictated by the multiplication of the input and output space-bandwidth products, determined by both the spatial extent and the pixel size of the input and output apertures. To represent complex-valued spatial information using spatially incoherent illumination, we preprocessed the input information by mapping complex-valued data to a real and nonnegative, optical intensity-based representation at the input field of view (FOV) of the diffractive network. We term this mapping the “mosaicking” operation, indicating the utilization of multiple intensity pixels at the input FOV to represent one complex-valued input data point. Similarly, we used a postprocessing step, which involved mapping the output FOV intensity patterns back to the complex number domain, which we termed the “demosaicking” operation. Through these mosaicking/demosaicking operations, we show that a spatially incoherent can be optimized to perform an arbitrary complex-valued linear transformation between its input and output apertures while providing optical information encryption. The presented spatially incoherent visual information processor, with its universality and thin form factor (), shows significant promise for image encryption and computational imaging applications under natural light. 2.ResultsFigure 1(a) outlines a spatially incoherent architecture to synthesize an arbitrary complex-valued linear transformation () such that , where the input is , the target is and . The mosaicking process involves finding the nonnegative (optical intensity-based) representation of each complex-valued element of using nonnegative values; here, bases, , [see Fig. 1(c)], are used for representing the intensity-based encoding of complex numbers. Based on this representation, the 2D input aperture of a spatially incoherent will have nonnegative (optical intensity) values, denoted as , representing the input information under spatially incoherent illumination. The output intensity distribution, denoted with , undergoes a demosaicking process where a complex number is synthesized from the intensity values of output pixels, yielding the complex output vector such that . In our analyses, we used , except in Fig. S5 in the Supplementary Material, where results are shown for comparison. We chose the basis complex numbers as , such that the set of bases is closed under multiplication, and the product of any two of the bases in the set is also a basis; for example, for we have . Based on this representation of information, with and , , we can decompose any arbitrarily selected complex-valued transformation matrix into matrices (, , ) with real nonnegative entries such that For a given complex-valued input , where , the corresponding target output vector can be written as i.e., we have with a nonnegative real-valued matrixFor , where and , a similar analysis yields Based on these equations, one can conclude that to all-optically implement an arbitrary complex-valued transformation using a spatially incoherent , the layers of the need to be optimized to perform an intensity linear transformation such that . The entire system, upon convergence, performs the predefined complex-valued linear transformation on any given input data using spatially incoherent light, based on Eqs. (2) and (4). In the following sections, we numerically explore the number of optimizable diffractive features () needed for accurate approximation of using a spatially incoherent . 2.1.Complex-Valued Linear Transformations through Spatially Incoherent Diffractive NetworksWe numerically demonstrated the capabilities of diffractive optical processors to universally perform any arbitrarily chosen complex-valued linear transformation with spatially incoherent light. Throughout the paper, we used . To visually represent the data, we rearranged the 16-element vectors into arrays of complex numbers, hereafter referred to as the “complex image.” We arbitrarily selected a desired complex-valued transformation, , as shown in Fig. 1(b). To explore the number of diffractive features needed, we trained nine models with varying values of and evaluated the mean-squared-error (MSE) between the numerically measured () and the target all-optical linear transformation, (see Fig. 2). Our results, summarized in Fig. 2, highlight that with a sufficient number of optimizable diffractive features, i.e., , our system achieves a negligible approximation error with respect to the target . In Fig. 2(c), we also visualize the resulting all-optical intensity transformation compared to the ground truth . In essence, this comparison reveals the spatially varying incoherent point spread functions (PSFs) of our diffractive system optimized using deep learning; a negligible MSE between and shows that the resulting spatially varying incoherent PSFs match the target set of PSFs dictated by . We also evaluated the numerical accuracy of our complex-valued transformation in an end-to-end manner, as illustrated in Fig. 2(d). For this numerical test, we sequentially set each entry of to , evaluated the corresponding complex output , and stacked them to form , where the subscript represents that the measurement was evaluated using the complex impulse along the basis as input. Then, we repeated this process for the other two bases to obtain and , and stacked these matrices as a block matrix , shown in Fig. 2(d). Each row of the images and in Fig. 2(d) represents one of these complex output vectors, while the corresponding target vectors are presented in the same figure through and . The small magnitude of the error shown in Fig. 2(d) illustrates the success of this spatially incoherent model in accurately approximating the complex-valued linear transformation , implemented for an arbitrarily selected . 2.2.Complex Number-based Image Encryption Using Spatially Incoherent Diffractive NetworksIn this section, we demonstrate a complex number-based image encryption–decryption scheme using a spatially incoherent . In the first scheme, shown in Fig. 1(d), the message is encoded into a complex image, employing either amplitude and phase encoding or real and imaginary part encoding. Then, a digital lock encrypts the image by applying a linear transformation () to conceal the original message within the image. At the optical receiver, the encrypted message is deciphered by an optimized incoherent that all-optically implements the inverse transformation . In an alternative scheme, as depicted in Fig. 1(e), the key and lock are switched, i.e., the spatially incoherent is used to encrypt the message with a complex-valued while the decryption step involves the digital inversion using . For our analysis, we used the letters “U,” “C,” “L,” and “A” as sample messages. “U” and “C” are used in amplitude-phase-based encoding (Fig. 3), whereas “L” and “A” are used for real-imaginary-based encoding of information (Fig. S1 in the Supplementary Material), forming complex-number-based images. To accurately model the spatially incoherent propagation26 of light through the , we averaged the output intensities over a large number of of randomly generated 2D phase profiles at the input (see Sec. 4 for details). In Fig. 3(a), we show the results corresponding to digital encryption and optical diffractive decryption, i.e., the system shown in Fig. 1(d). The digitally encrypted complex information , and its intensity representation , are shown in Fig. 3(a). The optically decrypted output (through the spatially incoherent ) and its intensity-based representation are shown in the same Fig. 3(a), together with the resulting error maps, i.e., and , which reveal a very small degree of error. This agreement of the recovered and the ground-truth messages in both the intensity and complex-valued domains confirms the accuracy of the diffractive decryption process through an optimized spatially incoherent . Figure 3(b) shows the successful performance of the sister scheme [Fig. 1(e)], which involves diffractive encryption through a spatially incoherent and digital decryption, also revealing a negligible amount of error in both and . As reported in Fig. S1 in the Supplementary Material, we also conducted a numerical experiment using the letters “L” and “A,” encoded using the real and imaginary parts of the message. The visualizations are arranged the same way as in Fig. 3, where for both schemes depicted in Figs. 1(d) and 1(e), the degree of error between the recovered and the original messages is negligible, affirming the success of using the real and imaginary part-based encoding method. For the assessment of the approximation errors when the number of diffractive features is smaller, we compared the decryption performance of three models with different numbers of diffractive features/neurons, i.e., , for the same setup outlined in Fig. S1(a) in the Supplementary Material. The results are summarized in Fig. S2 in the Supplementary Material: for models with , the decryption quality is compromised, exhibiting a pixel absolute error of . However, this error reduces to for where the decrypted images display significantly enhanced contrast and reduced noise levels. To further evaluate the efficacy of our encryption method, we analyzed the complex image entropy, examining both the real and imaginary components separately (refer to the Sec. 4 for details). The original image , encrypted output , and the digitally encrypted output , along with the corresponding image entropies, are shown in Fig. S3(a) in the Supplementary Material for two complex image examples. We repeated this analysis for a set of 1000 complex images with the resulting entropy distributions reported in Fig. S3(b) in the Supplementary Material. These results demonstrate that the entropy of the encrypted images is statistically higher than that of the original images. This increase in entropy signifies a heightened level of randomness within the encrypted images, thereby validating the effectiveness of our encryption process. In addition, the entropy distributions of the encrypted images show excellent agreement with the digitally encrypted corresponding images, further demonstrating the success of our spatially incoherent optical encryption scheme. 2.3.Different Mosaicking and Demosaicking Schemes in a Spatially IncoherentHow we assign each element in the vector and to the pixels at the input and output FOVs of the diffractive network does not affect the final accuracy of the image/message reconstruction. For example, we can arrange the FOVs in such a manner that the components corresponding to a basis are assigned to the neighboring pixels, in two adjacent rows, as shown in Fig. S4(a) in the Supplementary Material; in an alternative implementation, the assignment/mapping can be completely arbitrary, which is equivalent to applying a random permutation operation on the input and output vectors (see Sec. 4). When compared to each other, these two approaches of mosaicking and demosaicking schemes show negligible differences in the error of the final reconstruction of the letters “U,” “C,” “L,” and “A” as shown in Fig. S4(b) in the Supplementary Material. These results underscore that the specific arrangement of the mosaicking/demosaicking schemes at the input and output FOVs does not impact the performance of the incoherent system. 3.Discussion and ConclusionIn this article, we employed a data-free PSF-based optimization method (see Sec. 4),26 since we can determine the nonnegative intensity transformation from the target complex-valued transformation based on the mosaicking and demosaicking schemes; the columns of represent the desired spatially varying PSFs of the . The advantage of this data-free learning-based optimization approach is that computationally demanding simulation of wave propagation with large is not required during the training. Coherent propagation is appropriate for simulating the spatially varying PSFs, point by point, since a point emitter at the input aperture coherently interferes with itself during optical diffraction within a ; this approach makes the training time much shorter. On the other hand, this approach necessitates prior knowledge of , which might not always be available, e.g., for tasks such as data classification. An alternative to this data-free PSF-based optimization approach is to train the diffractive network in an end-to-end manner, using a data-driven direct training approach.26 This strategy advances by minimizing the differences between the outputs and the targets on a large number of randomly generated examples, thereby learning the spatially varying PSFs implicitly from numerous input-target intensity patterns corresponding to the desired task – instead of learning from an explicitly predetermined . This direct approach, however, requires a longer training time, necessitating the simulation of incoherent propagation for each training sample on a large data set. In our presented approach, the choice of is not restricted to , as we have used throughout the main text. As another example of encoding, we show the image encryption results with in Fig. S5 in the Supplementary Material, where the four bases are . The reconstructed “U,” “C,” “L,” and “A” letters are also reported in the same figure, confirming that given sufficient degrees of freedom (with ), the linear transformation performances are similar to each other. However, compared to , this choice of necessitates 4/3 times more pixels on both the diffractive network input and output FOVs—reducing the throughput (or spatial density) of complex-valued linear transformations that can be performed using a spatially incoherent . Accordingly, more diffractive features and a larger number of independent degrees of freedom (by 16/9-fold) are required within the volume to achieve an output performance level that is comparable to a design with . Note that while is sufficient to reconstruct the original complex-valued images regardless of the image complexity, the redundancy provided by larger values might offer increased resilience against noise at the cost of reducing the image-processing throughput (per input aperture area) with larger . Our framework offers several flexibilities in implementation, which could be useful for different applications. First, the flexibility to arbitrarily permute the input and the output pixels following different mosaicking and demosaicking schemes (as introduced earlier in Sec. 2) could enhance the security of optical information transmission. A user would not be able to either spam or hack valuable information that is transferred optically without specific knowledge of the mosaicking and demosaicking schemes, thus ensuring the security of this scheme. Note that this enhancement in security is achieved without adding complexity to the system by just permuting the assignment of data elements to the pixels of the input and output devices, e.g., spatial light modulators (SLMs) and complementary metal-oxide-semiconductor (CMOS) detector-arrays. Second, the flexibility in choosing , as discussed above, could be useful in adding an extra layer of security against unauthorized access, albeit with a trade-off in system throughput that comes with larger . Furthermore, we can use different sets of bases for mosaicking and demosaicking operations by applying offset phase angles and , respectively, to the original bases , . This will result in a set of modified/encrypted bases: for mosaicking and for demosaicking. This powerful flexibility in representation further enhances the security of the system. Regarding image encryption-related applications, we demonstrated two approaches [Figs. 1(d) and 1(e)] to utilize for encryption or decryption. However, it is also possible to deploy a pair of diffractive systems in tandem, with one undertaking the matrix operation for encryption and the other undertaking the inverse operation for decryption. Furthermore, potential extensions of our work could explore a harmonized integration of polarization state controls32 and wavelength multiplexing33 to build a multifaceted, fortified encryption platform. In addition to increasing the data throughput, these additional degrees of freedom enabled by different illumination wavelengths and polarization states would further enhance the security of a diffractive processor-based system. In this work, we focused on the numerical analysis of the presented concept. However, we should note that various designed using deep-learning-based approaches have been experimentally validated over different parts of the electromagnetic spectrum, e.g., from terahertz (THz)9,14 to near-infrared (NIR)15 and visible wavelengths,24 showing a good agreement between the numerical and experimental results. To address some of the experimental challenges associated with fabrication errors and mechanical misalignments, a “vaccination” strategy34,35 has been devised. This approach enhances the robustness of the diffractive optical designs by incorporating such aberrations/imperfections as random variables during the training phase, thereby preparing the system to better withstand and adapt to the uncertainties inherent in real-world experimental conditions. Although spatially coherent light is more suitable for complex-valued information processing in laboratory settings, the use of spatially incoherent light offers various practical advantages. For example, speckle noise, which is inevitable in coherent systems, can be suppressed by using partially or fully incoherent illumination. An additional benefit of spatially incoherent designs is the range of viable illumination sources that can be used: instead of using specialized coherent sources, a spatially incoherent system can work with standard light-emitting diodes (LEDs), or even under natural light, which is important for some applications of diffractive information processing. To conclude, we demonstrated the capability of spatially incoherent diffractive networks to perform arbitrary complex-valued linear transformations. By incorporating various forms of mosaicking and demosaicking operations, we paved the way for a wider array of applications by leveraging incoherent for complex-valued data processing. We also showcased potential applications of these spatially incoherent for complex number-based image encryption or decryption, highlighting the security benefits arising from the system’s flexibility. Our exploration marks a significant stride toward enhanced versatility and robustness in optical information processing with spatially incoherent diffractive systems that can work under natural light. 4.Appendix: Methods4.1.Linear Transformation MatrixIn this paper, we use so that ; see Fig. 1(b). To generate , we randomly sample the amplitude of each element from the uniform distribution and the phases from . For the encryption application, to ensure that the result of inversion is not sensitive to small errors, we performed QR-factorization on to obtain a condition number of one.36 4.2.Real-Valued Nonnegative Representation of Complex NumbersFollowing Eq. (4), the complex-valued input and target vectors and are represented by the corresponding real and nonnegative intensity vectors and , where and . The desired all-optical intensity transformation between and is derived from the target complex-valued linear transformation following Eqs. (1) and (5). We should note that deriving from requires mapping each complex element to its real and nonnegative representation based on the complex bases such that . To define a unique mapping, we follow an algorithm29 by imposing additional constraints: if , i.e., if the angle between and is greater than ; here represents the complex conjugate of . The same constraints were also used while mapping the complex input vectors to the real and nonnegative intensity vectors . 4.3.Mosaicking and Demosaicking SchemesFor mosaicking (demosaicking) assignment of each element of () to one of the () pixels of the 2D input (output), the arrangement of the FOV can be regular, e.g., in a row-major order as shown in Fig. S4(a) in the Supplementary Material, “Regular mosaicking.” Alternatively, the pixel assignment on the input (output) FOV can follow any arbitrary mapping which can be defined by a permutation matrix () operating on the input (output) vector; see Fig. S4(a) in the Supplementary Material, “Arbitrary mosaicking.” For such cases, when ordered in a row-major format, intensities on the input (output) FOVs () can be written as (). Accordingly, such an arbitrary arrangement of pixels was accounted for by redefining the all-optical intensity transformation as . 4.4.Spatially Incoherent Light Propagation through aThe 1D vector is rearranged as a 2D distribution of intensity at the input FOV of the . To numerically model the spatially incoherent propagation of the input intensity distribution through the , we coherently propagated the optical field through the trainable diffractive surfaces to the output plane, where is a random 2D phase distribution, i.e., for all . If we denote the coherent field propagation operator as (see Sec. 4.5), then the instantaneous output intensity is and the time-averaged output intensity for spatially incoherent light can be written as The average output intensity can be approximately calculated by repeating the coherent wave propagation -times, each time with a different random phase distribution , and averaging the resulting output intensities, We used for estimating the incoherent output intensity corresponding to any arbitrary input intensity . Note that when only one pixel at the input aperture is activated, with all other input pixels being inactive with zero intensity, as is the case while evaluating spatially varying PSFs, the application of Eq. (8) becomes redundant, although one could still use it. In this scenario, all the light diffracted from a single point source is mutually coherent. Consequently, for the purposes of evaluating the spatially varying PSFs of the system, as elaborated later in Sec. 4.7, employing a coherent propagation model for each point emitter at the input aperture is accurate and provides a faster solution. 4.5.Coherent Propagation of Optical Fields:The propagation of spatially coherent light patterns through a diffractive processor, denoted by , involves a series of interactions with consecutive diffractive surfaces, interleaved by wave propagation through the free space separating these surfaces. We assume that these modulations are introduced by phase-only diffractive surfaces, i.e., the field amplitude remains unchanged during the light–matter interaction. Specifically, we assume that a diffractive surface alters the incident optical field, symbolized as , in a localized manner according to the optimized phase values of the diffractive features, resulting in the phase-modulated field . The diffractive surfaces are coupled by free-space propagation, allowing the light to travel from one surface to the next. We used the angular spectrum method to simulate the free-space propagation,37 where is the 2D Fourier transform and is its inverse operation. is the free-space transfer function corresponding to propagation distance . For wavelength ,The fields were discretized with a lateral sampling interval of to accommodate all the propagating modes and sufficiently zero-padded to remove aliasing artifacts.38 4.6.Diffractive Network ArchitectureWe modeled the diffractive surfaces by their laterally discretized heights , which correspond to phase delays , where is the refractive index of the material. The connectivity between consecutive diffractive layers9 was kept equal across the diffractive designs with varying by setting the separation between the layers as , where the width of each diffractive layer is . Here, is the number of diffractive layers; we used throughout the paper. 4.7.Training and Evaluation of Spatially Incoherent Diffractive ProcessorsFor performing an arbitrary complex-valued linear transformation with a diffractive processor, we used the PSF-based data-free design approach, where the diffractive features were optimized so that the all-optical intensity transformation of the diffractive processor achieves . To evaluate , we used intensity vectors , where if and 0 otherwise. In other words, are unit impulses, located at different input pixels. We simulated the all-optical output intensity vectors corresponding to these unit impulses and stacked them, i.e., Finally, we compensated for the optical diffraction efficiency-related scale mismatch through multiplication by a scalar, i.e., where was defined asThe MSE loss function to be minimized was defined as The height of the diffractive features at each layer was constrained between zero and a maximum value by employing a latent variable . The relationship between the constrained height and the latent variable was defined as , where we chose , which corresponds to a differential phase modulation of . The latent variables were initialized randomly from the standard normal distribution . The optimization of the diffractive layers was carried out using the AdamW optimizer39 for 12,000 iterations, with an initial learning rate of . The model state corresponding to the minimum of the MSEs evaluated after every 400 iterations was selected for the final evaluation. The models were implemented and trained using PyTorch (v1.12.1)40 with Compute Unified Device Architecture (CUDA) version 12.2. Training and testing were done on GeForce RTX 3090 graphics processing units (GPUs) in workstations with 256 GB of random-access memory (RAM) and Intel Core i9 central processing unit (CPU). The training time of the models varied with the size of the models. For example, the model used in Figs. 2(b) and 2(c) took around 1 h for 12,000 iterations. Inference for each input vector with takes around 30 s. To visualize the all-optical transformation error in Fig. 2, we used the error matrix ; here denotes an element-wise operation. To evaluate the error at complex linear transformation, we applied demosaicking to the columns of to form the block matrix . Here, the subscript represents that is measured by applying the columns of as input and stacking the corresponding demosaicked (complex-valued) output vectors. Accordingly, we have Here, represents an element-wise operation. 4.8.Entropy EvaluationFor the evaluation of the image encryption strength, we computed the entropy separately for the real and imaginary parts of a complex image as follows: where we calculated the distribution of either the real or imaginary part (denoted by the superscript) over the pixels of ; here denotes the complex image. is the probability (normalized histogram count) for a certain pixel value .For the histograms presented in Fig. S3(b) in the Supplementary Material, the data set is adapted from the Extended MNIST (EMNIST).41 For the creation of the input complex images, we randomly selected two distinct images from the EMNIST data set, using one as the real part and the other as the imaginary part of the complex image. To ensure compatibility with the input dimensionality, these images were bilinearly downsampled to a resolution of . We randomly formed a set of 1000 such complex images to compile the histograms presented in Fig. S3(b) in the Supplementary Material. Code and Data AvailabilityThe data and methods required to assess the conclusions drawn in this study are included within the main text and supplementary information files. The optimization of machine-learning models used in this research was conducted using the publicly available PyTorch library. Additional data can be requested from the corresponding author. AcknowledgmentsThe Ozcan Research Group at UCLA acknowledges the support of the U.S. Department of Energy (DOE), Office of Basic Energy Sciences, Division of Materials Sciences and Engineering under Award # DE-SC0023088. ReferencesJ. W. Goodman, A. R. Dias and L. M. Woody,
“Fully parallel, high-speed incoherent optical method for performing discrete Fourier transforms,”
Opt. Lett., 2
(1), 1
–3 https://doi.org/10.1364/OL.2.000001 OPLEDP 0146-9592
(1978).
Google Scholar
H. Kwon et al.,
“Nonlocal metasurfaces for optical signal processing,”
Phys. Rev. Lett., 121
(17), 173004 https://doi.org/10.1103/PhysRevLett.121.173004 PRLTAO 0031-9007
(2018).
Google Scholar
R. Hamerly et al.,
“Large-scale optical neural networks based on photoelectric multiplication,”
Phys. Rev. X, 9
(2), 021032 https://doi.org/10.1103/PhysRevX.9.021032 PRXHAE 2160-3308
(2019).
Google Scholar
A. Silva et al.,
“Performing mathematical operations with metamaterials,”
Science, 343
(6167), 160
–163 https://doi.org/10.1126/science.1242818 SCIEAS 0036-8075
(2014).
Google Scholar
B. J. Shastri et al.,
“Photonics for artificial intelligence and neuromorphic computing,”
Nat. Photonics, 15
(2), 102
–114 https://doi.org/10.1038/s41566-020-00754-y NPAHBY 1749-4885
(2021).
Google Scholar
G. Wetzstein et al.,
“Inference in artificial intelligence with deep optics and photonics,”
Nature, 588
(7836), 39
–47 https://doi.org/10.1038/s41586-020-2973-6
(2020).
Google Scholar
X. Xu et al.,
“11 TOPS photonic convolutional accelerator for optical neural networks,”
Nature, 589
(7840), 44
–51 https://doi.org/10.1038/s41586-020-03063-0
(2021).
Google Scholar
S. M. Kamali et al.,
“Angle-multiplexed metasurfaces: encoding independent wavefronts in a single metasurface under different illumination angles,”
Phys. Rev. X, 7
(4), 041056 https://doi.org/10.1103/PhysRevX.7.041056 PRXHAE 2160-3308
(2017).
Google Scholar
X. Lin et al.,
“All-optical machine learning using diffractive deep neural networks,”
Science, 361
(6406), 1004
–1008 https://doi.org/10.1126/science.aat8084 SCIEAS 0036-8075
(2018).
Google Scholar
O. Kulce et al.,
“All-optical synthesis of an arbitrary linear transformation using diffractive surfaces,”
Light Sci. Appl., 10
(1), 196 https://doi.org/10.1038/s41377-021-00623-5
(2021).
Google Scholar
M. S. S. Rahman et al.,
“Ensemble learning of diffractive optical networks,”
Light Sci. Appl., 10
(1), 14 https://doi.org/10.1038/s41377-020-00446-w
(2021).
Google Scholar
J. Li et al.,
“Class-specific differential detection in diffractive optical neural networks improves inference accuracy,”
Adv. Photonics, 1
(4), 046001 https://doi.org/10.1117/1.AP.1.4.046001 AOPAC7 1943-8206
(2019).
Google Scholar
M. S. S. Rahman and A. Ozcan,
“Time-lapse image classification using a diffractive neural network,”
(2022). Google Scholar
B. Bai et al.,
“To image, or not to image: class-specific diffractive cameras with all-optical erasure of undesired objects,”
eLight, 2
(1), 14 https://doi.org/10.1186/s43593-022-00021-3
(2022).
Google Scholar
B. Bai et al.,
“Data-class-specific all-optical transformations and encryption,”
Adv. Mater., 35
(31), 2212091 https://doi.org/10.1002/adma.202212091 ADVMEW 0935-9648
(2023).
Google Scholar
Y. Gao et al.,
“Multiple-image encryption and hiding with an optical diffractive neural network,”
Opt. Commun., 463 125476 https://doi.org/10.1016/j.optcom.2020.125476 OPCOB8 0030-4018
(2020).
Google Scholar
Y. Su et al.,
“Optical image conversion and encryption based on structured light illumination and a diffractive neural network,”
Appl. Opt., 62
(23), 6131
–6139 https://doi.org/10.1364/AO.495542 APOPAI 0003-6935
(2023).
Google Scholar
D. Mengu and A. Ozcan,
“All-optical phase recovery: diffractive computing for quantitative phase imaging,”
Adv. Opt. Mater., 10
(15), 2200281 https://doi.org/10.1002/adom.202200281 2195-1071
(2022).
Google Scholar
C.-Y. Shen et al.,
“Multispectral quantitative phase imaging using a diffractive optical network,”
Adv. Intell. Syst., 5
(11), 2300300 https://doi.org/10.1002/aisy.202300300
(2023).
Google Scholar
T. Yan et al.,
“Fourier-space diffractive deep neural network,”
Phys. Rev. Lett., 123
(2), 023901 https://doi.org/10.1103/PhysRevLett.123.023901 PRLTAO 0031-9007
(2019).
Google Scholar
B. Bai et al.,
“Pyramid diffractive optical networks for unidirectional magnification and demagnification,”
(2023). Google Scholar
E. Goi, S. Schoenhardt and M. Gu,
“Direct retrieval of Zernike-based pupil functions using integrated diffractive deep neural networks,”
Nat. Commun., 13
(1), 7531 https://doi.org/10.1038/s41467-022-35349-4 NCAOBW 2041-1723
(2022).
Google Scholar
Z. Huang et al.,
“All-optical signal processing of vortex beams with diffractive deep neural networks,”
Phys. Rev. Appl., 15
(1), 014037 https://doi.org/10.1103/PhysRevApplied.15.014037 PRAHB2 2331-7019
(2021).
Google Scholar
X. Luo et al.,
“Metasurface-enabled on-chip multiplexed diffractive neural networks in the visible,”
Light Sci. Appl., 11
(1), 158 https://doi.org/10.1038/s41377-022-00844-2
(2022).
Google Scholar
O. Kulce et al.,
“All-optical information-processing capacity of diffractive surfaces,”
Light Sci. Appl., 10
(1), 25 https://doi.org/10.1038/s41377-020-00439-9
(2021).
Google Scholar
M. S. S. Rahman et al.,
“Universal linear intensity transformations using spatially incoherent diffractive processors,”
Light Sci. Appl., 12
(1), 195 https://doi.org/10.1038/s41377-023-01234-y
(2023).
Google Scholar
B. H. Soffer et al.,
“Programmable real-time incoherent matrix multiplier for optical processing,”
Appl. Opt., 25
(14), 2295
–2305 https://doi.org/10.1364/AO.25.002295 APOPAI 0003-6935
(1986).
Google Scholar
W. Swindell,
“A noncoherent optical analog image processor,”
Appl. Opt., 9
(11), 2459
–2469 https://doi.org/10.1364/AO.9.002459 APOPAI 0003-6935
(1970).
Google Scholar
J. W. Goodman and L. M. Woody,
“Method for performing complex-valued linear operations on complex-valued data using incoherent light,”
Appl. Opt., 16
(10), 2611
–2612 https://doi.org/10.1364/AO.16.002611 APOPAI 0003-6935
(1977).
Google Scholar
W. Schneider and W. Fink,
“Incoherent optical matrix multiplication,”
Opt. Acta Int. J. Opt., 22
(11), 879
–889 https://doi.org/10.1080/713818991
(1975).
Google Scholar
A. R. Dias,
“Incoherent optical matrix-matrix multiplier,”
(1981). Google Scholar
Y. Li et al.,
“Universal polarization transformations: spatial programming of polarization scattering matrices using a deep learning-designed diffractive polarization transformer,”
Adv. Mater., 35
(51), 2303395 https://doi.org/10.1002/adma.202303395 ADVMEW 0935-9648
(2023).
Google Scholar
J. Li et al.,
“Massively parallel universal linear transformations using a wavelength-multiplexed diffractive optical network,”
Adv. Photonics, 5
(1), 016003 https://doi.org/10.1117/1.AP.5.1.016003 AOPAC7 1943-8206
(2023).
Google Scholar
D. Mengu et al.,
“Misalignment resilient diffractive optical networks,”
Nanophotonics, 9
(13), 4207
–4219 https://doi.org/10.1515/nanoph-2020-0291
(2020).
Google Scholar
D. Mengu, Y. Rivenson and A. Ozcan,
“Scale-, shift-, and rotation-invariant diffractive optical networks,”
ACS Photonics, 8
(1), 324
–334 https://doi.org/10.1021/acsphotonics.0c01583
(2021).
Google Scholar
R. A. Horn and C. R. Johnson, Matrix Analysis, Cambridge University Press(
(2012). Google Scholar
J. W. Goodman, Introduction to Fourier Optics, W. H. Freeman(
(2005). Google Scholar
T. Kozacki and K. Falaggis,
“Angular spectrum-based wave-propagation method with compact space bandwidth for large propagation distances,”
Opt. Lett., 40
(14), 3420
–3423 https://doi.org/10.1364/OL.40.003420 OPLEDP 0146-9592
(2015).
Google Scholar
I. Loshchilov and F. Hutter,
“Decoupled weight decay regularization,”
(2019). Google Scholar
A. Paszke et al.,
“PyTorch: an imperative style, high-performance deep learning library,”
in Adv. Neural Inf. Process. Syst.,
(2019). Google Scholar
G. Cohen et al.,
“EMNIST: extending MNIST to handwritten letters,”
in Int. Joint Conf. Neural Netw. (IJCNN),
2921
–2926
(2017). https://doi.org/10.1109/IJCNN.2017.7966217 Google Scholar
|