Lensless coded-aperture imaging with separable Doubly-Toeplitz masks

Abstract. In certain imaging applications, conventional lens technology is constrained by the lack of materials which can effectively focus the radiation within a reasonable weight and volume. One solution is to use coded apertures—opaque plates perforated with multiple pinhole-like openings. If the openings are arranged in an appropriate pattern, then the images can be decoded and a clear image computed. Recently, computational imaging and the search for a means of producing programmable software-defined optics have revived interest in coded apertures. The former state-of-the-art masks, modified uniformly redundant arrays (MURAs), are effective for compact objects against uniform backgrounds, but have substantial drawbacks for extended scenes: (1) MURAs present an inherently ill-posed inversion problem that is unmanageable for large images, and (2) they are susceptible to diffraction: a diffracted MURA is no longer a MURA. We present a new class of coded apertures, separable Doubly-Toeplitz masks, which are efficiently decodable even for very large images—orders of magnitude faster than MURAs, and which remain decodable when diffracted. We implemented the masks using programmable spatial-light-modulators. Imaging experiments confirmed the effectiveness of separable Doubly-Toeplitz masks—images collected in natural light of extended outdoor scenes are rendered clearly.


Introduction
Coded-aperture imaging was introduced by Dicke 1 and Ables 2 in the 1960s, and the field developed rapidly for astronomical x-ray and gamma-ray imaging.Coded-aperture imagers extend the pinhole camera concept, which requires no lenses, has unlimited depth of focus, and can image radiation of any wavelength.By placing a large number of pinholes in a common aperture plane, the light-gathering capability is greatly increased.Improved sensitivity comes at the price of having the focal plane record a multiplex of overlapping images, requiring algorithmic reconstruction to render a clear image.Thus, coded-aperture imaging is a subset of the general field of computational imaging.Some limitations of a random pinhole distribution, primarily numerical instability due to the existence of zeros in the inverse-mask Fourier-domain transfer function, were mitigated with the development of coded apertures with uniformly redundant arrays (URAs) 3 and modified URAs (MURAs) 4 of pinholes.By placing the pinholes in prime-number-based patterns determined by sampling theory, it was thought that the near-delta-function system point-spread functions (PSFs) could be achieved (after deconvolving the distribution with appropriate filter functions), greatly improving the image reconstruction.
Use of coded-aperture imaging for general terrestrial applications poses two challenges.First, strictly speaking, the information-theoretic advantages of URAs and MURAs apply only for radiation wavelengths short enough that diffraction is negligible, since a diffracted MURA is no longer a MURA.Thus, coded-aperture imaging was applied mainly for x-ray and gamma-ray astronomies.Second, MURAs have limited effectiveness for extended scenes in general, regardless of radiation wavelength, due to the fundamentally ill-posed nature of reconstructing an extended scene from a finite-sized coded-aperture image.Thus, exploiting the potential of coded-aperture imaging requires addressing both diffraction and ill-posed reconstruction with very fast and efficient decoding methods.The work presented in this paper shows how to accomplish that goal.

Objective: Lensless Imaging of Extended Scenes with Natural Illumination
In the astronomical applications for which coded aperture imaging was originally developed, the scene typically contains a discrete bright object in a largely dark field.A key assumption behind MURA encoding is that "all [nonzero] object points contribute a complete cyclic version of the basic [aperture] pattern," 3 and the imaging geometry is designed to ensure that the complete aperture "shadows" are collected on the focal-plane array.However, terrestrial applications involve nonzero scene elements extending to and past the edge of the effective field-of-view: these scene elements will cast only partial aperture shadows on the focal plane.This effectively causes more object points to contribute to the image than there are image pixels, creating an inherently ill-posed problem for reconstructing the "true" scene.
To address the ill-posed imaging problem, we developed a new class of coded aperture masks which provide compelling advantages over MURAs.The new class of masks, which we call separable Doubly-Toeplitz apertures, provides images that are orders of magnitude faster to decode and are less affected by the ill-posed nature of the imaging of extended scenes-their transfer functions have far fewer zero eigenvalues than do those of MURA masks.Unlike MURAs, separable Doubly-Toeplitz apertures retain their properties when diffracted, making them much more robust for broadband imaging than MURAs.
This paper addresses the optics and mathematics of the new class of coded apertures.
The method has also been validated with real-world imaging of extended scenes in natural light.Such validations are absolutely necessary-much of the published work based on simulations alone presents overly optimistic predictions of coded-aperture performance.Thus, we provide a demonstration of coded-aperture imaging in visible light acquired with off-the-shelf hardware.This work also uncovered several optical phenomena that are minor nuisances for lensed imaging, but causes significant performance degradation for coded apertures, and would not have been uncovered by modeling and simulation alone.A future publication will provide in-depth exposition of the practical optical challenges for developing software-defined optics.

Former State-of-the-Art: Fourier Decoding and Modified Uniformly Redundant Array Masks
The coded-aperture mask is a flat plane parallel to an imaging array.In this configuration, the image I is just a correlation function between the object O and the mask M: or in terms of matrix elements The goal of decoding is to use knowledge of M to recover the object scene O from the encoded image I.
The prior state-of-the-art in coded-aperture imaging was to select binary codes which could be inverted via forward correlation with slightly modified versions of themselves.Specifically, mask codes M were selected for which the masks are simply a collection of binary (0% or 100% transparent) elements and for which the inverse matrix G, such that was easily derivable from M itself.For example, for URA 3 masks, URAs and MURAs satisfy Eq. ( 3) under periodic boundary conditions.In practice, diffraction is not always negligible.An example of an ideal mask is shown in Fig. 1, along with a diffracted mask image.In this case, the MURA was lithographically deposited as a pattern on a glass slide, so that it was truly binary, either passing or blocking light.However, the collected mask image shows effects of scattering and diffraction, compromising the self-inverse property.In addition, as is explained below, even ideal MURAs have limited utility for extended scenes.
Because diffracted MURAs do not obey Eq. ( 3), inversion of the transfer function is required.The most straightforward approach for inversion of Eq. ( 1) is by Fourier transform.Our initial investigations using spatial-light-modulator (SLM)defined masks showed noisy low-contrast recovered images.We successively eliminated the causes such as diffraction, polarization, and nonuniform SLM-element throughput.Finally, we traced the performance limitations to the illposed mathematics inherent in imaging extended scenes.The larger the mask, the more ill-posed the inversion, thus requiring regularization.
Initially, we implemented regularized decoding by introducing a real, positive function c 2 ðkÞ into the Fouriertransform-based decoding and computing an estimated image from This is a special case of the Tikhonov 5 regularization method used in image deblurring 6 -c 2 ðkÞ plays the dual roles of regularizing the inversion and controlling the effects of noise.With this regularization, recognizable decodings of laboratory images were achieved with both lithographically produced and programmable-SLM-based masks.
2 New Approach to Ill-Posed Coded-Aperture Decoding: Theory Even with regularized Fourier inversion, decoded images of extended scenes were usually unrecognizable.We traced this to the Fourier method itself, which has the limitation of implicitly assuming cyclic or periodic behavior of the object O.While the cyclic assumption is acceptable for compact objects against dark backgrounds, it is inappropriate for extended scenes viewed through large aperture masks.
To relax the assumption of cyclic boundaries, we recast the decoding problem in the most general terms.The correlation of a filter M with an object scene O to yield an image I in Eq. ( 1) can also be expressed as a simple matrix multiplication: where I1 and O1 are the one-dimensional (1-D) vectors created by concatenating the rows of I and O: where C I and C O are the numbers of columns in I and O, respectively.The matrix M 0 0 has rows that are shifted copies of the first row, which is constructed from the filter M by embedding each row of M into a segment of length C O , padded with zeros for elements The size of the matrix M 0 0 grows as the sizes of the image and the filter mask grow.If the image has dimensions r I × c I and the filter mask has dimensions r F × c F , then the sampled object is of size The relationship between image size, filter size, and the extent of the object area sampled illustrates the inherently illposed nature of any imaging system.Since the PSF of a realoptical system is never a delta-function, there are always more object points contributing to an image than there are points in the image itself.For conventional imaging systems, where the PSF may be very compact, it is possible to clip or "throw-away" points near a reconstructed image's edges and to retain only those which represent an approximately wellposed inversion.If significant diffraction is present, then the clipping strategy is inadequate-hence, the emphasis is on the diffraction limit in conventional optics.
For coded-aperture imaging, the filter area (and the spread function) is directly proportional to the light-gathering power of the system.Because of this proportionality of spread to light-gathering, it is not possible to use a clipping strategy to select well-posed portions of the recovered image, except in the trivial case of a single-pinhole aperture.
Thus coded-aperture imaging must address inherently illposed encoding and decoding.
While the forward correlation can be done exactly using Eq. ( 6), the inversion (estimating O from the measurements I) presents a seriously ill-posed 7 and computationally challenging task.The general case is commonly solved via the regularized normal equations: where 1 r O ×c O is the identity matrix with each dimension measuring The operations count for Eq. ( 9) is on the order of 8 For 1 megapixel images, this yields 10 18 operations, requiring on the order of 10 6 s to 12 days on a teraflops processor.Thus, computations on large images using generalized masks are usually impractical.Further, the matrix M 0 0 is so large that the accumulation of round-off errors will likely swamp the inversion even if it could be accomplished in a manageable time.The only exception would be if the encoding mask is a single-pixel pinhole aperture-a well-posed but trivial case.

Separable Doubly-Toeplitz Masks
We have developed a method to specify finite-sized masks that greatly reduce the degree to which the imaging is illposed and render the inversions of Eq. ( 9) tractable.If the mask M can be expressed as an outer product of two 1-D vectors A and B, with lengths N A and N B , respectively, then M is an N A × N B -dimensional matrix of the form Mði; jÞ ¼ AðiÞBðjÞ; (12)   where AðiÞ is the i'th element of vector A, and BðjÞ is the j'th element of vector B. Equation ( 9) can now be written simply as a product of much smaller two-dimensional (2-D) matrices: The matrices M A and M B are Toeplitz, with the elements shifted by 1 pixel on each successive row: The vector M B is similarly constructed from the elements of the vector B.
The difficulty of inversion using Eq. ( 13) is vastly less than for solving Eq. ( 9).While this is still an ill-posed problem, the sizes of matrices M A and M B are much smaller than M 0 0 in Eq. ( 9).With the matrix dimensions summarized in Table 1, the inversions require an operations count on the order of For image sizes on the order of 1 megapixel, this yields on the order of 2 × 10 9 operations, requiring just milliseconds on a teraflops processor.Thus, unlike arbitrary general masks or diffracted MURAs, separable Doubly-Toeplitz masks can be inverted in a usefully short time.
To our knowledge, this Doubly-Toeplitz strategy for specifying coded apertures has not been previously applied to the spatial imaging problem, though there are hints of 1-D Toeplitz masks in the temporal motion-deblurring work of Agrawal and Raskar. 9,10Diffraction Effects on Doubly-Toeplitz Masks Having identified an approach to reduce the impact of ill-posed inversions, the investigations next turned to mitigating the effects of diffraction.Simulations verified that the Doubly-Toeplitz property is preserved in a diffracted mask.Simulations also provide a means of synthesizing decoding matrices representing diffracted mask images for use when good laboratory-collected mask images are not available.
Diffraction simulations are particularly straightforward for a collection of identical rectangular apertures placed at points on a regular-rectangular-grid SLM.Let I O and I P denote the intensity fields at the object and image planes, respectively, and let E O denote the complex optical field at the object plane.When the illumination is incoherent, the diffracted intensity in a pixel at location ði; jÞ on the image plane is where and The parameters in Eqs. ( 16)-(18) are ða X ; a Y Þ = aperture opening sizes, ðp X ; p Y Þ = aperture pitch, >ða X ; a Y Þ, ðΔx P ; Δy P Þ = image-plane pixel pitch, A, B = vectors of weights of mask elements in the x and y directions, R = range from aperture mask to object plane, f P = range from aperture mask to image plane, and λ = wavelength of light.
Diffraction is a significant contributor to extended-scene coded-aperture imaging in visible light.Our experimental investigations were conducted with a bandpass filter centered on 550 nm, and a coded aperture implemented by varying the throughput of elements of a transmissive SLM.The SLM had elements with dimensions of approximately 24 μm × 31 μm on a 36-μm pitch.The total area of active pixels measured 1024 × 768 pixels, corresponding to a maximum aperture of approximately 27 mm × 36 mm.The distance from the aperture to the focal plane was typically 65 mm.With these parameter values, the Fresnel number for the coded aperture optics ranged from under 0.02 (with a single SLM element used as a pinhole) to over 40,000.Because the objective of coded-aperture imaging is to capture the advantages of pinhole imaging while maximizing light-gathering capability, many SLM elements were set to the "on" state, so that our experiments were conducted predominantly at Fresnel numbers between 150 and 40,000.We simulated the Fresnel diffraction with a custom-written numerical code developed for coded-aperture imaging and validated by comparison with laboratory results using both programmable and fixed apertures.Results of one of these experiments are shown in Fig. 2. Within the experimental uncertainties caused by random variations of the transmissivity of the SLM elements and the finite resolution of the focal plane, the agreement is very good.In future papers, we plan to present the effects of diffraction, noise, and variable transmissivity on the quality of coded-aperture images via objective and subjective metrics such as peak signal-tonoise ratio and structural similarity.
Examination of Eq. ( 16) shows that, if the raw aperture mask is an outer product of two 1-D vectors A and B, then the effects of the diffracted mask can still be expressed as an operation of a pair of Toeplitz matrices.Thus, a separable Doubly-Toeplitz aperture mask will yield a separable Doubly-Toeplitz encoding even when there is significant diffraction.If the object is an incoherent point-like source, then its image will be decomposable into A and B vectors which can be used to synthesize decoding masks.
To verify the diffraction model, we conducted a series of experiments comparing simulations with image data, obtaining excellent agreement between model and data.Diffracted coded apertures were also modeled and compared with "collected masks"-images of a point-source viewed through the spatial-coding masks.We verified that the collected masks decompose into the outer products of diffracted 1-D functions, in accord with Eq. ( 16).The mask shown in Fig. 3 is the basic "workhorse" mask used in the investigations presented in this paper.Three scaled versions of this mask were used: "02" is 2.2-mm wide, "04" is 4.4mm wide, and "08" is 8.8-mm wide.The "02" mask spans 62 elements of our SLM, with a minimum feature size of two SLM elements.The SLM elements had a nominal 35.5-μm pitch.The pitches were identical along the axes of the SLM, but the aperture openings were asymmetric-24-μm high and 31-μm wide, accounting for the asymmetry of the diffraction pattern.The 31-element base mask is adapted from the one published by Agrawal and Raskar 10 for temporal encoding and decoding to compensate for motion blur.In a future publication, we will report on methods for efficient searches for masks optimized specifically for 2-D spatial coded-aperture imaging.Figure 3 also demonstrates that the diffraction increases the size of the effective encoding masks, increasing the illposedness of the inverse problem.This makes separable Doubly-Toeplitz masks even more valuable for cases in which fine imaging resolution is required.

Solution: Method
In systems, noise will always arise from effects such as photon counting or read noise, so that the actual inverse problem includes a noise term N. The matrix I δ denotes the image containing noise effects which are not encoded by the aperture mask.In addition, there is the illposed nature of the problem embodied in the nonsquare matrices M A and M B .Both the noise and the ill-posedness require a regularization method.We begin by multiplying Eq. ( 13) by the transposes of the mask matrices, obtaining the normal equation: The matrix products ðM A T M A Þ and ðM B T M B Þ are square and symmetric, but are not invertible, due to degenerated zero eigenvalues.The degree of degeneracy increases with the size of the filter masks-there are (r O − r I ) zero eigenvalues for M A , and (c O − c I ) zero eigenvalues for M B .
[Compare this with solving Eq. ( 9) for a MURA or generic mask, which is orders of magnitude more degenerate, having (r O c O − r I c I ) zero eigenvalues.] To relieve the degeneracies, we introduce regularization parameters α A and α B , so that  The reconstructed images initially show patterns at their edges within a distance of one mask width (or height) of the border.We noted similar artifacts in the encoded images of uniform (all white) scenes, which led us to a method by which the artifacts could be normalized out of the reconstructed images.The normalization is accomplished by encoding and decoding a uniform all-white object W to create a matrix W Est that includes the spurious brightness differences accrued as the coded-aperture mask straddles the edge of the image area: O Est is divided by W Est , element-by-element, to obtain a normalized object estimate O Norm : The normalization can also be expressed as multiplication by two normalization matrices D A and D B : This is because the all-white matrix W can be expressed as an outer product of two 1-D vectors of constant value, which in turn allows the normalization estimate to be decomposed into an outer product of two 1-D matrices: The normalization matrices D A and D B each have identical columns, given by The decoded and normalized object estimates are then

Improvement via Landweber Iteration
The results can be further improved by iterating 11 from the solution in Eq. ( 27).We settled on the method of Landweber, 12 which iterates using a regularization constant ε L : where P A and P B are the positive-definite symmetric matrices specified to control the convergence.While classic Landweber implementation uses P A ¼ P B ¼ 1 (the identity matrix), our implementation uses where S A and S B are the inverses: and The starting estimate O ð0Þ is estimated from Eq. ( 27), then iterated using with Equation ( 31) is actually a modified Landweber method 13 that is guaranteed to converge as long as 0 < ε 2 L < 2. However, the convergent solution is not guaranteed to be correct-the results may be noise dominated unless the iterations stop at a suitable point.We apply an a posteriori criterion based on the residuals after n iterations: with norm kr ðnÞ k 2 ¼ X i;j ½R ðnÞ ði; jÞ 2 : (34) The algorithm stops the iterations when the residuals from Eq. (34) begin to increase.
6 Encoding and Decoding Real-Extended Imageries The refined methods of coding and decoding made it possible to image extended outdoor scenes in natural light.We used these images to further improve and debug the processing.
One experimental setup is shown in Fig. 4. The camera was situated in a commercial high-rise building in Honolulu, Hawaii, approximately 45 m above the ground, looking east toward the city skyline.The SLM was a Holoeye HEO 0017 LC-SLM with a 1024 × 768-element active area.This was paired with a Kodak RMV 16000 digital camera.The RMV 16000 was chosen because its focal plane area of 36 mm × 24 mm is close in size to the 33 mm × 25 mm dimensions of the Holoeye SLM and because the RMV pixel pitch of 7.4 μm allows oversampling of the nominal 35.5-μm pitch of the SLM elements, reducing sampling as a source of error.The SLM was placed 6.5 cm in front of the camera's focal plane.
In addition to the programmable SLM, the camera had a green-bandpass filter to reject optical wavelengths that are not adequately blocked by the "off" pixels of the SLM.We also developed methods to deal with "bad" or "dead" pixels, inhomogeneous focal plane response, fixed-pattern noise, and with stray light reflecting inside the camera.Most of the effects are minor nuisances for conventional lens-assisted imaging, but cause significant degradation of coded-aperture decoding quality.This is because the decoding algorithms attempt to invert the image, artifacts and all, to construct an object scene.Careful reduction of all nonencoded-light is required to achieve satisfactory results.

Encoding and Decoding Real-Extended Imagery: Results
The details of the various artifacts and the associated mitigation algorithms will be presented in a future publication.
The basic steps to precondition the images then to decode them are outlined in Fig. 5.The denoising step can be either single-frame denoising or denoising by adding multiple raw images before decoding.Stray-light interference limited the effectiveness of single-frame denoising.This was mitigated via a 20-frame average, with the raw images averaged together before application of flat, dark, and defect corrections.

Multimask Image Stacking
Our investigations showed that the regularization artifacts differ from mask to mask.Thus, reconstruction could be improved by using multiple images collected with a sequence of different masks.We tested this approach using four 90-deg rotations of the same basic mask.The four images were decoded separately and stacked to provide an improved image.Results from simply averaging the four resulting decoded images are compared with single-mask imagery in Fig. 6 and also compared with an image acquired with a conventional lensed camera.

Conclusions
Our goal was to improve the performance of completely lensless imagers, especially for imaging extended scenes in natural light.We have demonstrated a novel approach, separable Doubly-Toeplitz masks, for specifying coded-aperture masks that allow efficient decoding and have shown the feasibility of using programmable SLMs to implement the new class of masks.This work presents a fundamental change in coded-aperture imaging and lays the foundation for future developments in which we will quantify the performance of the separable Doubly-Toeplitz masks for a greater variety of scenes with both programmable and fixed masks, exploring the limitations of the undistorted field-of-view achievable, as well as extending the applicability to bands beyond the visible spectrum.In particular, it has been suggested that the diffraction could limit the performance of coded apertures for infrared imaging. 14The work presented here shows that the diffraction can be readily included in the decoding algorithms for separable Doubly-Toeplitz masks.More serious limiters, especially for the long-wave infrared, are likely to be the influence of nonencoded information from thermal selfemission of the camera body and drifts in the nonuniformity corrections for uncooled thermal-camera focal planes.Future work will also explore how the separable-Doubly-Toeplitz masks can be combined with refractive elements, extending the super-resolution capabilities studied under the DARPA LACOSTE program. 15he project has also identified other implementation-specific difficulties that need to be addressed in using SLMs for lensless imaging.Since it is still early in the development of the technology, rapid progress can be made.Once the devicespecific sources of noise and error are mitigated, the next biggest improvement will come from developing a computationally efficient means for identifying mask candidates.Rather than generate random binary codes, then exhaustively test each one, a more-efficient approach is likely to be a genetic algorithm which evolves mask codes from the best currently known starting points.
Since Doubly-Toeplitz masks are continuous, having no isolated open sections, self-supporting nonprogrammable masks can be built in to the fuselages and carapaces of small unmanned vehicles with three-dimensional printing methods, reducing the time and cost of fielding new systems.If fabricated from material that blocks x-ray and gammaray radiations, then this would in turn enable extremely compact noncontact imaging in these wavelengths for medicine, defense, and homeland security.

Fig. 1
Fig. 1 Example of a MURA: (a) ideal binary mask, and (b) image showing effects of diffraction.The collected mask image is no longer binary and exhibits numerous diffraction features.

Fig. 2
Fig.2Images of a point source viewed through a single-large square aperture synthesized by setting a 100-pixel-square area of SLM pixels to the "on" state.Vertical cuts through the images illustrate the diffraction features.The agreement is very good, despite noise and the element-to-element variability of the SLM.

Fig. 3
Fig. 3 (a) Binary mask commanded on the SLM, and (b) the image of a point source viewed through the mask, showing Fresnel diffraction features.

Fig. 4
Fig. 4 Equipment for imaging outdoor scenes.This view looks from downtown Honolulu eastward toward Diamondhead mountain.

Fig. 5
Fig.5Processing flow for coded-aperture image acquisition and decoding.Raw coded-aperture imagery is unrecognizable-processing yields a clear image.Data conditioning reduces numerous sources of unencoded noise and interference, so that the decoding algorithms can compute a recognizable image.

Fig. 6
Fig. 6 Final results (b) of processing real imagery with multiple masks (four 90-deg rotations of the "02" mask), versus using a single mask (a), showing reduction of decoding "ghost" artifacts.(c) The reference photo of the scene acquired with a conventional camera.(The clouds are different because the images were acquired at different times.)

Table 1
Relationship of dimensions of the collected image, the encoding filter mask, and the object area contributing to the image.
)where 1 R O is an identity matrix with dimensions R O , and 1 C O is an identity matrix with dimensions C O .Equation (20) now multiplies O by invertible matrices, so that a noisy estimate of O can be derived from the image: