Review of snapshot spectral imaging technologies

Abstract. Within the field of spectral imaging, the vast majority of instruments used are scanning devices. Recently, several snapshot spectral imaging systems have become commercially available, providing new functionality for users and opening up the field to a wide array of new applications. A comprehensive survey of the available snapshot technologies is provided, and an attempt has been made to show how the new capabilities of snapshot approaches can be fully utilized.


Introduction
Spectral imaging sensors sample the spectral irradiance Iðx; y; λÞ of a scene and thus collect a three-dimensional (3-D) dataset typically called a datacube (see Fig. 1). Since datacubes are of a higher dimensionality than the twodimensional (2-D) detector arrays currently available, system designers must resort to either measuring time-sequential 2-D slices of the cube or simultaneously measuring all elements of the datacube by dividing it into multiple 2-D elements that can be recombined into a cube in postprocessing. These two techniques are described here as scanning and snapshot.
The use of imaging spectrometers was rare before the arrival of 2-D CCD arrays in the 1980s, but steadily grew as detector technology advanced. Over the following 30 years, better optical designs, improved electronics, and advanced manufacturing have all contributed to improving performance by over an order of magnitude since that time. But the underlying optical technology has not really changed. Modified forms of the classic Czerny-Turner, Offner, and Michelson spectrometer layouts remain standard. Snapshot spectral imagers, on the other hand, use optical designs that differ greatly from these standard forms in order to provide a boost in light collection capacity by up to three orders of magnitude. In the discussion below, we provide what we believe is the first overview of snapshot spectral imaging implementations. After providing background and definitions of terms, we present a historical survey of the field and summarize each individual measurement technique. The variety of instruments available can be a source of confusion, so we use our direct experience with a number of these technologies [computed tomography imaging spectrometer (CTIS), coded aperture snapshot spectral imager (CASSI), multiaperture filtered camera (MAFC), image mapping spectrometry (IMS), snapshot hyperspectral imaging Fourier transform (SHIFT) spectrometer, and multispectral Sagnac interferometer (MSI)-each described in Sec. 4 below] to provide comparisons among them, listing some of their advantages and disadvantages.

Definitions and Background
The field of spectral imaging is plagued with inconsistent use of terminology, beginning with the field's name itself. One often finds spectral imaging, imaging spectrometry (or imaging spectroscopy), hyperspectral imaging, and multispectral imaging used almost interchangeably. Some authors make a distinction between systems with few versus many spectral bands (spectral imaging versus imaging spectrometry), or with contiguous versus spaced spectral bands (hyperspectral versus multispectral imaging). In the discussion below, we use spectral imaging to refer simply to any measurement attempting to obtain an Iðx; y; λÞ datacube of a scene, in which the spectral dimension is sampled by more than three elements. In addition, we use the term snapshot as a synonym for nonscanning-i.e., systems in which the entire dataset is obtained during a single detector integration period. Thus, while snapshot systems can often offer much higher light collection efficiency than equivalent scanning instruments, snapshot by itself does not mean high throughput if the system architecture includes spatial and/or spectral filters. When describing a scene as dynamic or static, rather than specifying the rate of change in absolute units for each case, we simply mean to say that a dynamic scene is one that shows significant spatial and/or spectral change during the measurement period of the instrument, whether that period is a microsecond or an hour. Since snapshot does not by itself imply fast, a dynamic scene can blur the image obtained using either a snapshot or a scanning device, the difference being that whereas motion induces blur in a snapshot system, in a scanning system, it induces artifacts. In principle, blurring and artifacts are on a similar footing, but in practice one finds that artifacts prove more difficult to correct in postprocessing.
When describing the various instrument architectures, "pixel" can be used to described an element of the 2-D detector array or a single spatial location in the datacube (i.e., a vector describing the spectrum at that location). While some authors have tried introducing "spaxel" (spatial element) to describe the latter, 1 this terminology has not caught on, so we simply use "pixel" when describing a spatial location whose spectrum is not of interest, and "point spectrum" when it is. While many authors refer to the spectral elements of a datacube as bands, we use "channel" to refer to individual spectral elements and reserve the use of bands for broad spectral regions [such as the visible band or the longwave IR (LWIR) band]. It is useful to have a terminology for a single horizontal plane of the datacube (the image taken over a single spectral channel), so we refer to this as a "channel image". A single element of the datacube is referred to as a "voxel" (volume element). When describing the dimensions of the datacube, we use N x , N y , and N w as the number of sample elements along the ðx; yÞ spatial and λ spectral axes, respectively, so that the total number of datacube elements is given by N x N y N w . We try to avoid describing the datacube in terms of resolution, since many authors use this to refer to the number of sampling elements, while others refer to it as the number of resolvable elements. At the Nyquist sampling rate, these two differ by a factor of two, but if one always refers to the number of samples, then the meaning is clear. One can also note that it is problematic to talk about a single value for resolution when discussing computational sensors, since the number of resolvable elements for these systems varies with the scene-some scenes are easier to resolve than others.
For time-resolved (video imaging) systems, the data dimensions assume the form (N x , N y , N w , N t ), where N t is the number of frames captured during a video sequence. We refer to this dataset as a "hypercube".
The amount of light collected by a given instrument is an important quantity, so we often refer to a given sensor's throughput or efficiency. Whereas the former refers specifically to the AΩ product (or "étendue"), the latter is a ratio of the sensor's throughput with respect to a reference sensor that can collect light from the entire datacube during the full measurement period and that also has ideal quantum efficiency. Whereas the ideal instrument for any given application always has an efficiency of 1, its throughput varies with different applications.
Also, many authors make a distinction between spectrometer, spectrograph, and spectroscope, with distinctions varying among researchers. We make no distinction here and generally stick to using spectrometer, except where this clashes with a given field's nomenclature.

Historical Overview
As in so much of optics, one can trace the beginnings of spectral imaging back into the nineteenth century, where one finds the astronomer P. J. C. Janssen using a doubleslit monochromator to view the solar corona. 2,3 The double-slit monochromator (at the time termed a spectroscope, or in this case a spectrohelioscope) was the only means of obtaining narrow-band spectra, and an image was obtained by spinning the device rapidly while viewing the exit slit with the eye. By adjusting the relative position of the exit slit with respect to the prism dispersion, one could thus view the same scene at different wavelengths. Although an important 4 and clever setup, it was regarded by other researchers as clumsy. 5 Some three decades later, Fabry and Perot developed their interferometric filter, which for the first time allowed astronomers to both view a full scene over a narrow spectral band and tune the filter wavelength. [6][7][8] The tunable filter thus represented an important advance, giving scientists access to information that was previously difficult to obtain. This allowed them to build ðx; y; λÞ representations of the object in view, albeit laboriously.
An additional advantage of the Fabry-Perot interferometer was that it allowed a much higher light throughput than the double-slit monochromator, enabling users to view dimmer objects. This opened up a number of discoveries, but, as a spectral imager, it still suffered from two problems that simple cameras do not: motion artifacts and poor light collection efficiency. These two issues have plagued the field ever since. In order to overcome these problems, astronomers began looking for nonscanning instruments that could obtain the full 3-D dataset in a single measurement period-snapshot spectral imagers. The first published example of a snapshot spectral imaging system was not far behind. Bowen developed his image slicer 9 by placing a series of tilted mirrors to slice the image into thin strips and then translate each strip to form a single long slit. Walraven later took this concept and created a design that was easier to manufacture, using only a thick glass plate (with a 45 deg angle cut into one end) cemented to a wedge-cut prism. 10 The resulting device is shown in Fig. 2 During this development, in 1958 Kapany introduced the concept of placing a coherent fiber bundle at the image plane and then reformatting the fiber output into a long thin line for easy adaptation to one or more slit spectrometers. 17 But it appears not to have been implemented until 1980, when such a device was developed for the Canada-France-Hawaii telescope (CFHT) on Mauna Kea. 18 Adaptations by other astronomers soon followed. 19,20 A third snapshot spectral imaging technique was developed by Courtes in 1960, in which a lenslet array is used to create an array of demagnified pupils. 21,22 These pupils fill only a small portion of the available space at the image, so with the proper geometry one can reimage them through a disperser to fill in the unused space with dispersed spectra. This concept was adapted to the CFHT and data began to be collected in 1987, producing datacubes with N x × N y ¼ 271 spatial elements and up to N w ¼ 2200 spectral samples. 23 These three techniques for snapshot spectral imagingimage slicing, fiber reformatting, and lenslet-based pupil array dispersion-are now widely described in astronomy as integral field spectroscopy (IFS), so we label the three techniques here as IFS-M, IFS-F, and IFS-L. The integral spectroscopy nomenclature appears to originate from the dissertation work of Chabbal, 24 but the first publication in which one can find it used in the title is Ref. 25.
Outside of astronomy, one finds similar uses of slit spectrometers and tunable filter cameras, but the vast majority of uses did not require imaging. These include measurements such as mapping the spectrum of atomic emission/absorption lines, measuring the chemical composition of the atmosphere, 26 and measuring the quantity of physical compounds. 27 Outside of astronomy, the coupling of imaging with spectrometry did not take hold until much later, with the beginning of airborne remote sensing. Here spectral imaging was first used for agricultural assessment and management around 1966 (Refs. 28 and 29) and received a large boost with the launch of the Landsat remote sensing satellite in 1972. 30 With the launch of Landsat also came the development of field-portable spectral imaging instruments needed to calibrate Landsat's measurements. As spectral imaging became more widely used, researchers faced the same obstacles of scanning artifacts and poor light throughput as those faced by the astronomers, so that they too began to explore new methods, leading to a variety of new instruments. As single-pixel detectors gave way to linear and then 2-D detector arrays, system design options expanded, allowing for new approaches. The first of these new approaches derived from the natural idea of using multiple beamsplitters, in which the beam is split into independent spectral channels, with each channel directed to an independent camera. While this was a common choice for imaging in three spectral bands, especially for wideband measurements (such as visible plus near-IR), it did not take hold for more than four spectral channels. Hindrances included the difficulty of making large beamsplitters of high-enough quality and the limited ability to reduce the bulk and weight of the resulting system. With the increasing availability of thin-film filters, another natural choice involved using an array of filters coupled to an array of lenses. This, too, did not progress far, perhaps because of the difficulty of making large arrays of lenses with sufficient quality and correction for parallax effects. (Ref. 31 comments that the first good thin film filters became available in the late 1930s. So we can expect that they did not become commercially available until the 1940s or 1950s.) With the realization of advanced computers, the option of a computational sensing 32 approach became feasible. The first of these new approaches was a computational sensor later named the CTIS. This device used a 2-D disperser to project a spectrally dispersed scene directly onto a detector array, allowing the spectral and spatial elements to multiplex. (One can experience the exact same thing by donning one of the many diffraction grating glasses that are given out at baseball games and fireworks displays.) The resulting data on the detector array are equivalent to tomographic projections of the datacube taken at multiple view angles, so that tomographic reconstruction techniques can be used to estimate the 3-D datacube from the set of 2-D projections. A compressive sensing approach 33 to snapshot spectral imaging was also developed, promising to allow snapshot measurement of datacubes with more voxels than there are pixels on the detector array. 34 As of this writing, there is still a significant amount of research that needs to be done with these computational techniques, as they have not yet shown performance that can compete with their noncomputational counterparts. 35,36 3 Scanning Imaging Spectrometer Architectures In parallel with the development of snapshot methods, scanning techniques for spectral imaging also advanced. The development of linear digital sensor arrays allowed researchers to collect light from a number of voxels simultaneously. This involved imaging a linear spatial region through a sequential set of spectral filters or, more commonly, by collecting light over a full set of spectral channels while imaging one spatial location and then scanning in two dimensions over the full field of view-a point scanning spectrometer. However, unless used with very wide spectral channels (such as on the Landsat satellites), the poor light collection of these devices often made it difficult to use these instruments outside the lab.
Once 2-D detector arrays became available in the 1980s, these were rapidly adopted in imaging spectrometers, providing a large boost in light collection capacity. 37 For the first time, researchers had available systems that could collect light emitted by thousands (and eventually millions) of voxels simultaneously. As array detectors advanced in size and performance, instrument designers took advantage of the new capability by increasing spatial and spectral resolution. Using a configuration in which the 2-D detector array is mapped to a spectral plus one-dimensional spatial ðx; λÞ dataset-a pushbroom spectrometer-made possible the first imaging spectrometers without moving or electrically tuned parts. If the sensor is placed on a steady-moving platform such as a high-flying aircraft, an Earth-observing satellite, or the object moves on a constant-motion conveyor, the second spatial dimension of the cube is obtained simply by scene motion across the instrument's entrance slit. The removal of moving parts greatly improved the robustness of these devices and reduced their overall size, allowing for image spectrometry to develop into a mainstream technology for Earth-observing remote sensing. Other architectures are also available for scanning systems, as summarized below. A good review of these architectures and their relative SNRs is Ref. 38

Point Scanning Spectrometer
The input spectrum is dispersed across a line of detector elements, allowing for very fast readout rates. The scene is scanned across the instrument's input aperture with the use of two galvo mirrors (or just one if the instrument platform is itself moving), allowing for collection of a full 3-D datacube.

Pushbroom Spectrometer
The input aperture is a long slit whose image is dispersed across a 2-D detector array, so that all points along a line in the scene are sampled simultaneously. To fill out the spatial dimension orthogonal to the slit, the scene is scanned across the entrance aperture. This can take the form of objects moving along a conveyor belt, the ground moving underneath an airborne or spaceborne platform, or the scene scanned across the entrance slit by a galvo mirror.

Tunable Filter Camera
A tunable filter camera uses an adjustable filter (such as a filter wheel) or an electrically tunable filter, such as a mechanically tuned Fabry-Perot etalon, 42,43 a liquid-crystal tunable filter (LCTF), 44 and an acousto-optic tunable filter (AOTF). 45 Response/switching times of the various approaches range from ∼1 s for the filter wheel, to 50 to 500 ms for the LCTF and mechanically tuned Fabry-Perot, and to 10 to 50 μs for the AOTF.

Imaging Fourier Transform Spectrometer
An imaging Fourier transform spectrometer scans one mirror of a Michelson interferometer in order to obtain measurements at multiple optical path difference (OPD) valuesthe Fourier domain equivalent of a tunable filter camera. 46,47 A more recent alternative method here is the birefringent Fourier-transform imaging spectrometer developed by Harvey and Fletcher-Holmes, which has the advantage of being less vibration sensitive due to its common path layout. 48

Computed Tomography Hyperspectral Imaging Spectrometer
This is a scanning device closely related to the CTIS snapshot technique mentioned above and has the advantage of having a conventional disperser design and of being able to collect many more projection angles so that the reconstructed datacube has fewer artifacts. Its main disadvantage is that the detector is not used efficiently in comparison to alternative methods. 49

Coded Aperture Line-Imaging Spectrometer
Although coded aperture spectrometry began as a method of scanning a coded aperture across the entrance slit of a conventional dispersive spectrometer, in Refs. 50 and 51 it was adapted to modern 2-D detector arrays, allowing for improved SNR at the cost of using larger pixel count detector arrays. Each of these architectures uses passive measurement methods, in which no control is required over the illumination in order to resolve the incident light spectrum. Several of these techniques can be used in reverse to produce spectrally encoded illumination systems. In this approach, the object is illuminated with a well-defined spectral pattern, such that the imaging side no longer needs to have spectral resolution capability. For example, one can illuminate a scene with a broadband light source transmitted through a tunable filter and time the image acquisition to coincide with steps in the filter wavelength to produce a datacube of light emitted or reflected by the scene.
Finally, when comparing scanning and snapshot devices, we can note that the division between the two is not as black and white as one might expect. For example, designers have produced sensor architectures that mix both snapshot and scanning techniques, so that the number of scans required to gather a complete dataset is significantly reduced. The earliest example of this of which we are aware (although it seems likely that astronomers had tried this well before this time), is a patent by Busch, 52 where the author illustrates a method for coupling multiple optical fibers such that each fiber is mapped to its own entrance slit within a dispersive spectrometer's field of view. More recently, we can find examples such as Chakrabarti et al., who describe a grating spectrometer in which the entrance slit is actually four separate slits simultaneously imaged by the system. 53 The respective slits are spaced apart such that the dispersed spectra do not overlap at the image plane. This setup can be used to improve light collection by a factor of four, at the expense of either increasing the detector size or reducing the spectral resolution by a factor of four. Ocean Optics' SpectroCam is another example of a mixed approach, in which a spinning filter disk is combined with a pixel-level spectral filter array (more detail on the latter is given in Sec. 4.8) to improve the speed of multispectral image acquisition.
In addition, the fact that a given instrument is snapshot does not in itself imply that the device is fast. Scanning devices can have very short measurement times, and snapshot devices can potentially have very long ones. The essential difference is that snapshot devices collect data during a single detector integration period, and whether this is short or long depends on the application. For large-format snapshot spectral imagers in particular, the frame readout rate can also be rather long in comparison to the exposure time, so that a video sequence (or hypercube) can be time-aliased due to poor sampling if the two rates are not forced to be better matched.

Snapshot Spectral Imaging Technologies
Before attempting a comparison of existing methods for performing snapshot imaging spectrometry, we go through the menagerie of instrument architectures and attempt to summarize the basic measurement principles of each one in turn. Previous surveys (such as Refs. 54 and 55) have focused on instruments developed for astronomy, while we attempt here to discuss all of the techniques of which we are aware. In order to describe the various instruments, there is an inevitable flood of acronyms. A selection of these are summarized in Table 1, together with the section discussing each, the figure illustrating a typical system layout, and the first reference to the technique.

Integral Field Spectrometry with Faceted Mirrors (IFS-M, 1938)
In astronomy, the most common approaches to snapshot imaging spectrometry are the integral field techniques (based on mirror arrays, fiber arrays, and lenslet arrays: IFS-M, IFS-F, and IFS-L)-so called because each individual measurement of a datacube voxel results from integrating over a region of the field (the object). IFS-M was the first snapshot spectral imaging technology to emerge, beginning with Bowen's image slicer. As originally conceived, this device was both difficult to manufacture (due to precision alignment of an array of small mirrors) and offered only a modest gain of 5× over existing slit spectrometers. In 1972, Walraven found a way to modify the design to use a prism-coupled plane parallel plate in place of the mirror array. 10,16 This made the device much easier to align and assemble, but it was still not very widely used, partly because its use was primarily limited to slow beams with f-numbers >30. 65 It was not until the "3-D" instrument was completed on the CFH telescope that image-slicer-type IFS could point to results that were difficult to obtain with existing instruments but that were readily obtained with the 3-D: performing spectroscopy and mapping the spatial distributions of dim extended objects (such as nebulae or distant galaxies). 15 Figure 3 shows a view of the image slicer subsystem for an IFS-M. The 3-D instrument design, however, had several limitations: in order to keep the pupils separated from one another, the image slicer's facet tilts had to be large, thus inducing some defocus at the ends of the facet. Furthermore, it was difficult to optimize the volume to increase the number of facets and therefore the number of spatial elements in the datacube. In 1997, Content showed that allowing the microfacets to have curvature greatly reduces these constraints, allowing the designer to reduce the facet tilts and to place the pupils closer together-an approached he named "advanced imaging slicing." 66,67 While adding curvature eases system design constraints, it substantially complicates manufacture of the slicing mirror, and researchers began a period of investigating how to improve manufacturing techniques. 68 Because of its all-mirror approach, the IFS-M technique is well suited to measurements in the IR. It was also known that although scatter induced by manufacturing artifacts, and by diffraction at facet edges, was a serious problem at visible wavelengths, it was much less so in the near-IR and shortwave IR, and the image slicing method has been shown to excel in these spectral bands. 69 As confidence in manufacturing techniques increased, Content later introduced the concept of microslicing (or IFS-μ), a technique that combines design elements of IFS-M with IFS-L. This enabled one to measure many more spatial elements in the datacube, at the expense of reduced spectral sampling. 70 The basic idea of microslicing is to use the same slicing mirror as IFS-M, but with larger facets. The slicer allows the various strips in the image to be physically separated, and each is then passed through an anamorphic relay, such that one axis is stretched. This gives some extra space so that further down the optical  path, the stretched image is relayed through an anamorphic lens array, which simultaneously unstretches and samples the image. The result is a format in which the various spatial elements of the image take the form of narrow strips that are sufficiently separated at the detector array that they can be spectrally dispersed without adjacent strips overlapping. The concept requires a mechanically elaborate design, but promises to achieve an impressive datacube size of 1200 × 1200 × 600 or 1500 × 1500 × 200. 70

Integral Field Spectrometry with Coherent Fiber Bundles (IFS-F, 1958)
With the invention of coherent fiber bundles, it was quickly realized that one can image through the circular bundle on one end and squeeze the other end into a long thin line (creating what Kapany describes as a light funnel) to fit into the long entrance slit of a dispersive spectrometer. 17 It was quite some time after this, however, that a working instrument was deployed. Instead, one first finds examples in the literature of systems that manipulate single fibers to positions within the field of view-for example, placing one fiber at each of a number of stars within the image. 71 Rather than spectral imaging per se, this is better termed multiobject spectroscopy. Allowing true spectral imaging through a fiber bundle required first overcoming several hurdles, the first of which was manufacturability. The process of reformatting the exit face of the fiber bundle into a long slit generally produced a lot of broken fibers, and replacing them is laborious. Another drawback with the use of fibers was that it could be quite difficult to couple light efficiently into them, so that a significant amount of light was lost at the incident face of the fiber. Moreover, if the cladding of the fibers is insufficiently thick, the fibers show significant crosstalk, which quickly degrades the measurement quality. Finally, earlier fiber-based systems were also restricted by the limited spectral range transmitted by available fibers. Improvements in manufacturing and assembly techniques have steadily reduced each of these problems. The use of multimode fibers increased coupling efficiency, 72 as did the use of lenslets registered to each fiber so that the light is focused onto the fiber cores and not in regions with poor coupling efficiency. (Lee et al. mention that these fiber-coupling lenslets should be used at f∕4 to f∕6. 73 ) In addition, multimode fibers are also more robust, so that it became easier to format them into a line without breaking. With these and other advances, fiber-based IFS systems were successfully deployed on telescopes. 74 Figure 4 shows an example layout for an IFS-F system. A drawback with the fiber-based approach, it was learned, was that the light emerging from the exit face of the fiber was always faster (lower f-number) than it was on the input face-a phenomenon that was termed focal ratio degradation. 19,69,75 Another phenomenon that was discovered when operating fiber-based systems at high spectral resolution was modal noise. 76,77 In the midst of all of these developments, it was realized that the fiber-based approach also allowed astronomers to do away with the free-space dispersive spectrometers that have been the mainstay of so much of astronomical spectroscopy. If one can make use of components developed in the photonic industry, which has optimized miniaturization and manufacturing efficiency, then it should be possible to minimize instrument size and cost in modern astronomical telescopes. 78 Since photonic components are designed primarily for single-mode inputs, this approach first required the development of a device that can convert a single multimode input into a series of single-mode outputs-what has been called a photonic lantern. 79 One of the great advantages of this integrated photonic spectrograph approach 80 is that it also enables the ability to use high-precision optical Bragg filters to filter out unwanted light from the atmosphere (often called OH suppression). 81 In parallel with the developments in astronomy, advances in the technology also made it possible to build a commercial spectral imager based on the same concepts. The first nonastronomical instrument of which we are aware is that of Matsuoka et al., which delivered datacubes of 10 × 10 × 151 dimensions at 0.2 fps. 82 This was followed by systems developed by other research groups. [83][84][85][86][87] 4.3 Integral Field Spectroscopy with Lenslet Arrays (IFS-L, 1960) The first discussion of using lenslet arrays for integral field spectroscopy appears to be that of Courtes in 1960, in which he proposes to use a lenslet array placed at the telescope's focal plane. Such a configuration generates an array of pupil images-each mapped to a different field position. 21,88 (The first review of this concept in English appears to be Meaburn's in 1975, in what he calls an insect-eye Fabry-Perot spectrograph. 89 ) This is the basis of the lensletbased integral field approach. The lenslet array, placed near the image plane, creates a series of pupil images-one for each lenslet. Since the lenslets are at or near an image plane, each pupil image comprises all of the light integrated across the field positions corresponding to the spatial extent of the lenslet. The advantage here is that one can allow the lenslets to create faster beams (with lower f-number) than the original input, so that void space is created between the pupil images. One can then take advantage of this space by dispersing the light across it, allowing for detection of the spectrum. Figure 5 shows an example layout for an IFS-L system. The modern form of the IFS-L was first presented by Courtes in 1980, 22,23 but the first published data from an instrument did not follow until 1988, when Courtes et al. present a system providing datacube dimensions of 44 × 35 × 580. 90 As the concept spread, a number of other astronomers began creating designs for different telescopes. 73,91,92 Borrowing from terminology used in fiberbased integral field spectrometry, one difficulty with the lenslet approach is focal ratio degradation. The beam behind the lenslet array must have a smaller f-number than the beam in front of it, placing higher étendue requirements on the backend optics. One way of mitigating this issue is to use pinholes in place of or in tandem with the lenslet array. 93 The tradeoff in doing this, of course, is that by spatial filtering one is reducing the system's light throughput. In fact, if one replaces the lenslets with pinholes (so that one is sampling field positions rather than integrating across them), then the light throughput of the system becomes no better than a scanning approach.
While the IFS-L technique began in astronomy, its success brought it outside notice, and it was eventually adapted to other spectral imaging applications, with the first publication demonstrating a system achieving datacube dimensions of 180 × 180 × 20, measured at 30 fps and f∕1.8 using a 1280 × 1024 CCD. 94,95

Multispectral Beamsplitting (MSBS, 1978)
The idea of using multiple beamsplitters for color imaging has been around for quite some time. 56 In this setup, three cemented beamsplitter cubes split incident light into three color bands, with each band observed by independent cameras [see Figs. 6(a) and 6(b)]. 96 While one can change the beamsplitter designs to adjust the measured spectral bands, it is not easy to divide the incident light into more than four beams without compromising the system performance. (Murakami et al., for example, limit themselves to four beamsplitters and attempt to use filters to increase the number of spectral channels. 97 ) Thus, four spectral channels appear to be the practical limit of this approach. A closely related method is to use thin-film filters instead of the bulkier beamsplitter cubes/prisms to split the light, 98 but this approach is still probably limited to about five or six spectral channels due to space limitations and cumulative transmission losses through successive filters. The space limitation can be overcome by using a single stack of tilted spectral filters operating in double-pass, which allows for the entire set of spectral images to be collected on a single detector array. 99,100 (This is an approach we have previously termed filter stack spectral decomposition. 101 ) Although more compact than the previous methods, since the filters are now operating in double-pass mode, transmission losses are doubled as well, so this method is limited to N w < 6.
A fifth implementation is to perform spectral splitting with a volume holographic element (VHE). Matchett et al. have shown that they can manufacture a VHE to split an incident beam in three, with each of the three spectrally filtered beams reflected at different angles. 102 The VHE has the advantage of being a compact element with good reflection   efficiency over a reasonable range of field angles. But it appears to be difficult to design the VHE for more than three channels. For example, Matchett et al. divided the system pupil in four, using a different VHE for each of the four sections, in order to produce a system that can measure 12 spectral channels. Matchett et al. state that this system has achieved 60 to 75% throughput across the visible spectrum.

Computed Tomography Imaging Spectrometry (CTIS, 1991)
As with every other snapshot spectral imaging technology, CTIS can be regarded as a generalization of a scanning approach-in this case a slit spectrometer. If one opens wide the slit of a standard slit spectrometer, spectral resolution suffers in that spatial and spectral variations across the width of the slit become mixed at the detector. However, if instead of a linear disperser one uses a 2-D dispersion pattern, then the mixing of spatial and spectral data can be made to vary at different positions on the detector. This allows tomographic reconstruction techniques to be used to estimate the datacube from its multiple projections at different view angles. Figure 7 shows the CTIS system layout. The CTIS concept was invented by Okamoto and Yamaguchi 57 in 1991 and independently by Bulygin and Vishnyakov 103 in 1991/ 1992, and was soon further developed by Descour, who also discovered CTIS's missing cone problem. 35,104,105 The instrument was further developed by using a customdesigned kinoform disperser and for use in the IR bands. 106 The first high-resolution CTIS, however, was not available until 2001, providing a 203 × 203 × 55 datacube on a 2048 × 2048 CCD camera. 107 Although the CTIS layout is almost invariably shown using a transmissive disperser, Johnson et al. successfully demonstrated a reflective design in 2005. 108 A major advantage of the CTIS approach is that the system layout can be made quite compact, but a major disadvantage has been the difficulty in manufacturing the kinoform dispersing elements. Moreover, since its inception, CTIS has had to deal with problems surrounding its computational complexity, calibration difficulty, and measurement artifacts. These form a common theme among many computational sensors, and the gap they create between ideal measurement and field measurements forms the difference between a research instrument and a commercializable one. While CTIS has shown a lot of progress on bridging this gap, it has not shown the ability to achieve a performance level sufficient for widespread use.

Multiaperture Filtered Camera (MAFC, 1994)
An MAFC uses an array of imaging elements, such as an array of cameras or a monolithic lenslet array, with a different filter placed at each element in order to collect portions of the full spectral band (see Fig. 8). This first MAFC implementation of which we are aware is the Fourier transform spectrometer approach by Hirai et al. 58 Surprisingly, it was not until 2004 that we found an implementation like that shown in Fig. 8a, by Shogenji et al., 109 after which we find other research groups following the same approach. 110,111 This layout uses an array of lenses coregistered to an array of spectral filters, with the entire set coupled to a monolithic detector array. (Note that the SHIFT system described in Sec. 4.12 describes a similar, but filterless, Fourier-domain approach.) Another approach was first suggested by Levoy 113 and implemented by Horstmeyer et al. 114 This involves adapting a light field camera with a pupil plane filter: a lenslet array is placed at the objective lens' image plane, so that the detector array lies at a pupil plane (as imaged by the lenslets). The image behind each lenslet is an image of the filter array, modulated by the scene's average spectral distribution across the lenslet. While more complex and less compact than the Shogenji design, this second layout has the distinct advantage of being able to use a variety of objective lenses, so that zooming, refocusing, and changing focal lengths are easier to achieve. Mitchell and Stone developed a similar technique in which the linear variable filter is placed at the lenslet plane. 115 While the MAFC is arguably the most conceptually simple approach to multispectral imaging, it does place some requirements on the scene's light distribution in order to work well. For finite-conjugate imaging, it requires that the object irradiance is reasonably uniform in angle, so that the individual filters and lenslets sample approximately the same relative light distribution as do all of their counterparts. Specular objects are thus problematic at finite conjugates, and angular variations in irradiance will be mistakenly measured as spectral variations.

Tunable Echelle Imager (TEI, 2000)
The tunable echelle imager (TEI) can be considered as a modification of an echelle spectrometer to allow imaging. To make this possible, a Fabry-Perot etalon is placed into the optical train so that the input spectrum is sampled by the Fabry-Perot's periodic transmission pattern. 59,116 This produces gaps in the spatial pattern of the dispersed spectrum, allowing one to fill the gaps with a 2-D image (see Fig. 9). The light transmitted by the etalon is passed into a crossdisperser (for example, a grating whose dispersion is out of the plane of the page as shown in Fig. 9 and then into an in-plane disperser). The result is a characteristic 2-D echelle dispersion pattern, where the pattern is no longer composed of continuous stripes, but rather a series of individual images, each one of which is a monochromatic slice of the datacube (i.e., an image at an individual spectral channel). Under assumptions that the spectrum is smooth (i.e., bandlimited to the sampling rate of the instrument), this achieves a snapshot measurement of the datacube. However, the main tradeoff is that the system throughput is quite low: not only does the etalon reflect most of the input light, but the crossed-grating format is also inefficient. Moreover, for cases in which the object's spectrum does not satisfy the bandlimit assumptions, the measurements are prone to severe aliasing unless scanning is used to measure the gaps in the spectral data.

Spectrally Resolving Detector Arrays (SRDA, 2001)
With the development of Bayer filter array cameras in the late 1970s, it became possible to produce single-pixellevel spectral filtering. 117 The generalization from color imaging to multispectral imaging by increasing the number of filters is a small step (see Fig. 10), and there have been numerous such proposals. 60,97,[118][119][120][121][122][123][124][125][126] The resulting instruments are extremely compact, since all of the spectral filtering is performed at the detection layer, but for several reasons this method has not been widely accepted in the spectral imaging community. The primary reasons are undoubtedly that manufacturing these pixel-level filters is difficult and that each pattern is highly specific, so that one cannot easily adjust the system in order to change spectral range or resolution. The fact that the multispectral approach will generally want to use detector arrays with higher pixel counts exacerbates the manufacturability problem. But another drawback is that, as with any filtering technique, an increase in spectral resolution produces a corresponding loss in light throughput. Although Bayer-type filter-array approaches are compact, convenient, and robust to perturbations such as temperature changes and vibration, they do have the disadvantage of requiring that the image is spatially bandlimited to the Nyquist limit of the filter array (a limit that is typically several times stricter than the Nyquist limit of the underlying pixel array). Without satisfying this assumption, the resulting reconstructed spectral images may show substantial aliasing effects, in which spatial variations in the scene will couple into erroneous spectral variations in the measured datacube. These effects can be minimized by defocusing the image in order to satisfy the bandlimit constraint.
The spatial/spectral filter-based approach is one approach toward adapting detector arrays to spectral resolving capability.
A number of other approaches are also under development, which do not incorporate filters and thus have the potential for an increased detection efficiency. Although development on spectrally resolved detector arrays with more than three spectral bands has been underway for over 40 years, 127,128 doing this for more than two spectral channels in snapshot detection mode has only been pursued recently. The first steps in this direction involved dual-band focal plane arrays [such as midwave IR (MWIR)/LWIR FPAs], 129,130 but more recently it has involved elements such as cavity-enhanced multispectral photodetectors, 131 elements composed of sandwiched electrodes and multiple detection layers, 132 multilayer quantum-well infrared photodetectors (QWIPs), 133 and transverse field detectors. 134 The cavity-enhanced multispectral photodetector is designed by sandwiching several thin layers of amorphous silicon (used as the detection layers) in a resonance-enhanced cavity. 135 Sun et al. 131 and Wang et al. 136 report on using this approach to measure two narrow spectral bands-one centered at 632 nm and another at 728 nm. Parrein et al. follow a closely related approach in which the detection element consists of layers of thin films sandwiched with multiple transparent collection electrodes. 132 This measurement Hagen and Kudenov: Review of snapshot spectral imaging technologies method combines the use of wavelength-dependent absorption depth with interference filters to create a stack of sensors having strong wavelength-dependent signal collection. The implementation of Ref. 132 so far allows only three spectral channels to be resolved per pixel, but the approach shows promise to allow resolution of more spectral channels. Multilayer QWIPs are an alternative approach that has seen considerable research. 130 Mitra et al., for example, present an IR detector consisting of a stack of multiple quantum well absorbers coupled through a diffractive resonant cavity. 133 So far, this technique has been limited to three spectral channels, though the concept is generalizable to more.
Transverse field detection is a concept recently developed by Longoni et al., 134,137 allowing for depth-resolved detection of absorbed photons within the detection layer. Sensor electrodes spaced along the surface of the detector array are biased to different voltages in order to generate the transverse electric fields needed for each electrode to preferentially collect photocarriers produced at different depths. While this trades off spatial resolution for depth resolution, it provides a flexible method for depth-resolved detection.
In general, for multispectral detection (>3 spectral channels), each of the filterless approaches are still under development, and thus considerable work remains before they can be deployed for practical use.

Image-Replicating Imaging Spectrometer (IRIS, 2003)
Lyot invented his tunable filter in 1938 based on the idea of using polarizers to turn the wavelength dependence of retardation in thick waveplates into a wavelength dependence in transmission. 138 Although the instrument was refined by others to use a different layout 139 and to allow wider fields of view, 140,141 it could never measure more than one spectral channel at once. In 2003, Harvey and Fletcher-Holmes described a generalization of Lyot's filter in which the polarizers are replaced with Wollaston beamsplitting polarizers. 61 By splitting each incident beam in two, this technique allows one to view a second spectral channel in parallel. By incorporating N Wollaston polarizers into the system, one can view 2 N scenes simultaneously. The resulting layout, for a setup using three Wollaston polarizers, is shown in Fig. 11. The IRIS approach is an elegant solution that makes a highly efficient use of the detector array pixels. So far, the IRIS approach has only been shown to operate with up to eight spectral bands, 142 and it seems likely that difficulties of obtaining large-format Wollaston polarizers with sufficient birefringence and that can correct for polarizationdependent chromatic aberrations may limit this approach to about 16 spectral channels. 143

Coded Aperture Snapshot Spectral Imager (CASSI, 2007)
CASSI was the first spectral imager attempting to take advantage of compressive sensing theory for snapshot measurement.
Compressive sensing developed out of the work of Emmanuel Candes, Terence Tao, and David Donoho, typically involving the use of L 1 -norm reconstruction techniques to reconstruct data that would be termed insufficiently sampled by the Nyquist limit. The phrase is not intended to refer to a broad category of reconstruction algorithms (such as computed tomography) that can sometimes be said to permit compressive measurement.
The concept for CASSI developed from a generalization of coded aperture spectrometry. 34 Coded aperture spectrometers replace the entrance slit of a dispersive spectrometer with a much wider field stop, inside which is inserted a binary-coded mask (typically encoding an S-matrix pattern or a row-doubled Hadamard matrix, 144 see Fig. 12). This mask attempts to create a transmission pattern at each column within the slit such that each column's transmission code is orthogonal to that of every other column. This follows directly from the properties of Hadamard matrices that each column of the matrix is orthogonal to every other column. The encoded light, transmitted by the coded mask within the field stop, is then passed through a standard spectrometer back-end (i.e., collimating lens, disperser, reimaging lens, and detector array). Because the columns of the coded mask are orthogonalizable, when they are smeared together by the disperser and multiplexed on the detector array, they can be demultiplexed during postprocessing. The resulting setup allows the system to collect light over a wide aperture without sacrificing the spectral resolution that one would lose by opening wide the slit of a standard slit spectrometer. The tradeoff is a factor of two in light loss at the coded mask, and in some noise enhancement due to the signal processing.
The theory of orthogonal codes only requires that the light is uniformly distributed in one axis; the other axis can be used for imaging. This is analogous to a slit spectrometer, which can image across its entrance slit. Using an anamorphic objective lens, one can achieve this by imaging the entrance pupil onto the field stop in one axis, while imaging the object onto the field stop along the orthogonal axis. Although one can consult references 145, 146, or 147 for further details on coded aperture spectral imaging, none of these sources mention the requirements for anamorphic front optics needed to achieve line-imaging with snapshot coded aperture spectrometry.
Compressive sensing allows one to take a similar procedure and apply it to snapshot spectral imaging, measuring ðx; y; λÞ in a snapshot and not just ðx; λÞ. The primary differences from the slit imaging case are that the aperture code is no longer an orthogonal matrix but a random binary matrix and that the reconstruction algorithm becomes much more complex. The generalization proceeds as follows. If one replaces the anamorphic objective lens with a standard one, and images the object directly onto the coded aperture mask, then the irradiance projected onto the detector array after passing through the disperser will be a mix of spatial and spectral elements of the datacube (see Fig. 12). The spatial-spectral projection at the detector array is modulated by the binary mask in such a way that each wavelength of the datacube experiences a shifted modulation code. If this code satisfies the requirements of compressive sensing, then this is all one needs in order to use compressive sensing Bottom: the pattern on the detector array due to imaging a coded aperture mask through a disperser, for an object that emits only three wavelengths (the wavelengths used in the example image here are the shortest, middle, and longest wavelengths detected by the system).
reconstruction algorithms to estimate the object datacube. 34 The resulting system layout is not only extremely compact, but also uses only a modest size detector array, so it is capable of imaging at high frame rates. Wagadarikar et al. showed the ability to capture 248 × 248 × 33 datacubes at 30 fps, though postprocessing of the raw data to produce a hypercube (video datacube sequence) consumed many hours of computer time. 148 More recent algorithms are much faster. 36 Though compressive sensing holds out great promise for future instrument development, designers have not yet succeeded in creating an architecture that replicates its basic requirements well. In rough form, one can summarize the requirements as follows. The most basic feature is that the measured object has to be compressible in some space. For example, for an image represented in a wavelet space, it is well known that one can almost always use fewer coefficients than the number of pixels you wish to reconstruct in the pixel domain-this is a compressible space for a typical image. If one were to measure the object such that each measurement value was the projection of the object onto a basis function in this compressive space, then one would need far fewer measurements than if the system measured all of a datacube's voxels directly. Unfortunately, one generally cannot design an optical system such that the object can be measured in the compressive space directly (such as measuring an object's wavelet coefficients). Compressive sensing, however, provides an alternative that is almost as good. Measurement vectors (we avoid calling them basis functions because they do not satisfy the general definition of a basis) that are composed of columns within a random incoherent measurement matrix have been shown to replicate the properties of measuring in a compressive basis to very high probability. Using this type of measurement matrix, however, comes with some additional requirements. First, the measurement vectors and the compressible basis functions must be mutually incoherent, which means that any element in one cannot be expressed as a sparse linear combination of elements from the other. 149 One can think of this in rough terms as having measurement vectors highly spread out when expressed in the basis vectors of the chosen compressible space, and vice versa. Also, the measurement vectors must satisfy isotropy, which means that they should have unit variance and are uncorrelated. 150 Orthogonal matrices are one example of systems that satisfy this property.
Once a measurement is completed, the user applies a reconstruction algorithm to estimate the object datacube. The typical procedure for this is for the user to choose the compressible basis in which to work (that is, one must know a priori a basis in which the object is compressible) and apply an algorithm to estimate both the magnitudes of the coefficients and the set of measurement vectors comprising the space described by the product of the sensing and signal compression matrices. The algorithm estimates the coefficients and basis vectors by choosing an object representation that best approximates the actual measurement while penalizing representations that have a larger number of coefficients (i.e., are less sparse). For CASSI, a common choice of basis has been total variation space in the x-y dimensions (i.e., the object's gradient image is assumed to be highly compressible). While it is possible to adapt the reconstruction algorithm to search for an optimal compressible basis (so that the user need not know this a priori), this greatly burdens an already computationally intensive problem.
The mathematics underlying this measurement approach has seen a rapid advance in the last decade, but implementing its requirements in hardware has been challenging. One of the main obstacles is that, in order to obtain sufficient compression, the feature sizes used to create coded projections are near the scale of the optical resolution. Not only does this mean that characterizing the system measurement matrix requires a great deal of care, but since the compressive sensing reconstruction methods are currently also sensitive to perturbations of the system matrix, the system is prone to artifacts. Because of these issues, subsequent implementations of CASSI have used linear scanning, such that a 640 × 480 × 53 datacube was reconstructed from a set of 24 frames (for a full collection time of about 2 s). 151 In comparison with equivalent scanning instruments, this final result is disappointing, as even the authors admitted (see the concluding remarks of Ref. 36), so it appears that considerable work remains before we can take advantage of the compressive sensing paradigm.

Image Mapping Spectrometry (IMS, 2009)
Image slicing, as accomplished by the IFS-M technique discussed in Sec. 4.1, is best suited for measurements with low spatial and high spectral resolution. For many applications such as microscopy, however, spatial sampling is the more important quantity, and spectral sampling with only 10 to 40 elements is more common. This makes the IFS-M an impractical approach in this field. While the microslicing implementation (IFS-μ) is capable of achieving a much higher spatial sampling, this comes at the cost of a serious increase in system design complexity. An alternative approach is IMS. Like IFS-M, a microfaceted mirror is placed at an image plane. Unlike image slicing, however, many of the mirror facets share the same tilt angle, so that multiple slices of the image are mapped to each individual pupil. The resulting pattern, as seen by the detector array, resembles that of seeing a scene through a picket fence. If there are nine individual pupils in the system, then the spaces between the fence's slats are 1/9th the slat widths (see Fig. 13). One can see only thin slices of the scene, but there are nine facet images on the detector array, each representing the image shifted by 1/9th of a slat width relative to the others. Assembling all nine subimages thus allows one to replicate the original scene. The advantage of obtaining these facet images is that one has separated the elements of the scene enough so that there is space to allow dispersion. By allowing each pupil to be shared among many mirror facets, the system design becomes much more compact and allows for higher spatial resolution.
The first IMS instrument (called an ISS at the time) provided a 100 × 100 × 25 datacube using a large-format CCD array, 62 but this was later improved to 350 × 350 × 46. 152 As with image slicing (IFS-M), the primary drawback of the IMS is the need for very high precision for cutting the image mapper, which is the central element of the system. Current ultraprecision lathes have advanced to the point where it is possible to make these elements on monolithic substrates, though considerable care is involved.

Snapshot Hyperspectral Imaging Fourier
Transform Spectrometer (SHIFT, 2010) The SHIFT spectrometer 63,153 performs its spectral measurement in the time domain and acquires image information using a division of aperture approach. Conceptually, the idea is an extension of an earlier formulation-the multiple-image Fourier transform spectrometer (MIFTS) developed by Hirai in 1994. 58 However, while the original MIFTS was based on a Michelson interferometer and lens array, the SHIFT spectrometer is based on a pair of birefringent Nomarski prisms behind a lenslet array. As depicted in Fig. 14, an N × M lenslet array images a scene through two linear polarizers surrounding a pair of Nomarski prisms. Thus, N × M subimages are formed on a detector array. Rotating the prisms by a small angle δ relative to the detector array enables each one of the subimages to be exposed to a different OPD. Therefore, a 3-D interferogram cube can be assembled by sequentially extracting each one of the subimages. Fourier transformation, along the OPD axis of the interferogram cube, enables reconstruction of the 3-D datacube. This prism-based design allows for a reduced system volume and an improved robustness to vibration. As with the original MIFTS system, the SHIFT can be considered as the Fourier transform analog of the MAFC, and it offers many of the same advantages, such as compactness. Unlike the MAFC, however, it also offers continuously sampled spectra and is more easily fabricated due to its use of birefringent prisms. On the other hand, it also shares the MAFC's disadvantage of suffering from parallax effects.

Multispectral Sagnac Interferometer (MSI, 2010)
The MSI 64 is an extension of channeled imaging polarimetry 154 to imaging spectroscopy. The idea was conceptually demonstrated using the MSI depicted in Fig. 15. In this interferometer, incident light is divided by a beamsplitter into two counter-propagating components. The component that was initially reflected by the beamsplitter begins its propagation in the −z direction, where it gets diffracted away from the optical axis by grating G 2 . Reflection off of mirrors M 2 and M 1 guides the beam to grating G 1 , where the beam is diffracted in the opposite direction. The converse path is taken by the component that was initially transmitted by the beamsplitter, thus allowing both beams to exit the interferometer collimated and dispersed laterally. The lateral dispersion induced by the gratings produces a lateral shear S between the emerging wavefronts, where S is linearly dependent on the free-space wavelength λ. Additionally, gratings G 1 and G 2 are multiple order diffractive structures; i.e., the blaze contains deep grooves that impose more than one wave of OPD. When these high-order sheared beams converge through an objective lens and onto a detector array, a 2-D spatial interference pattern is generated. The spatial frequency of this interference is directly proportional to the diffraction order and, therefore, directly related to the given order's spectral transmission. These interference fringes are measured by the detector array as a superposition of coincident amplitude-modulated spatial carrier frequencies. A 2-D Fourier transformation can be taken of the raw data to window and filter the amplitude-modulated channels in the frequency domain. Inverse Fourier transformation yields the 2-D spatial data that correspond to the unique spectral passbands generated by the gratings.
This system is essentially a multispectral approach in which many unique spectral slices are measured simultaneously on coincident interference fields. Its advantages include inherent spatial coregistration between the bands while offering simple postprocessing. However, its disadvantages lie in its implementation, namely, the spectral bands must correspond to the grating's orders. Additionally, only one dimension in the Fourier space can be used to modulate spatial and spectral information. Therefore, more work must be done to make this technique a viable competitor to any of the other methods mentioned here.

Technology Comparisons
There are many ways to compare the various snapshot implementations, such as compactness, speed, manufacturability, ease of use, light efficiency, and cost. And while these are all important, different system designers have different opinions about each of these factors, so that any discussion can quickly devolve into an argument. In an attempt to avoid explicitly taking sides, we have opted to compare the various technologies on a more fundamental level-the efficiency with which they make use of their detector elements. Snapshot spectral imagers generally make use of large detector arrays and can push the limits of existing detector technology, so that their efficiency in using detectors correlates closely with other important issues such as compactness, speed, and cost. Allington-Smith 1 has previously termed this metric the specific information density Q: the product of the optical efficiency η (i.e., average optical transmission times the detector quantum efficiency) with what can be called the detector utilization ζ. The utilization is the number of Nyquist-resolved elements R in the imaging spectrometer datacube divided by the number of detection elements M (pixels) required to Nyquist-sample those voxels. Here R ¼ R x R y R w , where R x , R y , R w denote the datacube resolution elements in the x, y, and λ directions. We modify the definition of ζ slightly from that of Allington-Smith so that the numerator in ζ instead represents the number of voxel samples N required to achieve R. Thus, for a Nyquist-sampled system, the two definitions for Q differ by a factor of two in each dimension, whereas Allington-Smith's ideal value for Q is 1/8, and the ideal value under our definition is Q ¼ 1. Letting M u , M v denote the 2-D detector sampling elements, we have for optical efficiency η. Allington-Smith also obtains specific formulas for Q for each instrument in terms of the system design parameters, such as the aperture diameter and system magnification. In order to show that the value for Q among technologies stems from even more fundamental considerations than these, we assume ideal conditions for each instrument type and derive the relevant efficiency from the required margins at the focal plane needed to prevent significant crosstalk among elements of the datacube. Here crosstalk is defined as the condition where multiple voxels within the measured datacube each collect a significant amount of signal from the same voxel in the true object datacube and where these two voxels are not physically adjacent to one another in the datacube. For voxels satisfying this condition but that are physically adjacent, we can call the effect blur rather than crosstalk. For optical efficiency estimates η for each technology, we assume ideal components, so that lenses, mirrors, prisms, and gratings are assumed to have no losses (100% transmission or reflectivity), and that all detectors have an external quantum efficiency of 1.
One of the reasons why we choose the detector utilization ζ to define a metric for comparing technologies is that it is in many ways a proxy for other important measures such as manufacturability and system size. The connection arises because, in various ways, all of the snapshot techniques encode the spectral information by expanding the system étendue. If all things are held constant except for the wavelength-dimension of the cube, then, in every instance, increasing N w requires increasing étendue. And this quickly runs into difficult design constraints-for high-performance systems one can only increase étendue by using larger and more expensive optics (i.e., larger-diameter optical elements that can also handle a wide range of angles). Thus, snapshot systems with lower ζ will generally reach this design ceiling before the higher ζ systems will, and either system size or the angular acceptance of the optics must compensate for the difference in ζ.
The basic premise from which we derive the detector utilization ζ for each technology is that each technique requires a margin around each subsection of the datacube, without which blurring will cause significant crosstalk. For some technologies, smaller margins are easier to achieve than for others, but this factor is ignored here. Those technologies that minimize the number of marginal pixels make the most efficient use (have the highest utilization ζ) of a given detector array, but the actual value of ζ depends on the aspect ratios of the datacube dimensions. For example, from Fig. 16 we can see that the IFS-L and IFS-F technologies use a similar format of projecting elements of the datacube onto a 2-D detector array: each individual spectrum is dispersed, and because its neighbor spectrum is not necessarily a neighboring spatial element, a margin must be used around each spectrum to minimize crosstalk. If each spectrum is allowed a margin s pixels, then the number of detector pixels M needed to capture an (N x , N y , N w ) datacube can be determined by the following calculation. For each individual spectrum in an IFS-L or IFS-F, Fig. 16 shows that we need N w pixels for the spectrum itself, 2N w pixels for the top and bottom margins around each spectrum, and 6 pixels for the margins on the two ends. Doing the same calculation for s > 1, we see that each spectrum uses a rectangle on the detector array with dimensions ðN w þ sÞ × ð2s þ 1Þ.
Multiplying this by the total number of spectra in the datacube, N x N y , we have The value for ζ ≡ N∕M follows directly from this equation as If the system architecture requires two pixels to measure each voxel in the datacube, then the utilization is ζ ¼ 0.5.
For the IFS-M, IFS-μ, and IMS technologies, an N y × N w swath is measured in a contiguous region on the detector array, so that each swath requires a rectangular space of ðN w þ 2sÞðN y þ 2sÞ. Multiplying by the total number of x-resolution elements in the datacube gives For the IRIS, TEI, MSBS, and MAFC technologies, each single-channel slice of the datacube is measured as a contiguous region, so that each wavelength requires a rectangular space of ðN x þ 2sÞðN y þ 2sÞ, and the total number of pixels needed is For the filter-array implementation of SRDA, each pixel samples an individual voxel, so that the utilization is inherently equal to 1. In the case of CASSI, we find that M CASSI ¼ ðN x þ N w − 1ÞN y < N-that is, the utilization is >1. In fact, the greater the number of wavelengths in the datacube, the greater the utilization for CASSI. Note that, due to problems achieving the micron-scale imaging required to map code elements 1∶1 to detector pixel elements, existing CASSI instruments map code elements 1∶2, so that they use about four times as many detector pixels as the theoretical value given here, i.e., M There are two architectures used for CASSI: a single-disperser design (CASSI-SD) and a dual-disperser configuration (CASSI-DD). Ref. 155 CASSI-DD is even more compressive than CASSI-SD, so that M CASSI−DD ¼ N x N y , achieving a detector utilization equal to N w . The error can be found by a careful look at Fig. 1 in Ref. 151, as well as in the mathematical description of Sec. 2 there. Whereas the authors indicate that the form of the data at the detector array is a cube, the architecture shows that it must in fact be a skewed cube (an "oblique cuboid"). This error also implies that the results reported in the paper are in fact corrupted with spatial misregistration errors. Table 2 summarizes the η and M values used to calculate Q for each technology. In the table, note that for the computational sensors (CTIS and CASSI), the number of datacube voxels is related to the number of resolution elements N not through the Nyquist sampling limit but through more complex criteria. When calibrating these computational sensors, M is technically an arbitrary value, but in practice one finds little value in allowing M to exceed the values shown in the table. In addition, for the SRDA row in Table 2, it is assumed that the implementation uses the filter-array camera. From Table 2 we can see that the MAFC/MSBS technologies offer the highest Q for high spatial/low spectral resolution datacubes (squat-shaped cubes), whereas the IFS-M/IFS-μ/ IMS options offer the highest Q for low spatial/high spectral resolution datacubes (tall cubes). The latter do especially well when the spatial dimensions of the datacube are rectangular N x ≠ N y . As indicated in Table 2, the IRIS approach behaves exactly as the MAFC/MSBS technologies, but loses a factor of two due to its need to work with polarized input. The IFS-L/IFS-F approaches suffer 3× loss in Q relative to the mirror-based IFS technologies due to the extra factor of (2s þ 1) shown in the formula for M given in Table 2, arising from the need to separate all spatial elements from one another to avoid crosstalk.
Each of the technologies listed in Table 2 is also classified according to the method used to divide the light into voxel elements. The majority of technologies use division of field [F] (also called division of focal plane), in which the light is either filtered or divided into separate beams according to its placement within the image. Division of amplitude [A] is the next most common method in which the light is divided into separate beams by allocating a portion of light into each beam, as a simple cube beamsplitter does. Only two other methods exist: division of pupil [P] (also called division of aperture) and compressive sensing [X].

Comments on Instrument Throughput
Fellgett 156 and Jacquinot 157 were the first researchers to compare light collection efficiency for various spectrometer technologies and to lead to categorizing what are now commonly referred to as the Fellgett (multiplex) advantage and the Jacquinot (throughput) advantage, both of which are widely associated with Fourier transform and Fabry-Perot spectroscopy. 158 More recently, researchers have argued that with the advance of detectors to 2-D array formats and with the majority of optical detector arrays used, from the ultraviolet to MWIR, now being shot-noise-limited, both of these advantages no longer provide the improvement in SNR that they once did. 159 Sellar and Boreman, on the other hand, argue that while this appears to be true for the Fellgett advantage, imaging Fourier transform spectrometers (imaging FTS, or IFTS) retain the Jacquinot advantage not because of their higher étendue but because they are able to maintain a longer dwell time on each datacube voxel than alternative technologies can. 38 The authors also provide a convincing case that the Jacquinot advantage can be considered as freedom from the requirement of having an entrance slit, while the Fellgett advantage can be considered as freedom from the requirement of having an exit slit. For filterless snapshot imaging spectrometers, both of the traditional advantages are automatically satisfied: no exit slit is used (Fellgett), and the instrument dwell time on every voxel is equal to the full measurement period (Jacquinot).
It is useful to note that FTS approaches to scanning and snapshot spectral measurement suffer from sampling efficiency losses at low spectral resolution, since a substantial portion of the reconstructed spectrum (located at very low wavenumbers) will lie outside the system's true spectral response range and so must be discarded. In detectornoise-limited applications, this effect is mitigated by the fact that while a fixed percentage of these samples do not contribute to actual spectral samples after Fourier transformation, they do contribute to improving SNR in the measured spectrum. 160,161 Spatial heterodyne interferometry can be used to overcome the FTS' sampling limitation at low spectral resolution. [162][163][164][165] In a previous publication, we have tried to steer the throughput comparison argument away from its historical focus on étendue 101 since the complexity of modern snapshot Table 2 The classification type of each technology, and ideal values for the optical efficiency η and number of detector pixels used (M ¼ Nζ) for each snapshot technology.

Technology
Class η M (pixels used) instruments makes any fundamental limits on étendue difficult to determine. Moreover, it has also been argued that, for scanning instruments at least, the differences in étendue among different technologies is not large. 38 Rather, we try to focus on a more important factor-the portion of datacube voxels that are continuously visible to the instrument. For scanning systems, this portion can be quite low (often <0.01), while filterless snapshot systems can achieve a value of 1 (i.e., all voxels are continuously sensed during the measurement period). This creates a large difference in light collection-a difference we have termed the snapshot advantage. While a snapshot instrument's absence of motion artifacts and ability to work without moving parts are both important, the snapshot advantage in light collection is the difference from scanning systems that holds the most promise for opening up new applications.
While not all snapshot implementations can be considered equal, Table 2 indicates that all but one technology (TEI) share optical efficiency values within a factor of four of one another. For most of the technologies summarized in the table, the efficiency values shown are straightforward to obtain and are generally not subject to major disagreement. Perhaps surprisingly, MAFC is the exception. As the argument leading to our choice of η ¼ 1 for MAFC requires a lengthy discussion, it has been moved to the Appendix.

Using Snapshot Instruments in Scanning Applications
Pushbroom-configuration spectral imagers have long been used on moving platforms in remote sensing because their view geometry is well suited to the system measurement geometry: linear motion along one axis provides the needed scanning to fill out the third dimension of the data, so that no moving parts are required in the system. Up until now, snapshot spectral imaging systems have been absent from environmental remote sensing, though computational speeds and data transmission rates have reached a level at which one can now fully utilize the snapshot advantage in light collection to improve SNR. Because these types of measurements take place from a moving platform, achieving the snapshot advantage requires performing what one can call video-rate software time delay integration (TDI). That is, with each acquired frame, the system must coregister and add the new datacube with the previous set in order to build a large high-SNR single datacube from the entire sequence of data. Figure 17 shows how this works. Unlike with hardware TDI, [166][167][168][169] where the data are not actually digitized until they have been fully summed, software TDI performs the summing after digitization and so is more prone to detector and digitization noise. In the regime of shot-noise-limited data, however, these effects are small. While it is possible, in principle, to design specialized detector arrays that would be capable of performing hardware TDI for a given snapshot imaging spectrometer, these arrays would be highly specialized and thus expensive and difficult to obtain.
As an illustration, a snapshot system capable of collecting a 200 × 500 × 200 datacube at standard frame rates (on the order of 100 fps) can use software TDI to dwell on a given spatial region for 200 times longer than an equivalent pushbroom spectrometer can, allowing for a factor of ffiffiffiffiffiffiffi ffi 200 p ¼ 14 improvement in SNR for shot-noise-limited data. Recent advances in data transmission formats (such as multilane CoaXPress, Camera Link HS, and SNAP12 fiber optics) have shown that the transmission rates required by such a setup are now achievable in commercially available hardware. Moreover, because of the parallel nature of the software TDI operation on datacubes, recent GPUs can process this data stream at high-enough rates to keep up. Together, these developments make the full snapshot advantage realizable even for moving platforms that are nominally optimized for pushbroom operation.

Disadvantages of Snapshot
Snapshot approaches are not without their tradeoffs. The system design is generally more complex than for scanning systems and makes use of recent technology such as large FPAs, high-speed data transmission, advanced manufacturing methods, and precision optics. Moreover, the snapshot advantage in light collection often can be fully realized only by tailoring the design to its application. For example, we have taken outdoor measurements with an ðN x ; N y ; N w Þ ¼ ð490;320; 32Þ snapshot imaging spectrometer that reads out at 7 fps, but used an exposure time of only 6 ms to avoid saturation. This exposure time is poorly matched to the readout rate, so that most of the snapshot system's light collection advantage is thrown away. An application for which these are much better matched would require a much dimmer scene or a much faster readout rate.
It is also important to recognize that there are measurement configurations for which snapshot spectral imaging is actually impossible to realize. Confocal microscopy is a clear example. Here the light is confined by a small aperture in order to reject light emerging from unwanted regions of the sample (i.e., outside the focal volume). 170 Thus, this method for rejecting unwanted light means that only one spatial point is in view at any given time, and one must scan the optics about the sample in raster fashion in order to generate a complete ðx; y; λÞ spectral image. By its nature this prevents a snapshot implementation. On the other hand, for the case of volumetric imaging microscopy, there exists an alternative technique-structured illumination microscopy-that is compatible with widefield imaging, and thus with snapshot spectral imagers, and that in some cases can achieve better SNR than can confocal microscopy. 152,171 An additional difficulty with using snapshot systems is the sheer volume of data that must be dealt with in order to take full advantage of them. Only recently have commercial data transmission formats become fast enough to fully utilize a large-format snapshot imaging spectrometer for daylight scenes. (Multi-lane CoaXPress is an example of such a format.) There are ways of reducing the data glut. For moving platforms, performing software TDI prior to transmitting the data allows high SNR data without requiring any bandwidth beyond those used by scanning systems. For target detection and tracking systems, one can process filter detection algorithms onboard prior to transmitting the data, so that rather than the full cube, one only needs to transmit the detection algorithm result. For transmitting complete datacubes, one can also resort to onboard compression. 172

Conclusions
Over the past 30 years, scanning techniques have seen an impressive improvement in performance parameters, including calibration stability, SNR, and spatial, spectral, or temporal resolution. This trend can be attributed to larger detector arrays, reduced detector noise, improved system design, and better optical/optomechanical manufacturing; but the underlying technology and concepts have not changed significantly in this time period.
The advent of large-format (4 megapixel) detector arrays, some 20 years ago, brought with it the capability to measure millions of voxels simultaneously, and it is this largescale measurement capacity that makes snapshot spectral imaging practical and useful. Almost all research in snapshot spectral imagers use novel 2-D multiplexing schemes, each of which involve fundamental tradeoffs in detector pixel utilization, optical throughput, etc. While many advantages can be realized for these snapshot systems over their temporally scanned counterparts, it is only by making use of large arrays of detector elements that these advantages can be achieved. And it is only in the past 10 years that the spatial and spectral resolution achieved by snapshot imaging systems has become sufficient that the devices are now commercially viable. We can anticipate that the snapshot advantage will open up a number of new applications that leverage the improvements in light collection, temporal resolution, or ruggedness. The next 10 years should see further improvements in the technologies reviewed here with continued advancements in detector array technology, optical fabrication, and computing power.

Appendix: Optical Efficiency of MAFC Systems
In order to explain our choice of η ¼ 1 for MAFC's efficiency factor, we attempt to provide an argument from two sides and explain why we feel one perspective should be given more weight.
One way to view the optical throughput is from a voxel's view of the system. That is, we consider a voxel emitting light as a Lambertian source and thus fully and uniformly illuminating the instrument's pupil. When comparing each instrument, if we set the pupil area and system focal length (and thus the f-number) to be the same for all instruments, then the efficiency is simply the fraction of light entering the pupil that reaches the detection layer. For the MAFC, light emitted by the object voxel illuminates the system pupil, but only one of the N w lenses in the MAFC objective lens array can transmit the voxel's light to the detection layer. The remaining N w − 1 lenses have to reject this voxel's light, so from the voxel perspective, the MAFC is effectively performing spatial filtering of the pupil, so that the transmitted light flux is reduced by 1∕N w . And thus this perspective argues for giving the MAFC an efficiency η ¼ 1∕N w .
A second way to view the optical efficiency is to see how the system's optical efficiency scales with a change in the number of wavelengths N w . For the MAFC, increasing N w means increasing the number of lenses in the objective lens array, and with them the number of filters as well. When scaling a lens array like this, if one momentarily ignores the filters, it can be readily observed that the irradiance on the detector is invariant to scale when viewing extended objects. This contradicts the previous voxel view-a fact that can be explained as follows.
When we increase the number of lenses in the array by a scale factor S 2 (e.g., S ¼ 2 increases the number of lenses by four), the lens focal lengths drop by the factor S. If the scaled, smaller lenslets have the same f-number that the unscaled, larger lenslets had, then we should find that the irradiance at the focal plane (ignoring filters) is independent of the number of lenslets when imaging an extended object. But this would appear to conflict with the voxel view expressed above. The feature that is easy to miss is that by scaling the lenses we have changed the magnification, so that the voxel we were imaging in the unscaled system is now 1∕S 2 of a voxel in the scaled system. The scaled version is effectively integrating across larger regions of the object. One can also explain this as an increase in the étendue of the pupil. That is, after scaling the system, the maximum range of angles passed by the pupil (what may be called the pupil acceptance angle) has increased by S in each axis. And this is the difference between the two perspectives on how to measure the system's efficiency: the pupil étendue of the scaled system is S 2 times that of the unscaled system, effectively cancelling out the 1∕N w optical efficiency factor. Thus, although increasing N w means that the spectral filters in the system will transmit a smaller fraction of the incident light, the shorter focal length lenses allow a higher étendue, and the two effects cancel to give η ¼ 1.
At this point, neither of the two views can be described as the correct view-they both simply describe different properties of the system. So we make use of some empirical observations. In general, one finds that MAFC systems have a substantially larger system pupil (the pupil of each individual objective lens multiplied by the number of lenses in the array) than do the objective lenses of other instruments such as CTIS, IRIS, IMS, etc. Moreover, one can observe that the MAFC also has a larger acceptance angle of the pupil than do the other systems. However, if we compare MAFC's pupil étendue with, say, IMS's étendue-measured not at the IMS objective lens pupil but rather at the system pupil of the IMS's back-end lenslet array-then we obtain comparable values. Thus, the limiting étendue of these systems is generally determined not by their monolithic frontoptics but rather by their optical arrays. Although this stretches the definition of optical efficiency from a system performance perspective, it is clear that we should choose the system scaling view over the voxel view, so that MAFC's efficiency factor η is 1 and not 1∕N w .