## I.

## Introduction

Compressive sensing is a novel technology that exploits the option to sample radiometric and spectroscopic signals at a lower sampling rate than the one dictated by the traditional theory of ideal sampling. The possibility of undersampling a signal without losing significant information is founded on the signal characteristic of admitting a sparse mathematical representation, which can be made accessible to an instrument throughout a specific integral transformation to be performed with a dedicated optical subsystem.

Compressive sensing is a cutting edge technology belonging to the general field of signal compression, and its main feature is connected with the circumstance that compression takes place before signal registration, during the sampling phase. Due to this characteristic, compressive sensing promises exceptional savings for sensor design and realization in terms of the required memory for temporary data storage, bandwidth necessary for data transmission, electrical power consumption. In their turn, the above lesser requirements would originate supplementary mass, volume, and cost reduction. The possible impact of these expectations on the cost of future space missions could be remarkable, motivating new investigations and research programs concerning this extraordinary technology.

This work addresses the theoretical issue of defining the maximum radiometric performance limit allowed for the compressive sensing technology. We show that, according to recent studies, signal multiplexing permits radiometric performance poorer than that allowed by a direct measurement method, bounding the maximum signal-to-noise ratio obtainable with the compressive sensing technology. Finally, we discuss the multiple paybacks originating by the compressive sensing technology.

## II.

## Historical Background

Multiplex spectrometry [1,2] was developed in the early 1950s as a remedy for the lack of array detectors in the infrared spectral interval, where the measurement required a time-consuming scanning process for collecting all the spectral bands of interest by means of a single element detector. The adaptation of a telecommunication technique made possible to multiplex all the spectral samples through the unique photosensitive element, obtaining a higher amplitude signal and, possibly, a proportional reduction of the integration time. The overall acquisition time required to observe a wide spectral interval was therefore reduced by a factor equals to the number of spectral samples, without compromising the Signal-to-Noise Ratio (SNR). Conversely, this apparent radiometric advantage provided by the new multiplex technique could even be traded for a higher SNR, allowing the overall measurement time to be the same as that of traditional dispersive spectroscopy. Multiplexing spectroscopy received two alternative implementations: two-beam interferometers [1] and coded aperture dispersive spectrometers [2]. Two-beam interferometers (Fourier Transform Spectrometry or FTS) implement multiplexing by means of the set of harmonic functions, producing the cosine transform of the spectrum of the observed source. Instead, aperture coded dispersive spectrometers can implement various sets of orthogonal functions, such as the Hadamard, harmonic, Legendre polynomials, and so forth. Usually, the spectrum of the observed source modulates the selected set of orthogonal functions that are coded as a bi-dimensional spatial pattern of transmittance (or reflectance) in the input and / or the output aperture. The main difference between FTS and Multiplex Dispersive Spectrometers (MDS) is that FTS devices realize interferometric amplitude multiplexing while dispersive spectrometers put into operation intensity multiplexing. As a common point, any multiplex spectrometers measure not the spectrum itself, but a complex transformation of it. Hence, such instruments require specific data pre-processing for transforming back the observed parameters into the spectrum of the observed source.

Multiplex Imaging (MI) is a relatively recent domain aimed at investigating those optical configurations and instrument’s features that provide some radiometric and Signal-to-noise Ratio (SNR) advantage when observing a remote source. The possible radiometric advantage of multiplex imaging is easily understood considering the simple example depicted in Fig. 1, where a multiple pinhole instrument get a coded signal made up of the superposition of many overlapping images of the same object. Superposition gives rise to the expected radiometric advantage, while the resulting signal can even be measured by a fast time-scanning procedure that can adopt a single-element detector.

Compressive Sampling (CS) is a novel research domain founded on the paradigm that sparse signals can be undersampled without losing relevant information. Candès [3] has given the following striking description of CS: “*Modern transform coders such as JPEG2000 exploit the fact that many signals have a sparse representation, meaning that one can store or transmit only a small number of adaptively chosen transform coefficients rather than all the signal samples…. This process of massive data acquisition followed by compression is extremely wasteful and raises a fundamental question: because most signals are compressible, why spend so much effort acquiring all the data when we know that most of it will be discarded? Wouldn’t it be possible to acquire the data in already compressed form so that one does not need to throw away anything? “Compressive sampling” aka “compressed sensing” shows that this is indeed possible.”*

The application of CS to imaging can be regarded as an evolution of MI, since the integral transformation that connects the data of interest *i*(*ξ*) to the really measured datagram *I*(* x*) is chosen so that the datagram itself is a sparse representation of

*i*(

*ξ*) and can be sampled more efficiently. Compressive imaging has been the first application as far investigated [4], while in the last years several studies have been undertaken that aims at investigating the application of CS to several fields of spectroscopy [5,6]. CS spectroscopy can be considered in its turn as an extension of earlier spectroscopic techniques, such as FTS or MDS, and some authors have claimed a possible radiometric advantage connected with the utilization of the CS approach [7].

However, recent theoretical investigations [8,9] addressing the radiometric performance of various multiplex spectroscopic techniques have shown evidence that signal multiplexing doesn’t provide any SNR benefit when the informative signal component is correctly individuated and taken into account. Even if CS technology does not offer any radiometric advantage, it gives a relevant improvement in terms of signal sampling, and remains one of most promising and unprecedented field of signal compression, justifying a thorough investigation of its true radiometric characteristics.

## III.

## Mathematical Framework

The mathematical framework for representing the MI scheme is described in the following equations, which are conceptually the same involved in MS. Multiplex imagers produce in their focal plane an integral transform *I*(* x*) of the incoming intensity

*i*(

**ξ**), as shown in the next relationship:

where *F*(* x*,

*) is the integral kernel of the implemented (forward) transform,*

**ξ***N*(

**x**) the noise due to the detector and readout circuitry, and

**x**is the coordinate of the focal plane position (conjugate of the vector coordinate ξ) that is measured within a limited interval

*D*

_{x}

*.*Depending upon the characteristics of the considered instrument, ξ and

**x**can be scalar or vector coordinates while their physical interpretation can be spatial or spectral position. The source estimation is obtained performing the inverse integral transform, by means of the analysis kernel

*B*(

**x**,ξ). The symbol D

_{ξ}designates the observed measurement interval. In view of Eq. (1) we have:

The previous equations should be intended as a mathematical framework for MS and MI that avoids the complexity arising from a detailed analysis of the optical characteristics of each specific implementation. The same modeling can be applied to the CS approach, which is based on a specific integral transformation having the same properties shown in Eqs. (1) and (2). Every implementation of multiplex imaging or spectroscopy adopts an orthogonal set of functions *e*(**x**,ξ), in order to obtain a practical definition of the two integral kernels *F*(**x**,ξ) and *B*(**x**,ξ) that must obey Eq. (2). Almost always the selected orthogonal functions are normalized and comply with general orthogonality relationships. The following equations give a general enough definition that can describe about each multiplex spectrometer or imager,

where *F*_{0} and *B*_{0} are real constants. Eqs from (1) to (3) make the measured light intensity distribution *I*(**x**) in the instrument’s focal plane far above the incoming intensity *i*(**ξ**), hence originating a radiometric gain in terms of the physical signal available for the detector *I*(**x**) >> *i*(**ξ**). Let us note that the amplitude boost allowed by multiplex techniques always involves the physical signal, i.e. the measured datagram.

## A.

### Compressive Sampling

Let us consider the general problem of reconstructing a vector *i*(**ξ**) from the datagram *I*(**x**) of the form:

where the subset of *K* adopted frequencies *x*_{k} gives rise to compression being *K*<*N*. Here we have adopted a discrete representation of the signal, in order to highlight the number of samples. The same equations can also be written adopting a continuous model, where the number of samples *N* would then be selected according to the standard sampling theory.

**Definition 1** ([10]). *We will say that a signal i(ξ) is S-sparse if its support {t: i(ξ _{t}) ≠ 0} is of cardinality less or equal to S*.

Candès, Romberg and Tao [10] showed that one could almost always recover the signal *i*(**ξ**) exactly by solving the convex problem:

Theorem 1 ([10]). Assume that *i*(**ξ**) is *S*-sparse and that we are given *K* Fourier coefficients with frequencies selected uniformly at random. Suppose that the number of observations obeys

Then minimizing ||*ĩ*(*ξ*)|| reconstructs *i(ξ)* exactly with probability of success that exceeds 1-*o*(N^{-δ}), if the constant *C* in (6) is of the form (*δ* + 1). The above theorem means that the signal *i*(**ξ**) can be reconstructed by measuring just any set of *K* frequency coefficients, utilizing an interpolation procedure which minimizes a convex functional and does not require any knowledge regarding the number of nonzero frequencies of *I*(**x**). It is worth noting that, it is impossible to reconstruct *S*-sparse signals with fewer samples and the same accuracy. It is possible instead to establish simple connections with the theorem of ideal sampling.

Suppose the signal *i(ξ)* has support Ω in the frequency domain, with *B*=*μ*(Ω), where *μ*(·) is the Lebesgue measure of the set Ω. Whereas Ω is a connected set, we can think of *B* as the bandwidth of *i(ξ)* and apply Shannon’s theorem. Now suppose the set Ω, still of size *B*, is unknown and not necessarily connected, a situation in which the Shannon sampling theorem doesn’t help. In this case, we can only assume that the connected frequency support is the entire domain, suggesting that all *N* samples are needed for exact reconstruction. However, Theorem 1 asserts that far fewer samples are necessary. Solving Eq. (5) will recover *i(*ξ*)* perfectly from about *B* log *N* samples in the transformed domain. The implicit assumption that the signal of interest *i(*ξ*)* is sparse would seem unrealistic, since all real signals are nonzero at almost all points *ξ*. However, this apparent inconsistency is promptly overcome if one considers that the signal *i(*ξ*)* is only requested to admit a sparse representation in a generic and unknown transformed domain. This property limits the degrees of freedom to a value less than N, allowing in principle the application of the CS approach.

Fig. 2 shows the schematics of a multiplex imaging or spectroscopic instrument. The fundamental component is constituted by an optical subsystem which performs the desired integral transform, i.e. the digital cosine transform of a certain frequency chosen at random. We note that CS unavoidably entails signal multiplexing, as reported in Eqs. (4) and (5).

## IV.

## Limitations to the radiometric behavior of multiplexed signals

We assume that the best radiometric performance attainable with a CS instrument that observes a given source *i(ξ)*, is bounded by the performance reached by the same sensor observing the same source when the signal sampling is set according to Shannon theorem. Relying on the analysis of the previous Section, we can affirm that a CS instrument adopting the same sampling rate required by the ideal sampling theorem performs signal multiplexing without compression, falling thus in the class of MS or MI devices. This is certainly true for the CS examples in [5,6]. Therefore, the problem of assessing the best radiometric performance limit of CS technology can be turned into the more familiar problem of assessing the best performance achieved by a generic multiplex instrument. Recent theoretical investigations [24] have revealed that the signal measured in interferential multiplex spectroscopy (FTS) always contains a non-informative signal contribution holding most of the power conveyed by the transformed signal *I*(**x**). In a different wording, it has been shown that the radiometric advantage exists only for the physical, while the part of the measured signal *I*(**x**) that transmits source information usually has half of the amplitude of the original signal *i(ξ)*. Moreover, the high-amplitude physical signal originates a high power photonic noise that cannot be separated from the informative signal component, giving rise to a signal measurement of lower SNR. Evidence exists that this kind of drawback also affects MDS as well as MI [8,9]. We re-examine this aspect in the following Sect. IV.A, where we condense most of the results pointed out in [24].

## A.

### Informative signal and effective SNR in multiplexed signals

The datagram *I*(**x**) is the integral transform of the source *i(ξ)* obtained by means of a set of orthogonal functions *e*(**x**,**ξ**). These functions assume negative and positive values, while any source intensity must be definite positive. Moreover, most of the functions *e*(**x**,**ξ**) that can be used for MDS and MI cannot be optically coded as a transparency or reflectance mask as long as they assume negative values. Therefore, the function *f*(**x**,**ξ**) and the constant *F*_{0} are chosen so that the forward integral kernel *F*(**x**,**ξ**). can be optically coded (0 ≤ *F*(**x**,**ξ**) ≤ 1).

The supplementary function *b*(**x**,**ξ**) and the constant factor *B*_{0} guarantee that Eq. (2) holds true. Many implementations of MDS and MI adopt the following configuration [8]:

We note that any integral transforms defined by a set of auto-adjoint orthogonal functions *e*(**x**,**ξ**) must obey Plancherel’s theorem, as shown in the following equation [8]:

The result above implies that the amplitude boost provided by any multiplex approach, including also CS, is surprisingly connected with the offset term *f*(x,ξ) = ½ rather than with the integral transformation itself. The datagram can be written as:

The first term on the right hand-side of Eq. (9) yields the integral transform originated by the orthogonal set of functions *e*(**x**,**ξ**) only, and the last term is a d.c. level proportional to the integrated source intensity. Therefore, the term *f*(x,ξ) doesn’t give contributions to the source estimates, and it doesn’t contain information pertaining to the source intensity distribution, that is, the multiplexed signal embedded in any MI, MS, and CS instrument is made up of two components *I*(**x**) = *S*(**x**) + *U*(**x**), the second of which doesn’t convey useful information. Barducci et al. [8] have shown that:

The informative component *S*(**x**) of the measured datagram *I*(**x**) has half the amplitude of the source signal *i*(**ξ**) observed without multiplexing, hence we can write:

Where *SNR*_{M} represents the physical signal SNR of a MI or MS sensor, and *SNR*_{M-Eff} represents its effective value which limits the accuracy of source estimations and includes the power of the informative datagram component only; *n*(**ξ**) is the inverse transform of the experimental noise *N*(**x**), and the two obtained expressions of the SNR in the two conjugated domains are equivalent due to Plancherel’s theorem. With simple mathematical manipulations one can write:

Here we have introduced the detector noise variance σ^{2}_{Det}, and the photonic noise variance |*U*(**x**)|, roughly corresponding to the intensity. We point out that the accuracy of source estimations in MI and MS always is worse than that obtained with a direct measurement model (traditional), and that the best reconstruction precision *SNR*_{CS-Eff} allowed by any CS sensor always is bounded by the value of Eq. (12). We recap this relevant outcome in the following relationship:

## B.

### Discussion

We have found in Eqs. (12) and (13) the maximal SNR allowed for a CS instrument, which is less than that achieved by an equivalent apparatus adopting a traditional (non-multiplexing) measurement model *SNR*_{Dir}. This property is highlighted by the following relationship:

where the last inequality is a consequence of Eq. (10). The circumstance that instruments employing a direct measurement model with full sampling always achieve a better SNR than the equivalent CS multiplexing sensor, can also be explained as a consequence of the Data Processing Inequality (DPI) [12]. This aspect becomes more evident when we compare the two models of observation, as done in Fig. 3. The additional optical subsystem in the CS measurement model can be considered as a data-processing unit. The well-known DPI suggests that the source information flux reaching the detector should decrease due to this subsystem. Here, the principal characteristic to be considered is that within the detector integration time the telescope collects a finite number of photons subject to a natural statistical variability obeying Poisson’s statistics. In this condition the telescope itself and any subsequent element in Fig. 3 should be devised as communication channel operating on a continuous signal limited in power or amplitude, having a finite bandwidth (set by the signal sparsity) and SNR. Hence, the finite channel capacity of each block decreases the transmitted information flux. The presence in the CS instrument of an additional block implies that information flux reaching the detector for this measurement model is less than or equal to the information flux available in the direct measurement model. This fundamental concept has been stated as Optical Data Processing Inequality in [9].

## V.

## Compressive Sampling for Spaceborne Instruments

As noted before, CS doesn’t produce any radiometric or SNR advantage. Therefore, we will consider the use of compressive sampling only in conjunction with possible sampling advantages (a reduced data volume for holding the same source information).

When spaceborne imagers are considered, this novel technique can be useful for accommodating all sensor budgets to their minimal allowed value. As an example undersampling a sparse signal demands for Analogue-to-Digital Converters (ADCs) with lower sampling rate, a circumstance that is expected to mitigate the overall electrical power absorption of the instrument. Less samples per dataset even implies a lower capacity of the sensor electronic memory, a decrease of the requested computing power, and a reduced bandwidth for the satellite downlink., and finally into lower mass and volume reserved for the instrument, and reduced requirements for on-board data processing. The most appealing situation in which CS can be applied is that of hyperspectral imagers operating at high spectral and spatial resolution, where the acquired datasets have three dimensions, resulting in extraordinary large volumes of collected data, which exhibits a significant correlation both in the spectral and the spatial directions. In summing up, the typical application of compressive sampling to the remote observation of the Earth and of other planets should be performed in the *x* - *λ* or *x* - *y* domain, providing the relevant advantages listed before. This imply that cost, mass, and volume budgets might be reduced or optimized simply adopting the compressive sampling architecture for a standard spaceborne hyperspectral sensor. It is worth noting that recent investigations [13] have pointed out that sparsity in the *x* - *λ* and *x* – *y* domains would roughly range in the interval from 3% until 10% of the original spectral samples. In the worst situation the number of needing samples necessary for error-free signal reconstruction would not exceed 40% of the images of interest. In the assumption that this rule of thumb holds true, it is possible to obtain relevant savings induced by the examined method of undersampling a signal.

## VI.

## Conclusions

The above analysis of compressive sampling has highlighted some precious possible advantages of this technology, even if we have demonstrated that the maximal radiometric performance of the CS measurement model is always poorer than that obtainable in the traditional direct measurement approach. Existence of potential advantages of high value is the basic motivation for developing further additional research activities in this field. The main technical goals of supplementary investigations comprises the assessment of the sparsity *S* of typical *x* - *λ* images, investigating the autocorrelation level of reflectance spectra of natural surfaces observed with spectral resolution varying from 0.5 nm up to 30 nm, analyzing the effects of noise on CS, and the maximum interpolation error associated to assigned sparsity and sampling compression schemes, developing new programmable optical modulators able to reach higher frame-rates that might be necessary for a full application of CS to hyperspectral imaging.