Straylight is a spurious effect that can seriously degrade the radiometric accuracy achieved by Earth observing optical instruments, as a result of the high contrast in the observed Earth radiance scenes and spectra. It is considered critical for several ESA missions such as Sentinel-5, FLEX and potential successors to CarbonSat. Although it is traditionally evaluated by Monte-Carlo simulations performed with commercial softwares (e.g. ASAP, Zemax, LightTools), semi-analytical approximate methods [1,2] have drawn some interest in recent years due to their faster computing time and the greater insight they provide in straylight mechanisms. They cannot replace numerical simulations, but may be more advantageous in contexts where many iterations are needed, for instance during the early phases of an instrument design.
In this paper, we describe a tool we have developed to assess diffuse straylight in imaging spectrometers, based on the Peterson formalism . Our aim is also to fill some information gaps in the literature and to standardize the methods and assumptions used by experts for straylight modelling. In a first section, we will review Bidirectional Scattering Distribution Function (BSDF) models for diffuse straylight, from particle contamination and roughness. In section 2, we will present and discuss the Peterson approach to combine the BSDFs of individual surfaces into a straylight kernel describing the complete optical system, its applicability to off-axis systems and how it can account for diffuse straylight vignetting. In section 3 we will describe how to use this kernel in a full straylight model, for both the telescope and spectrometer part of an imaging spectrometer. We will illustrate the computation principle with examples and finally discuss several possible definitions of straylight, and will explain how they can be evaluated with our approach.
BSDF FOR DIFFUSE STRAYLIGHT
A first source of diffuse straylight is contamination of the optical surfaces with dust particles. A synthetic review of this subject has been presented by Dittman . The dust particles, of random shape and orientation, are traditionally considered spherical to simplify the modeling, although it can be remarked that for a mirror surface, in principle two particles should be considered, the particle and its mirror image. It is also assumed that the particles follow a log-log size distribution according to the MIL-1246B standard . Dittman  discussed the possible use of two slope values (0.926 and 0.313) for respectively cleaned and uncleaned surfaces, as the cleaning process tends to remove preferably the largest particles. He also showed that Spyake and Wolfe  published BSDF curves are convolved with a finite detector aperture and thus should not be used as a starting point for a straylight model.
From the Dittman review, it appears that the most accurate way to get BSDFs is a numerical computation based on the Mie theory [6,7] and integration over the MIL-1246B size distribution. Scripts for Mie computation can be easily found on the web. It is also possible to write one’s own script, with the help of the Wiscombe criteria that tells when to stop the Mie infinite sums , online tools for validation , and some care for evaluating complex mathematical functions at very large orders. The Mie intensities i1 and i2 [6,7] evaluated for one specific particle size, are converted into a BSDF as follows. The scattered intensity I (square modulus of the electric field, proportional to an irradiance in W/m2) at a distance r from the particle is:
where I0 is the incident intensity on the particle, k=2π/λ, θ the scattering angle and unpolarized light is considered. We assume σ particles per unit surface area, dS is a small area perpendicular to the incident beam over which we have σdS particles, and dA a small surface where the scattered light is measured. The BSDF is the ratio scattered radiance L(θ)/incident irradiance I0:
With this formula we could recover the Dittman BSDF curves, compute them for a given distribution slope and several wavelengths on a regular grid. Our final particle contamination BSDF is obtained with a wavelength interpolation and scaling to reach the relevant contamination level. For a mirror, the forward scatter, after reflection adds up to the backscatter so it may be useful to sum up both contributions.
Roughness of optical surfaces: Total Integrated Scatter (TIS)
The TIS for a rough surface is traditionally computed with:
where k=2π/λ, σ is the rms roughness and Δn the refractive index contrast at the interface (Δn =2 for a mirror). To better understand the underlying assumptions and establish the angular dependency, we show how this equation is derived from first principles. We consider a mirror surface with a roughness histogram p(h). This histogram is normalized (its integral is 1) and centered (its average is 0). The optical path difference (OPD) between the real ray reflected by the mirror at the height h, and the ideal ray reflected on the average flat surface for which h=0, is 2hcosθ. The decrease of specular reflection is obtained by interference between all the real reflected rays. We assume that the TIS is equal to 1 minus this specular reflection, by energy conservation:
On the right side we see the Fourier transform of the histogram. Under normal incidence (θ=0), with very small roughness (σ<<1), a Taylor expansion of the exponential allows to recover equation (3). If the angular dependency is kept, for very small roughness we recover the formula given by Stover :
Finally, the usual formula with the exponential TISreflection = 1-exp(-4k2σ2cos2θ) is recovered with a gaussian histogram of arbitrary amplitude. In transmission, a similar reasoning allows to get the following TIS at arbitrary incidence and σ<<1:
where θ1 and θ2 are the incident and transmitted angles, respectively in media with indices n1 and n2. This equation is consistent with the predicted specular transmission established in the microwave remote sensing literature [11,12]. We note that, contrary to the case of reflection, it predicts an increase of TIS at large incidence angles.
The two TIS formulae (5) and (6) may be used to renormalize the considered roughness BSDFs. However, proper normalization is not easy to achieve as at non-normal incidence, the angular boundaries for computing a BSDF integral are shifted. Also, if the assumption TIS = 1-specular is reasonable in reflection on a metallic surface, it may be incorrect in transmission if the transmittance variations with angle are significant, for instance inside a material close to total internal reflection. In most cases, a simple BSDF normalization at normal incidence can be done with (3). If angles are large (e.g. folding mirrors with 45deg incidence) a more sophisticated normalization using (5) or (6) must be implemented.
Roughness of optical surfaces: overview of existing BSDF models
A large choice of BSDF models is available in the literature to describe rough optical surfaces. Some of them are empirical, and care must be taken to select the parameters of these laws so that the BSDF integral is equal to the expected TIS. Others are physically based and proper normalization is automatically ensured. At normal incidence, proper normalization is achieved when:
The ABg , the 2-parameters  and 3-parameters [1,3] Harvey-Shack models are empirical. The Wein formula, as presented in , seems to be physically based but is actually an empirical modification of a rigorous model . The exponent 3/2 was removed from equation (2.20) in , following the suggested empirical fitting law of equation (2.24). It can be easily checked that in the BSDF from reference , proper normalization is not automatically achieved. Finally, the K-correlation model [15, 16] is a physically-based BSDF obtained by inserting an analytical parametrization of the power spectral density (PSD) into the Rayleigh-Rice equation [10, 16]. As can be checked numerically, this model is inherently normalized to the correct TIS.
Dittman proposed to extend the K-correlation model to PSD functions with a slope s≤2 . Such PSD laws are fitting well the experimental BSDFs, and indeed for s=2 one recovers the Wein equation . Unfortunately, PSD laws with s≤2, if assumed mathematically correct over the entire frequency spectrum, result in an infinite roughness σ and cannot be associated to real physical surfaces. Further, the obtained BSDFs are not normalized, and if proper normalization factors are added the final BSDF expression does not follow anymore the Rayleigh-Rice theory. To resolve the issue, Dittman  makes the following reasoning: the TIS, given by the BSDF integral over one hemisphere, is also given by the PSD integral over a limited range of frequencies (limited by the hemisphere boundaries), and finally given by the traditional equation (3) with a band-limited roughness. This band-limited roughness is wavelength-dependent, with the consequences that it is no longer possible to associate a single roughness value to a surface, and the TIS is now determined by the surface PSD.
On the other hand, this concept of wavelength-dependent roughness seems to contain two inconsistencies: (a) it contradicts the basic result, demonstrated above in section II.B, that the TIS is fully determined by the roughness histogram (1st order statistics) and does not depend on 2nd order statistics, such as the PSD or autocorrelation. (b) the Rayleigh-Rice theory does not seem to explain what happens to the evanescent modes (light scattered at angles > 90 deg) and how their energy is redistributed in the propagating modes. Therefore, we can question its validity for PSDs with a significant energy in the high frequencies, such as the ones considered by Dittman for s≤2. In view of the confused situation, we decided to normalize the K-correlation (with s≤2) BSDFs with the usual TIS formula (3) and associate a single-valued roughness to each optical surface.
Roughness of optical surfaces: expected BSDF dependencies
In this section, we try to clarify the dependencies that one should expect from a physically-based BSDF model, with respect to wave vector k=2π/λ, roughness σ, refractive index contrast Δn and angles (incidence angle θ0, scattered angle θ and relative azimuth φ). In reflection, in the limit of very small roughness it can be shown that the Kirchhoff [11, 12], perturbation  and Rayleigh-Rice  rigorous theories all converge to a common law:
where no assumption (other than σ<<1 and isotropy) was done about the surface. We assume it is immersed in a medium of index n1, with Δn=2n1. R is a reflectance factor that also includes polarization projection terms, and F a general function that contains the angular variability. The obliquity factors (slow varying with the angles) have been neglected. For the transmission BSDF, very few references are available that provide rigorous equations. From [11,12] we find that the Kirchhoff and perturbation theories, in the limit σ<<1, converge to:
where light propagates from index n1 to index n2 so that Δn=n2-n1. Again, obliquity factors have been neglected, T is a generalized transmittance coefficient, and F the same function as for the reflection BSDF. In these two equations the TIS as given by (3) appears explicitly in front of the function F. If we exclude it, the remaining terms indicate that when the wavelength increases, the BSDF peak decreases and its curve becomes wider. We finally note that the transmission BSDF may be rewritten as a function of the scattered angle θout in medium n1 (e.g. after it exits a glass component of index n2), noting that the BSDF is a quantity per unit solid angle and per unit projected area:
The obtained expression, very similar to (8), justifies the practice of using the same BSDF models to describe scattering in reflection and transmission. As we will see in the next section, the angular dependency of the BSDFs in (8) and (11) is simplified to implement the Peterson formalism , by assuming normal incidence (θ0=0) and small angles (sinθ~θ).
We now explain how the BSDFs of individual optical surfaces can be combined into a straylight kernel that describes the properties of the entire instrument. Equation (14) in reference  explains how the BSDFs of all surfaces must be summed with different weighting and stretching factors. In this equation, the term πaent2 represents the entrance pupil area, so that πaent2Eent is the incident flux that we assume equal to 1. The transmittance T affects equally the scattered and non-scattered light so it is disregarded and the final kernel expression is:
where the sum index j runs over all optical surfaces, (na) is the instrument numerical aperture, aj the height of the marginal ray when it crosses surface j, and r the distance from the geometrical image measured in instrument focal plane, where the kernel is to be evaluated.
To establish this expression, Peterson only considered optical elements immersed into a medium of index n=1. We now derive a more general law by repeating Peterson’s demonstration and considering air/glass or glass/air interfaces. First, we note that equation (7) in  writes njθj = H/yj in the general case, where nj is the index of the medium where the scattered ray propagates. Then, we note that the power P and the product n2G, where n is the index and G the throughput, are conserved quantities through the instrument. Since P=LG where L is the radiance, we conclude that L/n2 is also conserved. We then get the generalized Peterson law:
From this reasoning we can draw two conclusions. First, the Peterson equation, although derived with simplified assumptions, can be used for all types of interfaces in an optical system. Second, in that equation the BSDFs must be expressed in the external medium with n=1, which (as discussed in the previous section) means the same BSDF models can be used for the reflective and transmissive interfaces.
Equation (12) was derived by Peterson under the assumption of a rotationally symmetric system. Its use for an off-axis instrument is not straightforward, as the height of the marginal ray aj that appears in (12) provides information about the distance to the optical axis but has no link any more with the size of the beam footprint on surface j. We propose to redefine aj as a function of the beam footprint area Sj on surface j:
To validate this approach we considered several simple configurations: (a) a parabolic mirror with a non-centered circular pupil, (b) a parabolic mirror with a non-centered elliptical pupil, (c) an Offner relay with 3 reflections on 2 concentric mirrors. The results are illustrated on figure 1, where the exact straylight kernel is computed with ASAP for a point source at infinity at the center of the field, and compared with the corresponding Peterson kernel computed with (12) and (15). The agreement is excellent.
Diffuse straylight vignetting
In a complex optical system, the argument of the BSDF in equation (12), computed numerically, may take very large values sometimes larger than 90 degrees for some optical surfaces and a given r. This is of course a limitation of the approach, and the BSDF must be set to 0 for angles larger than 90 degrees. On the other hand, this feature highlights the fact that the Peterson approach allows BSDFs of individual surfaces to scatter light at arbitrary large angles. In practice, most of the scattered light will actually fall outside of the physical aperture of the next optical surface, or may be vignetted by another surface further away in the instrument. To evaluate the impact of this effect we compared three levels of approximation:
(a) no vignetting: the BSDF is set to 0 for angles > 90 deg. Straylight is overestimated by the Peterson formalism.
(b) boxcar vignetting function: the BSDF is set to 0 for angles larger than a value called vignetting angle. Its computation is illustrated on figure 2. We consider an object on-axis, and pay attention to the principal ray originating from this object, that is superimposed with the system’s optical axis. At the surface where scattering occurs (here surface #11), the scattered rays (in red) may be vignetted by another optical surface (here surface #18). The vignetting angle is the largest scattering angle for which no vignetting occurs. Typical vignetting angles range from 4 to 9 deg in a telescope, and from 5 to 90 deg in a spectrometer (90 deg is achieved at the last optical surfaces).
(c) realistic vignetting function: the BSDF is multiplied with a vignetting function computed numerically for an object on-axis. The computation is monodimensional, and thus approximate, but nevertheless time-consuming. The obtained vignetting function is assumed constant over the field of view.
The comparison was performed with the spectrometers designed by the industrial primes Selex Finmeccanica (Italy) and Airbus (France) during the phase A-B1 study of the FLEX mission . The impact of the different assumptions is reported in Table 1, for the peak of the straylight spectrum in the high resolution (HR) channel. We see that the implementation of a realistic vignetting angle reduces straylight by 10 to 15%, while implementing a vignetting function with a realistic shape has a small impact (around 2%). On this basis, we have implemented assumption (b) in our model.
Impact of the vignetting assumptions used on the peak straylight in the HR FLEX spectrometer.
|FLEX instrument concept||No vignetting||Boxcar vignetting function||Realistic vignetting function|
|Airbus||100%||-10.2 %||-11.7 %|
|Selex Finmeccanica||100%||-13.5 %||-15.8 %|
STRAYLIGHT MODEL FOR AN IMAGING SPECTROMETER
We now describe how the straylight kernel, obtained with the Peterson formalism described previously and assumed to be constant over the instrument field of view and spectral range, is used to compute straylight in an imaging spectrometer. Imagers are much simpler so the method we present can be easily adapted for these instruments. Fourier-transform spectrometers are more complex and will not be addressed.
Pushbroom imaging spectrometers usually consist of (a) a telescope that images a scene at infinity (b) a slit situated in the telescope focal plane (c) a collimator, with the slit placed at its focus (d) a disperser such as a grating, prism or grism (e) finally a camera that focuses rays coming from the disperser onto a detector array. The slit plays the role of a field stop, so that the detector records in its two dimensions, the spectral information and only one spatial dimension (along-slit). The remaining spatial dimension (across-slit) is acquired temporally while the instrument flies over the observed scene, the slit being perpendicular to the flight direction.
We assume that such an instrument observes a scene that has separated spatial and spectral dependencies. Such scene is typically constructed by aggregating spectra computed with a radiative transfer model for a spatially uniform scene Lj(λ), and giving them some spatial dependency gj(x,y):
In this equation, x is measured along flight direction. The following two contributions are computed separately and will be discussed in the next sections:
- Straylight originating from the telescope (before slit): the Peterson kernel is convolved with the spatial scene. Thus the convolution is performed over the variables x and y. The resulting straylight corresponds to a spatial blur of the image, can be understood as a degradation of spatial resolution and is (at first order) the same at all wavelengths.
- Straylight originating from the spectrometer (after the slit): the kernel is convolved with the spectral-spatial final image over the two variables λ and y. The resulting straylight results from a spatial blur (along slit direction) and a spectral blur. It varies with wavelength and corresponds to a radiometric error affecting the measured spectra.
To limit the model complexity, we will make two additional assumptions:
- the observed scene (or at least the achieved slit illumination) is spatially constant over x, so that the instrument spectral response function (ISRF) is not distorted by a non-uniform slit illumination.
- the imaging PSFs (telescope, spectrometer, pixel size) are assumed to be separable, i.e. they can be written as a product of two functions depending on one variable: PSF(x,y) = F(x)G(y)
The nominal (straylight-free) illumination in the telescope focal plane becomes, ignoring spectral transmittance:
where V(x,y) is a function describing the vignetting that results from the telescope entrance baffle. It can be computed with a simple model, based on the convolution of the instrument pupil (oversized to get some margin) and the useful field, of rectangular shape, defined by the slit. The imaging PSFs before slit involve the telescope optical quality (diffraction, aberrations, multiple spots from a polarization scrambler) and potentially motion smear. The straylight is computed in the telescope focal plane by convolution with the telescope Peterson kernel K, and then propagated to the spectrometer focal plane:
If we neglect the imaging PSFs (replacing them with Dirac functions) we get:
This simple law has been implemented in our model with the following three steps: (a) computation of illumination on slit plane gj(y)V(x,y) (b) convolution with the Peterson kernel (c) implementation of slit opening (selection of the computed straylight for x=0) and multiplication with convolved spectra. Our model is programmed in Python, and makes use of the Numpy, Scipy and Matplotlib packages. The inputs and outputs are collected in Excel and the Python code is called by a Visual Basic macro. Some computation results are presented on figure 3 below to illustrate the method. A telescope with 17 optical surfaces, contamination levels ranging from 50 ppm to 400 ppm and roughness of 1nm rms was illuminated with a half-illuminated field. The simulation took less than 45 sec on a 2.80 GHz computer with 16 Gb RAM under Windows 7.
To compute spectrometer straylight, we make the same two additional assumptions as for telescope straylight (slit illumination uniform in x, and separable PSFs). The nominal illumination on the instrument focal plane in the absence of straylight is:
where the imaging PSF includes all PSFs in the instrument (telescope, spectrometer, detector) along the y axis. The corresponding straylight is obtained through a slightly more complicated equation:
where T(λ) is the product between the instrument spectral transmittance and the detector quantum efficiency. When the convolution with the Peterson kernel is omitted, and if the variations of T(λ) are neglected across the width of the ISRF, equation (20) is recovered. If scenes with simple patterns of spatial contrast are used, and the straylight is evaluated a few pixels away from the transitions, the imaging PSF may be omitted.
Equation (21) is implemented in our model. Some computation results are presented on figure 4 below for a half-illuminated scene, a complex spectrometer with 40 optical surfaces, and similar assumptions as on figure 3. The simulation took less than 75 sec on a 2.80 GHz computer with 16 Gb RAM. It is interesting to note that the spectrometer straylight has a spectrally smooth signature, contrary to the telescope straylight.
Possible straylight definitions
The Peterson method allows to address various interpretations of the straylight definition. The straylight calculated as described in the previous sections includes all the photons that underwent a scattering event. Other definitions are possible:
(a) straylight may be defined as the radiometric error achieved by comparison to an idealized instrument free of straylight. Then the radiometric losses (fraction 1-TIS of the nominal signal) must be subtracted from the computed straylight.
(b) straylight may also be defined as the amount of light being deviated from the geometrical image by more than a certain distance. Then the photons with a too small deviation must be excluded from the straylight we computed. The modelling is more complex due to the necessity to consider the combined effects of scattering and the imaging PSFs as causes of deviation. We implemented such a model and found that the impact on the final straylight is weak, respectively 3.3%, 6.5% and 9.7% for criteria of +/-1.5, +/-2.5 and +/-3.5 times the spatial sampling distances and spectral resolution (equal to the ISRF full width half maximum). Our approach can thus still be used with that definition, knowing that it slightly overestimates straylight.
This paper describes in details all the building blocks needed to set up a semi-analytical model based on the Peterson  formalism and compute diffuse straylight in an imaging spectrometer. Simpler instruments like imagers can be treated with a similar approach. Such a model has been implemented at ESA to perform analyses in support of preliminary phases of Earth observing optical instruments. This tool will allow to obtain the first straylight estimates with short computing times in a context where many iterations are often required. It is not intended to replace commercial softwares (such as ASAP, LightTools, Zemax) used for more accurate analyses later in the development phases of an instrument.
This paper is also a first step towards the harmonization and standardization of the assumptions made by our industrial partners for their straylight analyses. Efforts are ongoing to extend this work in three directions. First, we would like to include diffraction effects, e.g. as described in  in the presented method. Second, we also plan a validation work that will be done through comparisons between our approach and numerical simulations performed on complex instruments with ASAP. Finally, we hope to establish similar semi-analytical tools to predict more elaborate effects, such as optical ghosts and scattering from an instrument baffle.
G.L. Peterson, “Analytic expressions for in-field scattered light distributions”, Proc. SPIE, vol 5178, pp184–193 (2004).Google Scholar
N. Choi and J.E. Harvey, “Image degradation due to surface scatter in the presence of aberrations”, Appl. Opt. 51, 535–546 (2012).Google Scholar
M.G. Dittman, “Contamination scatter functions for stray-light analysis”, Proc. SPIE, vol 4774, 99–110 (2002).Google Scholar
Military Standard 1246B, “Product cleanliness levels and contamination control program” (Sep. 4, 1987).Google Scholar
P.R. Spyak and W.L. Wolfe, “Scatter from particulate contaminated mirrors Part 1, 2, 3”, Opt. 31(8), 1746–1784 (1992).Google Scholar
H.C. Van de Hulst, “Light scattering by small particles”, John Wiley & Sons, New York, 1957.Google Scholar
C.F. Bohren and D.R. Huffman, “Absorption and scattering of light by small particles”, John Wiley & Sons, New York, 1997.Google Scholar
W.J. Wiscombe, “Improved Mie scattering algorithms”, Appl. Opt. 19, 1505–1509 (1980).Google Scholar
J.C. Stover, “Optical Scattering, Measurement and Analysis”, 2nd ed. (SPIE Press, 1995).Google Scholar
F.T. Ulaby, R.K. Moore and A.K. Fung, “Microwave remote sensing, vol II Radar remote sensing and surface scattering and emission theory”, appendix 12D, Addison-Wesley, London 1982.Google Scholar
J.A. Kong, “Electromagnetic wave theory”, section 6.6, Wiley, New York, 1975.Google Scholar
E.R. Freniere, G.G. Gregory, R.C. Chase, “Interactive software for optomechanical modeling”, Proc. SPIE, vol 3130, 128–133 (1997).Google Scholar
S.J. Wein, “Small-angle scatter measurement,” PhD Thesis, Univ. of Arizona, Tuscon (1989).Google Scholar
M. Dittman, “K-correlation power spectral density and surface scatter model”, Proc. SPIE, vol. 6291, 62910R (2006).Google Scholar
J.E. Harvey, N. Choi, A. Krywonos, “Calculating BRDFs from surface PSDs for moderately rough optical surfaces”, Proc. SPIE, vol 7426, 74260I (2009).Google Scholar
FLEX Report for Mission Selection, ESA, SP-1330/2, June 2015.Google Scholar
R. Berlich, B. Harnisch, “Radiometric assessment method for diffraction effects in hyperspectral imagers applied to the Earth Explorer 8 mission candidate FLEX”, ICSO 2014 conference, Tenerife, 7-10 Oct. 2014.Google Scholar