Universal dictionaries for representation of turbulent atmosphere point spread functions

Abstract. Dictionary methods have proved very useful in compressive sensing. Herein we show that, for dictionaries to represent the point spread functions (PSFs) of image formation in atmospheric turbulence, it is possible to construct dictionaries computationally. Computationally created dictionaries make unnecessary the exhaustive collection of PSF data, for arbitrarily large numbers of conditions of turbulence and optical image formation systems, while also reducing the overall number of dictionaries that need to be created and stored.


Introduction and Overview
Compressive sensing methods have demonstrated that additional image and signal representations, beyond those produced by the Fourier transform, have significant benefits. For example, the wavelet basis-set was proven better than the Fourier basis-set for compression of images, and the Haar basis-set is more useful than Fourier representation for efficient representation of discontinuous functions, e.g., pulse trains. 1 A further theme of compressive sensing is the use of a dictionary to represent a signal. A dictionary is, in simplest terms, a basis-set of functions. Dictionaries can be constructed by machine learning from a sufficiently large set of exemplars collected within a specific data domain.
The ubiquity of compressive sensing was recently demonstrated for the representation of atmospheric turbulence. A dictionary, trained from images of turbulent point spread functions (PSFs), was able to synthesize new PSFs that possess the same characteristics as the original PSFs used to train the dictionary. 2 Herein, we present updated results demonstrating a method that frees the previously published, dictionary-based, PSF representation method from the requirement for collection of large sets of experimental PSF data.
We shall do this using dictionary basis-set representation as posed by the simple equation The matrix D is a collection of basis-set functions, where each function in the basis-set is a column of the matrix. The challenge is not to represent a signal by an arbitrary basis-set but to use functions specifically "tuned" to the characteristics present in the data. The K-means Singular Value Decomposition (K-SVD) algorithm was developed to be a solution for the problem of creating a dictionary tuned to structures inherent in a class of signals 3 using an approach similar in general philosophy to the k-means algorithm. The algorithm assumes a collection of data vectors exist that sufficiently characterize the overall variations in conditions to be efficiently represented from the dictionary via k-means approximations and singular value decomposition. The K-SVD algorithm has been validated for constructing dictionaries of PSFs and published results of PSF representations are available. 2,4 The data used in these validation efforts was collected at the Air Force Research Laboratory in the Wright Patterson Air Force Base (AFRL/ WPAFB), located in Dayton, Ohio. Various details of the experimental PSF data collection are found in Ref. 4 and an example of a 256-atom dictionary constructed from one of the turbulent PSF collections is seen in Ref. 2. Using six separate tests, originally documented in Ref. 2 and revisited in Sec. 3.2 below, we found dictionary-synthesized PSFs possessed the same characteristics as the original PSFs. In this paper, we shall use the same six tests to test the accuracy of dictionary representation of PSFs using what we refer to as a "universal dictionary."

Motivation and Approach: Efficient Representation of Atmospheric Turbulence PSFs
Representation of data is related to how the data will be used, transmitted, or stored. Wavelet functions are known independent of the coefficients, so only coefficients are needed to reconstruct the original data represented in a Wavelet basis, 5 similar to the dictionary approach described by Eq. (1). As another example, consider the problem of multiframe blind deconvolution (MFBD) for deblurring imagery corrupted by atmospheric turbulence. In MFBD, each image frame is corrupted by a different PSF. The deconvolution must estimate the PSF and the underlying object. Solving for representation coefficients of a PSF during MFBD means fewer PSF estimation iterations over all pixels. This has been verified in actual practice with PSF dictionaries used in an advanced MFBD algorithm. 6 The specific details for making a PSF representation are simple. However, the nature of turbulent PSFs makes PSF representation difficult in a conventional basis-set. Turbulent PSFs contain many small "speckles" that require many high frequency components in a Fourier basis-set representation. If a Zernike representation is chosen for turbulent PSFs, the radial symmetry inherent in the Zernike polynomials requires many terms to represent the lack of any radial symmetry in an asymmetric turbulent PSF. A more desirable situation is to identify and capture significant structures in a representative collection of turbulent PSFs to make a functional basisset. This is the role of PSF dictionaries and motivates our desire to generalize PSF dictionary creation, as compiled from PSFs created from atmospheric turbulence.
The initial development of dictionary representation for turbulent PSFs in Ref. 2 proved the validity of the following hypothesis, which we refer to as the dictionary synthesis hypothesis: A dictionary compiled from experimentally measured turbulent PSFs can be used to synthesize PSFs that are structurally and statistically equivalent to the turbulent conditions that were the source of the PSFs used to compile the dictionary. 2 It is important, at this point, to bear in mind the nature of a hypothesis. Some cogent aspects of a hypothesis are captured by its definition: a hypothesis is a tentative assumption made to draw out and test its logical or empirical consequences, or an idea proposed for the sake of argument so that it can be tested to see if it might be true. 7 Further, a hypothesis is usually tentative: it is an assumption or suggestion made strictly for the objective of being tested.
To this definition we add that a hypothesis can be tested in multiple different ways. When a hypothesis survives multiple different tests, its credibility is established and reinforced. This is the motivation for the multiple tests that were reported in Ref. 2. We note that tests other than in Ref. 2 can be further devised and applied, and we believe that novel tests can emerge from the dictionary synthesis hypothesis in the future.
We now wish to consider an extension of this hypothesis to the following dictionary universality hypothesis: A dictionary compiled from computationally created PSFs, for specific turbulence conditions, can be used to make dictionary representations of "real" PSFs (i.e., PSFs directly measured or present in indirect ways, e.g., via image formation), if the representations are structurally and statistically equivalent to the real PSFs.
Without the dictionary universality hypothesis, to faithfully represent a PSF in a dictionary, a dictionary to be used for representation must be compiled through use of a machine learning algorithm, applied to an ensemble of PSFs collected under atmospheric and optical conditions identical to those present in the PSF data to be represented by the dictionary. Unfortunately, equipment, cost, and variability of weather and collection environments, rule out unbounded collections of PSF ensembles for this purpose.
Assume, for example, that a PSF dictionary has been compiled and verified by the methods of the six tests stated above, as shown in Ref. 2. The dictionary will be associated with a specific value of Fried's coherence diameter, r 0 . 8 However, if the imaging system is much closer to or further away from a point object used to collect PSFs for a dictionary, the effects of turbulence in the optical path will be different than those for which PSFs were collected to compile the dictionary at hand. We expect this as the result of the distance of propagation interacting with the turbulence structure constant, C 2 n , in the computation that defines the coherence diameter, r 0 . 9 Furthermore, as proved in the paper by Fried (that sets forth the probability of a "lucky (diffraction limited) image" in turbulence), 10 the ratio of the limiting pupil diameter to the value of r 0 determines the impact that turbulence makes in the quality of the image. Thus, even if all turbulence conditions remain unchanged, optical system or propagation distance changes will alter the behavior of turbulent PSFs in the image plane.
The comprehensive experimental collection of PSFs, for all possible conditions of turbulent atmosphere, optical designs, and collection geometry is neither practical nor reasonable. This brings forth the desire for a way to create dictionaries without reliance on experimental collection of PSFs. We seek to have a universal method for the creation of dictionaries, a method that does not rely on unbounded experimental collection of PSF data. As such, in the following we shall discuss and demonstrate an approach for what we refer to as "dictionary universality."

Method: Dictionary Universality by Simulation of Turbulent PSFs
Historically, the result of significant research and modeling is the value of optical simulations of different physical effects and trade-offs in the design and configuration of optical systems. 11 Simulation of optical systems is an accepted and standard tool, when correctly applied. The classic text of Imaging Through Turbulence 9 contains details for the simulation of turbulent image data for adaptive optical systems, as do two other "how-to-do-it" texts in Refs. 12 and 13. In the case of extreme turbulence, a recent publication demonstrated simulation success using these methods. 14 For simulation of turbulence, and the computation of the corresponding PSFs, we used a comprehensive algorithm using the phase screen approach as in Refs. 9, 12, and 13. The structure of the phase variations in the screens is controlled by the Kolmogorov power spectral density with the number of phase screens as a simulation parameter.

Selection of Simulation Parameters
As stated above, multiple collections of PSFs, imaged through atmospheric turbulence conditions, were available from previous experiments at AFRL/WPAFB. We summarize the optical system turbulence conditions for these PSFs in Table 1. The value of the structure constant, C 2 n , was obtained from scintillometer measurements made in the PSF collections. Note that focal plane sampling (or quality factor, Q) 11 was virtually free of aliasing for the telescope aperture. The altitude change from source to telescope was minimal, and the measured value of the structure constant was accepted as unchanged over the propagation distance. A laser provided the point source at the indicated wavelength.
The parameters summarized in Table 1 were utilized in phase screen PSF simulation software using the accepted wavefront propagation techniques described above. For example, the optical focal length and the focal plane sampling, as related by Q, were used to simulate the actual sampling of the PSFs in the target focal plane detector pixels, as summarized in Table 2.
The simulated aperture was matched to the laser emission wavelength (787 nm), and focal length was chosen to make the focal plane detector Nyquist sampled. The PSFs were simulated on 32 × 32 pixel 2 frames, to maintain consistency with the size of 32 × 32 pixel 2 frames extracted from the experimental PSF collections. We provide the Nyquist pixel spacing in the object plane, calculated as λL∕2D, 14 in comparison. Finally, the simulation was set for a total of 10 phase screens from the source to the collection optical system. A total of 5000 PSFs were simulated, using the parameters in Table 2. Number of atoms in the K-SVD dictionary 256 Number of K-SVD iterations 6 Initial K-SVD dictionary DCT

Number of atoms used in fixed K-SVD representations 64
The computed PSFs in the image plane are saved with all the related parameter documentation. During the simulations of PSFs, information from each propagation, for a given turbulence level, is integrated to give a computed estimate for the coherence diameter, r 0 , providing additional monitoring of the simulation results. The collection of simulated PSFs is then used as the input for the K-SVD algorithm to compile a dictionary for representation of the PSFs measured in the experiment summarized in Table 1 above.

Representation of PSFs from a Dictionary Compiled by Simulated PSFs
For economy of language, we refer in the following to the AFRL experiment collection of PSFs at WPAFB as "real" PSFs. To evaluate the accuracy of universal dictionary representation of real PSFs, we compiled a dictionary from simulated PSFs, generated as described above, in Table 2. The dictionary compilation code used to derive dictionary atoms was the publicly available version of the K-SVD algorithm. 3 The total number of derived atoms in the dictionary was 256. The number of iterations for the K-SVD algorithm was 6 with the representation flag set to use fixed length representations, i.e., a specific number of atoms were used for the representation of each PSF during the computations for the optimum dictionary. In this case, 64 dictionary atoms were used in the K-SVD representations of PSFs, with those 64 atom representations then used in the optimization cycle of dictionary updates. The K-SVD algorithm uses an initial dictionary for "iteration zero," and, for this, we chose the standard basis function set of the discrete cosine transform (DCT) matrix.
Following the compilation of the dictionary from the 5000 PSFs propagated through 10 phase screens, the dictionary representations of the real PSFs were computed using the Orthogonal Matching Pursuit (OMP) algorithm. 3 We varied the number of atoms used in dictionary representations from 32, 64, 128, 192 to 256 atoms in all the tests discussed below. The same six tests, as established for the dictionary synthesis hypothesis presented in Ref. 2, were applied to compare the original real PSFs to the PSFs represented by our 10-phase screen universal dictionary. The description and purpose of the six tests may be summarized as the following.

Results: The Six Tests
Initial experiments indicated errors in the accuracy of representations of the PSFs. A series of histogram examinations of the AFRL PSFs and dictionary representations of those PSFs showed a discrepancy in the way that the smallest PSF intensity values were emerging from the OMP representation process. This led to detailed examination of the background statistics of the AFRL PSFs and closer scrutiny of the PSF sources from the AFRL experimental data collection.
In the WPAFB PSF experiments, the camera frame rate and active laser rate were purposely set at different values. This resulted in collecting some frames with no laser source and, occasionally, some where the laser "clipped" as it was coming on or off, as illustrated by the orange pulse in Fig. 1. Frames with the laser completely off (indicated by the red pulse in Fig. 1) were extracted from the data and averaged to integrate out noise and random changes in the environment lighting. The integrated result revealed faint, but definite, horizontal bands, ∼3 to 4 pixels wide and separated by 3 to 4 pixels, as shown in Fig. 2. The mean intensity values of these bands were in the range of 300 to 400, which corresponded to the range of values that were often in error when represented by the dictionary. The banding effect is believed to be a problem with crosstalk somewhere in the electronics that supported the collection of the PSFs. Bands of this sort have occurred previously in digital image data due to system crosstalk. For example, in Ref. 15, electronic crosstalk was detected in the spectral bands of the MODIS satellite operated by NASA.
In recognition of the background and crosstalk issues described above, intensity values of the experimentally collected PSFs were accumulated in a 2 × 2 pixels border frame on the boundary of the 32 × 32 pixels frame of the PSFs. These values were used to estimate the mean background, which was subtracted from the PSFs. The background-corrected PSFs were then checked for any (nonphysical) negative values, which were set to zero, followed by normalization of the PSFs to unit volume. Similar procedures of nonnegative and normalization constraints were applied to the simulated PSFs, because dictionary atoms contain negative values and the basic OMP algorithm does not enforce positivity. Fig. 1 The dashed lines illustrate how, if the laser pulse period is 10 μs longer than the camera frame period, a camera with the indicated integration times will capture ∼69 full-length pulses followed by 31 pulses missed (red) or partially missed (orange)repeated 10 times during a 1-s collect. The six tests, as summarized above, were applied for the comparison of the PSF dictionary representations to the PSFs collected in the WPAFB experiment. The results from the application of the six tests were favorable in support of the dictionary universality hypothesis.
• Test 1: visual inspection, showed PSF representations by OMP of the real PSFs to be indistinguishable in general features, size, and shape from the PSFs collected in the experiments. Figure 3 shows examples of three PSFs, with the real AFRL experimental PSFs on the left and the OMP representations on the right. • Test 2: comparing the ensemble mean of PSF representations with the real PSFs, showed indistinguishable visual displays, and the difference between the two ensemble means was equivalent to a signal-to-noise ratio (SNR) of 20.22 dB. Figure 4 is the side by-side comparison of the mean of the two sets of PSFs. The mean of the real PSFs is on the left. • Test 3: comparing the ensemble mean of the autocorrelation of PSF representations with the real PSFs showed indistinguishable visual displays, and the difference between the two ensemble means was equivalent to an SNR of 46.29 dB. Figure 5 is the side-by-side comparison of the mean autocorrelation of the two sets of PSFs.

Discussion
An initial stage of the results we have presented herein took place with a significantly reduced simulation structure. The PSFs were simulated according to the conditions in Table 1, but from a dictionary compiled from PSFs simulated with only one phase screen, placed midway over the distance from object to receiver. The results from that initial simulation study of the dictionary universality hypothesis were not as favorable as the results reported immediately above. However, the real PSFs, when represented from the dictionary, compiled from PSFs simulated with propagation through only one phase screen, did remarkably well in relation to the real PSFs. Applying the six tests presented above showed that most tests passed, even though the number of phase screens (one) in the PSF simulations was substantially less than recommended practice. The failure of the tests for the one-phase screen PSF representations were isolated to the cases where a smaller number of dictionary atoms were used. Specifically, using fewer than 196 dictionary representation atoms, of the 256 dictionary atoms available, caused failures for tests 4 and 5.
As readily accessible from the literature for simulation of turbulence, general recommendations for the number of phase screens to be used in turbulence simulations would advise that many phase screens are necessary for physical fidelity. 16 As shown in Ref. 17, focused on anisoplanatic image formation, 4-5 phase screens are recommended for simulation of anisoplanatic conditions over the short distance of 1 km. Further, simulation of turbulence using a "brightness function" process, found to be more efficient for the generation of turbulence realizations, was compared to wave-optics modeled examples simulated with 20 phase screens over distances of 10 km. 18 Thus, our use of 10 phase screens in our 5-km PSF simulations is consistent with recommendations widely found in the literature.
A question that immediately arises is how most of the six tests using PSFs simulated with only one phase screen, showed favorable results. This question is natural since PSFs simulated with an order of magnitude fewer screens than recommended in the references cited above were  not expected to be a close match with real PSFs. No direct answer to this question is on hand. We do offer a speculation about this question based on the following recently reported results: • Two recent publications simulated and then demonstrated from experimental data the existence of "spatial stabilization of deep-turbulence-induced anisoplanatic blur." 19,20 The observed spatial stabilization phenomenon means that in deep turbulence-observed PSFs, even in the presence of a very small isoplanatic angle, do not vary radically over small position changes in the image plane. Instead, PSFs with relatively simple structure occur over slightly larger regions, varying with image plane position in a well-correlated manner. Such PSFs are more predictable and less stochastic than the full nature of deep turbulence would initially lead one to expect. Thus, the necessity of using many phase screens for accurate simulation may thereby be reduced in such circumstances. • We also note that the experimental data leading to the verification of the spatial stabilization in Refs. 19 and 20 is from the same data set that was used to compile the results reported in Ref. 2 and used in the tests reported directly above. If resources become available, we would urge investigating the spatial stabilization phenomenon demonstrated in Refs. 19 and 20 within the context of the number of phase screens to accurately simulate turbulent PSFs.
As noted above, any hypothesis is subject to being tested. We consider the results presented above to be an initial demonstration of the dictionary universality hypothesis. Further, we believe that the results reported herein are the basis and motivation for more tests of the dictionary universality hypothesis. Beneficial avenues of inquiry for such additional tests may be more complexity and fidelity in the simulation of the turbulence that gives rise to PSFs in an optical system, as well as more complexity and details for the optical systems wherein PSFs are formed. We also suggest the six tests utilized and described above can be replaced with other metrics as to how a collection of PSFs, represented in a dictionary, can be assessed for equivalence to real PSFs. The only limit to the carefulness and thoroughness of such dictionary universality hypothesis tests will be the number of resources available to be devoted to additional efforts.