Efficient wavefront sensing for space-based adaptive optics

Future large space telescopes will be equipped with adaptive optics (AO) to overcome wavefront aberrations and achieve high contrast for imaging faint astronomical objects, such as earth-like exoplanets and debris disks. In contrast to AO that is widely used in ground telescopes, space-based AO systems will use focal plane wavefront sensing to measure the wavefront aberrations. Focal plane wavefront sensing is a class of techniques that reconstruct the light field based on multiple focal plane images distorted by deformable mirror (DM) probing perturbations. In this paper, we report an efficient focal plane wavefront sensing approach for space-based AO which optimizes the DM probing perturbation and thus also the integration time for each image. Simulation of the AO system equipped with a vortex coronagraph has demonstrated that our new approach enables efficient information acquisition and significantly reduces the time needed for achieving high contrast in space.


Introduction
One of the major goals for the next-generation large space telescopes 1-3 is to directly image faint earth-like planets. This requires that the telescope be equipped with a coronagraph 4-8 for suppressing starlight and adaptive optics (AO) for correcting wavefront aberrations (see Fig. 1). In ground-based telescopes, 9,10 adaptive optics typically works by first measuring the wavefront aberrations using a wavefront sensor 11 and then compensating for the aberrations using devices such as deformable mirrors (DMs). 12 However, this conventional approach is not suitable for space missions where a higher contrast (below 10 −9 rather than the 10 −6 typical of ground telescopes) is required because a separate wavefront sensor introduces non-common-path errors. Instead, focal plane wavefront sensing 13 must be used in space-based AO to retrieve the aberrated light field. This is done via small probing commands to the DMs, causing the light field to vary slightly, allowing it to be estimated by observing the corresponding focal plane intensity changes and solving a phase-retrieval optimization problem.
Currently, the benchmark method for focal plane wavefront sensing is pair-wise DM probing followed by a batch process estimation. 14,15 This approach constructs a linear observation of the light field using pairs of opposite DM probing commands and then formulates the wavefront sensing as a least-squares problem. Building on that architecture, several improved wavefront sensing approaches have also been proposed, such as the Kalman filter method 16 which combines information from previous AO control steps and the extended Kalman filter 17,18 which enables simultaneous incoherent source estimation. These improvements focus only on the formulation of the statistical estimation problem, not the DM probing and image acquisition process itself. However, in space-based AO, the latter is equally important. Unlike similar phase retrieval problems in other fields, such as quantitative phase imaging, 19 wavefront sensing in space-based AO does not provide any science results in and of itself; the retrieved light field is only used for wavefront correction. Nevertheless, the ultimate objective is to observe the faint incoherent astronomical objects hidden below the residual light from these coherent wavefront aberrations. Reducing the time spent on wavefront sensing and image acquisition significantly increases the available time for science observations.
In this paper, we propose an improvement to the stochastic modeling of a space-based AO system and accordingly introduce efficient wavefront sensing policies, where optimal DM probing commands and camera exposure times are used. Simulation results using a vortex coronagraph system 6 show that our new approach achieves almost the same accuracy of the field estimation with fewer images and much shorter exposures, thus significantly reducing the time spent on wavefront correction. shown here for simplicity, however, typically more than two deformable mirrors are used in a real AO system) and the coronagraph suppresses the starlight using a series of masks, including (a) pupil plane mask (a binary mask, fully transmissive or fully opaque), (b) a focal plane vortex mask 6 (a pure phase mask, its phase shown in the figure), and (c) a Lyot stop (a binary mask). After several AO control steps, a high-contrast annular observation region appears in the image.

Space-based AO
Adaptive Optics in a space telescope is used to correct the wavefront aberrations in the telescope optics and the coronagraph instrument. Figure 1 shows a representative space telescope system equipped with both AO and a coronagraph. A coronagraph 4-8 is a type of optical device designed for imaging faint companions around a star. In a coronagraph instrument, a series of amplitude and phase masks work together to block out the on-axis starlight but transmit the off-axis light sources, ultimately creating a high-contrast observation region, or so-called dark hole, in the image plane. However, the coronagraph is sensitive to complex wavefront aberrations (both amplitude and phase errors) in the optical system. When the wavefront is aberrated by the lens/mirror surface roughness, misalignments, and thermal effects, the focal plane observations are contaminated with bright speckles and the contrast in the dark hole is significantly degraded. In that case, DMs in the AO system are needed to restore the designed high contrast. The DM's surface can be controlled by applying various voltages to the DM actuators. 12 As mentioned in Sec. 1, DMs first apply probing commands to sense the light field, then apply control commands to compensate for the wavefront aberrations. This process is iterated, between the sensing and control, to dig a final dark hole. In this section, we describe the current space-based AO process, including system modeling and the wavefront sensing control (WFSC) policies.

System modeling
The light propagation through the coronagraph can be modeled as a linear operation. Figure 1 shows a space-based AO system equipped with a vortex coronagraph, 6 which consists of a binary (fully transmissive or fully opaque) pupil plane mask, a focal plane vector vortex coronagraph (a pure phase mask which introduces an azimuthal phase ramp) and a binary Lyot stop mask. It defines a general architecture of a coronagraph instrument, however, the pupil plane masks can also be replaced by apodizers and the focal plane mask can also be replaced by amplitude masks in other types of coronagraphs. [4][5][6][7][8] With the coronagraph's pupil mask, focal plane mask, and the Lyot stop respectively denoted as M P , M F and M L , the relationship between the pupil plane field, E p , and focal plane field, E f , is where F and F −1 represent the two-dimensional Fourier transform and inverse Fourier transform, respectively, and C is the composite linear coronagraph operator. To keep the representation clean, here we neglect a constant coefficient related to the light wavelength λ. Deformable mirrors introduce phase perturbations to the incident light field. Assuming the incident light field with complex wavefront aberrations is given by E ab , the corrected pupil wavefront downstream of the adaptive optics is where φ is the surface height of the DM, λ is the light wavelength, and the constant is 4 instead of 2 because the change in the optical path length is twice the mirror displacement. The DM surface height is a two-dimensional function of the actuators' voltage commands, φ = φ(u). It can be approximated as a linear superposition of each actuator's influence on the mirror surface, 20 where N act is the number of actuators on the DMs, u q is the voltage command to the q-th actuator, and f q is its unit voltage response on the DM surface. The DM surface varies over time when the AO control loop is running. After k wavefront sensing and control steps, the accumulated DM surface height becomes where ∆φ t and ∆u q,t are respectively the surface change and incremental voltage command at time step t, and ∆u k is collection of all the actuators' voltage changes at step k.
Combining the Fourier optics modeling of the adaptive optics and the coronagraph (Eq. 1, Eq 2 and Eq. 3), the focal plane field after k control steps is Assuming the DM surface change is very small in each sensing and control step, Eq. 5 can be linearized using a Taylor expansion. The space-based AO with coronagraph can thus be mathematically described as a linear time-varying (LTV) system, where G k−1 is a linear projection modeling the DM's influence on the focal plane light field, which is also known as Jacobian matrix after E f,k and ∆u k are discretized and vectorized. The observations of the focal plane light field are the camera images perturbed by the DM probing commands. Letting the DM probing command be ∆u p k , the corresponding camera image is where I in,k is the incoherent signal not influenced by the DMs, such as the exoplanet light, and | · | 2 represents the element-wise square of the amplitude of a complex vector/matrix. Equation 6 and Eq. 7 together describe a state space model (SSM) of the AO system. Only monochromatic light is considered in the above optical modeling. However, it is straightforward to extend the above mathematical formula to the broadband case by defining each wavelength's SSM independently and then concatenating the state vectors of different wavelengths to formulate a broadband SSM. 21

Wavefront sensing and control policies
Wavefront control typically minimizes the total energy in the focal plane observation regions (dark holes) for the DM voltage commands according to Eq. 6. This control policy is usually referred to as electric field conjugation (EFC) 14,22 and can be formulated as a regularized quadratic programming problem, where · 2 is the L-2 norm of a vector/matrix, E f,k−1 is the discretized focal plane electric field in the dark holes, ∆u k is the optimal DM control command and α k is a Tikhonov regularizer. The Tikhonov regularizer is introduced to avoid unreasonably large commands exceeding the operation limit, since the equation of the electric field is an under-determined system (the number of actuators is smaller than the number of pixels in the search region). Wavefront sensing solves the dual problem of estimating the focal plane light field. As mentioned in Sec. 1, the current benchmark wavefront sensing approach is the pair-wise DM probing and least-squares estimation. 14,15 According to the observation model in Eq. 7, differencing the perturbed images from opposite DM probing commands, ±∆u p k , constructs a linear observation of the electric field, where † represents the complex conjugate, • represents the Hadamard product, and p k = G k ∆u p k is the focal plane electric field perturbation introduced by the probing commands. The estimation problem based on the linear observation can be thus formulated as a least-squares problem, whereÊ f,k is the estimated focal plane light field, N p is the number of pairs of opposite DM probing commands applied, j is the index of the pair-wise probing commands, {p j k } are the perturbations generated by {∆u p,j k }, and {∆I p,j f,k } are the corresponding camera measurements. By visualizing the pair-wise probing observation in a complex plane, as shown in Fig. 2, we can see that the difference images measure the projections of the complex electric field on the perturbation directions. At least 2 pairs of probes are needed to estimate the field; the probing directions should be as different as possible. In addition, the probes should modulate the whole observation area, otherwise the difference between the positive and negative images at some pixels would be too small to be used for regression. To satisfy the above requirements, currently the most popular DM probing policy is to generate two-dimensional sinc waves on the DM surface, 21 where (x, y) are the coordinates in the pupil plane and β k , a, b, c and ψ j are constants. The Fourier transform of the above DM surface shape (assuming the first formula) is which produces two symmetric uniform rectangles with opposite phase diversities in the Fourier domain. According to Eq. 1, the coronagraph operator is similar to the Fourier transform except for some extra kernel convolutions, so the focal plane perturbations created by the sinc probes should still have a relatively uniform and symmetric structure (see Fig. 3 and case (a) in Sec. 4). The phase (or the direction in the complex plane) of each probe can be adjusted by changing the shift term ψ j . Typically, a set of probing phases that uniformly cover the range of [0, π] are selected. The amplitude of the sinc waves β k are computed based on the mean probing contrast, that is, the probe amplitude in units of contrast, C p k = |p k | 2 /N pix , where N pix is the number of pixels in the observation regions. A widely used heuristic law for determining the probing contrast 23 is where C k = |E f,k | 2 /N pix is the mean contrast of the original observation region. By using Eq. 13, the probe contrast is neither too large nor too small, which gives a relatively accurate measurement of the original focal plane electric field.
Two more advanced wavefront sensing approaches are the Kalman Filter and the (iterated) extended Kalman filter. The Kalman filter (KF) 16 incorporates the state transition information and solves a weighted least-squares problem for the electric field estimation, where · R k and · P k are the matrix weighted norms that balance the importance of the information from current observations and state transitions. The matrix weights R k and P k are computed based on the level of process noises and observation noises of the system. The Extended Kalman Filter (EKF) 17, 18 directly solves a phase-retrieval-like non-convex inverse problem, where now N p is the number of DM probing commands instead of pairs of probing commands. These two approaches improve the formulation of the statistical estimation problem by better utilizing the information acquired. However, the DM probing and image acquisition policies stay the same as in the benchmark case.

Optimal probing policy and efficient camera integration time
Based on the space-based AO framework described in Sec. 2, we now present our new approach for optimizing the DM probing policies and camera integration times. We here mainly consider the case of pair-wise probing and estimation.

A stochastic model of system noises
The state-space model introduced in Sec. 2.1 is a stochastic process with additive process noise, w k , in the state transition equation and observation noise, n k , in the state observation equation, Although not explained in detail in Sec. 2.2, the weighting matrices R k , P k , and P k in the Kalman filter and the extended Kalman filter depend on the statistics of these model noises.
Based on the experimental observations, we found that the process noises could be modeled as zero-mean Gaussian, where I is the identity matrix and the variance is approximately proportional to the summation of a constant and the electric field change multiplied by another constant, The observation equation and observation noise can thus be distributed using Eq. 18, where w p k is the DM probing noise and n d k is the detector noise. The observation noise can be modeled as a non-zero mean Gaussian, of which the covariance consists of three parts: the first term from the camera readout noise is a fixed constant, the second term from the Poisson noise is proportional to the image intensity, and the third term from DM probing has a similar formula to Eq. 18. The readout noise and Poisson noise terms are related to the camera integration time, t, via where the flux (proportional to the brightness of the planet's host star) is the number of photons hitting the detector in unit time at the center pixel of the starlight point spread function (PSF), n r is the camera's readout standard deviation, andr 0 andr 1 are the normalized noise coefficients. The covariance of the pair-wise probing observation is thus where C k = |E f,k | 2 /N pix and C p k = |p k | 2 /N pix are respectively the mean coherent contrast and mean probing contrast defined in Sec. 2.2 and C in k = I in,k /N pix is the mean incoherent contrast.

Optimal probing contrast
The optimal probing contrast should minimize the covariances of the observation noises. According to Fig. 2, the projection measurement of the electric field is z k = ∆I p f,k /(4|p k |), so the corresponding observation noise variance is According to the inequality of arithmetric and geometric means (AM-GM inequality), 24 with equality if and only if C p k = We now have a theoretical solution for the optimal probing contrast. When the camera readout noise is very small (r 0 → 0) and the incoherent contrast is small (C in k → 0), this theoretical solution is similar to the heuristic law in Eq. 13, where the mean probing contrast is proportional to the square root of the mean coherent field contrast. However, our optimal law has a clearer physical meaning and varies for different systems, depending on the DMs, the coronagraph, and the detectors used.

Optimal probing shape
With n pairs of probes and difference images, {p j k , ∆I p,j f,k }, j = 1, · · · , N p , we can write the overdetermined linear observation equation of the electric field, where θ k,j = arctan2( {p j k }, {p j k }) defines the orientation of the complex probe perturbation, p j k . The estimated mean and covariance of the electric field arê where Cov(x k ), consisting of H k and {Var(z j k )}, are functions of {∆u p,j k } and optical model parameters (Jacobian matrix, G k , and noise coefficients, q 0 , q 1 , r 0 , r 1 , defined in Sec. 3.1). The optimal probe shape can thus be computed by minimizing the log determinant of this covariance matrix with respect to the DM probing voltage commands, where P(·) is a user-chosen regularizer that prevents ill-posed solutions. One useful choice of the regularizer is a Tikhonov regularization of the DM probing voltage commands. This policy is typically called variance-minimizing 25 in active learning and optimal experiment design. When we have only two pairs of probing commands, the log determinant of the estimation covariance can be simplified to an easily interpreted formula, Minimizing the first two terms makes the DM probing commands satisfy the optimal probing contrast criterion in Sec. 3.2, while minimizing the third term makes the complex perturbations as perpendicular as possible to each other.

Efficient camera integration time
Assuming the optimal probing policy is applied, we can now determine the best camera integration time by analyzing the signal-to-noise-ratio (SNR). The SNR of the camera image is defined as where γ ∆ = C k t. The SNR is maximized when γ → +∞, i.e. the camera integration time becomes infinitely large. However, the last term in the square root and the last term of the equation are not influenced by the integration time, so the marginal benefit becomes very small when the integration time exceeds a certain threshold. flux n r q 0 q 1r0r1 2 × 10 9 12 10 −14 0.05 3.6 × 10 −17 5 × 10 −10 Based on this observation, we can define an adaptive camera exposure policy, t k = max(γ/C k , t min ), which results in shorter integration times when the contrast is low but longer for high contrast. We set a minimum integration time, t min , to avoid too short an integration time which would result in abnormal detector effects and too large a probing contrast that exceeds the DM linear operation regime. This policy results in a simplification of Eq. 29, when C k C in k . Given any SNR, we can solve for the camera integration time according to the above equation.

Simulation setup
In this section, we show the results of a simulation of a space-based AO system to demonstrate our new optimal probing and camera exposure policies. The layout of the system is almost identical to Fig. 1 except for an additional DM. A simple vortex coronagraph is used, consisting of a circular pupil aperture, a charge six vortex phase mask, and a Lyot stop. Two DMs with 34×34 actuators are used in the AO system, with the first placed at the conjugate plane of the coronagraph pupil plane. The image plane observation region (dark hole) is an annular area extending from 3−9λ/D, where λ = 635nm is the wavelength of the starlight and D = 1cm is the diameter of the coronagraph pupil mask. Both amplitude and phase wavefront aberrations are introduced in our simulation. In the wavefront sensing and control loop, both DMs are used to correct the wavefront aberrations but only the first one is used for probing the dark hole field. The system parameters defined in Sec. 3 are listed in Tab. 1. These parameters could either be easily measured (such as the flux and the detector statistics, n r ) or be computed using system identification algorithms 26,27 (such as the process noise parameters q 0 and q 1 ).
We explore five wavefront sensing approaches: (a) sinc wave probes (Eq. 11) and the heuristic probing contrast (Eq. 13), (b) sinc probes with optimal probing contrast (Eq. 24), (c) optimized probes initialized with sinc waves (Eq. 27), (d) optimized probes randomly initialized, and (e) the probing policy in (c) with adaptive camera integration times (Eq. 30). The camera integration time for each image in the first four cases is fixed at one second. We choose that number so that the WFSC reaches the desired high contrast (around 5 × 10 −10 ) for the simulated flux. In space, the flux is very low, so the camera exposure times in a real space mission would be much longer. Our simulations only reflect the relative time needed for different approaches. Although the DM probing and camera exposure policies are different, an identical least-squares estimator (Eq. 10) is applied for all cases.

Results
Since the annular dark hole setup is used in our simulation, cases (a) and (b) need to at least apply four pairs of sinc waves for probing. Only two pairs of sinc probes are not enough. As shown in Fig. 3, either the pixels located on the x-axis or on the y-axis are not well modulated by two pairs of sinc probes (probe intensity is too low), so we have to switch the probing axis to fully cover the dark hole regions. Typically, the first and the second pair perturb two symmetric regions on the left and the right, and the third and the fourth pair perturb two regions on the top and the bottom. The phase shifts between the first two pairs and the second two pairs are both 90 degrees, which results in almost orthogonal electric field perturbations in the focal plane. As indicated in Fig. 4, where we show the changes of image contrast over time in wavefront control, even though both cases use sinc probing profiles, case (b) with optimal amplitudes uses a much shorter time (fewer images) than the benchmark case (a). The optimal probing contrast law guides us to collect images with smaller observation noises.  (ξ, η) are the focal plane coordinates, and λ/D is the unit of the field of view (FOV), where λ is the star light wavelength and D is the telescope aperture size. The first two columns and the third column respectively display the probes' contrasts in log scale, {|p 1 k | 2 , · · · , |p 4 k | 2 }, and the angle differences, With the DM probing shapes optimized in cases (c) and (d), two-pairs-of-probe sensing becomes possible which further reduces the time and number of images needed for WFSC to achieve a high contrast. The probe shape optimization problem in Eq. 27 is solved using an Adam optimizer. 28 The initialization highly influences the final solutions since it is a non-convex program. As can be seen in Fig. 4, case (c) with a sinc wave initialization performs better. It slightly modifies the sinc probe shapes and introduces electric field perturbations to the previously unmodulated axial regions. As indicated by Fig. 5, the probe contrasts of the axial pixels now increase from below 10 −9 to above 10 −7 , and the probe angle differences become almost 90 • . However, with a random initialization, the focal plane perturbation is not always uniform (see Fig. 6) because the  Case (e), the adaptive exposure policy, is simulated using the same probing policy as in case (c) but with the adaptive integration time. The adaptive exposure policy significantly reduces the WFSC time while still reaching a high contrast, as shown in Fig. 4, because short camera integration times are sufficient for the wavefront sensing at low contrast. In the figure it appears that the adaptive camera exposure policy levels out at 3×10 −10 and converges to the fixed exposure case after one hundred seconds. However, this is not true and is purely because of the long camera exposure time at high contrast. In Fig. 7, we show a log-log contrast-versus-time graph from the adaptive camera exposure policies defined at four different SNRs. As can be seen, the contrast is still going down after a hundred seconds (can reach below 10 −10 ), and the logarithm of the final contrast is inversely proportional to the logarithm of the WFSC control time used. In contrast, the WFSC contrast curves using fixed integration time do get stuck, because the collected high contrast probing images have very low SNRs. Typically, reducing the SNR speeds up the wavefront correction. However, when the SNR is below 1, such as in the case of SNR = 1/2, WFSC is a little slower at the beginning because the estimation is not very robust. Even worse, when the SNR reaches below 1/2 (not shown), the WFSC no longer works. That also explains why the fixed integration time policy get stuck after reaching a high contrast due to low-SNR images. Therefore, the adaptive camera exposure policy defined at SNR = 1 is the best choice.

Conclusion and future work
In this paper, we have proposed a new stochastic model of a space-based AO system and developed optimal DM probing policies and adaptive camera exposure policies based on that. Our new approach enables efficient wavefront sensing, so the AO system can reach a high contrast within a much shorter time. We demonstrated the new approach by simulating a telescope system with a vortex coronagraph. Future work includes the validation of the new methods in experiment. We also want to explore the applications of our efficient wavefront sensing algorithms in correcting non-common-path errors in ground-based telescopes. In addition, we also plan to investigate the optimal probing and camera exposure policies for nonlinear wavefront sensing algorithms, such as the extended Kalman filter in Eq. 15. control, signal processing and machine learning. He is a member of the American Astronomical Society and the SPIE.
N. Jeremy Kasdin is a Professor of Mechanical and Aerospace Engineering at Princeton University. He is the Principal Investigator of Princetons High Contrast Imaging Laboratory and Coronagraph Adjutant Scientist for WFIRST, the Wide Field InfraRed Survey Telescope. He received his Ph.D. from Stanford University in 1991. Professor Kasdins research interests include space systems design, space optics and exoplanet imaging, orbital mechanics, guidance and control of space vehicles, optimal estimation, and stochastic process modeling. He is an Associate Fellow of the American Institute of Aeronautics and Astronautics and member of the American Astronomical Society and the SPIE.  A space telescope system equipped with AO and a high-contrast vortex coronagraph. The AO corrects the complex wavefront aberrations (only phase aberrations are shown here) using deformable mirrors (only one mirror is shown here for simplicity, however, typically more than two deformable mirrors are used in a real AO system) and the coronagraph suppresses the starlight using a series of masks, including (a) pupil plane mask (a binary mask, fully transmissive or fully opaque), (b) a focal plane vortex mask 6 (a pure phase mask, its phase shown in the figure), and (c) a Lyot stop (a binary mask). After several AO control steps, a high-contrast annular observation region appears in the image. 2 A graphical interpretation of pair-wise probes and difference images. 3 Focal plane perturbations at the first control step caused by the empirical sinc waves with optimal amplitudes. (ξ, η) are the focal plane coordinates, and λ/D is the unit of the field of view (FOV), where λ is the star light wavelength and D is the telescope aperture size. The first two columns and the third column respectively display the probes' contrasts in log scale, {|p 1 k | 2 , · · · , |p 4 k | 2 }, and the angle differences, {|∠p 1 k − ∠p 2 k |, |∠p 3 k − ∠p 4 k |}. 4 WFSC simulation contrast curves with different DM probing and camera exposure policies, (a) benchmark case using sinc wave probes and the heuristic probing contrast, (b) sinc probes with optimal amplitude, (c) optimized probes initialized with sinc waves, (d) optimized probes randomly initialized, and (e) adaptive camera exposure with probing policy in (c).