## 1.

## Introduction

Directly imaging exoplanets, particularly Earth-like exoplanets, has been a goal of the scientific community for some time and was identified as a key technological priority in the last decadal review.^{1} Starshades and coronagraphs are the key enabling technologies for direct imaging missions in space. Both are experiencing great successes in their technology development, with a few modern examples of these designs being the Exo-S,^{2} Exo-C,^{3} and WFIRST-AFTA^{4} studies. The starshade option requires formation flight of the telescope with a second satellite that occults the star to a diffraction-limited level so that only the planet light enters the telescope. A coronagraph relies on a series of apodizers, masks, and stops in the instrument to suppress the starlight so that the off-axis planets can be imaged. The trade for a single satellite means that the starlight enters the telescope, resulting in aberrations that leak starlight into the coronagraphic image. Even small aberrations affect contrast at a ${10}^{-10}$ contrast level, the relative intensity at which we expect to see Earth-like exoplanets. Furthermore, these aberrations are dynamic in even the most stable telescope and result in a quasi-static speckle field in the image plane. The solution is to include deformable mirrors in the instrument so that the resultant speckles can be actively suppressed, allowing us to recover a dark hole of high contrast in the image plane. The precision required to create a dark hole at extremely high contrast drives wavefront-sensing methods away from sensors that introduce noncommon path optics. Thus, the control loop is closed around the final science image where we must first estimate and then control the electric field to maintain a dark hole in which we can discover and spectrally characterize targets.

Here, we describe the current control and estimation methods for focal plane wavefront control at multiple facilities, most specifically drawn from practical experience at the Princeton University High Contrast Imaging Laboratory (HCIL) and the Jet Propulsion Laboratory’s (JPL) High Contrast Imaging Testbed (HCIT). We begin by formulating the linear problem under the framework of Fourier optics, a common approach to all such focal plane methods. A comparison of the mathematical basis for various controllers and estimators follows. Finally, we describe several probing methods, evaluate the quality of our observation as a function of the probe chGoice, and examine how this affects our ability to achieve high contrast.

## 2.

## Wavefront Control

We begin by deriving the underlying mathematics for a generalized focal plane wavefront control instrument. A schematic of the Princeton HCIL layout is given in Fig. 1 to illustrate the concept. One can consider the fiber launch in Fig. 1 to be the input image from the telescope. What follows is the simplest possible control system with two deformable mirrors (DMs). The beam is collimated by an off-axis parabola (OAP), reflects off two DMs in series, and propagates through the apodizer. In the case of the HCIL, the apodizer is a shaped pupil coronagraph,^{5}^{,}^{6} but the choice of apodizer only affects the transformation required to reach the image plane (which is generalized in this paper). With the shaped pupil located in the first pupil plane, we then form the first image with a second OAP, block the core of the point spread function (PSF) with a focal plane mask, and reimage onto the science camera. Note that there is no wavefront sensor, and the purpose of this section is to derive the transformations that describe the effect of perturbations in collimated space on the image plane electric field so that we can ultimately define model-based estimators and control laws where the sensing is done in the image plane. Augmenting the controller to multiple DMs, and mathematically correcting for moving DM surfaces to planes nonconjugate to the pupil is fully described by Pueyo et al.,^{7} with various details of controllability addressed by Shaklan and Green^{8} and Groff.^{9} Once again, these will have small effects on the form of the transformations that we have generalized here.

Treating the designed pupil field as $A(u,v)$, the DM surface as a phase perturbation at the pupil plane, $\phi (u,v)$, and including a distribution of complex aberrations, $g(u,v)$, the electric field at the pupil plane at any control step $k$ is given by

For controller design, we write the controlled phase at step $k$ as a sum of the previous DM surface and a small perturbation, $\delta \phi $, about which we will ultimately linearize the phase induced by the DM. It is important to note that we show the aberrated field as a separate additive complex error, $g(u,v)$, but this makes it less obvious that the initial field about which one linearizes should be the entire phase distribution in the pupil. In other words, ${\phi}_{0}(u,v)$ would include an estimate of the entire aberrated field. At HCIT, this comes from the initial phase diversity estimates used to flatten the pupil field and determine its starting complex field.Assuming that the phase perturbation from the DM, $\delta {\phi}_{k}$, is adequately small, we let ${\phi}_{k}={\phi}_{k-1}+\delta {\phi}_{k}$ and take a first-order approximation of the exponential, resulting in the linearized form of the pupil field.

## (2)

$${E}_{\mathrm{pup},k}(u,v)\cong A(u,v)[1+g(u,v)]{e}^{i{\phi}_{k-1}}[1+i\delta {\phi}_{k}(u,v)].$$## (4)

$${E}_{im,k}(x,y)=\mathcal{C}\{A(u,v){e}^{i{\phi}_{k-1}}[1+g(u,v)]\}+i\mathcal{C}[A(u,v){e}^{i{\phi}_{k-1}}\delta {\phi}_{k}(u,v)].$$## (7)

$${E}_{im,k}(u,v)=\mathcal{C}\{A(u,v){e}^{i{\phi}_{k-1}}[1+g(u,v)]\}+\frac{2\pi i}{\lambda}\sum _{q=1}^{{N}_{\mathrm{act}}}\delta {a}_{q}\mathcal{C}[A{e}^{i{\phi}_{k-1}}{f}_{q}(u,v)].$$^{10}

Finally, the measured intensity in the image plane is just the square of the field, ${I}_{im,k}(x,y)={|{E}_{im,k}(x,y)|}^{2}$. We often refer to the area in the image plane where sufficient contrast has been achieved as the dark hole and use the notation ${I}_{\mathrm{DH}}$ for the high-contrast intensity (either designed or controlled) there. The objective of the wavefront controller is to find the control matrix ${u}_{k}$ that will recover the desired contrast in the dark hole in the presence of the aberrations.

## 3.

## Speckle Nulling

Speckle nulling, whose fundamental equations are described fully in Ref. 11 along with a speckle energy minimization and sensing scheme, is the simplest form of suppression in the final image plane. It does not require an estimate of the electric field and, therefore, only closes the loop on intensity images in an attempt to slowly drive the unobservable field states to zero. All that is needed is a mapping of spatial frequencies at the DM plane to image plane locations. Speckles are then singled out in the image and suppressed based on their intensity and phase. The amplitude of the command for any spatial frequency is directly proportional to the intensity of the speckle. Specifically, the DM amplitude should scale as $\lambda \sqrt{{I}_{\text{speckle}}}$. The correct phase shift for each spatial frequency command on the DM is determined by minimizing the energy of the speckle in the image plane. Since the amplitude and phase of any spatial frequency on the DM are computed directly from intensity measurements at the detector, speckle nulling tends to be highly robust to model error and does not require as much computational overhead. However, it is highly inefficient compared to other control methods because speckles are dealt with one at a time and there is no guarantee that canceling a speckle will not degrade the contrast of other speckles. This has been mitigated in some sense by simultaneously solving for the multiple speckles at a time, which has been reported at the Ames Coronagraph Experiment Laboratory, where the algorithm is typically used in conjunction with electric field conjugation (EFC).^{12} Despite this, speckle nulling tends to be slow and is largely relied upon to clean up stray artifacts. The main advantage of speckle nulling is its robustness to model error, which is thoroughly demonstrated on sky by Martinache et al. without the aid of extreme adaptive optics.^{13}

## 4.

## Monochromatic Model–Based Control Algorithms

The primary control laws used in focal plane wavefront control for the past decade at Princeton and JPL are stroke minimization^{7} and EFC.^{14} Though other variants, such as the coronagraphic focal plane wavefront estimation for exoplanet detection (COFFEE)^{15} algorithm, have also been studied, stroke minimization and EFC have achieved the highest contrast of any controllers to date and are the primary control laws under consideration for the WFIRST-AFTA coronagraph system. In this section, we describe each, highlighting their differences and similarities.

## 4.1.

### Stroke Minimization—Lagrange Multipliers

The stroke minimization algorithm^{7} seeks to regularize the control solution by directly minimizing stroke under the constraint that the solution for the desired DM perturbation achieves a specified contrast level.

Applying the matrix form of the control amplitudes in Eq. (9) and taking the inner product (represented as $\u27e8\xb7,\xb7\u27e9$) of Eq. (4), we find the intensity in the dark hole at the operation wavelength, ${\lambda}_{0}$, to be

## (10)

$${I}_{\mathrm{DH}}(k)=\frac{4{\pi}^{2}}{{\lambda}_{0}^{2}}{u}_{k}^{T}{M}_{k}{u}_{k}+\frac{4\pi}{{\lambda}_{0}}{u}_{k}^{T}\mathfrak{I}\{{b}_{k}\}+{d}_{k},$$## (11)

$${M}_{k}=\u27e8\mathcal{C}(A{e}^{i{\phi}_{k-1}}{f}_{k}),\mathcal{C}(A{e}^{i{\phi}_{k-1}}{f}_{k})\u27e9={G}_{k}^{*}{G}_{k},$$## (12)

$${b}_{k}=\u27e8{E}_{ab,k},\mathcal{C}(A{e}^{i{\phi}_{k-1}}{f}_{k})\u27e9={G}_{k}^{*}{E}_{ab,k},$$## (15)

$$\begin{array}{ll}\text{minimize}& \sum _{q=1}^{N}{a}_{q,k}^{2}={u}_{k}^{T}{u}_{k}\\ \text{subject to}& {I}_{\mathrm{DH}}(k)\le {C}_{k}\end{array},$$To solve the optimization problem, Pueyo et al.^{7} realized a control law by modifying the inequality constraint in Eq. (15) to be represented as the equality constraint $({I}_{\mathrm{DH}}-{C}_{k})=0$. Incorporating the equality constraint for the central wavelength into the minimization on stroke via a Lagrange multiplier, ${\mu}_{k}$, yields a quadratic cost function given by

## (16)

$${J}_{k}={u}_{k}^{T}{u}_{k}+{\mu}_{k}[{I}_{\mathrm{DH}}(k)-{C}_{k}]\phantom{\rule{0ex}{0ex}}={u}_{k}^{T}{u}_{k}+{\mu}_{k}(\frac{4{\pi}^{2}}{{\lambda}_{0}^{2}}{u}_{k}^{T}{M}_{k}{u}_{k}+\frac{4\pi}{{\lambda}_{0}}{u}_{k}^{T}\mathfrak{I}\{{b}_{k}\}+{d}_{k}-{C}_{k})\phantom{\rule{0ex}{0ex}}={u}_{k}^{T}(\mathcal{I}+{\mu}_{k}\frac{4{\pi}^{2}}{{\lambda}_{0}^{2}}{M}_{k}){u}_{k}+{\mu}_{k}\frac{4\pi}{{\lambda}_{0}}{u}_{k}^{T}\mathfrak{I}\{{b}_{k}\}+{\mu}_{k}({d}_{k}-{C}_{k}),$$## (17)

$$\frac{\partial J}{\partial {u}_{k}^{T}}{|}_{{u}_{J,k}}=2(\mathcal{I}+{\mu}_{k}\frac{4{\pi}^{2}}{{\lambda}_{0}^{2}}{M}_{k}){u}_{J,k}+{\mu}_{k}\frac{4\pi}{{\lambda}_{0}}\mathfrak{I}\{{b}_{k}\}=0$$## (18)

$${u}_{J,k}=-{(\frac{1}{{\mu}_{k}}\frac{{\lambda}_{0}}{2\pi}\mathcal{I}+\frac{2\pi}{{\lambda}_{0}}{M}_{k})}^{-1}\mathfrak{I}\{{b}_{k}\}.$$## 4.2.

### Electric Field Conjugation—Tikhonov Regularization

EFC is an alternate controller approach first presented by Give’on et al.^{14} It was originally modeled on the traditional, pupil-based adaptive optics approach, where the DM is set to conjugate the phase in the pupil to recover as flat an entering field as possible. In focal plane wavefront control, the analogous field conjugation would be done in the image. That is, given an estimate of the electric field at step $k-1$ from Eq. (8), ${\widehat{E}}_{im,k}=\mathcal{C}(\widehat{A{e}^{i{\phi}_{k-1}}})+\mathcal{C}(\widehat{A{e}^{i{\phi}_{k-1}}}g)$, the EFC controller seeks to drive to a desired field, ${E}_{D}$, allowing us to define the targeted field perturbation as

## (20)

$${E}_{D}=\mathcal{C}\{A(u,v){e}^{i{\phi}_{k-1}}[1+g(u,v)]\}+\delta {E}_{k}.\phantom{\rule{0ex}{0ex}}={E}_{im,k}(x,y).$$In its original form, EFC seeks to find a control ${G}_{k}{u}_{k}$ that conjugates the residual aberration terms in Eq. (20), to create the desired final field in the image plane. The effect of different choices of ${E}_{D}$ will become clearer once the controller has been fully defined.

Unfortunately, given the limitations of the influence function form of the DMs, the image plane field perturbation in Eq. (20) is not always reachable. In other words, one cannot simply invert ${G}_{k}$ to solve for ${u}_{k}$ given the desired field perturbation defined in Eq. (19). Instead, EFC seeks to find a control as close to the desired field perturbation as possible. We do so by defining an error, ${\u03f5}_{k}$, between the controlled field and the desired field as

EFC seeks to minimize this error rather than minimize the stroke as in Sec. 4.1. Doing so via a simple least-squares solution, however, often produces an ill-posed problem. Thus, EFC uses a regularized cost function to guarantee inversion. The regularization parameter, ${\mathrm{\Gamma}}_{k}$, is introduced via a Tikhonov regularization, making the quadratic cost function## (22)

$${W}_{k}={({G}_{k}{u}_{k}-\delta {E}_{k})}^{*}({G}_{k}{u}_{k}-\delta {E}_{k})+{u}_{k}^{T}{\mathrm{\Gamma}}_{k}^{T}{\mathrm{\Gamma}}_{k}{u}_{k}.$$## (23)

$$\frac{\partial {W}_{k}}{\partial {u}_{k}}{|}_{{u}_{w,k}}=2({G}_{k}^{*}{G}_{k}+{\mathrm{\Gamma}}_{k}^{T}{\mathrm{\Gamma}}_{k}){u}_{w,k}-2{G}_{k}^{*}\delta {E}_{k}=0,$$## (24)

$${u}_{w,k}={({G}_{k}^{*}{G}_{k}+{\mathrm{\Gamma}}_{k}^{T}{\mathrm{\Gamma}}_{k})}^{-1}{G}_{k}^{*}\delta {E}_{k}.$$## (26)

$${W}_{k}={({G}_{k}{u}_{k}-\delta {E}_{k})}^{*}({G}_{k}{u}_{k}-\delta {E}_{k})+{\alpha}_{k}^{2}{u}_{k}^{T}{u}_{k},$$As with stroke minimization, the task is now to search through ${\alpha}_{k}$ to evaluate ${u}_{w,k}$ in Eq. (27) and minimize Eq. (26). What is interesting is that the solution directly weights the minimization of stroke against the residual from the targeted electric field. As the regularization parameter, ${\alpha}_{k}$, is increased, the solution will seek smaller solutions of $u$, but the targeted electric field may become unreachable.

What remains is to choose $\delta {E}_{k}$ for the field perturbation at each step, or equivalently, the desired field ${E}_{D}$. If we attempt to drive to the nominal electric field that the coronagraph would achieve in a perfect optical system, making ${E}_{D}=\mathcal{C}\{A\}$,^{7} the target field will be given by

## 4.3.

### Comparing Control Laws

Where stroke minimization is attempting to achieve a targeted average contrast in the image plane via an equality constraint, EFC is attempting to minimize the desired field error and stroke is minimized via the regularization parameter. Thus, the true mathematical difference between stroke minimization and EFC is only in the inherent priority lent in each cost function. Where the primary metric of minimization in stroke minimization is the total DM stroke, leaving the resultant field as a free variable, the primary metric of minimization in EFC is the field residual, leaving the required stroke as a free variable. In monochromatic light, this simply means the Lagrangian for our central wavelength, ${\mu}_{k}$, and Tikhonov regularization parameter, ${\alpha}_{k}^{2}$, are inverses of one another. This is entirely due to the fact that the inequality constraint of the intended algorithm in Eq. (15) has been implemented as an equality constraint in a quadratic cost function.

For monochromatic control, the real difference is not from the choice of control law but in its algorithmic application. Details in the choice of minimization parameters, $({\mu}_{k},{C}_{k})$ for stroke minimization and $({\alpha}_{k}^{2},\delta {E}_{k})$ for EFC, will yield different (sometimes dramatically different) solutions. The logic used in these choices seems to be a function of the user/laboratory in question and can have the greatest impact on varying control solutions to achieve a specific contrast level in the image plane.

In the case of the regularization parameter, the principal difference we see in their algorithmic applications is how this value is tuned, or if it is tuned at all. Here we point out that the proper mathematical application of these algorithms is to relinearize after every iteration and to perform a high-resolution line search on the Lagrange multiplier (or equivalently the regularization parameter) to find the true minimum for that time step.

Choosing the targeted contrast (or desired field) tends to be more of an ad hoc decision. Experimentally, this is generally driven by practically achievable step sizes in early iterations. At some point, one can pick a target that is unreachable by the controller at that iteration. There are essentially two scenarios for this value. The first is at the end of control when we have reached our final achievable contrast. This tends to be limited by noise, be it sensor or process noise. In this case, the contrast target is insignificant because we are ultimately limited by the estimator, which is discussed in Sec. 6. More interesting for the controller is the second scenario, where we observe unreachable states in early iterations. This would seem curious because these states are not below the instrument’s ultimate achievable contrast and we are not yet sensing at an unfavorable signal-to-noise ratio (SNR). The simplest explanation is that the actuator strokes are highest during early iterations and the contrast target will be limited by the accuracy of the current Jacobian. In this case, the next higher-order terms that were neglected in Eq. (8) become significant for the probe and control amplitudes in question. Thus, while theoretically reachable, the linearization defining the controller’s Jacobian, ${G}_{k}$, will effectively have lower accuracy in early iterations than in later iterations because higher-order terms in the linear expansion are a function of the DM amplitude. This is true even if the same Jacobian is used in later iterations and is a separate issue from whether one chooses to relinearize at each iteration of the algorithm. As a functional example, the Princeton HCIL has demonstrated dark holes that suppress the aberrated field from $\approx 1\times {10}^{-4}$ to $\approx 2\times {10}^{-7}$ without relinearizing the Jacobian,^{16} but we cannot achieve $2\times {10}^{-7}$ in a single time step when we start at $1\times {10}^{-4}$. The effect of probe amplitude does play a role in accuracy, and its error propagation is discussed in Sec. 8. Here we are only making the case that the controller amplitude limits the ability of the Jacobian to accurately predict the shape required to achieve a particular contrast value or field target. Thus, we tend to throttle the target so that we are not left with undue DM residuals that will have to be compensated anyway.

## 5.

## Broadband Model–Based Control Algorithms

Monochromatic control of the image plane field represents a significant step toward detection, but the controller must ultimately be capable of creating a dark hole in broadband light so that we can take spectra of the planets. We are typically only concerned with $\approx 10\%$ bandpasses, but it would be useful to achieve a bandpass as large as 20%. Some of the first broadband results, presented in Ref. 6, used the monochromatic controller at a central wavelength and accepted the contrast degradation in the rest of the bandpass. However, both EFC and stroke minimization have been developed into broadband control laws by augmenting the cost function with multiple monochromatic estimates that span the bandpass of concern.

## 5.1.

### Windowed Stroke Minimization

To define the bandpass of the controller, we begin with the monochromatic constraint in Eq. (15). The center wavelength constraint at ${\lambda}_{0}$ is now augmented with two more contrast constraints ${C}_{{\lambda}_{1},k}$ and ${C}_{{\lambda}_{2},k}$ at bounding wavelengths ${\lambda}_{1},{\lambda}_{2}$, to define a window over which the correction will be made. With three separate constraints, the optimization now becomes

## (32)

$$\begin{array}{lll}\text{minimize}& & \sum _{p=1}^{{N}_{\mathrm{act}}}{a}_{p,k}^{2}={u}_{k}^{T}{u}_{k}\\ \text{subject to}& & {I}_{\mathrm{DH}}({\lambda}_{0})\le {C}_{{\lambda}_{0},k},\\ & & {I}_{\mathrm{DH}}({\lambda}_{1})\le {C}_{{\lambda}_{1},k},\\ & & {I}_{\mathrm{DH}}({\lambda}_{2})\le {C}_{{\lambda}_{2},k},\\ \text{where}& & {\lambda}_{1}={\gamma}_{1}{\lambda}_{0}\\ & & {\lambda}_{2}={\gamma}_{2}{\lambda}_{0}.\end{array}$$## (33)

$${I}_{\mathrm{DH}}(\lambda )=w(\lambda )\frac{4{\pi}^{2}}{{\lambda}^{2}}{u}_{k}^{T}{M}_{\lambda ,k}{u}_{k}+w(\lambda )\frac{4\pi}{\lambda}{u}_{k}^{T}\mathfrak{I}\{{b}_{\lambda ,k}\}+w({\lambda}_{k}){d}_{\lambda ,k},$$## (34)

$${I}_{\mathrm{DH}}({\lambda}_{i})=w({\lambda}_{i})\frac{4{\pi}^{2}}{{\gamma}_{i}^{2}{\lambda}_{0}^{2}}{u}_{k}^{T}{M}_{{\lambda}_{i},k}u+w({\lambda}_{i})\frac{4\pi}{{\gamma}_{i}{\lambda}_{0}}{u}_{k}^{T}\mathfrak{I}\{{b}_{{\lambda}_{i},k}\}+w({\lambda}_{i}){d}_{{\lambda}_{i},k}.$$## (35)

$${J}_{k}={u}_{k}^{T}{u}_{k}+{\mu}_{k,0}[{I}_{\mathrm{DH}}({\lambda}_{0})-{C}_{{\lambda}_{0},k}]+{\mu}_{k,1}[{I}_{\mathrm{DH}}({\lambda}_{1})-{C}_{{\lambda}_{1},k}]+{\mu}_{k,2}[{I}_{\mathrm{DH}}({\lambda}_{2})-{C}_{{\lambda}_{2},k}]\phantom{\rule{0ex}{0ex}}={u}_{k}^{T}\{\mathcal{I}+\frac{4{\pi}^{2}}{{\lambda}_{0}^{2}}[{\mu}_{k,0}{M}_{{\lambda}_{0},k}+{\mu}_{k,1}\frac{w({\lambda}_{1})}{{\gamma}_{1}^{2}}{M}_{{\lambda}_{1},k}+{\mu}_{k,2}\frac{w({\lambda}_{2})}{{\gamma}_{2}^{2}}{M}_{{\lambda}_{2},k}\left]\right\}{u}_{k}+\frac{4\pi}{{\lambda}_{0}}{u}_{k}^{T}[{\mu}_{k,0}\mathfrak{I}\{{b}_{{\lambda}_{0},k}\}\phantom{\rule{0ex}{0ex}}+{\mu}_{k,1}\frac{w({\lambda}_{1})}{{\gamma}_{1}}\mathfrak{I}\{{b}_{{\lambda}_{1},k}\}+{\mu}_{k,2}\frac{w({\lambda}_{2})}{{\gamma}_{2}}\mathfrak{I}\{{b}_{{\lambda}_{2},k}\}]\phantom{\rule{0ex}{0ex}}+[{\mu}_{k,0}({d}_{{\lambda}_{0},k}-{C}_{{\lambda}_{0},k})+{\mu}_{k,1}w({\lambda}_{1})({d}_{{\lambda}_{1},k}-{C}_{{\lambda}_{1},k})+{\mu}_{k,2}w({\lambda}_{2})({d}_{{\lambda}_{2},k}-{C}_{{\lambda}_{2},k})].$$## (36)

$${J}_{k}={u}_{k}^{T}\{\mathcal{I}+{\mu}_{k,0}\frac{4{\pi}^{2}}{{\lambda}_{0}^{2}}[{M}_{{\lambda}_{0},k}+{\delta}_{1}\frac{w({\lambda}_{1})}{{\gamma}_{1}^{2}}{M}_{{\lambda}_{1},k}+{\delta}_{2}\frac{w({\lambda}_{2})}{{\gamma}_{2}^{2}}{M}_{{\lambda}_{2},k}\left]\right\}{u}_{k}\phantom{\rule{0ex}{0ex}}+{\mu}_{k,0}\frac{4\pi}{{\lambda}_{0}}{u}_{k}^{T}[\mathfrak{I}\{{b}_{{\lambda}_{0},k}\}+{\delta}_{1}\frac{w({\lambda}_{1})}{{\gamma}_{1}}\mathfrak{I}\{{b}_{{\lambda}_{1},k}\}+{\delta}_{2}\frac{w({\lambda}_{2})}{{\gamma}_{2}}\mathfrak{I}\{{b}_{{\lambda}_{2},k}\}]\phantom{\rule{0ex}{0ex}}+{\mu}_{k,0}[({d}_{{\lambda}_{0},k}-{C}_{{\lambda}_{0},k})+{\delta}_{1}w({\lambda}_{1})({d}_{{\lambda}_{1},k}-{C}_{{\lambda}_{1},k})+{\delta}_{2}w({\lambda}_{2})({d}_{{\lambda}_{2},k}-{C}_{{\lambda}_{2},k})].$$## (37)

$${u}_{J,k}=-{\{\frac{1}{{\mu}_{k,0}}\frac{{\lambda}_{0}}{2\pi}\mathcal{I}+\frac{2\pi}{{\lambda}_{0}}[{M}_{{\lambda}_{0},k}+{\delta}_{1}\frac{w({\lambda}_{1})}{{\gamma}_{1}^{2}}{M}_{{\lambda}_{1},k}+{\delta}_{2}\frac{w({\lambda}_{2})}{{\gamma}_{2}^{2}}{M}_{{\lambda}_{2},k}\left]\right\}}^{-1}\phantom{\rule{0ex}{0ex}}\xb7[\mathfrak{I}\{{b}_{{\lambda}_{0},k}\}+{\delta}_{1}\frac{w({\lambda}_{1})}{{\gamma}_{1}}\mathfrak{I}\{{b}_{{\lambda}_{1},k}\}+{\delta}_{2}\frac{w({\lambda}_{2})}{{\gamma}_{2}}\mathfrak{I}\{{b}_{{\lambda}_{2},k}\}].$$With the controller defined, we now evaluate its ability to produce a uniform contrast in wavelength space. The most recent broadband results from the Princeton HCIL producing symmetric dark holes in a targeted 10% band around ${\lambda}_{0}=633\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{nm}$ are shown in Fig. 3.^{17} In all cases, the controller corrects two rectangular areas spanning 6 to $11\times -3$ to $3{\lambda}_{0}/D$ on both sides of the image plane. The contrast level is evaluated over two rectangular areas spanning 7 to $10\times -2$ to $2{\lambda}_{0}/D$ on both sides of the image plane. Note that this area is defined by the central wavelength for all measurements so that contrast performance correlates to a fixed sky angle, $\beta $, defined as $\beta ={\mathrm{tan}}^{-1}(n{\lambda}_{0}/D)$. Since both DMs are nonconjugate to the pupil and have very large wavefront error,^{9} we anticipate there to be a great deal of chromaticity in the Princeton HCIL. We will briefly address these limitations in the context of the experimental results, but for now, we chose to slightly underweight the bounding wavelengths in the optimization, choosing ${\delta}_{1}={\delta}_{2}=0.75$ at the bounding wavelengths, ${\lambda}_{1}=600$ and ${\lambda}_{2}=650\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{nm}$. In this particular experiment, we estimate each wavelength separately using the batch process estimator discussed in Sec. 6.1. The final dark holes, shown in Fig. 3(d), reached a contrast of $5.67\times {10}^{-6}$ in an $\sim 10\%$ band and $1.364\times {10}^{-5}$ over the full band of the laser, shown in Fig. 3(b).

Individual monochromatic images of the final dark hole are given in Fig. 4. With the bounding wavelengths only slightly underweighted in the optimization, we see in Fig. 3(a) that the contrast is relatively constant in wavelength over the 10% band. We also see in Figs. 4(a) and 4(c) that there is very little evolution of the dark hole at the bounding wavelengths of the 10% bandpass.

The question that now arises is what limits contrast as a function of our choice in bandpass. Correcting over a 10% bandpass automatically costs us an order of magnitude in our final achievable contrast. The most obvious answer is the inherent chromaticity of our pupil from highly aberrated, nonconjugate planes. Shaklan et al.^{18} show that the ultimate achievable contrast is dependent on the propagation of phase-induced amplitude errors from optics that are nonconjugate to the pupil plane. They also point out that since the control system has a finite set of controllable spatial frequencies, the ultimate achievable contrast will become a function of the bandwidth we choose in our dark hole. Using the results of Shaklan et al.,^{18} we have previously estimated the lower limit of the Princeton HCIL to be $\approx 1\times {10}^{-6}$ for a 20% bandwidth.^{17} This order of magnitude bound only accounts for the quilting of the two DM surfaces, both of which are shown in Fig. 1 to be at planes nonconjugate to the pupil. This bounding estimate both highlights the importance of bandwidth in the optical design and indicates that our current performance is still worse than the fundamental limitations of this optical system.

Given that we do not believe we have reached a fundamental contrast floor in a 10% band, we now consider other limitations that could affect broadband control. At the time of these measurements, the contrast of speckle measurements in the Princeton HCIL was only stable to within $\approx 0.5$ to $1\times {10}^{-6}$ over the period of tens of minutes to an hour. It is possible that we reached a stability limit in the experimental setup in this particular broadband experiment. The broadband source is of much lower power than the monochromatic source, driving up the integration time for each image. The broadband solution also requires three individual estimates to achieve the results shown in Fig. 3. Thus, we are estimating over a time scale that will affect our measurements by $\approx 1\times {10}^{-6}$, and it is not surprising that we were only able to correct down to the $1\times {10}^{-6}$ level at any wavelength. This highlights the importance of setting stability requirements, which must scale with the minimum allowable integration time required to obtain an adequate SNR. HCIL, being an air experiment with no active temperature control directly on the optical bench, has very poor stability compared to what is expected from a space environment. However, we can control under equivalent time scales as the observatory environment by taking shorter exposures (between 100s of milliseconds to a minute depending on the experiment) with a brighter source. This makes yet another point; to accurately predict the performance of a flight instrument from an experimental testbed, one must extrapolate the system stability to time scales relevant to the expected integration time. Broadband estimation, taking inherently longer to accomplish than monochromatic estimation, should be the time scale used to impose a stability requirement on the telescope. It is possible to make the time scale of broadband estimation similar to the monochromatic case if we can estimate each wavelength simultaneously. To do so requires closing the control loop around an integral field spectrograph (or an analogous imaging spectrograph).

To take full advantage of an observatory’s stability, we clearly want to reduce the time required to produce estimates of the electric field over the optimization bandwidth. Furthermore, the high potential for stability limitations points us to the use of the Kalman filter estimators, discussed in Sec. 6.2, because we can reduce the number of images per control step. With coronagraphy generally living in the photon limited regime, we must keep in mind that in addition to other noise sources, the sensitivity of wavefront control to temporal stability will be driven by our broadband estimation steps.

## 5.2.

### Broadband Electric Field Conjugation

As with stroke minimization, we create a broadband form of EFC by augmenting the cost function with multiple wavelengths. We redefine the error function in Eq. (21) for the monochromatic case as

where ${G}_{{\lambda}_{i},k}$ and $\delta {E}_{{\lambda}_{i},k}$ are the wavelength-dependent Jacobian and field target, respectively. Summing a discrete set of these error metrics in wavelength, the broadband EFC cost function is given by## (39)

$${W}_{k}=\sum _{i=0}^{N}{({G}_{{\lambda}_{i},k}{u}_{k}-\delta {E}_{{\lambda}_{i},k})}^{*}({G}_{{\lambda}_{i},k}{u}_{k}-\delta {E}_{{\lambda}_{i},k})+{u}_{k}^{T}{\mathrm{\Gamma}}_{k}^{T}{\mathrm{\Gamma}}_{k}{u}_{k}.$$## (40)

$$\frac{\partial {W}_{k}}{\partial {u}_{k}}{|}_{{u}_{w,k}}=2(\sum _{i=0}^{N}{G}_{{\lambda}_{i},k}^{*}{G}_{{\lambda}_{i},k}+{\mathrm{\Gamma}}_{k}^{T}{\mathrm{\Gamma}}_{k}){u}_{w,k}-2\sum _{i=0}^{N}{G}_{{\lambda}_{i},k}^{*}\delta {E}_{{\lambda}_{i},k}=0,$$## (41)

$${u}_{w,k}={[\sum _{i=0}^{N}{G}_{{\lambda}_{i},k}^{*}{G}_{{\lambda}_{i},k}+{\mathrm{\Gamma}}_{k}^{T}{\mathrm{\Gamma}}_{k}]}^{-1}\left[\sum _{i=0}^{N}{G}_{{\lambda}_{i},k}^{*}\delta {E}_{{\lambda}_{i},k}\right].$$## (42)

$${W}_{k}=\sum _{i=0}^{N}{({G}_{{\lambda}_{i},k}{u}_{k}-\delta {E}_{{\lambda}_{i},k})}^{*}({G}_{{\lambda}_{i},k}{u}_{k}-\delta {E}_{{\lambda}_{i},k})+{u}_{k}^{T}{\mathrm{\Gamma}}_{k}^{T}{\mathrm{\Gamma}}_{k}{u}_{k},$$## (43)

$${u}_{w,k}={[\sum _{i=0}^{N}{G}_{{\lambda}_{i},k}^{*}{G}_{{\lambda}_{i},k}+{\alpha}_{k}^{2}\mathcal{I}]}^{-1}\left[\sum _{i=0}^{N}{G}_{{\lambda}_{i},k}^{*}\delta {E}_{{\lambda}_{i},k}\right].$$^{14}have pointed out, it is also possible to avoid the summation by augmenting the Jacobian as a larger diagonal matrix. The mathematics are identical to the summation presented here, and the approach is largely limited by the maximum practical (or allowable) matrix size in the computation. In either case, the task is once again the same. We must search through ${\alpha}_{k}$ to evaluate ${u}_{w,k}$ in Eq. (43) and minimize Eq. (42).

The resemblance between the broadband EFC algorithm and windowed stroke minimization is not as strong as it was in the monochromatic case. Their reciprocal treatment of the regularization parameter and Lagrange multipliers becomes more signifciant in the presence of multiple parameters. Since the constraints are applied to the contrast target in windowed stroke minimization, each wavelength has its own Lagrange multiplier. With the modification made in Eq. (36) to further constrain the solution, the two become much more similar in that there is only a single regularization parameter we search through in the optimization. If arbitrary weights were included in Eq. (39), then Eqs. (42) and (43) would be nearly identical in form to the optimization defined in Eqs. (36) and (37). As in the monochromatic case, their exact functional similarity will be associated with the choice of $\delta {E}_{{\lambda}_{i},k}$ in the EFC control law, and the algorithmic approach to their implementation remains unchanged save for the choice in weights.

## 6.

## Wavefront Estimation

The principle of focal plane wavefront control is that high contrast is best achieved by avoiding all possible noncommon errors that arise from path, temporal, or wavelength differences. Thus, the controller is provided with an estimate of the electric field at the final image plane, rather than from a measurement taken by a separate wavefront sensor. The three principal methods used to sense the image plane electric field in high-contrast imaging are Gerchberg-Saxton based multiplane solutions,^{19}20.^{–}^{21} differential fringe imaging, such as the self-coherent camera^{22} or pinhole masks,^{23}^{,}^{24} and DM probing. Multiplane solutions are typically used as an initialization step where the pupil field is flattened in an attempt to initialize control with the cleanest beam possible. Creating fringe patterns in the image is a very clever way to sense the field and has demonstrated great success in high-contrast imaging.^{25} As a flight instrument, this means adding a mechanism and is not necessarily compatible with all coronagraphs. This is particularly true if multiple classes of coronagraph are used in the same instrument, as is the case with the WFIRST-AFTA concept. With or without a fringe-based technique, DM probing is always possible and is attractive because it only relies on the DM already needed for wavefront control. No additional sensors or mechanisms are required to probe the electric field in the image plane. There are, however, limitations with such a method. Here we derive the estimation scheme associated with DM-based probing where the image plane is modulated with a set of diversity, or probe, images.

## 6.1.

### Batch Process Method

Pair-wise difference imaging, as first developed by Give’on et al.,^{14} allows the estimation of the focal plane electric field from several intensity images, each with a different DM setting. As originally conceived, at least two pairs of probed images are required to invert the measurements and obtain the real and imaginary parts of the electric field. The choice of these probe shapes is important and is the principal topic for the remainder of this paper. Here, in Sec. 6, we first derive the basic least-squares estimator, which will subsequently be compared to a Kalman filter estimator.

Letting ${u}_{j,k}$ be the signal for the $j$’th positive probe shape in the $k$’th iteration, we refer back to Eq. (10) to define the change in the focal plane electric field for a given pair of probe shapes, ${p}_{j\pm ,k}=\pm G{u}_{j\pm ,k}$. Recalling Eq. (14), we apply these shapes to Eq. (10) to define their image plane intensities as

## (44)

$${I}_{j\pm ,k}={|{E}_{ab,k}|}^{2}+{|{p}_{j,k}|}^{2}\pm 2\mathfrak{R}\{\u27e8{E}_{ab,k},{p}_{j,k}\u27e9\},$$## (45)

$$\mathrm{\Delta}{I}_{j,k}={I}_{j+,k}-{I}_{j-,k}=4\mathfrak{R}\{\u27e8{E}_{ab,k},{p}_{j,k}\u27e9\}.$$## (46)

$$\left[\begin{array}{c}\mathrm{\Delta}{I}_{1,k}\\ \vdots \\ \mathrm{\Delta}{I}_{{n}_{p},k}\end{array}\right]=4\left[\begin{array}{cc}\mathfrak{R}\{{p}_{1,k}\}& \mathfrak{I}\{{p}_{1,k}\}\\ \vdots & \vdots \\ \mathfrak{R}\{{p}_{{n}_{p},k}\}& \mathfrak{I}\{{p}_{{n}_{p},k}\}\end{array}\right]\left[\begin{array}{c}\mathfrak{R}\{{E}_{ab,k}\}\\ \mathfrak{I}\{{E}_{ab,k}\}\end{array}\right],$$## (48)

$${z}_{k}=\left[\begin{array}{c}\mathrm{\Delta}{I}_{1,k}\\ \vdots \\ \mathrm{\Delta}{I}_{{n}_{p},k}\end{array}\right],$$## (49)

$${H}_{k}=4\left[\begin{array}{cc}\mathfrak{R}\{{p}_{1,k}\}& \mathfrak{I}\{{p}_{1,k}\}\\ \vdots & \vdots \\ \mathfrak{R}\{{p}_{{n}_{p},k}\}& \mathfrak{I}\{{p}_{{n}_{p},k}\}\end{array}\right],$$## (50)

$${x}_{k}=\left[\begin{array}{c}\mathfrak{R}\{{E}_{ab,k}\}\\ \mathfrak{I}\{{E}_{ab,k}\}\end{array}\right].$$## 6.2.

### Filtering Method

A more recent form of estimation employs a Kalman filter to estimate the field.^{16} In its initial formulation, the state estimate, $\widehat{x}$, and the observation matrix, ${H}_{k}$, are identical to those constructed in Sec. 6.1. The construction of ${\widehat{x}}_{k}$ and ${H}_{k}$ define the form of the controller update, ${\mathrm{\Gamma}}_{k-1}$, used to extrapolate the field estimate given the control applied in the last iteration, ${u}_{k-1}$. With the observation and control update matrices defined, the five canonical discrete Kalman filter equations are given by

## (52)

$${\widehat{x}}_{k}(-)={\mathrm{\Phi}}_{k-1}{\widehat{x}}_{k-1}(+)+{\mathrm{\Gamma}}_{k-1}{u}_{k-1},$$^{16}$Q$ is defined by an estimate of the error on the DMs state and ${R}_{k}$ is defined by the detector noise. More recent versions modify the filter to an extended Kalman filter and augment the state with an estimate of the bias.

^{26}This has the advantage of using the control history to estimate the incoherent component of the field dynamically with the state, providing a measurement of the true coherent contrast independent of exo-zodiacal light. Groff and Kasdin

^{16}also point out that the bias estimate will detect planets over the course of the control history. Such a bias estimator is fully developed and experimentally tested in Ref. 27. This extended Kalman filter successfully demonstrates the ability of the filter to separate incoherent components of the image and successfully detects planets injected into the experiment using a star-planet simulator.

Common to both versions of the Kalman filter is the freedom to reduce the number of probe pairs to as low as a single pair per iteration. Since a measurement update at any time step is guaranteed to not increase the covariance of the state estimate from a prior iteration, this allows us to demonstrate that the control law is reaching a contrast target with the minimum number of measurements required by the estimator. The guarantee that new updates cannot make the estimate covariance worse also tends to yield better stability in the control as the controller reaches higher contrast levels and becomes noise limited. The improved stability at high noise levels has been demonstrated in Ref. 16, and the performance of such filters has been demonstrated by Riggs et al.^{28} at extremely high contrast levels, with initial comparisons to the batch process method presented in Sec. 6.1.

## 7.

## Sensing the Wavefront with Pair-Wise Probing

In Sec. 6, we took the probe images required to estimate the field for granted. We now define the probes and analyze their impact on the quality of the image. DM probes are a set of perturbations, ${\phi}_{j}$, that must be chosen to modulate the intended control area well to maximize the signal in the image plane and to guarantee a well-conditioned observation matrix, ${H}_{k}$, associated with that set of probes. As was discussed in Sec. 6.2, this is the only factor that can result in measurement updates making the estimate worse over time.

These shapes are traditionally based on analytical functions for which we know the Fourier transform. Following Give’on et al.,^{14} we will simplify the problem of coverage/shape of the dark hole by choosing symmetric rectangular regions that span the region we wish to estimate and, for the sake of choosing the shape only, assume that the operator is a Fourier transform (i.e., $\mathcal{C}\{\xb7\}=\mathcal{F}\{\xb7\}$). We define an offset pair of rectangles of width ${w}_{x}$ and height ${w}_{y}$ by convolving the rectangle function with a set of Dirac delta functions, $\delta (\xb7)$. Shifting the rectangles off axis in the image plane by a distance $a$ in the $x$ dimension and a distance $b$ in the $y$ dimension, the intended probe field in the image plane from a pupil plane perturbation, $A(u,v)\phi (u,v)$, is defined as

## (57)

$$\mathcal{F}\{A\phi \}=\mathcal{F}\{A\}*c\text{\hspace{0.17em}}\mathrm{rect}({w}_{x}x)\mathrm{rect}({w}_{y}y)*[\delta (x-a)+\delta (x+a)]*[\delta (y-b)+\delta (y+b)]\phantom{\rule{0ex}{0ex}}=\mathcal{F}\{cA\text{\hspace{0.17em}}\mathrm{sinc}({w}_{x}u)\mathrm{sinc}({w}_{y}v)\mathrm{cos}(au)\mathrm{cos}(bv)\},$$## (58)

$$\phi =c\text{\hspace{0.17em}}\mathrm{sinc}({w}_{x}u)\mathrm{sinc}({w}_{y}v)\mathrm{cos}(au)\mathrm{cos}(bv).$$It is important to note that Eq. (57) should not be used to construct the observation matrix. This should be done by solving for the DM commands from Eq. (58) and applying them to the true system Jacobian in Eq. (9). The PSF will alter the field so that the field from a DM shape given by Eq. (58) will not have exact unit amplitude and the edges of the rectangle will extend by one radius of the PSF. The distribution will still be relatively uniform, most likely modulating each pixel in the dark hole.

## 8.

## Error Analysis for Pair-Wise Probing

With DM probe shapes defined in Sec. 7, we analyze the variance of the electric field estimate given bounding cases of measurement noise. We begin with the effect of stellar shot noise on the estimate, demonstrate how zodi and dark current drive us to brighter probes, and determine the order of magnitude limit on how bright we may make the probes. We then determine what strategies can be used to obtain the most precise and accurate electric field estimates possible. In all cases, we propagate the uncertainty in the probe through the batch process estimator to more easily see the effect of noise in a single iteration. Since this amounts to a measurement error, the noise value will be identical in the Kalman filter and the optimal gain will more heavily weight information from the prior iteration. Thus, the batch process estimator represents the worst-case contribution of probe errors to our achievable contrast.

## 8.1.

### Stellar Shot Noise

Here, we will calculate the variance in the electric field estimate at a given pixel in the presence of only stellar shot noise. By utilizing probe shapes that actuate the entire focal plane evenly in field (such as the products of sinc and sine functions mentioned in Sec. 7), we can safely apply these calculations to all pixels in the correction region.

We choose probes ${p}_{j,k}$ in the $k$’th iteration such that they have the same amplitude $p=|{p}_{j,k}|\text{\hspace{0.17em}}\text{\hspace{0.17em}}\forall \text{\hspace{0.17em}}j$.

## (59)

$${p}_{j,k}=p{e}^{i{\theta}_{j,k}}=p\text{\hspace{0.17em}}\mathrm{cos}({\theta}_{j,k})+ip\text{\hspace{0.17em}}\mathrm{sin}({\theta}_{j,k}),$$## (60)

$$\left[\begin{array}{c}\mathrm{\Delta}{I}_{1}\\ \vdots \\ \mathrm{\Delta}{I}_{{n}_{p}}\end{array}\right]=4p\left[\begin{array}{cc}\mathrm{cos}(\alpha +\pi \frac{{n}_{p}-{n}_{p}}{{n}_{p}})& \mathrm{sin}(\alpha +\pi \frac{{n}_{p}-{n}_{p}}{{n}_{p}})\\ \vdots & \vdots \\ \mathrm{cos}(\alpha +\pi \frac{{n}_{p}-1}{{n}_{p}})& \mathrm{sin}(\alpha +\pi \frac{{n}_{p}-1}{{n}_{p}})\end{array}\right]\left[\begin{array}{c}{E}_{R}\\ {E}_{I}\end{array}\right],$$## (62)

$${H}_{\theta}^{T}{H}_{\theta}=\left\{\begin{array}{c}\sum _{j=0}^{{n}_{p}-1}{\mathrm{cos}}^{2}(\alpha +\pi \frac{{n}_{p}-j}{{n}_{p}})\sum _{j=0}^{{n}_{p}-1}\frac{1}{2}\mathrm{sin}\left[2\right(\alpha +\pi \frac{{n}_{p}-j}{{n}_{p}}\left)\right]\\ \sum _{j=0}^{{n}_{p}-1}\frac{1}{2}\mathrm{sin}\left[2\right(\alpha +\pi \frac{{n}_{p}-j}{{n}_{p}}\left)\right]\sum _{j=0}^{{n}_{p}-1}{\mathrm{sin}}^{2}(\alpha +\pi \frac{{n}_{p}-j}{{n}_{p}})\end{array}\right\}=\frac{{n}_{p}}{2}\left[\begin{array}{cc}1& 0\\ 0& 1\end{array}\right].$$## (63)

$$\left[\begin{array}{c}{E}_{R}\\ {E}_{I}\end{array}\right]=\frac{1}{4p}\frac{2}{{n}_{p}}\left[\begin{array}{c}\sum _{j=0}^{{n}_{p}-1}\mathrm{\Delta}{I}_{j}\text{\hspace{0.17em}}\mathrm{cos}(\alpha +\pi \frac{{n}_{p}-j}{{n}_{p}})\\ \sum _{j=0}^{{n}_{p}-1}\mathrm{\Delta}{I}_{j}\text{\hspace{0.17em}}\mathrm{sin}(\alpha +\pi \frac{{n}_{p}-j}{{n}_{p}})\end{array}\right].$$## (64)

$$={\left(\frac{1}{4p}\right)}^{2}\left(\frac{2}{{n}_{p}}\right){\sigma}^{2}\{\mathrm{\Delta}I\}\left[\begin{array}{c}1\\ 1\end{array}\right]+{\sigma}^{2}\{p\}/{p}^{2}\left[\begin{array}{c}{E}_{R}^{2}\\ {E}_{I}^{2}\end{array}\right].$$The fundamental limit to measurement of ${E}_{R}$ and ${E}_{I}$ comes from Poisson statistics on the intensity measurements in Eq. (64), which is rewritten (without the $\sigma \{p\}$ term) as

## (65)

$${\sigma}^{2}\{{E}_{R}\}={\sigma}^{2}\{{E}_{I}\}={\left(\frac{1}{4p}\right)}^{2}\left(\frac{2}{{n}_{p}}\right){\sigma}^{2}\{\mathrm{\Delta}I\}.$$To determine the variances relevant to Eq. (65), we define ${C}_{p}$ such that

where ${C}_{p}$ has a contrast interpretation, e.g., ${C}_{p}={10}^{-8}$ corresponds to probes that add ${10}^{-8}$ intensity in the image plane. With this formulation, the units of ${p}^{2}$ are photoelectrons. For an individual intensity measurement, Poisson statistics dictate that the variance in ${\mathrm{photoelectrons}}^{2}$ is equal to the expected number of photoelectrons. For large probes, remembering that the units of both sides of this equation are ${\mathrm{photoelectrons}}^{2}$. Since any $\mathrm{\Delta}I$ is the difference of two intensities, each with variance ${\sigma}^{2}\{I\}$, the variance of a differential measurement is given by Revisiting Eq. (65) for large probes, the variance of one component of the field in units of photoelectrons is given by## (72)

$${\sigma}^{2}\{{E}_{R}\}={\sigma}^{2}\{{E}_{I}\}\sim {\left(\frac{1}{4p}\right)}^{2}\left(\frac{2}{{n}_{p}}\right)2{p}^{2}\phantom{\rule{0ex}{0ex}}\sim \frac{1}{4{n}_{p}}.$$It is instructive to define a new term, a dimensionless noise-equivalent contrast ${N}_{c}$ as one metric of the noise in the final electric field estimation. The electric field estimates, ${E}_{R}$ and ${E}_{I}$, are expressed in units common to the measurement used in the probe sequence, so the appropriate normalization for ${E}_{R}^{2}$ and ${E}_{I}^{2}$ is the same as the intensity image ${I}_{\mathrm{pk}}$ from Eq. (67). Analogous to Eq. (68), ${N}_{c}$ is defined such that

## (73)

$${N}_{c}=({\sigma}^{2}\{{E}_{R}\}+{\sigma}^{2}\{{E}_{I}\})/{I}_{\mathrm{pk}}\phantom{\rule{0ex}{0ex}}=\frac{1}{2{n}_{p}{F}_{\mathrm{pk}}t},$$${E}_{R}$ and ${E}_{I}$ are computed from a sequence of probe images, comprising ${n}_{p}$ pairs. Each pair is taken with two images using integration time $t$. We define the total time for the entire probe sequence to be

allowing us to write the noise-equivalent contrast as A sample calculation of this would be where ${t}_{\mathrm{tot}}$ is chosen to be long enough that the peak intensity reaches ${10}^{9}$ photoelectrons. This ${t}_{\mathrm{tot}}$ time is broken into $2{n}_{p}$ separate integrations with time $t$, with DM modulation in between images, allowing a calculation of ${E}_{R}$ and ${E}_{I}$. The noise-equivalent contrast ${N}_{c}$ is then ${10}^{-9}$. In other words, ${t}_{\mathrm{tot}}$ is the time required to reach SNR=1 at a contrast level defined by ${N}_{c}$.Wavefront control, in particular, offers an excellent use of ${N}_{c}$ with an interpretation in terms of contrast. Assuming an initial contrast of ${|{E}_{0}|}^{2}/{I}_{\mathrm{pk}}={10}^{-7}$, and a set of large amplitude probes that produce ${N}_{c}={10}^{-9}$, the measured field will have real and imaginary components given by ${E}_{R}=\mathfrak{R}\{{E}_{0}\}+{\delta}_{R}$ and ${E}_{I}=\mathfrak{I}\{{E}_{0}\}+{\delta}_{I}$ with an expectation of $E[{\delta}_{R}^{2}+{\delta}_{I}^{2}]/{I}_{\mathrm{pk}}={N}_{c}$. If wavefront control operates on the electric field and removes the measured field, ${E}_{R}+i{E}_{i}$, then the residual field is $-{\delta}_{R}$ and $-{\delta}_{I}$, such that the contrast after correction has an expectation value $E[({\delta}_{R}^{2}+{\delta}_{I}^{2})/{I}_{\mathrm{pk}}]={N}_{c}$. Thus, the noise equivalent contrast ${N}_{c}$ is the limit to how far wavefront control can reduce the contrast for a given electric field estimate.

The form of Eq. (75) is very different from the Poisson noise associated with measurement of the contrast itself, in that the noise does not depend on $|{E}_{0}|$. The relevant comparison is that the electric field variance is ${N}_{c}=E[{|({E}_{R}+i{E}_{I})-{E}_{0}|}^{2}]/({F}_{\mathrm{pk}}{t}_{\mathrm{tot}})$, which is distinct from the standard deviation of measured intensity $\sigma \{{E}_{R}^{2}+{E}_{I}^{2}\}/{I}_{\mathrm{pk}}$. The calculation of $\sigma \{{E}_{R}^{2}+{E}_{I}^{2}\}$ is algebraically laborious, but leads to an answer whose variance is twice the direct-measurement Poisson limit. Integrating over the total time ${t}_{\mathrm{tot}}$, the variance is given by

## (76)

$$\sigma \{{E}_{R}^{2}+{E}_{I}^{2}\}/({F}_{\mathrm{pk}}t)={2}^{1/2}{({|{E}_{0}|}^{2}2{n}_{p})}^{1/2}/({F}_{\mathrm{pk}}{t}_{\mathrm{tot}}).$$Another observation about ${N}_{c}$ is worth revisiting. While the ${N}_{c}$ limit from Poisson statistics is a fundamental limit, the fractional error $\sigma \{p\}/p$ from Eq. (64) also affects the maximum improvement from a single probe estimate. For example, contrast improvements by more than a factor of 10, such as the previous example of a factor of 100 from ${10}^{-7}$ to ${10}^{-9}$, may not be possible if there are 10% errors on $p$.

## 8.2.

### Stellar Shot Noise Using Small Probes

Section 8.1 dealt with large probes, where ${p}^{2}\gg {|{E}_{0}|}^{2}$. While the intermediate case where ${p}^{2}\sim {|{E}_{0}|}^{2}$ does not have much algebraic simplicity, the case of small probes, ${p}^{2}\ll |{E}_{0}{|}^{2}$, is simple to express. For small probes, the individual measured intensities, and therefore their Poisson noise terms, are dominated by $|E{|}^{2}$; based on Eq. (44) and analogous to Eq. (66), the probe intensity can be approximated as

Following the same logic as with the large probes, the noise equivalent contrast ${N}_{c}$ for small probes becomes Since, by construction, small probes have ${|{E}_{0}|}^{2}/{p}^{2}\gg 1$, this noise is far larger than the corresponding large probe ${N}_{c}$.## 8.3.

### Other Noise Sources

We can expect there are other sources of measurement noise that will degrade our estimate quality. Here we will include the effects of zodi or exozodi light $Z$ (in units of photoelectrons, integrated over time $t$), detector dark current $D$ (also in units of photoelectrons), and readout noise variance ${\sigma}_{\mathrm{ron}}^{2}$ (in units of ${\text{photoelectrons}}^{2}$). Large probes are assumed, as they will have the least noise in the electric field estimates. A measurement $\mathrm{\Delta}I$ comprising ${n}_{\mathrm{exp}}$ stacked exposures per probed image, where ${n}_{\mathrm{exp}}$ exposures together accumulate $t$ integration time, will have a variance

## (79)

$${\sigma}^{2}\{\mathrm{\Delta}I\}=2{\sigma}^{2}\{{I}_{j\pm}\}+2{\sigma}^{2}\{Z\}+2{\sigma}^{2}\{D\}+2{n}_{\mathrm{exp}}{\sigma}_{\mathrm{ron}}^{2}\phantom{\rule{0ex}{0ex}}=2{p}^{2}+2Z+2D+2{n}_{\text{exp}}{\sigma}_{\mathrm{ron}}^{2}.$$## (80)

$${N}_{c}=\frac{1}{{F}_{\mathrm{pk}}{t}_{\mathrm{tot}}}(1+\frac{Z+D+{n}_{\mathrm{exp}}{\sigma}_{\mathrm{ron}}^{2}}{{p}^{2}}).$$We can now see some new properties of the variance of the electric field estimate. We see that increasing the probe brightness $p$ reduces the variance contribution from incoherent background (zodi, dark current) and readout noise. If probes can be made arbitrarily large, ${N}_{c}$ will approach the fundamental limit from stellar shot noise alone.

## 8.4.

### Maximum Allowable Probe Brightness

Earlier we found that brighter probes (larger $p$) reduce the variance of the electric field estimate. However, if we make $p$ too large, higher-order terms of the Taylor series neglected in Eq. (3) will become the dominant source of error in Eq. (44). Here we will find the maximum allowable intensity of a probe before nonlinear terms become significant relative to the nominal contrast level, ${|{E}_{ab,k}|}^{2}$. For simplicity, we will once again evaluate a single pixel in the image plane. Rather than assuming we already know the probe ${p}_{j,k}$, we will treat the additive electric field change of the $j$’th probe at the $k$’th iteration as a general value $\mathrm{\Delta}{E}_{j,k}$. In this case, the least-squares solution in Eq. (46) is rewritten as

## (81)

$$\left[\begin{array}{c}\mathrm{\Delta}{I}_{1,k}\\ \vdots \\ \mathrm{\Delta}{I}_{{n}_{p},k}\end{array}\right]=4\left[\begin{array}{cc}\mathfrak{R}\{\mathrm{\Delta}{E}_{1,k}\}& \mathfrak{I}\{\mathrm{\Delta}{E}_{1,k}\}\\ \vdots & \vdots \\ \mathfrak{R}\{\mathrm{\Delta}{E}_{{n}_{p},k}\}& \mathfrak{I}\{\mathrm{\Delta}{E}_{{n}_{p},k}\}\end{array}\right]\left[\begin{array}{c}\mathfrak{R}\{{E}_{ab,k}\}\\ \mathfrak{I}\{{E}_{ab,k}\}\end{array}\right].$$## (82)

$${E}_{\mathrm{pup},k}{e}^{i{\phi}_{k}}\approx \tilde{E}[1+i\delta {\phi}_{k}-\frac{1}{2!}\delta {\phi}_{k}^{2}-\frac{i}{3!}\delta {\phi}_{k}^{3}+\frac{1}{4!}\delta {\phi}_{k}^{4}],$$## (83)

$${E}_{ab,k}(\pm \delta {\phi}_{k})=\mathcal{C}\{{E}_{\mathrm{pup},k}{e}^{i{\phi}_{k}}\}\phantom{\rule{0ex}{0ex}}\approx \mathcal{C}\{\tilde{E}\}\pm {G}_{k}{u}_{j.k}-\frac{1}{2}\mathcal{C}\{\tilde{E}\delta {\phi}_{j,k}^{2}\}\mp \frac{i}{3!}\mathcal{C}\{\tilde{E}\delta {\phi}_{j,k}^{3}\}+\frac{1}{4!}\mathcal{C}\{\tilde{E}\delta {\phi}_{j,k}^{4}\}.$$## (84)

$$\mathrm{\Delta}E=\frac{{E}_{ab,k}(+\delta {\phi}_{j,k})-{E}_{ab,k}(-\delta {\phi}_{j,k})}{2}\approx {G}_{k}{u}_{j,k}-\frac{i}{3!}\mathcal{C}\{\tilde{E}\delta {\phi}_{j,k}^{3}\}.$$## (85)

$$\mathrm{\Delta}{I}_{j,k}=|{E}_{j+,k}{|}^{2}-{|{E}_{j-,k}|}^{2}\phantom{\rule{0ex}{0ex}}=4\mathfrak{R}\{\u27e8{E}_{ab,k},{G}_{k}{u}_{j,k}\u27e9\}-2\mathfrak{R}\{\u27e8{G}_{k}{u}_{j,k},\mathcal{C}\{\tilde{E}\delta {\phi}_{j,k}^{2}\}\u27e9\}-\frac{2}{3}\mathfrak{R}\{\u27e8{E}_{ab,k},\mathcal{C}\{i\tilde{E}\delta {\phi}_{j,k}^{3}\}\u27e9\}\phantom{\rule{0ex}{0ex}}+\frac{1}{3}\mathfrak{R}\{\u27e8\mathcal{C}\{\tilde{E}\delta {\phi}_{j,k}^{2}\},\mathcal{C}\{i\tilde{E}\delta {\phi}_{j,k}^{3}\}\u27e9\}\phantom{\rule{0ex}{0ex}}+\frac{1}{6}\mathfrak{R}\{\u27e8{G}_{k}{u}_{j,k},\mathcal{C}\{\tilde{E}\delta {\phi}_{j,k}^{4}\}\u27e9\},$$## (86)

$$4\mathfrak{R}\{\u27e8\mathrm{\Delta}{E}_{j,k},{E}_{ab,k}\u27e9\}=4\mathfrak{R}\{\u27e8{E}_{ab,k},{G}_{k}{u}_{j,k}\u27e9\}-\frac{2}{3}\mathfrak{R}\{\u27e8{E}_{ab,k},\mathcal{C}\{i\tilde{E}\delta {\phi}_{j,k}^{3}\}\u27e9\}.$$## (87)

$$\mathrm{\Delta}{I}_{\text{err}}=-2\mathfrak{R}\{\u27e8{G}_{k}{u}_{j,k},\mathcal{C}\{\tilde{E}\delta {\phi}_{j,k}^{2}\}\u27e9\}+\frac{1}{3}\mathfrak{R}\{\u27e8\mathcal{C}\{\tilde{E}\delta {\phi}_{j,k}^{2}\},\mathcal{C}\{i\tilde{E}\delta {\phi}_{j,k}^{3}\}\u27e9\}\phantom{\rule{0ex}{0ex}}+\frac{1}{6}\mathfrak{R}\{\u27e8{G}_{k}{u}_{j,k},\mathcal{C}\{\tilde{E}\delta {\phi}_{j,k}^{4}\}\u27e9\}.$$## (88)

$$\mathrm{\Delta}{I}_{\mathrm{err}}\approx -2\mathfrak{R}\{\u27e8{G}_{k}{u}_{j,k},\mathcal{C}\{\tilde{E}\delta {\phi}_{j,k}^{2}\}\u27e9\}.$$## (89)

$$\mathrm{\Delta}{I}_{\mathrm{err}}\approx {|{G}_{k}{u}_{j,k}|}^{3}={|\mathcal{C}\{\tilde{E}\delta {\phi}_{j,k}\}|}^{3}.$$## 9.

## Conclusions

We have outlined what we consider to be the current best practices of focal plane wavefront probing, estimation, and control. Both the EFC and stroke minimization control laws are model-based, and their performance relies heavily on the accuracy of the Jacobian used to define the transfer function between DM actuation and image plane field. When stroke minimization algorithm is implemented with an equality constraint in a quadratic cost function, it is identical to EFC when its desired field is set to zero. If the true inequality constraint were applied, the two algorithms would behave differently, but stroke minimization has not been implemented in this fashion to date. More differences appear in the broadband control cases, with stroke minimization more explicitly allowing a greater set of degrees of freedom since each wavelength has its own Lagrange multiplier. That is not to say we could not attempt weighting the broadband EFC algorithm, but a multiparameter optimization is inherent to the formulation of windowed stroke minimization. However, the biggest difference in the control solution for a specific dark hole comes in the user’s choice of contrast/field target, how often the Jacobian is relinearized, and how the regularization parameters are chosen at each iteration. The choice in size and location of the dark hole also plays a major role in performance, but the outcome is highly dependent on the initial state realized in any individual optical system.

Control will only be as accurate as the wavefront estimation and probing steps. Here we have described the original batch process method as a segue to more advanced Kalman filtering algorithms. So long as the Kalman filter’s observation matrix is well conditioned, we are guaranteed that the estimate will not get worse over time, a very important fact as the dark hole reaches higher contrast levels and noise plays a more significant role. Since the estimator will be limited by the probes we apply, we propagated measurement errors through the estimator to quantify how they limit our contrast performance. Various noise sources and uncertainties bound the brightness of the probes, suggesting that brighter probes are better until nonlinear errors in the estimate become significant.

## Acknowledgments

Tyler D. Groff, A. J. Eldorado Riggs, and N. Jeremy Kasdin are funded for this work by the NASA Nancy Grace Roman Technology Fellowship (NNX15AF28G), the NASA Space Technology Research Fellowship (NNX14AM06H), and Jet Propulsion Laboratory (JPL) subaward 1499194 under contract from National Aeronautics and Space Administration (NASA). For Brian Kern, this research was carried out at JPL, California Institute of Technology, under a contract with NASA.

## References

## Biography

**Tyler D. Groff** is an associate research scholar at Princeton University. He received his BS in mechanical engineering and astrophysics from Tufts University in 2007 and his PhD from Princeton University in 2012. His current research interests include exoplanet imaging, optomechanics, optical design, wavefront estimation and control, infrared spectroscopy, integral field spectrographs, deformable mirrors, and ferrofluidics. He is a member of the American Astronomical Society and SPIE.

**A. J. Eldorado Riggs** is a PhD candidate in the Department of Mechanical and Aerospace Engineering at Princeton University. He received his BS in physics and mechanical engineering from Yale University in 2011. His research interests include coronagraph design and wavefront estimation and control for exoplanet imaging.

**Brian Kern** is an optical engineer at NASA Jet Propulsion Laboratory (JPL). He received his PhD from Caltech in 2002. His research interests are in high-contrast imaging. He works on the JPL High Contrast Imaging Testbed.

**N. Jeremy Kasdin** is a professor of mechanical and aerospace engineering and vice dean of the School of Engineering and Applied Science at Princeton University. He is the principal investigator of Princeton’s High Contrast Imaging Laboratory. He received his PhD from Stanford University in 1991. His research interests include space systems design, space optics and exoplanet imaging, orbital mechanics, guidance and control of space vehicles, optimal estimation, and stochastic process modeling. He is an associate fellow of the American Institute of Aeronautics and Astronautics and member of the American Astronomical Society and SPIE.