Optimal scheduling of exoplanet direct imaging single-visit observations of a blind search survey

Abstract. We present an algorithm, effective over a broad range of planet populations and instruments, for optimizing integration times of an exoplanet direct imaging observation schedule, to maximize the number of unique exoplanet detections under realistic mission constraints. Our planning process uses “completeness” as a reward metric and the nonlinear combination of optimal integration time per target and constant overhead time per target as a cost metric constrained by a total mission time. We validate our planned target list and integration times for a specific telescope by running a Monte Carlo of full mission simulations using EXOSIMS, a code base for simulating telescope survey missions. These simulations encapsulate dynamic details such as time-varying local zodiacal light for each star, planet keep-out regions, exoplanet positions, and strict enforcement of observatory use over time. We test our methods on the Wide-Field Infrared Survey Telescope (WFIRST) coronagraphic instrument (CGI). We find that planet, Sun, and solar panel keep-out regions limit some target per-annum visibility to <28  %   and that the mean local zodiacal light flux for optimally scheduled observations is 22.79 mag arcsec  −  2. Both these values are more pessimistic than previous approximations and impact the simulated mission yield. We find that the WFIRST CGI detects 5.48  ±  0.17 and 16.26  ±  0.51 exoplanets, on average, when observing two different planet populations based on Kepler Q1-Q6 data and the full Kepler data release, respectively. Optimizing our planned observations using completeness derived from the more pessimistic planet population (in terms of overall planet occurrence rates) results in a more robust yield than optimization based on the more optimistic planet population. We also find optimization based on the more pessimistic population results in more small planet detections than optimization with the more optimistic population.


Introduction
The 2010 astronomy and astrophysics decadal survey highly prioritized exoplanet bulk population statistics and inventorying planets around nearby stars (within 30 pc). 1 The Wide-Field Infrared Survey Telescope (WFIRST), 2 prioritized by the 2010 decadal survey, will include a coronagraphic instrument (CGI) capable of directly imaging and detecting new exoplanets unobservable by modern radial velocity or transit techniques. The expected performance of CGI in blind search surveys can be estimated through probabilistic methods using completeness. 3,4 An alternative method is to execute a Monte Carlo of full survey simulations on simulated universes. This process creates an ensemble of design reference missions (DRMs) containing a list of target stars observed, the integration time used for each star, when the simulated observations occurred, and the simulated outcome of each observation. Such a collection of DRMs, produced by our method, effectually certifies the ability of the instrument to make the expected number of detection observations claimed in a probabilistic evaluation such as completeness. It is important to note that both methods are still equally limited in their overall prediction accuracy by the assumptions made about the true population of exoplanets to be discovered.
Detailed DRMs enable requirement definition and design iteration optimization for future telescopes, including the large-scale mission concepts under development by science and technology definition teams for NASA's 2020 decadal survey. Both the Habitable Exoplanet Observatory (HabEx) 5 and the Large UV-Optical-Infrared Surveyor (LUVOIR) 6 mission concepts contain a significant exoplanet direct imaging component with HabEx reserving 1.95 years for coronagraph science operations and LUVOIR reserving 50% of the total mission time for exoplanet science. While target revisits and spectral characterizations could represent a substantial portion of the executed mission, we purposefully omit optimization with revisits or characterizations due to the complexity of that problem. We do include the rare spectral characterization in our WFIRST mission simulations. In our optimization process, we focus solely on delivering an estimate of the maximum possible number of uniquely detected exoplanets under realistic mission constraints by evaluating the number of detections made through single-visit observations of stars, henceforth referred to as yield. We leave the inclusion of revisits, orbit characterizations, and spectral characterizations in full survey optimization for future work.
Brown, in Ref. 3, used single-visit completeness to estimate the number of extrasolar planets potentially discoverable with the terrestrial planet finder, an earlier direct imaging mission concept. Single-visit completeness, hereafter referred to as completeness, is the probability of detecting a planet, drawn from a particular population, using a particular instrument, should one exist about a given target star. 3,4,7 When completeness is evaluated for each star in a full target list, summed, and scaled by the expected number of planets from that population per star (η), we arrive at the expected number of planets to be detected from that population by that mission. 8 While this technique can be used to quickly evaluate a mission's performance, it abrogates temporal constraints and uncertainties, such as target visibility, variable overhead times, changing local zodiacal light intensity, and unscheduled characterizations of newly detected planets. Solely using completeness to evaluate a mission can therefore only provide an upper bound for expected performance.
Completeness has previously been used as a reward metric for multiple observation integration time optimizations. Brown demonstrated a method for finding the group Δmag (difference in brightness between the planet and star in magnitudes) and number of target trade-off point by optimizing a target list subset Δmag against the number of targets in that subset assuming different fixed mission times. 3 While this method approximates the reward gradient for achieving a specific group Δmag, it overlooks the gain made by customizing Δmag i for each individual star, i. Hunyadi et al.,in Ref. 9, advanced Brown's work by maximizing summed completeness over all targets assuming a fixed mission time and using integration times as the decision variables. In this new approach, star integration times are optimized to equivalent slopes beyond the completeness versus log integration time inflection point and the highest completeness per integration time of these targets is then selected. To practically achieve this, the authors of that study discretized integration times into 1 h increments and calculated completeness values for each integration time. Their final target list contained the set of highest completeness per integration time targets. Hunyadi et al. specifically investigated Earth-analog planets in the habitable zone (as also done in Ref. 3), but also explored Jupiter and Saturn-analogs. Alternatively, a limiting search observation, as defined by Brown, 10 would observe a target for the fixed exposure time sufficient to achieve the system's limiting planet-star magnitude, Δmag lim , for each target. The creation of the final observation schedule in Ref. 10 involved selecting the subset of targets with precalculated integration times and overhead time per observation that fit within the total observing time. While Brown implemented target revisits, which can improve yield, the use of precalculated Δmag lim makes the resulting target list suboptimal. While optimization of integration times and inclusion of revisits mark improvements in yield and target list planning realism, the omission of overhead times and discretization of integration times limits the ability to practically implement the desired observations within a finite span of time and under dynamic mission constraints.
A refinement of Brown and Hunyadi's work-altruistic yield optimization (AYO) 4 -uses completeness versus integration time as a figure of merit as in Ref. 10 to incrementally sacrifice stars from a target list and reallocate the integration time, t i , from star i to the largest dc j ðt j Þ∕dt j star j in increments of dt. At its core, this represents a form of "greedy optimization" which incrementally converges each observation to a constant slope similar to that described in Ref. 9. This method does not allow for the reintroduction of targets stars to the target list, a necessity of a dynamically evolving mission schedule. Finally, the original AYO method does not fully account for overhead times in the calculation of integration times but rather states that the addition of time can occur after the fact and the use of a finite dt parameter results in a loss of potential completeness.
None of these yield optimization processes use optimization with continuous integration times or test the ability to schedule observations via full mission simulations accounting mission elapsed time (MET). Brown stressed that Monte Carlo simulations of the mission as a whole should be used to produce confidence in the proposed mission's integrity. 3 Brown's work seeded the founding pillars of our well book-kept full mission survey simulator to include: 1. tracking individual exoplanets around target stars versus MET; 2. tracking spacecraft position versus MET; 3. accounting for solar system body locations and keep-out regions versus MET; 4. accounting for variations in local zodiacal light versus MET; 5. potential restriction of telescope observations to prescribed observing blocks and timesharing with other observatory instruments versus MET.
The EXOSIMS 11 code base was specifically developed to book-keep these parameters across MET. In EXOSIMS and this paper, we account for the locations of individual exoplanets around target stars versus MET, the tracking of our observatory on a nominal L2 Halo orbit 12 versus MET, solar system body locations versus MET from ephemeridies, keep-out occlusion of target stars versus MET, changes in local zodiacal light intensity versus MET, and possesses the capability to account for cordoned off observing blocks reserved for other instruments at varying MET, and portion of mission life reserved for observatory science.
This paper describes our process for producing a set of planned observations maximizing unique exoplanet detection yield and subsequently validating this prediction under realistic mission conditions. The observation planning process incorporates a combination of filters applied to a planet population and star catalog, described in Secs. 2.1 and 2.2, completeness calculations detailed in Sec. 2.3, and our optimization algorithms with binary-integer programming (BIP) 13 and sequential least squares quadratic programming (SLSQP), 14 discussed in Sec. 2.4. Our validation process is outlined in Sec. 2.5 where we discuss the framework of EXOSIMS 11,[15][16][17] survey simulation as well as incorporation of time-varying keep-out regions (Sec. 2.5.1), timevarying local zodiacal light noise (Sec. 2.5.2), and the convergence of our Monte Carlo simulation (Sec. 2.5.4). We then show practical application on WFIRST in Sec. 3, where we discuss attributes of the observing plan and a comparison of planet populations input versus instrument capabilities versus detected planet population. We also include instrument and fit file parameters 18 in Sec. 7 and a full optimized target list in Sec. 8.

Planet Populations
To calculate completeness for each target for a specific instrument, we first generate a joint probability density function of Δmag and planet-star separation projected onto the image plane, s, using Monte Carlo as in Ref. 3 for an assumed planet population. We use two planet populations in this paper; one derived from Kepler's detections from Q1-Q6 data 19 (Keplerlike) 20 and another derived from NASA's Exoplanet Program Analysis Group (ExoPAG) Study Analysis Group 13 (SAG13). 21,22 The Kepler-like and SAG13 populations both share the same eccentricity and albedo distributions but differ in their occurrence rates, semimajor axis, and planetary radii distributions. The final output necessary for calculating completeness is the joint probability distribution of Δmag and s which is found by sampling the respective planet population and fitting a rectilinear bivariate spline. We use the same sampling methods and distributions for both calculating the joint probability distributions and generating planets in our simulated universes.

Kepler-like planetary semimajor axis and radii
For the Kepler-like planet population, we adopt a power-law distribution for semimajor axis (a) modified from the power law fit in Ref. 23 to be of the form E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 1 ; 1 1 6 ; 6 9 2 (1) where −0.62 is adopted from Ref. 24. In this model, we include an exponential decline in semimajor axis past a "semimajor axis knee" (a knee ). This "knee" is motivated by a lack of giant planet discoveries by the Gemini Planet Imager, which indicates the RV-based power-law planet occurrence rate from Ref. 23 approaches 0 between 10 and 100 AU. 25 Using giant planet radial velocity data, Ref. 26 fit many period break points to the giant planet occurrence rate power-law model. We assume an intermediate knee equivalent to a knee ¼ 10 AU. The normalization factor is given by integrating the non-normalized distribution over a specific a range E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 2 ; 1 1 6 ; 5 6 3 where we consider values of a range in a min ¼ 0.1 AU to a max ¼ 30 AU, again based on the paucity of wide-separation planets discovered to date. The per-simulation average distribution of a is shown in the histogram above Fig. 1(a). We note that a min < minðs min;i Þ for WFIRST and this planet population. Since the closest target list star has distance minðd i Þ ¼ 2.63 pc and has the smallest observable planet star separation minðs min;i Þ ≈ IWA × minðd i Þ, then WFIRST's inner working angle (IWA) of 0.15 arcsec means the smallest planet-star separation observable by WFIRST is 0.394 AU. By assuming a maximum planet eccentricity ðe max Þ ¼ 0.35, the smallest observable semimajor axis a min from ðs min Þ ≈ a min ð1 þ e max Þ ¼ 0.292 AU. We base our Kepler-like planetary radii (R) based on Fig. 7 in Ref. 19. We define the bin counts in Fig. 7 of Ref. 19 as R 85 . These bins range from R min ¼ 1R È to R max ¼ 22.6R È and are based solely on data from planets with periods >0.8 days, but <85 days. To properly tune the overall Kepler-like planet occurrence rates (η KL ) starting with R 85 , we first use Eq. (2) evaluated at a min ¼ a 0.8 to a max ¼ a 85 to get a 85;norm . Here, a 0.8 and a 85 are the semimajor axes corresponding to periods of 0.8 and 85 days around a Sun mass star. We then scale R 85 by Eq. (2) and a 85;norm to get the adjusted planetary radii occurrence rates: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 3 ; 1 1 6 ; 3 3 5 We multiply the last five bins in R vals by 2.5 to account for longer orbital baseline data and more closely match the larger period orbit distributions available from radial velocity surveys at the time when this distribution was first derived. 23,27 Even with this multiplication, the right-hand side histogram of Fig. 1(a) shows the characteristic drop-off in planetary radius clearly observable in Fig. 7 of Ref. 19. The Kepler-like planet occurrence rate per star is η KL ¼ P R vals ¼ 2.375 over the specific a and R ranges discussed in this section. When we simulate a universe of Kepler-like planets, we first calculate the number of planets to generate for each planet radius bin (N q ) for all bins in R vals . This is given as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 4 ; 1 1 6 ; 1 9 6 In this paper, N p is the number of planets to generate. In our Monte Carlo completeness calculations, we generate 10 8 planets. We take N q samples from a log-uniform distribution over the q'th bin range in R vals to get a set of planets radii (R q ). 19 Finally, we randomly select N p radii from the collective nine sets of planetary radii to get R ¼ S 9 q¼1 R q . At the same time, we use an inverse transform sampler to generate a ¼ fa k ∀ k ∈ 1::N p g based on Eq. (1).

SAG13 planetary radii and semimajor axis
The SAG13 planet population represents a significant update on the Kepler-like population, incorporating all available Kepler data circa 2017. The occurrence rate model presented in Ref. 28 is a power-law model fit as a function of ln P and ln R but does not include an occurrence rate turnover indicated in giant planet occurrence rates from radial velocity studies 26 and the lack of detections of large a planets by subsequent Gemini Planet Imager survey results. 25 Fig. 1 The universe of planets generated over all simulations for (a) Kepler-like and (b) SAG13 planet populations. (c) and (d) The populations of detected planets for simulations run with completeness calculated using the Kepler-like planet population observing a universe of (a) the Kepler-like planets and (b) SAG13 universe of planets. (e) and (f) The populations of detected planets for simulations run with completeness calculated using the SAG13 planet population observing a universe of (a) the Kepler-like planets and (b) SAG13 universe of planets. Overlay text on (a) and (b) shows average occurrences per grid-space averaged over the Monte Carlo of simulations. Overlay text on (e) and (f) shows average detections per grid-space averaged over the Monte Carlo of simulations. These plots reference RpvsSMAdetectionsDATA_ WFIRSTcycle6core_CKL2_PPKL2_2019_04_05_19_34_.txt.
We need an occurrence rate model that is in terms of linear Keplerian orbital elements and decreases occurrence rates for large a planets so our extrapolation of the model beyond the bounds in Ref. 28 does not produce substantially more large a planets than expected. We include a full derivation of the SAG13 occurrence rate model implemented 22,29 in EXOSIMS in Sec. 7.8 but summarize the steps, important components, and results here.
We start with the original occurrence rate model fit from Ref. 28 included as Eq. (37) and convert from ðln P; ln RÞ to ðP; RÞ resulting in Eq. (38). We reconstructed the occurrence rate grid from the linear model in Fig. 2(a). When directly compared to the occurrence rate grid for G-type stars in Ref. 28, we see a maximum deviation of 0.74 (100× planets per bin occurrence) and maximum percent deviation of 13%. We then convert Eq. (38) from period (P) space to a space using Eqs. (39) and (40) assuming a solar mass star to arrive at Eq. (41). We compare the analytical integral over a space of this new occurrence rate model to arrive at Fig. 2(b) which shows identical occurrence rates in each of their counterparts' grid spaces in Fig. 2(a). We now append a semimajor axis "knee" to decay occurrence rates of larger a planets to 0, which results in Eq. (42). We calculate the overall SAG13 exoplanet occurrence rate per planet (η SAG13 ) by performing the double integral over the entire a and R space as in Eq. (43). The SAG13 model parameters in Table 1 are split at 3.4R È , so we use the notation ; t e m p : i n t r a l i n k -; e 0 0 5 ; 1 1 6 ; 5 0 7 where γ i , β i , and α i are constants from Table 1. The α i in the denominator and ðα i − 1Þ to α i exponent are a result of the indefinite integral over R. The four R terms are the result of evaluating the indefinite integral over R min to 3.4R È to R max . Finally, K i is an intermediate calculation representing the marginalization of Eq. (42) over a written as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 6 ; 1 1 6 ; 4 2 0 Here, μ is the gravitational parameter assuming a solar mass star. We use the same a knee ¼ 10 AU in the SAG13 planet population that we use for the Kepler-like planet population. The average number of planets generated per star in the EXOSIMS implemented SAG13 universe over the range specified is η SAG13 ¼ 5.62. In order to generate planets with R and a from the occurrence rate model, we still need a joint probability density function f R p ;a ðR; aÞ as well as a single variable probability density function and conditional probability density function. By normalizing Eq. (42) by η SAG13 , we get the joint probability distribution in Eq. (44). From here, we calculate the probability distribution of R for the SAG13 planet population by marginalizing Eq. (44) over a only. This gives us the planetary radius distribution E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 7 ; 1 1 6 ; 2 5 0 This probability density function inherits the parametric conditions of the joint probability density function. Since f R ðRÞ is a marginalization over the joint probability density function, the integral of f R ðRÞ over the range R min ≤ R ≤ R max is 1. The distribution of generated planets in R is shown in the right-side histogram of Fig. 1(b). We now calculate the conditional probability distribution f ajR ðaÞ using f R;a ðR; aÞ, f R ðRÞ, and Bayes rule. The conditional probability distribution is a simple division of the joint probability density function by the marginalized probability density function to get E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 8 ; 1 1 6 ; 1 2 3 We now have the tools to sample semimajor axis and planetary radius. We calculate the SAG 13 exoplanet occurrence rate cumulative distribution by integrating over the linear occurrence rate model derived from Ref. 28 and included in Eq. (38) over R and P of each bin in (a). The purple numbers in (a) represent 100× the per-bin per star exoplanet occurrence rate and can be directly compared to the planet occurrence rate grid per star in Ref. 28, which shows good agreement with the largest deviation between implementations of 0.74%. The total SAG 13 occurrence rate per star found using Eqs. (38) and (43) over the plot range in (a) is 1.95 planets per star. The black percentages in (a) are calculated by dividing the 100× per bin occurrence (purple number) by the occurrence rate over the entire grid range and represents the probability a single planet will be in that bin. The exoplanet occurrence rate cumulative distribution in (b) is found using Eqs. (41) and (43) and integrating over the R and a ranges in (b). The P range in (a) and the a range in (b) are mapped using Eq. (39) assuming a Sun mass star. The 100× and percentage occurrence rates per bin between (a) and (b) are identical. We show the joint probability distribution of R and a from Eq. (44) implemented in EXOSIMS in (c) which is extrapolated beyond the range of (a) and includes the semimajor axis "knee." The apparent difference in color gradient between (a) or (b) and (c) is due to the logarithmic scale of (c); the top right bins are orders of magnitude larger in area than the bottom left bins.
To generate a planet from the probability distributions in this section, we first sample Eq. (7) using a single variable inverse transform sampler to get a set of planetary radius, R p . We also use a single variable inverse transform sampler to sample Eq. (8) and get a. The a min and a max used in the SAG13 population are the same as in the Kepler-like planet population. SAG13's a frequency distribution is included above Fig. 1(b). The two-dimensional (2-D) contour plot and associated grid values in Fig. 1(b) show R and a interdependence.

Planet eccentricity
We assume a Rayleigh distribution for orbital eccentricities in both the Kepler-like and SAG13 planet populations as done in Ref. 24, such that the eccentricity (e k ) of the k'th simulated planet is where n is a uniform random variable between 0 and 1, σ e is the Rayleigh parameter for eccentricity, e min is the minimum allowed eccentricity, defined as 0, and e max is the upper 95th percentile for e. The mean eccentricity μ e ¼ σ e ffiffiffiffiffiffiffiffi π∕2 p . In Ref. 24, the authors fit the Rayleigh distribution to radial velocity-detected planets and found μ e ≈ 0.225 to be a best fit with p-value of 0.5. They additionally found 0.125 < μ e < 0.25 has a p-value above 0.05 and placed strict limits of μ e < 0.35 and μ e > 0. 24 For this work, we use μ e ¼ 0.175 (splitting the difference between 0.225 and 0.125) as done in Ref. 19 and adopt their independence of star spectral type assumption.

Planet Albedo from semimajor axis
In both the Kepler-like and SAG13 planet population, we calculate albedo (p), using a cubic 2-D interpolant over metallicity and a using the V-band column from Table 4 of Ref. 30. The individual values in Table 4 from Cahoy's work are based on the atmospheric modeling of Jupiter and Neptune at a from 0.8 to 10 AU over varying metallicity ranging from 1× solar abundances to 30× solar abundances. To accommodate these restrictions for each p k interpolated, we use the randomly generated a k truncated to be between 0.8 and 10 AU. This means any planets with a k < 0.8 AU will be treated as having 0.8 AU in the interpolant and conversely and planets with a k > 10 AU will be treated as having 10 AU in the interpolant. Jupiter at 0.8 AU in Cahoy's work were found to be cloud-free, resulting in a geometric albedo of 0.322 which is comparable to Earth's geometric albedo. We assume a uniform random planet metallicity multiplication factor between 1× and 30×. Metallicity of a planet is a descriptor for the abundance of elements heavier than hydrogen or helium in that planet. The least reflective planet would be a Neptune at 0.8 AU with 30× solar abundance metallicity, which is slightly larger than the geometric albedo of Mercury and the Moon. The most reflective planet would be a Jupiter at 2 AU with 3× solar abundance. Planets formed through different processes will have varying metallicity representative of the planet's atmospheric metallic composition relative to solar abundances.

Joint probability density function
The final remaining parameters needed to describe planets at the time of an observation are the physical location of the planets relative to their host stars (r k∕i ), which is defined by Eq. (1) in Ref. 31. We assume that the direction of the orbit eccentricity vector is uniformly distributed in space, so that the orbit inclination is sinusoidally distributed between 0 and π, while the remaining Keplerian elements (longitude of the ascending node, argument of periapsis, and mean anomaly) are all uniformly distributed from 0 to 2π. 31 We sample all of the parameters aforementioned for each planet which is sufficient information to calculate a projected planet-star separation ðs k Þ and difference in magnitude between the host star and the planet ðΔmagÞ. We calculate the projected planet-star separation as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 0 ; 1 1 6 ; 6 0 7 We calculate Δmag k using Eq. (4) from Ref. 7 and adopt their use of a Lambert phase function. Sampling all of the parameters described above for N p ¼ 10 8 planets allows us to calculate the joint probability density of projected separation and star-planet magnitude difference, f s;Δmag ðs; ΔmagÞ. As in Eq. (7) in Ref. 7, we assume independence between all parameters, except for semimajor axis and planet radius in the SAG13 population as well as the planet albedo dependence on planet radius. We bin these planets into a 1000 × 1000 grid and fit a rectilinear bivariate spline to the resulting 2-D histogram. This rectilinear bivariate spline consists of integrable high order polynomials with total volume under the surface of 1. The resulting f s;Δmag ðs; ΔmagÞ densities for the two populations are shown in Fig. 3.

Star Catalog
We need a list of target stars, along with their positions on the sky, distance (d i ), and apparent brightness in B and V bands for our calculations of completeness and integration time. We derive the list of targets stars from the EXOCAT-1 star catalog discussed in Ref. 32, which contains a variety of targets out to a distance of ∼30 pc. Some of these stars are missing photometric values, including the luminosity, absolute magnitude, V band bolometric correction, and the apparent VBHJK magnitudes, all of which we attempt to fill in by interpolating a table of standard stars by spectral type in Ref. 33. From the N ¼ 2396 targets, we have an initial set of target stars I which we pare down through a series of filters. There are many stars in the EXOCAT-1 star catalog with "not a number" data entries which can propagate through our equations so we remove targets with these kinds of data entries. Not all of these entries are absolutely necessary so some targets may have been unnecessarily filtered. Some of these star systems are binary systems which we do not account for in our equations so we omit them from I. In addition, some stars have low apparent visual magnitudes so we filter any target stars we know would take excessively long to achieve a reasonable Δmag on. Missing data filter removes 429 targets with any fields containing a "not a number" value within the set of IPAC fields {hip_name, st_spttype, parx, st_vmag, st_j2m, st_h2m, st_vmk, st_dist, st_bmv, st_mbol, st_lbol, coords, st_pmra, st_pmdec, wds_sep} as well as whenever we could not fill in a missing photometric value of a star.
Binary star filter removes 164 targets using the Washington Double Star catalog (filters targets with companion stars within <10 arcsec). 34 Integration time cut-off filter removes 1436 targets with integration times t i > 30 days, where t i is calculated assuming local zodiacal light to be Z ¼ 23.0 mag arc sec −2 , exozodiacal light with magnitude EZ After paring down I using a missing data filter, binary star filter, and integration time cut-off filter, I is reduced down to 651 targets (the filters are not mutually exclusive). Our initial filtering of target stars reduces the number of degrees of freedom in the optimization process and increases computation time. Larger telescopes will result in larger target lists with less candidate stars removed via the initial integration time filter. The statically optimized target list is included in Table 7. This optimized target list contains mainly FGK-type stars and 8 A-type stars. All of these targets are located within 20 pc.

Calculating Completeness
Completeness is calculated for the i'th target by integrating over the joint probability density function of s and Δmag 15 E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 1 ; 1 1 6 ; 6 9 2 s min;i f s;Δmag ðs; ΔmagÞds dΔmag: (11) The limits on the inner integrand are strictly obscurational. For star i, at a distance d i from the Sun, the minimum planet-star separation observable is s min;i ¼ IWA d i and the maximum planet-star separation is s max;i ¼ OWA d i , where IWA and OWA are the instrument's inner and outer working angles, respectively. For the outer integrand, we use the fundamental lower limit on Δmag min;i ¼ 0 as opposed to the analytical lower bound in Eq. (18) Here, ν i ðλÞ is the target star's B-V color, implemented as an empirical fit to data from Ref. 33 (see Sec. 7), which is accurate to about 7% in the wavelength range 400 nm < λ < 1000 nm as discussed in Ref. 36. SNR is the signal-to-noise ratio threshold chosen for determining planet detections. 3 ϵ PC is the photon-counting efficiency of the system and is used to give the ideal planet count rate. Tðλ; WAÞ is the instrument's core throughput (see Sec. 7). T is a function of WA, and by extension s so the integral over Δmag is not separable unless we assume a single representative WA at which to evaluate each star. Assuming a WA substantially speeds up calculations, but may not be representative of instruments with strong Δmag lim dependence on WA. C b;i is the net background count rate, and C sp;i is the net speckle residual count rate, including all postprocessing assumptions. We use the calculations of C p;i , C b;i , and C sp;i from Ref. 35 and include them in Sec. 7. The spectral flux density is given as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 3 ; 1 1 6 ; 3 5 0 Here, F 0 is the zero-magnitude flux calculated as in Eq. (23) of Sec. 7 and presented in Ref. 36, A is the pupil area, Δλ is the spectral bandwidth, ϵ q ðλÞ is the detector quantum efficiency from Fig. 5 in Sec. 7, ϵ inst is the optical attenuation due to the science instrument, and ϵ syst is the optical attenuation due to the coronagraph, treated separately since a single instrument can have multiple coronagraphs. By integrating Eq. (11) with the limiting Δmag i from Eq. (12), we arrive at a formulation for completeness as a direct function of integration time, c i ðt i Þ.
The theoretical maximum completeness for the i'th target (c ∞;i ) is found by integrating Eq. (11) to the upper limit of Δmag i at t ∞ (this assumes an infinite observing time is available). Using the WFIRST parameters, we see a universal upper limiting Δmag i of 23.137. It is important to note that the inclusion of speckle residuals means that integrating past a certain point will not produce any improvements in the achieved SNR, meaning there is a specific time for every target past which it makes no sense to integrate further.
While the equations derived thus far are sufficient to perform continuous integration time optimization, gradient-based solvers, such as the ones we employ as follows, all perform significantly better with analytical gradient functions. To expedite the optimization process in Sec. 2.4, we calculate the derivative of completeness with respect to integration time as a function of integration time. As in Ref. 15, the derivative of Eq. (12) with respect to time is E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 4 ; 1 1 6 ; 1 1 1 and the derivative of completeness with respect to integration time is therefore E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 5 ; 1 1 6 ; 5 0 0 We now have all the analytical expressions needed to optimize our planned observing schedule and have filtered down the original >2000 deg of freedom represented by the required integration times for each star in the input catalog to 651 (see Sec. 2.2). Input decision variables t ¼ t i ∀ i ∈ I of 651 deg of freedom are still quite large and could take a long time to compute, especially given the nonconvexity of the sigmoid-shaped c i ðt i Þ curves. In order to ensure fast convergence of the nonlinear optimization, we need to provide a feasible starting guess preferably close to the final solution. We know from experimentation with AYO 4 that an optimal observation schedule will converge to a fixed ε ¼ dc i ∕dt i ∀ i ∈ I. By combining Eq. (15) with Eq. (12) and inverting, we can find integration time as a function of dc i ∕dt i ≡ ε E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 6 ; 1 1 6 ; 2 7 7 which allows us to solve for integration time of all targets in a specific subgroup of I at ε. 15 This provides us with everything we need to both formulate the optimization problem and find a good initial solution, as described in the next section.

Optimization Process
Our goal is to maximize the number of unique detections throughout a blind-search survey.
We evaluate the reward potential of a target using the completeness metric and the cost as the integration time required to achieve that completeness. In our optimization formulation, we seek to maximize the summed completeness that fits within the rigid mission time constraints coupled with an additional time cost per observation. Maximizing summed completeness with fixed overhead per observation in a time constrained mission is the full nonlinear optimization problem we aim to solve in this section. Our optimization process is broken down into three major steps: The first is the calculation of an initial feasible solution via a BIP.
The second is a scalar minimization problem solving the target subset and collective derivative using Brent's method wrapped around a BIP (similar to Ref. 9).
The third step uses the output from the first or second step depending on which produces a higher summed completeness as an initial solution to optimize the solution by adding, removing, and finely tuning integration times via SLSQP.
Both step 1 and step 2 enable the solving of the full nonlinear optimization problem in step 3.
Step 3 can be seeded with any "good" initial guess (an initial solution sufficiently close to an optimal solution). A "good" initial guess for SLSQP can even be an infeasible solution (i.e., in the case of an unexpected reduction in total observing time, we can seed step 3 with the previous solution to step 3). Optimization step 3 is fundamentally important because it allows individual adjustment of each t i , individual removal of targets, and individual addition of targets (something Ref. 4 cannot do) while conforming to all time-related constraints of the full nonlinear optimization problem.

Mission time constraints
We formulate the summed completeness maximization problem to ensure any nonzero, fractional, observation incurs the observatory overhead and instrument settling time costs of making an observation. Settling time, T settling , is time required by the observatory to start a new observation. This includes waiting out transient vibrations from the slew, time needed to reach thermal equilibrium, and time for the initial generation of the high-contrast region, either by "digging the dark hole" for a coronagraph or by completing the precision alignment required by an external starshade. Overhead time, T OH , on the other hand, is any additional time required by the observatory during the science integration. This includes time reserved for momentum dumping and orbit maintenance (if these operations will interrupt science data collection), dark hole maintenance for coronagraphs, and stationkeeping burns for starshades. The inclusion of T settling and T OH makes it difficult to find an initial feasible solution in most cases, as the overhead time required for observing every target in the target list is typically greater than the total amount of mission time available. In the specific case of the WFIRST CGI explored here, we have over 650 days of overhead time associated with observing the full target list and <100 days of allotted exoplanet observing time. This means that we cannot simply evenly distribute our available time between targets to get an initial state for the optimization, as this would generate a constraintbreaking total required time. In general, initializing gradient-based optimizations on nonlinear and nonconvex search spaces (such as the completeness versus integration time sigmoids) leads to poor optimizer performance and frequently results in no feasible solution being found.

Step 1: the binary integer program
Step 1 is designed to create a guaranteed initial feasible solution to the nonlinear optimization problem and uses reasonable inputs.
Step 1 uses an initial calculation of background count rate (C b0 ) and residual speckle count rate (C sp0 ) using f Z0 , f EZ0 , and WA int . f Z0 is the zodiacal light surface brightness, in arcsec −2 , calculated using E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 7 ; 1 1 6 ; 1 7 9 where the default Z we use in step 1 is a static 23.0 mag arcsec −2 . 3 f EZ0 is the exozodiacal light surface brightness in arcsec −2 calculated using E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 8 ; 1 1 6 ; 1 2 2 where the default EZ we use in steps 1 to 3 is 22.0 mag arcsec −2 . 4 WA int is the working angle used for calculating integration times (this sets the specific values of instrument contrast, throughput, and other angular-separation-dependent terms) and is 2 × IWA. We use 0.3 arcsec for all targets in step 1 which is near a maximum balance of core mean intensity and throughput at λ ¼ 565 nm based on Figs. 4(a) and 4(c). The aforementioned parameters are used to calculate an initial c 0 and t 0 . We additionally impose a constraint on the total time spent using T max as the maximum amount of time to spend observing and T OH þ T settling as a fixed overhead time for making any observation. We use the coin-OR branch and cut solver, 13 as provided by Google OR-Tools to solve our BIPs, as described in Algorithm 1. This gives us an initial feasible solution of targets, denoted by x Ã 1 , with integration times, t 0 , and summed completeness, x Ã 1 c 0 .

2.4.3
Step 2: scalar minimization with group dc∕dt In step 2, we reuse C p0 , C b0 , and C sp0 specifying a solution tolerance of 10 −2 on a bounded scalar minimization problem with bounds on ε of [0,7]. We know ε at t ¼ ∞ and t ¼ 0 is 0, but we have determined from software experiments that seven work well as an upper bound in this case. Realistically, this upper bound on ε should be max½dc i ∕dt i ∀ i ∈ I, but only one target could achieve that value and the optimization fails so the upper bound on ε must be reachable by multiple targets. This minimization in step 2 uses the Python implemented scipy "minimize scalar" function, as described in Algorithm 2. Successful execution of this procedure produces Input: I, c 0 , t 0 , T OH , T settling , T max , and an optimization time limit maximum of 5 min Output: x Ã 1 , the list of binary values signaling to keep (1) or remove (0) each target E Q -T A R G E T ; t e m p : i n t r a l i n k -; t 0 0 2 ; 1 5 6 ; 4 3 2 x i ∈ f0;1g; ∀ i ∈ I: Algorithm 2 Bounded scalar minimization wrapping binary integer program.
Input: I, C p0 , C b0 , C sp0 , T OH , T settling , T max , and an optimization time limit maximum of 5 min Output: ε Ã , the value of dc∕dt evaluated for each target which maximizes yield Output: t Ã , integration times for each target evaluated at ε Ã Output: x Ã 2 , the list of binary values signaling to keep or remove each target E Q -T A R G E T ; t e m p : i n t r a l i n k -; t 0 0 3 ; 1 4 1 ; 1 9 4 a separate, feasible solution, distinct from the solution arrived at in step 1. By varying the ε and solving the BIP subproblem in Sec. 2.4.2, we achieve a different set of targets x Ã 2 with integration times t 2 .

Step 3: SLSQP minimization
In step 3, we formulate the SLSQP optimization process with an initial solution seeded with x Ã 1 t 0 or x Ã 2 t Ã 2 , whichever produces a larger summed completeness. In practice, we could take any xt as a good initial guess and resolve with new mission time constraints based on new information. This seeded solution should prove sufficiently close to an optimal solution such that the cðtÞ sigmoid-like inflection point is exceeded. We now replace our previous assumed f Z0 with a more optimistic surface brightness. We calculate zodiacal light surface brightness every ≈1∕3 of a day for a year, interpolating the lookup tables from Ref. 37 as shown in Fig. 6. For our optimization, we specifically use the per annum minimum, f Z;min for all targets excluding f Z;i in keep-out regions shown in Fig. 7(a). This is crucial as the f Z;i intensity of targets with 0 deg heliocentric ecliptic latitude, (b), has local zodiacal light minimum within 56 deg of the observatory's antisolar point (see Fig. 6), a region not visible to WFIRST due to solar panel pointing requirements in Table 3 (Algorithm 3).
The output t Ã 3 is an optimal allocation of integration times to each target accounting for the nonlinear overhead time assignment. This optimal allocation encodes our observation priority over all possible targets and means each target in the target list is equivalently important to observe to achieve the expected summed completeness. In the validation section, we choose the minimum zodiacal light intensity target selection metric to complement our optimization assumptions. The drawback is, if variations occur in the total mission time for any reason, high reward targets might not be observed. There are technically infinite "near-optimal" solutions. Equivalent maximum summed completeness target lists can be achieved for a range of different numbers of target stars. This is directly caused by the increasing number of approximately equivalent target stars at further stellar distances. These integration times are used as inputs to our validation process discussed in Sec. 2.5 and implemented in EXOSIMS, which runs a Monte Carlo of full mission survey simulations, observing each target for the integration times prescribed in t Ã 3 and bookkeeping all dynamic aspects of the mission.

Optimization performance
By breaking our optimization problem into three distinct parts, we see some benefits from each part. The BIP guarantees an optimal solution based on the input and has a subsecond solve time.
The scalar minimization problem solves within a few seconds and achieves ≈99% of the maximum completeness we achieve. The SLSQP part on WFIRST seeded with the scalar minimization solution marginally improves the summed completeness and takes a little over a minute. The solution arrived at by SLSQP is almost certainly a local minimum and not a global optimum. In this paper, rows 1, 2, and 3 in Table 2 are how we solved for the optimal observation target list and integration times, using f Z;0 as input to 1 and 2 of the optimization problem. We contrast optimizing with f Z;0 to f Z;min in rows 4, 5, and 6 of Table 2 which causes a marginal decrease in summed completeness for the BIP and scalar minimization problems. Row 7 is optimization seeded with t Ã 3 from row 6 and a slightly longer total observing time which solves in a shorter time than from the scratch optimization process. Rows 8,9,10,11, and 12 are based on a substantially larger telescope which is reflected in the higher overall summed completeness. Solve times are strictly larger on the Big Telescope than for WFIRST, but we see rows 11 and 12, which are seeded with the optimal solution from row 10 have substantially shorter solve times than row 10.

Algorithm 3 SLSQP optimization.
Input: I, f Z , t, T OH , T settling , and T max Output: t Ã 3 , the integration times to spend on each star E Q -T A R G E T ; t e m p : i n t r a l i n k -; t 0 0 4 ; 1 9 6 ; 4 3 5

Validation
We validate the ability to schedule our optimized integration times from Sec. 2.4 and listed in Table 7 of Sec. 8, via a Monte Carlo of full survey simulations using the EXOSIMS code base, detailed in Ref. 16. This software allows us to bookkeep dynamic aspects of the mission while scheduling observations on randomly generated planetary systems about a real set of target stars. Using EXOSIMS, we account for exoplanet, solar system planet, and observatory orbit propagation as well as recalculating look vectors, keep-out regions, and local zodiacal light noise. At the start of each survey simulation, we randomly generate planets around stars; drawn from the Kepler-like or SAG13 planet populations discussed in Sec. 2.1 using sampling methods described in Ref. 16. To define the observable times of individual stars, we combine solar system planet locations with instrument-specific keep-out angles and observatory look vectors for each target in the star catalog throughout the duration of the mission. These solar system planet locations are taken from the Horizons Ephemeris System furnished by NASA's Navigation and Ancillary Information Facility (NAIF) 40,41 based on an assumed mission start modified Julian date (60,634 for WFIRST). The EXOSIMS framework default observatory orbit is a quasiperiodic, stable, halo orbit at the second Earth-Sun Lagrange point with period of ≈180 days based on Ref. 42. We stitch the observatory position along the orbit using a onedimensional (1-D) spline interpolant and propagate the observatory along the interpolant throughout MET assuming a general observatory start location on the Halo orbit when the Earth is at the Earth-Sun equinox (60,575.25 MJD for WFIRST). The starting location of the observatory for a single-visit blind search survey coronagraph mission has negligible impact on yield. We discuss the calculation of keep-out regions in Sec. 2.5.1. In each simulation, we incrementally filter available targets, simulate observations and their outcomes, and propagate orbits as shown in Fig. 8.
At the start of each main loop shown in Fig. 8 (steps 1 to 9), filters remove targets with too long of integration times (t i > 30 day); targets currently in keep-out regions of planetary bodies; previously observed stars not currently scheduled for revisits; and filters targets not observable within the nearest time constraint. In this paper, we do not revisit targets, so we filter out any targets that have been observed during the mission. We then use an intelligent method of . This identifies target star i which we proceed to observe for the precalculated time t i from the method described in Sec. 2.4. We divide the observations into discrete time intervals and calculate the signal and noise at each interval, calculate the total achieved SNR, and propagate planets around the star as well as the observatory and solar system planets. Splitting up observations into time intervals enables us to approximate the achieved SNR via Riemann sum using the new location of the observatory, planets, and simulated exoplanets along with the associated zodiacal light, planet phase angle, and planet position in the instrument working angle. We divide our observations into two equal intervals over the duration of the integration time, t i , which are all strictly <2 days in this case. We then advance the mission time by t i þ T OH þ T settling and check if the mission is over. The EXOSIMS framework relies upon probabilistic planet generation and random draws. However, our Python implementation is capable of not only replicating results but also fully reproducing each survey simulation run by resetting the simulation from the simulation's random seed. EXOSIMS also keeps track of all inputs required to replicate the simulation.   EXOSIMS is additionally capable of strictly adhering to predefined observing blocks, but this functionality was not used for the results presented here.

Keep-out regions
WFIRST has keep-out regions specified in Table 3. We define these keep-out regions as a subset of the sky which cannot be entered by the telescope look vector throughout an observation. Our strict accounting for time and geometry in simulations enables us to ensure any observation's look vector (r i∕SC ) does not start, stop, or pass through a keep-out region. Nominally, each solar system body in Table 3 has a minimum keep-out region the telescope cannot look within. The Sun has a maximum keep-out region of 124 deg the telescope cannot look outside of. This is set by the minimum incidence angle of light on the spacecraft's solar panels to power the observatory and CGI. The bore-axis vector of WFIRST cannot point farther than 124 deg away from the spacecraft Sun vector (r ⊙∕SC ) in order to meet spacecraft power requirements. There is an additional minimum solar panel keep-out angle at 56 deg which was not modeled. Since our implementation in EXOSIMS in this specific case enforces observations at local zodiacal light minimum indicated by the black squares shown in Fig. 6, these observations either occur near the spacecraft antisolar point or edge of the maximum solar panel keep-out region.
We cache a keep-out map to substantially increase simulation speed. This keep-out map spans the duration of the mission and is discretized into small time bins (1 day in this paper) and encodes the visibility of each star in the catalog. The keep-out map encodes the visibility status of targets at each point in time by calculating the angular distance between r i∕SC ∀ i ∈ I and the observatory to planet vectors. If the formed angle is less than the minimum keep-out angle or greater than the maximum keep-out angle for that body, the target star is marked as not visible. In Fig. 7(a), we see the target visibility for 100 stars in the star catalog ordered by right ascension with their visible times (white) and obstructing bodies. The Sun (yellow) is the major contributor to keep-out regions with the Earth (blue) being the second largest contributor. Star occultation by Mars, Venus, or Jupiter keep-out regions is nearly negligible. Figure 7(b) also shows some targets are visible <28% of the time while others are visible all of the time. We can conclude from these plots that increasing WFIRST's maximum solar keep-out angle or decreasing the minimum keep-out angle can increase the amount of time a target is visible by maximally ≈2 day per change in degree.

Local zodiacal light
The local zodiacal light intensity (f Z;i ) is the largest, known, time varying noise source we account for and manifests itself in the background noise count rate introduced in Eq. (12) and explicitly written in Sec. 7. Variations in local zodiacal light can vary the summed completeness by up to 10%. 17,43,44 In Sec. 2.4, for step 1 for our integration time optimization, we use a static Z 0 of 23.0 mag arcsec −2 as done in Ref. 4. However, in step 3, we optimize our final integration times using f Z;min (or equivalently Z max ) which implies specific times of the year when these observations can be made, as shown by the black lines in Fig. 6. The black points in Fig. 6 are curved at low l, and this is directly caused by the large anti-solar keep-out region which marginally inhibits observation at f Z;min . There is a distinction between the f Z;i used in the integration time optimization and the f Z;i used in the Monte Carlo validation. When we evaluate whether a detection has been made at the top of step 7 in Fig. 8, we calculate the planet SNR using f Z;i based on r i∕SC ðt c Þ where t c is the current time in the simulation. We calculate the interpolant in Fig. 6 Here, h and c 0 are the Planck's constant and speed of light in a vacuum, and f λ ½log 10 ðλÞ is a quadratic 1-D interpolant of Ref. 37 data in Fig. 9 of Sec. 7. Since the Table 17 from Ref. 37 is in the geocentric ecliptic coordinates, but the spacecraft will physically be located on an Earth-Sun L2 orbit, we assume the table's b ¼ 0 deg and l ¼ 0 deg is coincident with r ⊙∕SC , and the additional distance of the spacecraft from the Sun has no effect. Further discussion of these two assumptions is included in Sec. 7.
Since the statically optimized integration time assumes a fixed zodiacal light, which coincides with a specific time of year and our actual observation takes into account the local zodiacal light conditions, our target selection method has an impact on the zodiacal light of the target we observe. In previous work, we tested different target selection metrics incorporating completeness, integration time, and deviations from the maximum or minimum zodiacal light intensity. 15,17,42 For a statically optimized target list, we found the strategy observing at minimum zodiacal light to be most effective. However, this assumes we have all the observing time allotted. If we included instrument degradation due to radiation or possibly early mission terminations, other selection metrics which preferentially select higher performing targets may become more attractive.
Using the 2-D interpolation of Fig. 6 combined with instrument specific filters in Sec. 3 and optimization process in Sec. 2.4, we can plot the histogram of minimum (black dots in Fig. 6) and maximum (at edge of solar keep-out) local zodiacal light intensity in Fig. 10. Depending on the time of year observations are made, the variation in zodiacal light intensity changes (excluding when targets are in keep-out) by upward of two magnitudes. We also see using an estimated Z 0 of 23.0 mag arcsec −2 is an overly optimistic approximation of the local zodiacal light noise. Realistically, for our filtered set of target stars, the appropriate μ Z min ≈ 22.79 mag arcsec −2 and μ Z max ≈ 21.59 mag arcsec −2 . Since f Z;min and f Z;max in Fig. 10 include the instrument-specific keep-out regions for WFIRST, increasing the maximum solar keep-out could decrease f Z;min and decreasing the minimum solar keep-out angle would increase f Z;max .

Detection and characterization observations
When making a detection observation of a planet, we do not use the same parameters as in the optimization process. Our algorithm for confirming a detection is also different. In a survey simulation, we define a planet as being detected when the collected SNR k > 5. We calculate this SNR using Nemati's SNR model from Ref. 35 discussed further in Sec. 7. The planet signal is E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 0 ; 1 1 6 ; 3 7 0 S ¼ C p t i ∕N dt ; (20) and the noise is E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 1 ; 1 1 6 ; 3 2 6  We include spectral characterization observations in our survey simulations but not in our optimization process. This is due to both the rarity in finding a strong enough planet signal to make characterization of with WFIRST and the high mission value placed on acquiring spectral characterizations. We do not model characterizations in our optimization process because it adds complexity and we have no good method of weighting new planet detections to spectral characterizations. We attempt to immediately perform a spectral characterization upon successfully detecting any new exoplanet. We calculate integration times for this characterization observation using the measured Δmag k , WA k , f EZ;k , and f Z;i at the current time as well as the characterization mode parameters in Sec. 7. We check if the new integration time plus overhead would exceed the total mission time, would enter into a keep-out region, or would exceed the 30-day integration cutoff filter. In the vast majority of cases, an SNR of 10 is not achievable given what we know about the exoplanet from a single detection. In addition, we make observations at local zodiacal light minimum immediately before a planet enters keep-out or immediately after a planet exits keep-out causing approximately half of the successfully detected planets to enter keep-out immediately following their observation.

Convergence
We determine the required number of simulations to ensure the accuracy of our results by executing 10,000 simulations of a generic input specification similar to that used in Sec. 3. By assuming the true mean yield is equivalent to the mean yield over 10,000 simulations (μ det;10;000 ), we can demonstrate convergence of μ det;i to μ det;10;000 , where i in this section is the number of simulations in the ensemble. We calculate the mean yield from a subset of ensembles incrementally for each i using Eq. (38) from Ref. 47. We then calculate the percent deviation between μ det;i and μ det;10;000 shown in Fig. 11. Figure 11 shows the mean number of unique detections converges for a sufficiently large number of simulations.
By assuming the mean number of detections per run are normally distributed, we show the 1σ, 2σ, and 3σ confidence intervals of the mean in Fig. 11. In reality, the ensembles are some form of gamma distribution because the numbers are all positive and priors are exponentially distributed. This normal distribution assumption fits better for high yield telescopes. The absolute percent errors for varying confidence intervals for 100 and 1000 simulations are presented in Table 4. Absolute percent errors represent the uncertainty in the mean number of unique detections. Excluding Table 4 and Fig. 11, all results in this paper are derived from single simulations or ensembles of 1000 simulations. We use the 3σ confidence interval to make comparisons and determine whether a result is significant. We can say a general mean number of unique detections is accurate to within AE3.19% at 3σ for 1000 simulations. Fig. 11 Absolute percent error from μ det 10;000 convergence combined with 1σ, 2σ, and 3σ confidence intervals.

WFIRST Results
The results in this section are generated using all previously discussed assumptions and parameters as well as the parameters in Table 6 of Sec. 7 based on the cycle 6 description of the CGI. 18

Completeness and Planned Observations
By applying the optimization process from Sec. 2.4 to the target list filtered in Sec. 2.2 and assuming a Kepler-like planet population with per observation T OH ¼ T settling ¼ 0.5 day and maximum observing time of T max ¼ 91.3125 day (3 months), we get the integration times in Table 7 in Sec. 8. The planned summed completeness of this target list is P c i ðt 3;i Þ ¼ 2.35. Since completeness is the probability of detecting planets from a population around a star, multiplying P c i by the population occurrence rate (η KL from Sec. 2.1) gives the expectation value of planets detected, in this case equal to 5.58 detections. We calculate the ultimate completeness 10 for all 651 potential targets by evaluating P c i at an infinite integration time (t ∞ ) to get P c i ðt ∞ Þ ¼ 9.01 and an expectation value of 21.40 exoplanets detected. We note here that the theoretical maximum summed completeness, if there were no Δmag or WA constraints, for all 651 target stars is 651. The summed ultimate completeness of only the targets in Table 7 is P c i ∀ i ∈ I ¼ 2.99 with an expectation value of 7.10 exoplanets detected. The ratio 2.35∕9.01 × 100 ¼ 26.1% is a measure of planned target list yield to the maximum theoretical summed completeness observing all 651 targets with t ∞ . The ratio 2.35∕2.99 × 100 ¼ 78.6% is a measure of the summed completeness of the planned target list to the maximum theoretical summed completeness of only the target stars observed with t ∞ . These ratios represent the fraction of planet phase space about all scheduled targets that could be probed in the assumed, finite, total mission time. We can conclude from these results that, in the short amount of time allocated for WFIRST, our optimized target list is capable of observing 78.6% of the summed ultimate completeness it could possibly gather.
There is a non-negligible difference between the planned completeness, c 3;i , and the completeness actually observed c t obs ;i . The planned observations and Monte Carlo simulations entirely based on the Kepler-like population are shown in Fig. 12. Note the loss of one observation between the planned and actual observations, which we attribute to accumulation of machine precision errors and our strict adherence to the T max upper bound on observing time as well as the allowance of characterizations in the implemented survey. The loss of an observation is evident by the single red square without an associated blue circle in Fig. 12. The P c i actually observed in this particular simulation of the Monte Carlo was 2.33. For each observation made in a survey simulation, observed completeness (blue circles) coincident with planned completeness (red squares) indicates each simulated observation occurs under optimal conditions for a single visit. Because we are observing targets solely at the local zodiacal light minimum and do not modify integration times, all observations made are optimal. If targets were observed at suboptimal zodiacal light levels, the c t obs ;i would be below the c 0;i as shown in Ref. 17. From the combined time varying limit in Eq. (12) applied to Eq. (11), we get the black sigmoid-like lines in Fig. 12, specifically plotting c i ðt i Þ for the top 10, median, and lowest completeness optimized targets in the target list. The median and lowest c i ðt i Þ lines are characteristic for the majority of similar low completeness targets and the addition of targets will typically be below the lowest completeness target in this list. The c i ðt i Þ and completeness side histogram shows a clustering of targets at lower completeness values, which can be attributed to the increased number of targets at larger d i . The upper limit of completeness lines is consistent with the theoretical maximum completeness values.
Demonstrating the importance of including overhead and settling times in observations are the max c i ∕t i diamonds in Fig. 12. These are universally located at some small integration time (t i < 10 −3 day, ∀ i ∈ I) and small completeness meaning any optimized target list maximizing P c i ðt i Þ t i without overhead constraints will have strictly nonzero t i > 0, ∀ i ∈ I. This means optimizing summed completeness without T OH and T settling results in an observation target list of length N (651 in the case of WFIRST), which cannot be executed under realistic conditions. Similarly, observing at Δmag lim would result in suboptimal individual target completeness and also leaves a substantial amount of unobserved phase space around each target.
Calculating completeness using the SAG13 planet population shows the summed ultimate completeness of all 651 targets is 13.538. The summed ultimate completeness of the observed targets is 3.641. The minimum completeness of these targets has increased due to the increased likelihood of larger Δmag planets at larger s in the SAG13 distribution in Fig. 3(b). As in the Kepler-like optimized target list, we see a single target is not being observed due to strict enforcement of observing time constraints. The major difference from Figs. 12 and 13 is the change in shape of the c i ðtÞ lines. The SAG13 c i ðtÞ lines have a lower ultimate completeness, but more targets have this limit. In addition, each of these targets is observed for shorter integration times, which mostly have a higher theoretical maximum completeness and general shift toward higher completeness at lower integration times. There is additionally a larger separation of lower completeness targets in Fig. 13 compared to Fig. 12. In general, completeness calculated using the SAG13 population is larger than completeness calculated using the Kepler-like population. Fig. 12 Completeness as a function of integration times calculated using the Kepler-like distribution, EXOCAT-1 star catalog, Nemati SNR model, 35 and Leinert Table Zodiacal Light. 37 Black lines show completeness versus integration time for the 10 highest planned completeness targets, the median completeness target, and the lowest completeness target. Red squares indicate planned integration time and planned completeness based on ε maximizing summed completeness. Blue dots indicate observation integration time and observation completeness of the simulated observation. White dots represent completeness at Δmag lim and are plotted for the 10 highest, median, and lowest completeness planned targets. Blue diamonds show the completeness and integration time of the maximum c i ∕t i point for the 10 highest, median, and lowest completeness targets. This plot references data generated using C0vsT0andCvsTDATA_WFIRSTcycle6core_CKL2_ PPKL2_2019_10_07_11_58_.txt with specific target stars, integration times, and completeness included in Table 7.

Sky Distribution of Completeness, Integration Times, and Targets
We are able to take the optimized target list included in Table 7 and bin the heliocentric ecliptic coordinates ðl; bÞ of each target, into triangular regions of approximately equivalent size and approximately isotropic distribution on the sky. When we sum integration time for all targets in each bin and normalize by bin area, we get the skymap distribution shown in Fig. 14. Since Fig. 12 shows all integration times are between 0.1 and 2 days, we can conclude the ðl; bÞ ¼ ð−140 deg; 0 degÞ and ðl; bÞ ¼ ð20 deg; 50 degÞ bins have the highest concentrations of observing time. By summing the bins over heliocentric ecliptic latitudes (b), we see a large disparity in P t i versus l, target count versus l, and P c i versus l. Since WFIRST is planned to be on an L2 halo orbit and has a Sun-orbital period of ∼365.25 days, the Leinert local zodiacal light 37 is fixed in this rotating frame, and the time-distribution of stars is uneven, the optimally scheduled mission will have preferential observing times consistent with the distribution in Fig. 14. This result is important for optimally distributing limited CGI time under the constraints of a 5-year mission shared with multiple other instruments. Using this distribution, we can create preferentially distributed observing blocks for the CGI.

Detected Planet Properties
From our ensembles of survey simulations, we can look at how a and R of the detected planets are distributed. The top row of Fig. 1 is the average distribution of R and a for all generated planets in a universe. In each subplot of Fig. 1 Table 5. Subplots of Fig. 13 Completeness as a function of integration times calculated using the SAG13 distribution, EXOCAT-1 star catalog, Nemati SNR model, 35 and Leinert Table Zodiacal Light. 37    Figs. 1(c) and 1(d) use a target list optimized for the Kepler-like planet population to observe universes of Kepler-like and SAG13 simulated planets. Similarly, subplots of Fig. 1(e) and 1(f) use a target list optimized for the SAG13 planet population to observe universes of Kepler-like and SAG13 simulated planets. We can do an analysis on the kinds of planets WFIRST is expected to detect in a blind search survey. Since our universe is randomly generated, we show the distribution of generated R versus a planets for all stars for the Kepler-like and SAG13 planet populations in Fig. 1. The implemented planet generation rate for the universe of the Kepler-like planets is η KL ≈ 2.377, consistent to two decimal places with the planet population model in Sec. 2.1. Here, η is simply calculated by dividing the sum total of planets in the ensemble of universes by the number of simulations in the ensemble (1000 from Sec. 2.5.4) and number of target stars (651 from Sec. 2.2). The η SAG13 ≈ 5.618 for SAG13 is also consistent to within two decimal places with the model in Sec. 2.1. The generated planet populations are consistent with the limits presented in Sec. 2.1. In Sec. 2.1, we showed the smallest observable planet-star separation observable to be 0.292 AU and each of the four detected planets plots shows all planets detected have a > 0.292 AU. A distinctive feature of the planet a generation is the "knee" applied at 10 AU which can be seen by the sharp drop-off in both populations.
There is a marginal difference in the y axis of the contour plots between Kepler-like and SAG13 populations, as SAG13 generates smaller planet radii than Kepler-like, so direct comparisons cannot be made between individual gridspace averages across universes. We also note the SAG13 universe generates an order of magnitude more large R and large a planets as indicated by the 24.29 and 221.19 grids from Kepler-like and SAG13, respectively [Figs. 1(a) and 1(b) gridspace (4,4)]. Comparisons between planet populations used to calculate completeness observing the same planet population universe indicates that optimizing with the Kepler-like population results in marginally smaller planet R detections. Specifically for the Kepler-like universe's largest average detection bin, we see that optimizing with the Kepler-like planet population results in 1.35 detected exoplanets on average and optimizing with the SAG13 planet population results in 1.11 detected exoplanets on average, which is significant, given our convergence results above. For the SAG13 universe's largest averaged detection bin, we see optimizing with the Kepler-like planet population results in the 3.51 detections and optimizing with SAG13 yields 3.55 detections but scaling by the number of detected planets gives 3.54 ¼ 3.51 × 16.266∕16.101 which is nearly identical. From these observations, we can conclude observing a universe of SAG13 planets with integration times optimized using completeness from a Kepler-like planet population result in more detections of small R planets. Each gridspace in the bottom two rows of Fig. 1(d) is greater than or equal to each gridspace in the bottom two rows of Fig. 1(f). We can specifically point to the 0.81 and 0.69 grid spaces that most evidently confirms this observation.
We can draw several conclusions by inspecting the average population of planets detected. Our simulations show WFIRST will not detect planets with a < a mercury . We also see WFIRST is not capable of detecting planets with R < R È . We also note that WFIRST is not sensitive to planets beyond a saturn . The majority of planets detectable by WFIRST are cold-Jovians and Super Earths. Based on Fig. 1, we can conclude that optimizing integration times with the more pessimistic Kepler-like planet population universally biases detections toward smaller planets.

Overfitting
We have chosen to use the Kepler-like and SAG13 planet populations to optimize target lists in this paper, both of which are created based on the known population of exoplanets. Since a motivation for the WFIRST CGI is to observe new exoplanets in an unexplored region of space, we must investigate how yield for a target list of integration times optimized for one planet population changes when observing a universe full of planets based on another planet population.
Optimizing a target list using completeness based on a planet population and observing that same planet population results in the highest yield in given in Table 5. However, observing a universe of Kepler-like planets with a target list optimized for SAG13 planets results in a 5.06% decrease in exoplanets detected, a greater decrease than observing SAG13 planets with a target list optimized using Kepler-like planets (a 1.01% decrease). A possible explanation for this difference is our inclusion of the rare characterization. The characterization part of Table 5 does not indicate this is the case and shows more planets are characterized on average when optimized with the wrong planet population range. A characterization observation is triggered whenever a planet is detected with sufficiently small Δmag and separation such that the immediate reobservation could achieve an SNR > 10 with a newly calculated t i < 30 days and all other detection observation filters in Sec. 2.5. Since the observation would be carried out with a different instrument, the changed parameters are included in Sec. 7.
The conclusion we can draw from this exercise in overfitting is that optimizing with the more pessimistic Kepler-like planet population yields more robust detections and more characterizations if the planet population is actually SAG13. This warrants reinvestigation with varying margins placed on t obs;i .

Conclusions
In this work, we presented our method of optimizing integration times and validating the optimized target list in a Monte Carlo of survey simulations using the EXOSIMS code base. We presented our implementation of a Kepler-like and SAG13-based planet populations and our associated methods for calculating completeness and simulating universes of planets. We presented our generalized target list integration time optimization process accommodating per observation overhead time constraints for a blind search single-visit survey and showed how the inclusion of overhead times is necessary when optimizing surveys. We discussed how EXOSIMS book-keeps all time varying aspects of a survey mission including keep-out regions of the Sun and major planets using ephemeridis, the observatory's position on an L2 halo orbit, local zodiacal light for each target, positions of the target stars, positions of simulated planets around target stars, and our strict enforcement of total observing time. Under WFIRST's constraints, we show the optimal heliocentric ecliptic longitudes for target observation based on local zodiacal light intensity as well as more realistic local zodiacal light average minimum and average maximum magnitudes of 22.79 and 21.59 mag arcsec −2 , respectively. The strictness of WFIRST's keep-out regions causes the visibility of some targets to drop below 28% which severely constrains both single visits and would constrain revisits. Making telescopes less sensitive to solar panel-based and planet-based keep-out regions keep-out regions enable observing at more ideal local zodiacal light conditions, increase the total target visibility, and will prevent aliasing of similar period planets when considering revisits. By making only optimal observations in our Monte Carlo survey simulations, we see the WFIRST optimized target list's preferential observing times are unevenly distributed on the sky and across heliocentric ecliptic longitudes, but these longitudes do not necessarily line up with the completeness longitude distribution. The observed mismatch between the observing time distribution across the sky and the completeness distribution across the sky means observation at specific times of year will be more valuable than others throughout the mission. Most importantly, it means that in the case where a mission is designed with preallocated observing blocks, blocks reserved for exoplanet imaging must take into account both the integration time and completeness distributions over ecliptic longitudes.
We validated our WFIRST optimized target list's summed completeness of 2.35 by performing a Monte Carlo of survey simulations on the Kepler-like planet population, achieving an observed summed completeness of 2.33 using our model of WFIRST. We have also demonstrated the convergence of the mean number of detected exoplanets from Monte Carlo of 1000 simulations to have 3.19% error at 3σ, giving additional confidence in the Monte Carlo results. Our Monte Carlo simulations also indicate that the WFIRST CGI detects 5.48 exoplanets, on average, when observing the Kepler-like planet population and 16.26 exoplanets on average when observing the SAG13 distribution. The vast majority of the WFIRST detectedable planets range in size from Super-Earths to Jovians at semimajor axes from 0.4 to 6 AU. Observing a population of the Kepler-like planets with a target list optimized for a SAG13 population of planets results in a yield decrease in detections of >5% whereas the inverse indicates a yield decrease <1% indicating the Kepler-like planet population optimized target list is less sensitive to variations in the underlying planet populations and should be used to optimize CGI target lists. We also see that optimizing with a more pessimistic planet population universally results in more detection of small radius planets. The simulation inputs and results included in this paper are publicly available via Cornell's eCommons service at: https://doi.org/10.7298/90n2-5j40.

Future Work
There are many improvements to be made in the input assumptions and underlying models used. We can likely improve yield by increasing the number of targets available to use in optimization by filtering stars missing only the parameters we use in our calculations instead of the blanket missing data filter we use currently. WFIRST might not be able to detect small planets, but improving our planetary albedo interpolant for small rocky bodies would be useful for larger and more capable telescopes. The local zodiacal light model used is realistically only suitable for Earth and L2-based telescopes and does not show the high-resolution structure other missions have found. A randomly generated star inclination could also be used for calculating the exozodiacal light at the planet. There are better exozodiacal light models we could use when simulating detection observations as well as better estimates for assuming f EZ in the initial optimization step. We could also use a prediction of the planets angular separation to predict the f EZ;k at the next observation. A newer version of Nemati's SNR model with more parameters could more realistically model WFIRST and other future telescopes. We specifically omit the effects of radiation in Nemati's model in order to ensure we can calculate Δmagðt i Þ. Including this (or instrumentation faults) would incentivize us to select a selection metric with some preference toward higher completeness targets early on in the mission.
WFIRST is limited by not only keep-out regions and but also observing blocks. The solar panel power keep-out region of WFIRST is substantial and, while not significantly impacting single-visit direct detection yield, will greatly impact revisit yield and orbit characterization of planets with orbital period similar to Earth. An analysis on the subspace of the Earth-like planets unobservable caused by the selection of an orbit with an Earth-like period should be conducted. In our implementation, we make observations at local zodiacal light minimums, but some targets have two local zodiacal light minimums. Our first observation of a target should preferentially be made when targets exit keep-out so subsequent detection or characterization observations can be made in that observing block. Our scheduling process does not assume the existence of observing blocks, which would further limit visibility of some targets and could force observations at suboptimal observing times.
Since our survey simulation is allowed to make characterization observations and our optimization does not account for characterizations or revisits, adding some heuristic describing the expected time making orbit characterizations or spectral characterizations can improve overall yield of a survey. WFIRST is not particularly sensitive to this issue because of a relatively short CGI mission time with a lower throughput resulting in a target-starved environment and less characterization candidates. Running SLSQP on larger, more capable observatory designs such as HabEx and LUVOIR result in unplanned characterizations at nearly every star which consumes a large portion of mission time and many planned detections are not made.
The multiplanetary systems simulated around stars are statistically consistent with the Kepler-like or SAG13 planet population occurrence rates, but these planets do not share the same ecliptic plane and are likely in unstable orbits. Simulating a random inclination for each target star and then randomly generating planet deviations from the star's inclination would be more useful and more representative of real star systems. This could enable the estimation of some orbital properties from images of stars with multiple planets. By simulating real planetary systems, it could be possible to determine whether a planet could exist in the habitable zone merely by characterizing the orbit of a Jupiter-like planet. Applying some simple star system stability filters on subsets of planets around stars would enable us to generate more realistic star systems and even help in fitting orbits of multiplanetary systems.
Science reward for various images and discoveries needs to be enumerated, categorized, and weighted to accommodate more than single-visit unique detections. 48 Future survey mission optimization and planning will require orbital characterizations in addition to spectral characterizations and will heavily weight exo-Earths. If the only discovery of scientific value is a specific type exoplanet with full orbital and spectral characterization, then incremental metrics and probabilities need to be created informing a mission optimizer or operator the probability (after subsequent images) that a potential Earth-like exoplanet is indeed an Earth-like exoplanet. The first step in this is determining the probability a detected planet with a s k and Δmag k is of a specific exoplanet type based on the planet binning scheme in Ref. 21. In addition, the reward associated with revisits needs to be separated by planet into some set of orbital parameters, and planet properties as well as new planet detection potential. With these newly formulated metrics, it will be possible to construct a dynamic program which calculates the incremental expected value from new observations at each step of the mission to maximize orbital and spectral characterizations of Earth-like exoplanets.
6 Appendix A

Symbols
Throughout this paper, we use some common subscript conventions and notation discussed here. We use a generic variable X to describe these conventions. The subscript k, such as in X k or X x;k , is referring to a X or X x of an individual planet. The subscript i, such as in X i or X x;i , is referring to a X or X x of a individual target star. The subscript min, such as in X min or X x;min , describes that this parameter is a minimum of X or X x ; this extends to the max subscript. We use bold variables such as X when we are referring to a set or collection of something. We use underlines such as X to identify position vectors andX to identify unit vectors.
We make use of common variables differentiated by their subscript or boldness to differentiate between what they specifically refer to. All ϵ parameters are related to efficiency of something. Lower case a refers to a semimajor axis-related parameter. Lower case p refers to an albedo-related parameter. Lower case c refers to a completeness. Lower case r refers to a direction or position vector. t variables are all time related. s refers to a planet-star separation related value. a refers to a semimajor axis-related value. R refers to a planet radius-related value.

Appendix B
This appendix contains functions and interpolants used in the calculation of Δmag in Eq. (12), including expressions for the zero-magnitude flux, star apparent V magnitude, throughput, intensity transmission of extended background sources, core mean intensity, core area, and quantum efficiency.

Zero-Magnitude Flux
The zero-magnitude flux, F 0 , used in the calculation of spectral flux density, C F 0 , in Eqs. (12), (13), and (19), is given as V mag;i is the V-band apparent magnitude of the i'th target star and is taken from EXOCAT star catalog used in Ref. 32. BV i is the color of the star as measured by the difference between B and V bands (in magnitudes). The parameters for the target stars used in the final target list are included in Table 7 in Sec. 8.

WFIRST Cycle 6 Parameters
This section presents the WFIRST cycle 6 parameters 18 in Table 6 used to calculate completeness and integration times.

Characterization Parameters
In this work, we include characterization observations, which slightly detract from executing the planned observation schedule in Table 7 of Sec. 8. These characterization observations immediately follow the detection of a planet so long as the star r i∕SC is not obstructed by a keep-out region, is not filtered by integration time, and there is sufficient time to make the observation before the target enters a keep-out region. A detected planet is considered for characterization if SNR > 10. These characterization observations use slightly modified set of parameters. Our characterization observation integration time is calculated using SNR ¼ 10, at λ ¼ 660 nm, PPL ¼ 4.0, ϵ syst ¼ 0.46584, N pix ¼ 76, PS ¼ 0.02631, and Rs ¼ 50. Rs, the spectral resolving power specifically changes the Δλ to λ∕Rs.

Electron Count Rates
This section lays out Nemati's SNR equation and all subcomponents from Ref. 35. The SNR equation used in this paper is E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 5 ; 1 1 6 ; 2 1 7 This uses C p;i , C b;i , and C sp;i which are the planet signal, background signal, and residual speckle spatial structure in electron count rates in s −1 , respectively. We calculate C p;i using E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 6 ; 1 1 6 ; 1 3 2 NCTE is the net charge transfer efficiency. We calculate C b;i using E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 7 ; 1 1 6 ; 7 2 3 C sr;i , C z;i , C ez , C dc , C cc , and C rn are electron count rates with units of s −1 for starlight residual, zodiacal light, exozodiacal light, dark current, clock-induced charge, and readout noise, respectively. ENF is an excess noise factor. We calculate the starlight residual count rate, C sr;i , for each target star i as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 8 ; 1 1 6 ; 6 3 7 where N pix is the number of pixels in the photometric aperture [Γðλ; WAÞ∕θ 2 ] calculated as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 9 ; 1 1 6 ; 5 8 7 where PPL is the number of pixels per lenslet and is simply the lenslSamp 2 , a parameter specified in Table 6, where PS is the detector pixel scale in arcsec per pixel. Ψðλ; WAÞ is the core mean intensity from Fig. 4 in Sec. 7. ν i is given by Eq. (24) in Sec. 7. We calculate the zodiacal light count rate, C z;i , using E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 3 0 ; 1 1 6 ; 4 9 2 where γðλ; WAÞ comes from Fig. 4 in Sec. 7. We calculate the exozodiacal light attributed count rate, C ez , as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 3 1 ; 1 1 6 ; 4 3 2 We calculate the dark current count rate, C dc , as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 3 2 ; 1 1 6 ; 3 8 4 where i dark is the dark current per pixel. We calculate clock-induced charge count rate, C cc , as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 3 3 ; 1 1 6 ; 3 2 4 where CIC is the clock-induced charge per pixel and t exp is the exposure time.
We calculate the readout noise count rate, C rn , as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 3 4 ; 1 1 6 ; 2 6 3 where s read is the readout noise per pixel. We calculate C sp;i using E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 3 5 ; 1 1 6 ; 2 0 2 where ϵ pp is the post-processing efficiency. Using these photon count rate models in applied to HIP 25278, a fifth magnitude star assuming f Z;0 , f EZ;0 , with a planet at WA ¼ 0.28 arcsec and Δmag ¼ 22.5, at λ ¼ 565 nm, using the instrument parameters in Fig. 4, we get the photon count rates C p ¼ 0.00174175 s −1 , C b ¼ 0.00646741 s −1 , and C sp ¼ 0.00016929 s −1 .

Local Zodiacal Light
Calculation of local zodiacal light is broken down into two major components: an intensity wavelength dependence correction factor, f λ ðλÞ, and intensity at the spacecraft centered look vector, f β ðr i∕SC Þ.
For f β , we know the zodiacal dust cloud has structure, but the degree to which structure and phase/scattering properties contribute to the zodiacal light intensity from a general observer location in space is currently uncertain (although missions have been proposed to model such a dust cloud 51 ). Knowing the degree of contribution determines whether the antisolar point of Table 17 of Ref. 37 should be modeled as fixed relative to the Earth or fixed relative to the observer. To simplify our work, we assume the latter so λ − λ 0 ¼ 0 deg when r i∕SC ¼ r ⊙∕SC and the corresponding antisolar point is when r i∕SC ¼ r SC∕⊙ and λ − λ 0 ¼ 180 deg.
To calculate local zodiacal light, we first find the position of the observatory in the heliocentric ecliptic frame r SC∕⊙ ðt c Þ. We then calculate E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 3 6 ; 1 1 6 ; 5 6 8 We get the longitude of the Sun relative to the spacecraft in the heliocentric frame l ⊙∕SC ¼ ðl SC∕⊙ þ 180Þ%360. We find the position vector describing the star position in the heliocentric true ecliptic frame r i∕⊙ and calculate the star position with respect to the observatory r i∕SC ¼ r i∕⊙ − r SC∕⊙ ðt c Þ. We then transform r i∕SC into spherical coordinates using Astropy's SkyCoord and extract the target star's latitude (b i∕SC ) and longitude (l i∕SC ) relative to the spacecraft. We then convert to absolute values for interpolation in the latitude and longitude range of  Table 17 in Ref. 37 and by extension f Z ðl; b; λÞ in Eq. (19).
To assess the validity of our spacecraft centered versus geocentric ecliptic frame, we need to assess how much the angular position of zodiacal light intensity interpolant inputs would differ. Since WFIRST is on an L2 Halo orbit, its out of ecliptic motion is <0.004 AU and orbital distance from the Sun is ≈1.010 AU, resulting in a geocentric ecliptic frame interpolation input deviation of Δb < 0.22 deg. When r i∕SC is 180 deg or 0 deg from r ⊙∕SC , the l and b used for interpolation are correct. However, interpolating for a target at say l ¼ 90 deg has the value somewhere between 89 deg < l < 90 deg due to the actual position of the spacecraft at the L2 Halo orbit and not Earth. We expect Δl < 1 deg for a Sun-Earth L2 orbit. We now make note that the smallest griddspacing of the input data is 5 deg meaning Δl and Δb are within these bounds. It is also important to note the accuracy of this zodiacal light model is, at best, 10%. 37 The final component necessary to complete Eq. (19) is f λ ðλÞ, a wavelength correction factor, which has a detailed explanation in Ref. 37.

SAG 13 Occurrence Rate Derivation
At some point, we need to convert occurrence rate models from period (P) space into semimajor axis (a) space. We chose to do this at the probability distribution level so we may arrive at a distribution of Δmag versus s, which will be a function of a. In this section, we convert the joint probability density function of planet occurrence rate in ln P and ln R space from the study analysis group 13 model in Ref. 28 to a and R space. The parametric fit exoplanet occurrence rate model they present is E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 3 7 ; 1 1 6 ; 1 5 2 Here, P refers to the planet period in years which is defined in the SAG 13 model over the range 10 ≤ P ≤ 640 days. R must be in units of R È and P must be in units of years. The constants γ i , α i , and β i all refer to constants from Ref. 28 included in, where i in this instance refers to whether i ¼ 0 or i ¼ 1 is used. We would like to transform this occurrence rate distribution into linear functions of R and a.
We first need to convert this parametric model fit from log-scaled distributions such as ∂ ln R and ∂ ln P to linear scale distributions of the form ∂R and ∂P. The general expression y ¼ ln x has derivative xdy ¼ dx which we can apply to both of these separable parameters since the γ i , α i , and β i terms are all constant. We arrive at the linear distribution of occurrence rates over R and a as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 3 8 ; 1 1 6 ; 6 4 0 We now need to convert from an occurrence rate model in terms of P to a. This conversion exists in Ref. 52 as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 3 9 ; 1 1 6 ; 5 5 1 Here, μ ¼ GM s where M s is the mass of the Sun and G is the gravitational constant. M s is close to the average mass of G spectral type stars, the spectral type the SAG 13 occurrence rate model is defined for. We assume the SAG 13 model extends to stars of all spectral types. To replace ∂P we calculate the partial derivative of Eq. (39) with respect to a as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 4 0 ; 1 1 6 ; 4 5 2 Substituting this into the linear occurrence rate model in Eq. (38) gives the linear SAG 13 occurrence rate model in terms of R and a of The Keck Planet Search used radial velocity observations of exoplanets to describe an occurrence rate as a power-law in planet mass and orbital period, but was limited to planets with P < 2000 days (a < 14.4 AU). 23 However, subsequent direct imaging surveys by the Gemini Planet Imager discovered the power-law did not extend throughout the entirety of the instrument's sensitivity range and supports that planet occurrence rates approach 0 between 10 and 100 AU. 25 Reference 26 introduces a period break to describe giant planet occurrence turnover in radial velocity data and presents many possible knee values ranging from ≈7 to ≈15 AU, depending on the model fit being used. We transform their period break into a similar semimajor axis knee value of a knee ¼ 10 AU making the adjusted SAG 13 exoplanet occurrence rate model The total planet occurrence rate (η SAG13 ) over an assumed planet range is given by marginalizing over both parameters of the SAG 13 occurrence rate model with the appended a knee to get a min ∂ 2 ηðR; aÞ ∂R∂a da dR: (43) It is important to note that a cubic form of the roll-off on the semimajor axis distribution is used, rather than the quadratic form from the Kepler-like case [c.f., Eq. (1)]. This is motivated by recent limits placed on wide-separation planets by direct imaging and longer-baseline radial velocity data. 26 We now calculate the joint probability density function of a and R by normalizing based on the integral over the occurrence rate parametric model to arrive at E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 4 4 ; 1 1 6 ; 6 3 9 f R p ;a ðR; aÞ ¼ 1 η SAG13 × 8 > > > < > > > : The integral of this joint probability distribution over the a and R range is 1. Previous work in Ref. 22 that originally derived the equations in this section showed that independently sampling R and sampling a marginalization over R of Eq. (44) could achieve less biased sampling of the exoplanet distribution. We marginalize Eq. (44) over a to arrive at an intermediate constant E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 4 5 ; 1 1 6 ; 4 9 3 We use K i to find the probability density function of semimajor axis conditional on R of We can independently sample the planetary radius distribution and subsequently sample the probability density function of semimajor axis conditional on the sampled R. We demonstrate the similitude and differences between our EXOSIMS implementation of the SAG 13 model from Ref. 28 in Fig. 2. In Fig. 2(a), we directly replicate linear exoplanet occurrence rate model around G spectral type stars from Ref. 28 included in Eq. (38). The purple numbers represent 100× the double integral over their respective bin areas. The largest absolute difference of these numbers from those in Ref. 28 is 0.74 in the range of 2.2 ≤ R < 3.4 and 320 ≤ P < 640. The largest percent difference of these numbers is 13.8% in the range of 11 ≤ R < 17 and 20 ≤ P < 40. We transformed these into percentages which sum to 100% over the whole grid. The occurrence rates and percentages of Fig. 2(b) are calculated by integrating over the transformation of the SAG 13 grid to a space with Eq. (41). The integrated values in each bin are identical to their counterpart in Fig. 2(a). Finally, Fig. 2(c) is the probability density function used in EXOSIMS with the "knee" included. The coloring difference between (c) and (a) or (b) is because (c) is a per bin area density and the bin areas of the top right bin is several orders of magnitude larger than the bottom left bin in both (a) and (b).

Appendix C
Using the SLSQP optimization method presented in Sec. 2.4 on the planet population discussed in Sec. 2.1, we arrived at the planned observation target list in Table 7. Table 7 Planned observation target list optimized using the Kepler-like planet population. sInd refers to the index of the planet in the filtered, initial target list of 651 targets. Name refers to the Hipparcos star catalog name of the target star. V mag refers to the visual magnitude of the target star. d i is the distance from the Sun to the host star. BV is the color of the star as measured by the difference in B and V bands (in magnitudes). t obs;i is the observing time planned by the optimization algorithm. c i ðt obs Þ is the expected completeness reward for making an observation of this star for the prescribed integration time. The "known planet" column in Table 7 was generated by taking all target star names in the optimized target list derived from the EXOCAT-1 star catalog, and cross-referencing them using a list of aliases from SIMBAD and the NASA Exoplanet Archive. In total, 9 of the 60 planned targets already have known exoplanets. The final column contains each target star's spectral type. Data in this