Starshade Rendezvous: Exoplanet Orbit Constraints from Multi-Epoch Direct Imaging

The addition of an external starshade to the {\it Nancy Grace Roman Space Telescope} will enable the direct imaging of Earth-radius planets orbiting at $\sim$1 AU. Classification of any detected planets as Earth-like requires both spectroscopy to characterize their atmospheres and multi-epoch imaging to trace their orbits. We consider here the ability of the Starshade Rendezvous Probe to constrain the orbits of directly imaged Earth-like planets. The target list for this proposed mission consists of the 16 nearby stars best suited for direct imaging. The field of regard for a starshade mission is constrained by solar exclusion angles, resulting in four observing windows during a two-year mission. We find that for habitable-zone planetary orbits that are detected at least three times during the four viewing opportunities, their semi-major axes are measured with a median precision of 7 mas, or a median fractional precision of 3\%. Habitable-zone planets can be correctly identified as such 96.7\% of the time, with a false positive rate of 2.8\%. If a more conservative criteria is used for habitable-zone classification (95\% probability), the false positive rate drops close to zero, but with only 81\% of the truly Earth-like planets correctly classified as residing in the habitable zone.


INTRODUCTION
The Starshade Rendezvous Probe (SRP) mission concept proposes adding a Starshade to the Nancy Grace Roman Space Telescope enabling the detection of habitable zone exoplanets and characterization of their atmospheres (Seager et al. 2019). Romero-Wolf et al. (2020) described in detail the technical basis for the SRP study report (Seager et al. 2019) along with the simulations used to estimate sensitivity of the observatory. The corresponding software is publicly available for reproduction of the results below and for comparison with similar simulations. 1 . The main result of these studies is that SRP is capable of discovering Earth-size planets in the habitable zones of nearby stars using the relatively moderate aperture of the Roman space telescope (Romero-Wolf et al. 2020) along with the Coronagraph Instrument (CGI).
While the SRP science objectives include quantifying the amount of habitable zone dust around nearby stars and measuring the metallicity of known gas giant planets, the primary driver is the detection and characterization of Earth-like planets. The overall strategy, described in more detail in Romero-Wolf et al. (2020), involves three main steps: 1) initial detection via direct imaging, 2) habitable zone determination via orbit tracing, and 3) atmosphere characterization via spectroscopy. The integration times necessary to image and to take spectra of Earth-like planets (steps 1 and 3) were taken into account with a model of the observatory. However, step 2 is more complicated since the observatory field of regard is constrained by solar exclusion angles, typically limiting the target observing windows to two ∼30 day blocks per year -a total of 4 observing opportunities during the assumed 2-year mission lifetime. Depending on the orientation and phase of a planet's orbit, it may or may not be visible during each of these 4 observing windows. In Romero-Wolf et al. (2020), we assumed that detecting the planet during at least 3 of the 4 epochs would be sufficient to determine if a planet lies within its parent star's habitable zone. In this paper, we consider this step in more detail, performing multi-epoch orbit fitting for the target list and expected signal-to-noise given by the Romero-Wolf et al. (2020) observatory model.
Measurement of planetary orbits has been previously modeled for several types of observations -radial velocity, astrometric wobble, and coronagraphic direct imaging. Examples include: 1) Mawet et al. (2019) combining radial velocity measurements with direct imaging upper limits to improve the orbital fit for the planet eps Eri b, 2) Ford (2006) quantifying the robustness of planetary orbit determination via the stellar astrometric signal, concentrating on the difficulty posed by multi-planet systems, and 3) Guyon et al. (2013) considering simultaneous stellar astrometry and direct imaging, finding that Earth-like planets can be characterized in just a few observations. Among the studies that, like this paper, concentrate on direct imaging, Blunt et al. (2017) performed a detailed analysis of orbital constraints based on the small fraction of an orbit that is traced by known long-period direct-imaged planets. For theoretical cases where the observations span at least half an orbital period, Horning et al. (2019) find that three equally spaced observations with SNR ≥ 10 can measure the semi-major axis and eccentricity to 10%. Guimond & Cowan (2019) also consider direct imaging of shorter period planets, finding that a habitable zone planet's semi-major axis can be measured to within 5% if it is observed with precision 3.5 mas over three epochs each spaced by at least 90 days apart.
Here we consider the results obtainable by a specific mission concept -the Starshade Rendezvous Probe. Unlike previous work, this includes: 1) a realistic signal-to-noise calculation as a function of stellar illumination, rather than an assumed astrometric precision, 2) a starshade-specific inner working angle that obscures planet images close to the star, and 3) observing windows based on known target sky positions and observatory pointing constraints, not just arbitrarily spaced images. Furthermore we focus here not on general orbit fitting results, but on a specific science question -whether or not we can determine if a planet lies in its parent star's habitable zone.
In the following, we first summarize the observatory model from Romero-Wolf et al. (2020) used to calculate the signal-to-noise for each planet image §2. For many sets of simulated observations, we then extract orbital elements for each injected planets §3. We give the resulting precision for the orbital fits in §4 and summarize in §5.

OBSERVING MODEL
We briefly summarize the models used for planet properties, orbit propagation, and the observatory. A detailed presentation of these models can be found in Romero-Wolf et al. (2020).  Leinert et al. (1998) Exozodi dust brightness c 4.5 zodi a For the assumed isotropic scattering, this geometric albedo is equivalent to 0.3 spherical albedo. b The semi-major axis ap is modified by a factor of √ L to account for stellar irradiance. c The unit of 1 zodi is equivalent to 22 mag/arcsec 2 .

Targets
Planet sizes and orbital periods are drawn randomly from these defined ranges for Earth-like planets, based on the distribution defined by SAG-13 (Belikov & et al. 2017) and modified by HabEx to include the dependence of the orbital semi-major axis on the lower limit of planet radii. The orbital period P defines the orbital radius a p by way of the stellar mass M using Kepler's third law. For sampling of Earth-like exoplanets, the orbits are assumed to be circular, consistent with most previous studies, e.g. Stark et al. (2016). However, when fitting, we allow eccentricity to be a free parameter (see §3).
The orbital radii are sampled over a range from inside the inner habitable zone (defined as 0.95 √ L , where L is the stellar luminosity relative to the Sun) to outside the outer habitable zone (defined as 1.67 √ L ; Kasting et al. 1993). The range of planet radii considered is bounded above at r pl ≤ 1.4 r ⊕ , based on evidence that suggests that planets with radius below are predominantly rocky (Rogers 2015). The lower bound on terrestrial planet radii depends on the planet's ability to retain an appreciable atmosphere, which, in turn depends on their stellar illumination. This results in a dependence on the planet's semi-minor axis a p , modified by the stellar luminosity to give r pl ≥ 0.8a We model Earth-like exoplanets assuming the scatter light isotropically using a Lambertian illumination phase function with a geometric albedo of 0.2 (Robinson & Reinhard 2018). We model the star as a blackbody radiator with the parameters provided in ExoCat (Turnbull et al. 2012). We include obscuring dust, both in the target system (exozodiacal dust with a fiducial value of 4.5 zodi (Ertel et al. 2020)) and locally in the Solar System (zodiacal dust using the model of Leinert et al. (1998)). Assumed planet and dust characteristics are summarized in Table 1.
While the use of an occulting starshade does allow for detection of fainter and closer planets, it also puts constraints on the allowed times of observation. The telescope-starshade system has a region of allowed Sun angles over which it can operate with the lower limit defined by the exclusion angle from the baffle of the telescope and the outer limit defined by reflection and scattering of sunlight off the starshade into the telescope baffle (Table 2).

Observatory
The integration times are based on the Starshade/Roman system parameters provided in Table 2. Roman has a telescope diameter of 2.4 m resulting in a point spread function of 65 mas at 750 nm wavelength. We assume observations in the 615 -800 nm band with an end-to-end efficiency, including optical throughput and detector efficiency, of 3.5% in imaging mode. The Starshade has an inner working angle (IWA) of 100 mas, below which planets are assumed to not be observable. The instrument contrast at the IWA and above is assumed to be 4 × 10 −11 , as calculated by Seager et al. (2019). Integration times vary between 1 -6.3 days, depending on the target (explained below).
Besides the sensitivity, determined by the parameters given above, the main constraints to orbit reconstruction are the nominal mission lifetime of 2 years and the solar exclusion angles. The minimum solar exclusion angle of 54 • is determined by the telescope baffle while the maximum solar exclusion angle of 83 • is determined by scattering off the Imaging integration time 1 -6 days edge of the Starshade. For Roman's orbit around the L2 Sun-Earth Lagrange point, the calculated observing windows are shown in Figure 1. During the 2-year lifetime of the mission, there are generally 4 opportunities to observe each target. It is assumed that targets with Earth-like planet candidates, following the decision tree laid out in Romero-Wolf et al. (2020), will be visited once in each of the 4 observing windows available. While spectral characterization can be performed with only a single visit with favorable illumination phase, multiple epochs are needed to constrain the planet's orbit, in particular its semi-major axis, which determines whether the planet is in the habitable zone. To best trace the orbit, the timing of the observations should be as evenly spaced as possible over the orbital period, given the limited observing windows and mission lifetime. The properties of the target stars are shown in Table 3. In Romero-Wolf et al. (2020) we defined the singlevisit completeness as the probability that an Earth-like planet would be detected during one target observation at a random time. The imaging integration time for each target is set by identifying what is required to reach a single-visit completeness of at least 50% with a minimum value of 1 day and a maximum value of 6.3 days (see Table 3). We also defined the orbital completeness as the probability that a randomly selected orbit will be detectable (S/N ≥ 7) during at least 3 of its 4 observing epochs. For most of the targets about half of the planet orbits meets this criterion, but the orbital completeness can sometimes fall below 20% for systems where the planet signal is relatively weak (again, see Table 3). In the next section, we calculate whether 3 detections in 4 visits is sufficient to determine whether a planet lies within its habitable zone.

ORBIT RECONSTRUCTION
Having identified the best targets for detection of Earth-like planets with their observation availability windows, we now describe our approach to orbit reconstruction. We assume the planet is observed at the beginning of each window as shown in Figure 1, which provides 4 observing epochs per target in most cases. In cases where the star has a single long availability window per year, we have set the observing times to the beginning and middle of that window.
For each Monte Carlo sampled planet, we propagate its circular orbit to each of the observing epochs and calculate its illumination phase (see §2). We apply the observatory model to estimate the planet signal to noise ratio (SNR). Observations with SNR ≥ 7 are considered to be detections. Otherwise, the observation is rejected as a non-detection. The one-dimensional astrometric uncertainty is approximated according to δθ = (65 mas)/SN R. The median astrometric precision for each star ranges from 2.3 to 5.9 mas (see Table 3). The simulated data are created by taking the true position of the planet and adding two-dimensional Gaussian scatter based on the astrometric precision, δθ.
For the orbit reconstruction, we implemented a forward modeling of Kepler's laws, as described in Mede & Brandt (2017), into our own software package. We sample all six Keplerian parameters, also including uncertainties in the star's mass and distance, treating them as nuisance parameters. Table 4 lists the parameters that we fit for each orbit, along with their assumed ranges and prior constraints. We use the emcee Markov chain Monte Carlo (MCMC)   (15) Delta Eridani (14) Fomalhaut (13) 1 Ori (12) beta CVn (11) beta Hyi (10) sigma Draconis (9) 82 Eridani (8) delta Pavonis (7) Altair (6) omicron 2 Eridani (5) Sirius A (4) epsilon Indi A (3) Procyon A (2) tau Ceti (1) Figure 1. Target star observing windows (reproduced from Romero-Wolf et al. (2020)) resulting from the telescope and starshade solar exclusion angles. Each target typically has two ∼30-day-long observing windows per year. Targets at high ecliptic latitude can have longer observing windows per year. The black dots mark the four desired observation start times in a two-year period. This is driven by the need to allow for sufficient time to spectrally characterize a planet if it is bright enough. software package (Foreman-Mackey et al. 2013) to fit the orbit. The MCMC fitting procedure calculates the quality of fit for a series of parameter values, not only converging toward the best set of values but also finding the full parameter ranges that are consistent with the data. Periodic orbital elements (ω, Ω, T 0 , and i) are modulated to stay within their prescribed bounds (typical from 0 to 2π).
Three examples of the orbit reconstruction simulations are shown in Figure 2. The first panel shows Procyon, a relatively luminous star (7.1 L ), meaning that its habitable zone is relatively distant both in angular scale (0.7-1.3 ) and in physical space (2.5-4.5 AU). With a mass of ∼1.5 M , planets in the habitable zone have relatively long periods. The randomly selected planet orbiting Procyon is detected in all four observations, but because of the long period, only a fraction of the orbit is traced. The second panel shows a planet orbiting tau Ceti that is only detected in three of the four observing epochs; during the first observation, the planet falls behind the starshade mask (the grey circle in the center of each panel), a common occurrence for planets on inclined orbits. In the third panel (sigma Dra), there are again only 3 successful observations, but in this case the planet is too faint to be detected during the first epoch due to unfavorable illumination phase. Also, because sigma Dra is near the ecliptic north pole, it has only one observing window per year (see Figure 1) and its orbital phase coverage is limited (epoch pairs 1/2 and 3/4 are within the same window). With only three closely-spaced epochs, the fit is relatively poorly constrained. An example of the MCMC posterior distributions is shown in Figure 3. For this inclined orbit, the inclination (i) and longitude of ascending node (Ω) are well determined and are accurately retrieved. The retrieved eccentricity (e) is necessarily larger than the assumed circular orbit, but is still close to zero (0.03±0.02). Given the circular orbit, the true argument of periastron (ω) is undefined and the retrieved value is only loosely constrained. Most importantly, the semi-major axis (a) is well constrained by the observations, enabling us to determine that the planet lies well within the habitable zone.

RESULTS
For each of the 16 target stars listed in Table 3 we simulate 1000 random orbits and then extract orbital parameters as described above. Table 5 lists the number of orbital calculations for the full set of simulations (see Foreman-Mackey et al. (2013) for details on MCMC parameters).
Our first objective is to accurately determine each planet's semi-major axis. The ability to make this measurement depends not only on the astrometric precision for individual observations (2.8 to 5.9 mas; Table 3), but also depends critically on the orbital sampling. If planets do not trace out their full orbit during the two-year observing window, the quality of the fit is reduced. This is particularly true for stars with high luminosity, which translates to a more distant habitable zone and hence longer orbital periods. Sirius, the most luminous star in our sample ( the worst precision in its orbit fitting (72.5 mas), whereas eps Ind, the least luminous star (0.23 L ), has the best determined orbit (3.5 mas). (Table 3 lists the median semi-major axis precision obtained for the other target stars.) Our ultimate objective is to determine whether a planet lies within its parent star's habitable zone. The key metric for this determination is not the absolute precision, but rather the fractional precision on a planet's semi-major axis. While the absolute precision varies between Sirius and eps Ind by a factor of 20, the fractional precision for the two is comparable (2.7% and 2.1%, respectively), since Sirius' habitable zone is a factor of 16 larger than eps Ind's. For the overall sample, the median fractional precision varies from 2.1% up to 7.6% for Fomalhaut. Figure 4 shows semi-major axis measurement precision versus true semi-major axis for each of the 1000 simulated planets around three of our target stars. 82 Eri (left panel) has one of the best precisions (4.4 mas median), although the performance degrades significantly for more distant orbits, where the planets are relatively faint. Fomalhaut (central panel) has one of the worst precisions (50.5 mas median), primarily because the planets in its habitable zone around this A-type star have periods considerably longer than our 2-year mission lifetime (5-12 years), such that only a fraction of each orbit is traced. The effect of limited phase coverage can be seen in Figure 5, which plots semi-major axis precision as a function of orbital period in the center of the habitable zone. Planets with periods less than our 2-year mission lifetime are well constrained, but those farther out have semi-major axis precision increasing roughly linearly with the period.
While the semi-major precisions for other target stars (shown in Figure 8 in the Appendix) follow a similar pattern of smoothly decreasing precision with increasing a p , sigma Dra (right panel in Figure 4) has an unusual bump around a p 160 mas, corresponding to orbital periods of ∼1 year. The decrease in performance is due to the sampling being repeated on 1-year cycles, where observations taken during the second year of the mission have about the same orbital anomaly as those taken in the first year of the mission (i.e. there is a 1-year aliasing). The resulting small range of orbital-anomaly coverage (see the right panel of Figure 2) makes it more difficult to fit the orbit.    Our ability to pin down a planet's semi-major axis depends on the fraction of its orbit that is traced by the observations. Periods less than the mission lifetime (2 years) are well sampled, while those with longer periods are only observed for a partial arc, resulting in lower precision in determining the orbit.
While the 1-year aliases just discussed is most pronounced for sigma Dra, several other systems exhibit a similar effect at orbital periods that match the phasing of the observations. For all of the precision plots (Figures 4 and 8), the semi-major axis corresponding to a 1-year period is indicated by a red hash mark; the other black marks correspond to other time differences between observing epochs (e.g. for 82 Eri, the first and second observations are separated by 40 days, the second and third by 325 days, and the first and fourth by 405 days; see the observing windows in Figure  1). While the 1-year aliasing generally causes the strongest effect, other orbital period/observing period alignments can also degrade performance.
Our MCMC fitting procedure calculates a (non-Gaussian) posterior distribution for each orbital parameter. From these distributions we derive the probability that each planet lies within its star's habitable zone. Figure 6 shows the Figure 6. As in Figure 4, orbital parameters are measured for 1000 random planetary orbits around each of three stars -82 Eri, Fomalhaut, and sigma Dra. The derived probability of each planet residing in its star's habitable zone is is shown as a function of its true semi-major axis. The habitable zone is indicated by the dashed lines. . False positive and false negative rates for 82 Eri, Fomalhaut, and sigma Dra, shown as a function of the habitablezone probability threshold. A more liberal threshold (lower probability) reduces the false negative rate, but increases the false positive rate. A very conservative threshold (to the right of each panel) can ensure that no false detections are made, but misses a significant number of true habitable-zone planets. A performance metric combining the two rates is given by the F1 score, the harmonic mean of the precision (the fraction of detected habitable zone planets that truly are located in the habitable zone) and recall (the fraction of habitable zone planets that are correctly classified as such).
results for the same three target stars as in Figure 4. For 82 Eri (left panel), the orbit fitting is fairly deterministicplanets well inside the habitable zone are correctly identified as such with high probability (>99%), while those well outside are ruled out (probability <1%). As one would expect, there is some ambiguity near the edges of the habitable zone, but for the overall sample there is just a 2.4% chance of a habitable zone planet being falsely classified as falling outside the habitable zone, while 98.8% of the planets classified as residing in the habitable zone are truly habitable zone planets (i.e. a false positive rate of 1.2%). These rates are based on a nominal classification threshold, where planets with habitable zone probability >50% are categorized as habitable zone planets. If a more conservative approach is desired, less planets can be included as habitable zone. If a 95% threshold is used for classification, for example, then 82 Eri will have only 0.2% false detections. However, only 88.9% of the true habitable zone planets will be included (11.1% false negative rate).
For Fomalhaut (middle panel of Figure 6), the worse orbital precision translates to much more scatter in the plotted probabilities and less certainty for determining whether a planet lies in its habitable zone. Still, there is only 4.0% probability of a habitable zone planet being misclassified, and only a 4.9% chance of a habitable-zone-classified planet not being truly in the habitable zone. For the conservative (95%) classification threshold, the false positive rate falls to zero, but at the expense of only 59% of the true habitable zone planets being included (i.e. a false negative rate of 41%).
The false positive and false negative rates for 82 Eri, Fomalhaut, and sigma Dra are shown in Figure 7, as a function of the classification threshold. These plots also display an overall success metric -the F 1 score, the harmonic mean of (1 -false positive rate) and (1 -false negative rate). While a nominal threshold of 50% results in an optimal balance between these two factors (i.e. it give maximum F 1 score), an emphasis on avoiding false detections would warrant a more conservative approach.
The false positive/false negative rates for each star are listed in Table 3 for both the nominal classification threshold (50%) and a conservative classification threshold (95%). For the nominal threshold, the average performance for the overall sample is a 2.8% false positive rate and a 3.3% false negative rate. For the conservative threshold, the average false positive rate is just 0.05%, but the average false negative rate goes up to 19%.

CONCLUSIONS
Based on a model for the Starshade Rendezvous Probe (SRP) mission concept with target-specific observing windows and SNR calculations dependent on the planet illumination during each window, we have quantified the ability of SRP to identify habitable zone planets. We find that detection of a planet in at least 3 out of the 4 observing epochs will adequately measure the planet's semi-major axis. For a 16 star sample observed with this strategy, we find that habitable zone planets are correctly identified as such 96.7% of the time, with 2.8% contamination by false classifications. Including the full range of planet masses, the mission is expected to detect ∼10 planets in the vicinity of the habitable zone (Romero-Wolf et al. 2020), such that a very small number of planets (less than 1) are expected to be misclassified. zeta Tuc Figure 8. For each of our 16 target stars, the orbital parameters are retrieved for 1000 random planet orbits, each of which is directly imaged at least three times. The precision for measuring each planet's semi-major axis is shown here. Figure 9. For each target star, orbital parameters and their uncertainties are measured for 1000 random planetary orbits. The probability of each planet residing in its star's habitable zone is is shown as a function of its true semi-major axis.