Optical Verification Experiments of Sub-scale Starshades

Starshades are a leading technology to enable the detection and spectroscopic characterization of Earth-like exoplanets. In this paper we report on optical experiments of sub-scale starshades that advance critical starlight suppression technologies in preparation for the next generation of space telescopes. These experiments were conducted at the Princeton starshade testbed, an 80 m long enclosure testing 1/1000th scale starshades at a flight-like Fresnel number. We demonstrate 1e-10 contrast at the starshade's geometric inner working angle across 10% of the visible spectrum, with an average contrast at the inner working angle of 2.0e-10 and contrast floor of 2e-11. In addition to these high contrast demonstrations, we validate diffraction models to better than 35% accuracy through tests of intentionally flawed starshades. Overall, this suite of experiments reveals a deviation from scalar diffraction theory due to light propagating through narrow gaps between the starshade petals. We provide a model that accurately captures this effect at contrast levels below 1e-10. The results of these experiments demonstrate that there are no optical impediments to building a starshade that provides sufficient contrast to detect Earth-like exoplanets. This work also sets an upper limit on the effect of unknowns in the diffraction model used to predict starshade performance and set tolerances on the starshade manufacture.


Introduction
Starshades have the potential to discover and characterize the atmospheres of Earth-like exoplanets in the habitable zone of nearby stars. [1][2][3] Their ability to achieve high contrast while maintaining high optical throughput and broad wavelength coverage make them the most promising technology to produce the first spectrum of an exo-Earth atmosphere. 4,5 In recent years there have been significant technological advances that demonstrate the feasibility of building and deploying a starshade, [6][7][8] which have led to a significant increase in the community's interest in a future starshade mission. There is interest in a starshade to rendezvous with NASA's next flagship mission, the Nancy Grace Roman Space Telescope, 2 and a starshade is baselined for the proposed flagship mission, the Habitable Exoplanet Observatory (HabEx). 3 1 arXiv:2011.04432v1 [astro-ph.IM] 9 Nov 2020 Though significant progress has been made, starshades are still an unproven technology, particularly with respect to their optical performance. The distributed architecture's size (10's of meters diameter over 10,000's of kilometers) and sensitivity (10 10 relative change in intensity) are unprecedented-when constructed, the starshade will be the largest visible-light optic ever made.
Consequently, we need a reliable means to experimentally validate the design tools that rely on an accurate prediction of the optical performance. The state of the art optical models [9][10][11][12]  Program to advance starshade technology to TRL 5 in a time frame compatible with a starshade rendezvous with the Roman mission. 6,8 The work presented here was conducted under S5 to advance the seminal optical technology of starshades-starlight suppression. We group the experiments into two categories, "optical verification" and "model validation", which reflect the two milestones set as the criteria needed to reach TRL 5. 6 Optical verification experiments show we can design an apodization function, which specifies the starshade's shape, that provides sufficient contrast to achieve our stated science goals. These experiments validate the fundamental operation of the starshade and demonstrate that the aforementioned assumptions are valid. Model validation experiments show that the optical models correctly capture the performance sensitivity to perturbations in the starshade shape and set an upper limit to the model uncertainty used in design tools to derive the starshade shape error budget and tolerances for future missions. This result means we must only carry a contrast margin of 1.35× in the design error budget. The results from these experiments build confidence in our ability to successfully design a starshade that will provide the contrast needed to detect Earth-like exoplanets.
In Sec. 2 we describe the layout and individual components of the starshade testbed and outline the experiments performed. Section 3 presents the results of the optical verification experiments, Sec. 4 presents the results of the model validation experiments, and we discuss the implications of these results in Sec. 5. We summarize and conclude in Sec. 6. Additional details and results from the optical verification experiments can be found in the S5 milestone final reports, 21,22 which have been accepted by the Exoplanet Exploration Program Technical Advisory Committee. 23

Experiment Design
The experiments presented here were conducted at the Princeton starshade testbed, a dedicated facility in the Frick chemistry building on the Princeton campus. The testbed was designed to replicate the flight configuration at 1/1000 th scale as closely as is possible given the differences in size and environment. The scale of the experiment is ultimately limited by the longest separation available in an indoor facility on campus. We use the Starshade Rendezvous Probe Mission 2 (SRM) and HabEx mission 3 as the reference flight configurations; parameters for the flight and laboratory configurations are presented in Table 1. The table lists the operational range of the Fresnel number, N , defined as where R is the starshade radius, λ is the wavelength of light, and Z eff is the effective starshadetelescope separation: where Z tel is the distance between telescope and starshade and Z src is the distance between starshade and light source. Z eff accounts for the finite distance to the diverging beam light source in the laboratory configuration and is derived in Eq. (4). For the laboratory experiments, Z eff = 17.7 m; for the flight configuration, the source (target star) is effectively infinitely far away, making In this work we use the geometric IWA ( = R/Z tel ) as the point of reference, though future design studies should follow Ref. 24 and set requirements relative to the effective IWA, which accounts for the width of the telescope's point spread function (PSF). For the range of N under study, the Fresnel approximation of diffraction is sufficiently accurate 12 to compute the diffraction pattern to contrast levels better than 10 −10 , an assertion borne out by the successful demonstration of a dark shadow in these experiments. We use the Fresnel-Kirchhoff 25 diffraction equation to describe the diffraction in both the laboratory and flight configurations. We invoke the standard paraxial and Fresnel approximations and assume circular symmetry (with radial coordinate at the starshade r). The electric field incident on the starshade (U 0 ) is due to a spherical wave of amplitude u 0 emanating at a distance Z src : We assume the starshade's shape is an approximation of a smooth radial apodization function, A (r), a valid approximation given a sufficient number of petals. 10 The Fresnel-Kirchhoff diffraction integral to compute the on-axis electric field U at the telescope pupil plane (dropping the leading phase factor) is given by The distance terms in the exponential can be combined into an effective separation parameter, Z eff , given by Eq. (2). We set the amplitude of the incident wave to be unity at the starshade, to give u 0 = Z src . Equation (4) can be rewritten as with dimensionless quantity n = r 2 /λZ eff spanning the range of Fresnel numbers up to N .
Written in the dimensionless form of Eq. (5) and neglecting the constant amplitude scale factor ≈ 1, the integral is independent of R, λ, and Z eff , and depends solely on the Fresnel number. Altogether, by utilizing a 80 m long facility, we are able to test at a flight-like Fresnel number a starshade that is large enough to be accurately manufactured with existing technology.

Summary of experiments
The experiments presented here can be sorted into two loose categories that track the two S5 milestones 6 that this work is tasked to complete: optical verification 21,22 (presented in Sec. 3) and model validation (presented in Sec. 4).  Table 2 Summary of experiments and production number of starshades tested (see Table 4 for details on specific starshades

Testbed configuration
The design of the experiment is simple: image a light source from within the deep shadow created by a starshade and measure the efficiency with which the starshade suppresses the on-axis light.
The testbed (shown schematically in Fig. 1) consists of three stations containing a laser, starshade, and camera. The main driver in the testbed design was to maximize the starshade size while maintaining a flight-like Fresnel number, which translates to maximizing the separation. The testbed design is set by the longest, straight-line facility to be found on campus, which gives a total testbed length of 80 m. Since a fraction of the length is needed for propagation of the diverging beam, the effective separation between starshade and telescope is 17.7 m, which sets the starshade size to 25 mm diameter.
The beam line is contained in 1 m diameter steel tubing (not a vacuum) to seal the testbed from stray light and dust and to help stabilize the atmosphere. The tube is wrapped in fiberglass insulation to minimize the effect of external thermal changes. All equipment is built to be remotely operated to minimize how often the testbed is opened, which generates atmospheric turbulence and stirs up dust. Additional details of the testbed design can be found in Refs. 21, 26, 27.

Light source
The light source serving as the artificial star is a multi-channel laser diode operating at: 405 nm, 638 nm, 641 nm, 660 nm, 699 nm, and 725 nm. The 405 nm light is outside the starshade's operating bandpass and is used for alignment. The laser is located outside the enclosure and fed in via a polarization-maintaining single-mode fiber optic. The polarization out of the fiber depends on external environmental conditions, which vary with time. The fiber terminates with a collimator and the output gaussian beam is focused by an objective lens through a pinhole to spatially filter high-order aberrations. Experiments # 1, 2, and 6 of Table 2 were done using the resultant polarization out of the fiber. In the rest of the experiments, a linear polarizer is placed between the collimator and objective lens and is fixed horizontally (in the image and lab frames).
As the state of polarization out of the fiber varies with external conditions, the power transmitted through the polarizer (and ultimately incident on the starshade) varies with time. To account for this, a beam-splitter after the polarizer sends a fraction of the light to a photometer, which records Light is launched from the fiber, collimated, and passes through a linear polarizer before reaching a beamsplitter. 10% of the light is reflected to a photometer to record the throughput. The other 90% continues to an objective lens which sends a diverging beam to the starshade. A pinhole at the focus spatially filters high-order aberrations.
the transmitted power during observations and allows for the contrast calibration to be adjusted accordingly. A cartoon diagram of the laser launching system is shown in Fig. 2.

Starshade masks
The starshades tested are roughly 25 mm in diameter and are etched into a 100 mm silicon wafer.
They are positioned in the middle testbed station and are held by a mask changer (shown in Fig. 3) with a motorized planetary gear that allows us to switch between starshade and calibration masks and to image the mask at different rotation angles.

Design
The starshade mask (shown in Fig. 4) consists of an inner starshade, representative of a free floating occulter, that is supported in a silicon wafer via radial struts. The outer diameter of the support wafer is also apodized to minimize the diffraction that would occur from the truncation of the beam by the outer diameter. 15 This design results in the starshade mask consisting of N p ( = number of petals) transmission regions bounded by the petals of the inner starshade, the radial struts, and petals of the outer diameter.
Both apodization profiles (inner and outer) are designed independent of each other using the numerical optimization scheme outlined in Ref. 10. Table 3 details the designs of a number of apodization functions; the minimum radius (R 0 ) is the radius at which the inner petals start and the maximum radius (R) is the radius at which the struts start (where the tips of a free floating occulter would be). Design A of Table 3 is an earlier design with smaller gaps and a larger maximum radius.
Design B was specifically designed for these experiments. We impose a constraint on the radius of the inner starshade to have Fresnel number < 15 and constrain the gaps between the starshade petals to have widths > 16 µm to minimize non-scalar diffraction. We found this to be the largest gap width that provides a valid solution to the optimization problem. Design C16 is the same apodization profile as Design B, but is made 3% larger to shift the operating bandpass to cover the laser's available wavelength channels. Design C12 is the same as Design C16, but with 12 petals instead of 16, which was done to minimize the number of inner gaps between petals, which serve as sources of non-scalar diffraction. After a solution to the optimization problem is found, the apodization profile is multiplied by 0.9 to provide width to the radial struts and is then petalized to become the starshade shape shown in Fig. 4. Since the radial struts consist of a constant multiplication applied to the apodization profile, they do not diffract into the shadow.

Manufacturing
The starshade pattern is etched into the device layer of a silicon-on-insulator (SOI) 100 mm wafer via a deep reactive ion etching process. The allowed tolerances on the shape are very small, ∼ 100 nm, which is achievable with a direct write electron beam lithography process. The SOI device layer is made as thin as is practical (1 µm -7 µm) to minimize non-scalar diffraction as light propagates past the optical edge. The 350 µm thick support wafer is etched from the backside to recess it 50 µm from the device layer's optical edge. The final step in the process is to coat the top of the device layer with a thin layer of metal to maintain opacity. Either 0.4 µm of gold or 0.25 µm of aluminum is used, both of which have thicknesses more than 50 times greater than their skin depth, so we expect the metal layer to be completely opaque. Details on the manufactured masks are found in Table 4. We refer the reader to Refs. 28, 29 for more details on the manufacturing process.

Optics + detector
The optics system in the camera station has pupil plane and focal plane imaging modes that are toggled by remotely flipping a lens in/out of the optical path. Contrast measurements are made in the focal plane imaging mode with the camera focused to the plane of the light source, simulating an exoplanet observation. We use an f/100 system with a 5 mm diameter aperture, which provides Table 4 Descriptions of manufactured masks including the apodization design (detailed in Table 3), the thickness of the device layer (optical edge), the thickness and type of metal coating, and the perturbations built into the shape.

Name Apodization Edge Thickness
Al -0.25 µm Sine waves & exposed tips the same number of resolution elements across the starshade as in the flight design. As will be shown in Sec. 3.2, the contrast improves as light at the geometric IWA rolls off with the telescope's PSF. A telescope that highly resolves the geometric IWA gets an added boost in contrast. As such, to test in a flight-like configuration, we scale the aperture to conserve the number of resolution elements across the geometric IWA: In pupil imaging mode, we observe the out-of-band diffraction pattern incident on the entrance pupil and use the bright spot of Arago to align the camera with the starshade, precisely what is done in the formation flying scheme to maintain starshade alignment. 30,31 To perform calibration measurements, a neutral density filter (optical density > 10 −7 ) is toggled into the optical train by a motorized stage. A linear polarizer on a motorized rotation stage serves as a polarization analyzer. The detector is an Andor iXon Ultra 888 EMCCD with 13 µm pixels. For low noise performance, the detector is operated with its conventional amplifier, i.e., not electron-multiplying, and is liquid cooled down to -90 • C.

Calibrations
A circular aperture mask is used to calibrate the throughput of the system and convert measurements of the occulted light source to a contrast value (see Appendix A for a definition of contrast).
The calibration mask is a 50 mm diameter circle etched through a silicon wafer and switches position with the starshade mask via the motorized mask changer. For each set of observations, two sets of images are taken: one with the starshade mask in the beam and one with the calibration mask in the beam. When observing with the calibration mask, a neutral density filter is placed in the optical path. The measured count rate for both sets of images are used in Eq. (11) of Appendix A to calculate contrast. We refer the reader to Ref. 21 for additional details on the calibration process.

Optical Verification Experiments
The first category of experiments are meant to verify the fundamental concept of a starshade by demonstrating that we can design the starshade's shape to provide the optical performance needed to image exoplanets. These experiments validate most of the assumptions (e.g., binary approximation to a smooth function) made in the equations used to design the apodization function that defines the starshade's shape. Verification is achieved by demonstrating better than 10 −10 contrast across a wide bandpass with a starshade in a flight-like optical configuration. The results presented in this section represent the completion of Milestones 1A 21 and 1B 22 of the S5 Project, which satisfy the first of two main requirements in the Starlight Suppression technology development plan. 6

Monochromatic contrast
In this first experiment, we demonstrate the best contrast achieved with the highest quality mask (DW17) at a single wavelength (λ = 638 nm). In the contrast image shown in Fig. 5, the brightest features are two lobes that are aligned with the polarization vector of the incident light (the intrinsic polarization of the fiber is slightly elliptical at a 40 • angle) and which remain fixed as the mask is imaged at different rotation angles. The bright lobes are due to polarization-dependent changes in the electric field as light propagates through the narrow gaps between petals. We call this the "thick screen effect" and describe it in detail in Sec. 3.1.1. While these lobes are relatively bright at their peak, they are confined to two lobes at the inner gaps between petals and the contrast significantly improves in regions of the image away from the lobes. This means that despite the lobes, 10 −10 contrast is achieved over a significant fraction of the image at the geometric IWA.
This is a key feature of the starshade: any light leaking around the starshade is confined to the edge of the starshade in the image and rolls off with the telescope's PSF at image locations away from the edge.
The primary impact of the lobes is to slightly reduce the image area over which 10 −10 contrast is achieved. The contrast is better than 10 −10 over 44% of a λ/D wide annulus centered at the geometric IWA and quickly rises to 100% at 1.05× the IWA. 21 Figure

A note on the thick screen effect
The bright lobes in Fig. 5 are aligned with the input polarization vector, remain stationary as the starshade rotates, and have a brightness that is an order of magnitude above that predicted by scalar diffraction theory. This was a new discovery that only appeared once high contrast levels were achieved at a flight-like Fresnel number. We have since developed a theory, deemed the thick screen effect, 33 which readily explains their origin.
The Fresnel-Kirchhoff diffraction formula, which is used to derive the apodization function represented by the starshade's shape, 10 makes the assumption that the electromagnetic field can be represented by a single scalar wave function that satisfies the scalar wave equation, 25 a valid assumption for most optical systems with features larger than the wavelength of light. More specifically, F-K diffraction assumes an infinitely thin, perfectly conducting diffraction screen and that the field in the plane of the screen takes Kirchhoff's boundary conditions, where the field is zero on the screen and is unchanged in the aperture.
The starshades in the lab configuration are small enough that these assumptions begin to break down. The gap between two petals is ∼ 20 wavelengths across and the screen (optical edge) is up to half as thick as the gaps are wide, so the gaps resemble waveguides more than thin screens.
As light propagates past the thick edge of the starshade mask (through the waveguide), energy is lost due to the finite conductivity of the walls 34  beyond the scope of this paper, but will be addressed in the future.

Broadband contrast
Testing over a wider wavelength range increases the applicability of the experiment to a more flight-like configuration and demonstrates that a starshade can maintain its high contrast performance over a scientifically interesting bandpass. In this experiment we tested mask DW21 at four discrete wavelengths that span a 10% (85 nm) bandpass. The design of DW21 is identical to, but 3% larger than, DW17 in order to shift the starshade's operating bandpass to cover the available laser wavelengths. Figure 6 shows again that better than 10 −10 contrast is achieved at the IWA and that the performance is dominated by the thick screen effect. The contrast is relatively constant across the bandpass, while the peak contrast is higher than in the monochromatic experiment due to DW21 being slightly thicker than DW17, which produces a larger thick screen effect. The morphology of the polarization lobes in the λ = 660 nm data is most likely due to a misalignment between the camera and starshade - Fig. 7 shows a model image where the camera is shifted offaxis by 1 mm and the lobes are distorted to one side of the starshade. Figure 8 shows the contrast averaged over a λ/D wide annulus, where the contrast is slightly worse at longer wavelengths as the PSF broadens and more light from the polarization lobes are leaked into the IWA. The average contrast at the IWA across the wavelengths is 2.0 × 10 −10 . This experiment shows the starshade does not suffer any fundamental degradation in performance by operating across a wider bandpass, as is expected by theory.  In the model the camera is shifted off-axis by 1 mm, which distorts the polarization lobes, suggesting this can account for the morphology of the lobes seen in the data.

Exposed tips
In order to suspend the fragile silicon starshade in the testbed, the starshade design incorporates radial struts that keep the starshade attached to the outer supporting wafer. This means the end of the petal never terminates at a tip. The diffraction equations show that the critical features of the starshade are the inner valleys between the petals and the outer tips of the petals, i.e., where the petal shape has a large azimuthal component. To verify that critical features such as tips behave in an expected manner, and to make the test article reflect a more flight-like configuration, we tested a mask built with exposed tips. Figure 9 shows the design of mask M12P3 where the inner starshade is slightly rotated relative to the struts and outer apodization function to expose tips at the end of the petals. As the petalized starshade is an approximation to a radial, azimuthally-   Fig. 10. In addition to the exposed tips, M12P3 has sine wave perturbations built into its shape for model validation (see Sec. 4.2.2) and a defect leftover from the manufacturing process, which dominate the contrast in the residual image.
Taking a λ/D wide annulus at the angular separation of the tips, and excluding the portion that lies on the sine wave perturbation (at 6:00 in the image), we find the average residual contrast in the annulus is 7 × 10 −11 . Inspection of Fig. 10 shows there is not significant residual light at the location of the tips and that most of the residual is due to leakage of unmodeled non-scalar diffraction from the inner gaps of the starshade and from the perturbations. From this we conclude that the exposed tips behave as expected (to the 10 −10 contrast level) and the starshade still provides sufficient contrast outside the IWA. 2) and an unintentional manufacturing defect. High contrast is still achieved at the location of the exposed tips (marked by the dashed circle).

Model Validation Experiments
There will always be slight deformations in the starshade's shape due to manufacturing errors, deployment errors, and thermal and mechanical stresses. A contrast error budget sets tolerances on the allowed deformations by balancing aspects of the mechanical design deemed to be most challenging. 37 One purpose of the model validation experiments is to validate the accuracy at which the models used to derive the error budget capture the sensitivity of contrast performance to shape perturbation; by observing starshades with known perturbations built into the shape, we can validate how the contrast changes in a known way. The validation accuracy set by the experiments determines the Model Uncertainty Factor (MUF) between contrast and shape in the error budget.
The MUF is a multiplicative term applied to the intensity (contrast) of each term in the error budget that provides a margin for inaccuracy of the model. Reducing the MUF through experimentation allows us to reduce the contrast margin budgeted to model uncertainty and leads to a more efficient design. The shape changes selected for the validation experiments are related to the mechanical architecture of the SRM design 2 and are representative of the shape errors in the error budget. 37 The model we are trying to validate uses scalar diffraction only, which is believed to be sufficiently accurate for the flight-scale starshade. However, since the discovery of non-scalar, thick screen effects in the lab configuration, additional work must be done to include these effects in the model in order to properly demonstrate that the perturbations are accounted for at flight scales.
In this section, we first present results from two experiments testing perturbations that dominate

Perturbed shape experiments
Here we present results from testing two masks with different classes of perturbations: displaced edges and sine waves. Each mask has two perturbations of different sizes built into its design.
The sizes are chosen to produce a signal in the image that is bright enough to overcome contributions from the thick screen effect, but faint enough to be informative to model validation. One perturbation is located on the inner starshade and is made brighter since it is closer to the central polarization lobes; the other defect is located on the outer starshade and is allowed to be fainter.
The perturbed masks have 12 petals to minimize the thick screen effect; fewer gaps between petals mean there are fewer sources of non-scalar diffraction and the lobes are (12/16) 2 times as bright.
Details of the perturbations are presented in Table 5 and the locations of inner and outer perturbations are shown in Fig. 11. In comparing contrasts between experiment and model, we draw a photometric aperture of radius λ/D around the perturbation and calculate the average contrast in that aperture using Eq. (12) of Appendix A, with error given by Eq. (14) of Appendix A. The expected contrast presented in Table 5 is the average contrast in the photometric aperture, calculated under the assumption of scalar diffraction only.

Displaced edge segments
The displaced edge perturbation simulates the effect of a petal edge segment being displaced from its nominal position during petal assembly on the ground. Figure 12 shows a 3.7 µm tall dis-    where the values for Model and Experiment are the photometric aperture averaged contrast. The error bars in Figure 15 are the experimental uncertainty propagated to the percent difference, and are calculated as where σ Experiment is calculated from Eq. (13) of Appendix A.
The agreement is better than 20% for all perturbations except for the outer perturbation at λ = 699 nm. The difference for the outer perturbation at 699 nm is ∼ 60%; Fig. 13 shows that it is much dimmer than expected and does not follow the trend of the different wavelengths. This effect is seen in three orientations of the mask (rotated by ± 120 • ) and has been repeated several times.
We currently do not have a good explanation as to why this happens, but we are continuing to refine the characterization and model of the optical edges in hopes of explaining this observation.

Sine waves
Sinusoidal changes to the edge shape can occur if individual edge segments are misplaced in such a way that their envelope creates a sine wave with respect to the nominal edge position. This is a particularly harmful perturbation if the sine wave is in sync with the Fresnel half-zones and they constructively interfere. This also places a strong wavelength dependence on the contrast they induce. The locations of the sine wave defects are shown in Fig. 11. Figure 16 shows the images taken of this perturbed mask. For M12P3, in addition to the intentional perturbations, a large defect is left over from the manufacturing process and is the brightest source in the image. SEM images show this is a large defect that extends vertically below the wafer's device layer and has a complicated, protruding structure. We estimate the area to be ∼ 750 − 1500 square microns, but projection effects make it difficult to know how much is seen by the camera. In the model, we adjust the area of the defect to 1375 square microns in order to match the data and contribute the appropriate amount of leakage to the other perturbations.  The sine wave perturbation has a strong response with wavelength, and the inner and outer per-turbations occasionally switch which is brightest, behavior that agrees with model predictions. Figure 18 shows the comparison between the experimental and model contrasts, with the percent difference calculated from Eq. (6) and the uncertainty on that difference calculated from Eq. (7).
For this mask, since the inner and outer perturbations are of equal brightness, the inner one generally performs worse as it is closer to and suffers more contamination from the central polarization lobes. Both the inner and outer perturbations agree with the model to better than 35% at all wavelengths. This agreement is not as good as that of the displaced edges mask, which we attribute to the presence of the large manufacturing defect. Due to the complicated, three-dimensional structure of the manufacturing defect, it is difficult to model the interaction between the defect and the sine wave perturbations. We believe this unmodeled interaction is the source of the larger discrepancy.

Polarization study
Our proposition that the bright central lobes are due to the thick screen effect can be validated by  Figure 19 confirms this as we view hori-

Contrast dependence on Fresnel number
The sub-scale experiments presented here are informative for a starshade mission because of the   a scalar-only regime, we build confidence that the scalar model will accurately predict performance in configurations where non-scalar diffraction is less prominent.

Discussion
The starshade enables the detection of exoplanets because it provides high contrast at a small IWA, which means it operates at a moderately low Fresnel number where the diffraction is governed by complicated, near field equations. The apodization function that provides sufficiently high contrast is found by solving an optimization problem that minimizes the electric field in the Fresnel-Kirchhoff diffraction equation. This radial apodization function is then approximated by a petalized binary occulter that is the starshade's edge. Until now, it had not been shown that we have the tools capable of designing an apodization function that sufficiently suppresses diffraction at low Fresnel numbers and that the petalized approximation is valid. As the Fresnel number is decreased, diffraction becomes harder to control as the apodization function spans fewer Fresnel zones over which to average out imperfections. As such, it was necessary to demonstrate that high The observed thick screen effect represents a worse case scenario in which our assumption of a purely scalar theory of diffraction is no longer valid and the optimized apodization is no longer applied to the appropriate problem. However, even at the small scales of the laboratory configuration, the contribution from non-scalar diffraction is below the target contrast level. Additionally, we ex-pect deviations from scalar diffraction theory to go away at larger scales as the sizes of features get much larger than the wavelength; scaling to a larger starshade for flight should work in our favor.
If our theory of the thick screen is correct, the edges of the full-sized starshade will induce the same wavelength wide boundary layer near the edge, but the size of the starshade will be 1000× larger, meaning the impact of the non-scalar diffraction will be 10 6 × smaller in intensity and will be reduced to a negligible amount (see Appendix B). Additional work may be needed to complete the validation of the non-scalar diffraction model, but the experiments completed thus far have built confidence that we can achieve the same, if not better, contrast at larger scales. Given historic understanding of how light behaves around features with sizes comparable to the wavelength, it was known non-scalar diffraction could be present, but we previously did not have adequate tools to quantify the effect. The experiments in this work have helped to develop those tools and will allow us to apply them to future configurations.
The model validation experiments aim to determine the accuracy to which the model can predict the contrast sensitivity to known effects. Some of the experimental noise is due to unknown manufacturing errors, misalignment, turbulence, and stray reflections, and is not attributable to the model. Without detailed measurements of these error sources, we can only set an upper limit to the model inaccuracy. The results presented in Sec. 4.2 show the model remains accurate to at least the 35% level, with an average difference of 20%. The model agreement is even better (< 20%) for the displaced edge perturbation data on mask M12P2, as that mask did not have a large manufacturing defect as mask M12P3 did. This leads us to believe that the dominant source of model disagreement is due to uncharacterized perturbations in the manufactured mask, where due to the thick screen effect, the three-dimensional structure of the mask becomes relevant. The large manufacturing defect in M12P3 has a complex vertical structure that is difficult to adequately characterize and thus difficult to include in the model. Global defects, such as a variable overetching, are also difficult to characterize and can contribute to the model disagreement.

Conclusion
We have presented results from a number of experiments demonstrating the best contrast achieved with a starshade at a flight-like Fresnel number. We achieved better than 10 −10 contrast, the level needed to detect Earth-like exoplanets, at the geometric IWA and across a scientifically interesting bandpass. The starshades tested were not perfectly manufactured, their small size introduced nonscalar diffraction, and the experiments were conducted in air. That we were able to achieve such high contrast even in the presence of these factors shows the efficiency and robustness in which In future work, we will continue model validation experiments on test masks with more perturbations representative of those in flight. These include: displacing a single petal radially; displacing all petals radially; and combining an edge segment displacement with a petal displacement.
We will also continue to refine the thick screen model through tests of masks with different thick-nesses and formally apply that model to the flight design to show these effects are negligible at flight scales. Completion of these experiments will advance the starlight suppression technology for starshades to TRL 5 and starshades will be ready for selection for the next exoplanet mission.

Appendix A: Contrast Definition
Contrast is defined as the amount of light within a resolution element of a telescope (at the image plane), divided by the peak brightness of the main light source as measured by that telescope when there is no starshade in place. The following equations define the contrast in terms of quantities measured in the lab; Table 6 provides descriptions of the variables used in the definition. Denotes unocculted / calibration measurement Free space propagation is not possible in the confinements of the lab, thus we use a circu-lar calibration mask to measure the unocculted brightness and convert to a free space brightness through modeling. In the following definitions, the subscript of a symbol will denote the observation mode, with m denoting measurements made when the starshade mask is in place and u denoting unocculted measurements when the calibration mask is in place.
We define the contrast at pixel i as which is a theoretical construct specifying the reduction in brightness the starshade mask provides, relative to the on-axis unocculted source.
The peak value of the apodization function (A 2 ) in the denominator accounts for the fraction of light that is blocked by the radial struts supporting the inner starshade. The value γ is the ratio of the peak of the PSF after propagation through free space to that after propagation through the circular calibration mask, and relates the contrast measured in the lab to that expected from a freefloating starshade. More details on this conversion can be found in Ref. 21. The transfer function is tied to a measurement through the equation where x ∈ {m, u}. We drop the superscript on s u and assume it is on-axis. We assume there is no ND filter during starshade measurements (ν m = 1, ν u ≡ ν) and that the photon energy, camera gain, and camera throughput do not change between observation modes. Substituting Eq. (9) into Eq. (8), we rewrite the contrast as We assume that values in the right parentheses have the same mean between observation modes, but whose true value during a given observation is distributed normally around the mean with variance σ 2 . In other words, Q u = Q m ≡ Q, σ 2 Qu = σ 2 Qm ≡ σ 2 Q , and similarly for ε. This simplifies the contrast definition to For model validation, we calculate the average contrast over n pixels that lie in a photometric aperture of radius λ/D centered at that pixel as where we have wrapped values independent of pixel into the constant variable α. The uncertainty in α is given by The variance in the counts of the unocculted image, σ 2 su , is given by Eq. (15). We estimate the values of the rest of the uncertainties of α in Ref. 21 to find σ α ∼ 2.5%. Assuming independent measurement errors, the uncertainty in the average contrast is propagated to where σ 2 s i m is the variance of pixel i in the mask image.
The dominant contributions to the uncertainty in counts collected during each exposure (s u , s m ) are: photon noise from the source, background light, and detector dark noise and read noise. We can ignore noise from clock induced charge in the detector electronics, as this is estimated to be < 3 × 10 −3 events/pixel. Read noise is estimated from the variance of 2 bias frames of 10 µs exposure time to be σ R = 3.20 e − /pixel/frame. We combine the number of counts from background light and detector dark noise into the variable d, which is estimated from dark exposures taken with an equal exposure time. The uncertainty in the measurement of s counts in a single image j is given by For each observation mode, a number (n frames ) of frames are taken and median-combined into a master image. The variance in s counts obtained from n frames frames is given by Additional details on noise sources and calibrations can be found in Ref. 21.

Appendix B: The thick screen effect at flight scales
In this appendix, we estimate how the thick screen effect scales with starshade size and argue the effect is negligible at flight scales. To start, we derive an expression for the change in intensity at the telescope due to the thick screen effect (assuming it induces a local change in amplitude only) and show that the expression is consistent with experimental results. We then apply the expression to the flight-scale starshade and show the effect is negligible.

B.1 Derivation of intensity estimation
We begin with the Fresnel-Kirchhoff diffraction equation and assume an incident plane wave with z = Z eff . The on-axis electric field U is given by where A(r) is the circularly symmetric apodization function. We characterize the thick screen effect as a change in the apodization function (α) relative to the nominal shape (A 0 ) such that A(r) = A 0 (r) + α(r). Equation (17) then becomes U nominal is the nominal electric field in the scalar diffraction limit and ∆U is the change in electric field as a result of the thick screen effect. In Sec. 3.1.1 we posited, and the models con-firm, 36 that the presence of the thick screen induces a change in the electric field in a narrow (∼ λ) boundary layer around the edge. For now, we restrict ourselves to the case where the presence of the screen induces a change in amplitude only, resulting in a ∼ λ wide boundary layer around the starshade edge with zero transmission. Since the width of the boundary layer, which we define as δ, is roughly constant for each edge (we neglect differences between polarization states), the change in the apodization function is related to the boundary layer width by where N p is the number of starshade petals, R 0 is the minimum radius at which the petals start, and we've acknowledged that there is no change in the apodization before the petals starts. The thick screen effect can now be written as The integral in Eq. (20) can readily be evaluated in terms of Fresnel integrals. We define the complex Fresnel integral as and make the appropriate substitutions to write the thick screen effect as where N is the Fresnel number at the starshade radius and N 0 is the Fresnel number at the start of the petals. The change in (on-axis) intensity due to the thick screen effect is then We note that the leading factor looks like a Fresnel number across the width of the boundary layer.

B.2 Comparison to experimental results
We will now use Eq. (23) to estimate the change in intensity for the laboratory configuration and compare to experimental results of mask DW9. Equation (23) is the change in on-axis intensity at the telescope plane, so we will compare our estimates to the suppression values calculated from pupil plane images. Suppression is the total intensity incident on the telescope's aperture when the starshade is occulting the star, relative to the total intensity of the unblocked star.
At a wavelength of λ = 638 nm, finite-difference time-domain (FDTD) simulations of the mask DW9 geometry yield an average change in amplitude consistent with a boundary layer width of δ = 0.45 µm (averaging over polarization states). We input the parameters from Table 3 into Eq. (23) to estimate the change in intensity to be |∆U | 2 = 1.6 × 10 −8 . Figure 24 shows the suppression plot (image of the pupil plane) for DW9 at λ = 638 nm. These data were taken without any polarizing elements, so represent a rough average over polarization states. The peak suppression is 1.5 × 10 −7 and the average over the aperture is 6.3 × 10 −8 , which is greater than, but within an order of magnitude of, that predicted by Eq. (23). These results show our estimate of the thick screen effect is within reason and we note that Eq. (23) was a lower limit as it assumed an amplitude-only change in field due to the thick screen.

B.3 Scaling up to flight
By examining Eq. (23), we can see why the non-scalar diffraction should be negligible at flight scales. Scaling from the lab configuration to that of flight, the Fresnel numbers are the same and δ will be roughly the same (the optical edges of the flight design are of similar thickness to those in the lab), so the thick screen effect intensity goes as |∆U | 2 ∼ z −1 . The effective starshade-telescope separation for flight is 10 6 × that in the lab (the starshades are 1000 th scale and the separation scales quadratically with size for a given Fresnel number), so we can expect the non-scalar diffraction to be 10 6 × lower in intensity.
Replacing λz in the denominator of Eq. (23) with R 2 N gives |∆U | 2 ∼ δ 2 /R 2 , which is the same argument given in Sec. 3 where U 0 is the initial field incident on the screen, A represents the aperture function of the screen, and F [·] (x,y) is the Fourier transform. Using the Kirchhoff boundary values, A is a binary function that is 0 on the screen and 1 in the aperture. Our method implements non-scalar diffraction into this model by replacing the boundary values of A with a complex field that arises from local diffraction at the edge of the screen. The presence of the screen only affects its immediate surrounding, so we only change the values of A in a narrow (∼ 10λ wide) seam around the edge of the screen, similar to the method proposed by Braunbek. 35 The field in the seam around the edge is solved for via an FDTD simulation 39     JPL. Interior to the inner red circle is the inner starshade representing a free floating occulter. The inner starshade is supported in the wafer by radial struts. The outer red circle marks the start of the apodization function of the outer diameter.

5
Contrast map for monochromatic experiment at λ = 638 nm with mask DW17.
The starshade pattern is overlaid and the dashed circle marks the geometric IWA. 6 Contrast images of mask DW21 at four discrete wavelengths spanning a 10% bandpass.
7 Experiment (left) and model (right) images of mask DW21 at λ = 660 nm. In the model the camera is shifted off-axis by 1 mm, which distorts the polarization lobes, suggesting this can account for the morphology of the lobes seen in the data. List of Tables   1  Optical parameters for the laboratory experiments and the SRM and HabEx star-shade architectures. The range of Fresnel numbers is set by the wavelength range. † For apodization design C12/C16 in Table 3.

2
Summary of experiments and production number of starshades tested (see Table 4 for details on specific starshades). Descriptions of manufactured masks including the apodization design (detailed in Table 3), the thickness of the device layer (optical edge), the thickness and type of metal coating, and the perturbations built into the shape.

5
Description of shape perturbations, including their expected contrast (photometric average, see Eq. (12)) at two wavelengths. Each perturbed mask ( a M12P2, b M12P3) has an inner and outer perturbation of different size. 6 List of variables used in contrast definition.