Revised astrometric calibration of the Gemini Planet Imager

Abstract. We present a revision to the astrometric calibration of the Gemini Planet Imager (GPI), an instrument designed to achieve the high contrast at small angular separations necessary to image substellar and planetary-mass companions around nearby, young stars. We identified several issues with the GPI data reduction pipeline (DRP) that significantly affected the determination of the angle of north in reduced GPI images. As well as introducing a small error in position angle measurements for targets observed at small zenith distances, this error led to a significant error in the previous astrometric calibration that has affected all subsequent astrometric measurements. We present a detailed description of these issues and how they were corrected. We reduced GPI observations of calibration binaries taken periodically since the instrument was commissioned in 2014 using an updated version of the DRP. These measurements were compared to observations obtained with the NIRC2 instrument on Keck II, an instrument with an excellent astrometric calibration, allowing us to derive an updated plate scale and north offset angle for GPI. This revised astrometric calibration should be used to calibrate all measurements obtained with GPI for the purposes of precision astrometry.


Introduction
The Gemini Planet Imager 1,2 (GPI) is an instrument, currently at the Gemini South telescope, Chile, that was designed to achieve high contrast at small angular separations to resolve planetary-mass companions around nearby, young stars. Many high-contrast imaging observations also require highly precise and accurate astrometry. One of the objectives of the large Gemini Planet Imager Exoplanet Survey 3 (GPIES) was to characterize via relative astrometry the orbits of the brown dwarfs and exoplanets imaged as a part of the campaign. 4 These measurements have been used to investigate the dynamical stability of the multiplanet HR 8799 system, 5 the interactions between substellar companions and circumstellar debris disks, 6,7 and to directly measure the mass of β Pictoris b. 8 Improved astrometric accuracy and precision can reveal systematic discrepancies between instruments that need to be considered when performing orbital fits using astrometric records from multiple instruments. Accurate, precise astrometry can also help with common proper motion confirmation or rejection of detected candidate companions.
Previous work has demonstrated that the location of a faint substellar companion relative to the host star can be measured within a reduced and postprocessed GPI image to a precision of ∼700'th of a pixel. 9 Since GPI's science camera is an integral field spectrograph (IFS)/ polarimeter, "pixel" in this context means the spatial pixel sampling set by the IFS lenslet array rather than of the subsequent Hawaii-2RG detector. Converting these precise measurements of the relative position of the companion from pixels into an on-sky separation and position angle (PA) requires a precise and accurate astrometric calibration of the instrument. The plate scale of the instrument is required to convert from pixels in the reconstructed data cubes into arcseconds and the angle of north on an image that has been derotated to put north up based on the astrometric information within the header. The previous astrometric calibration (a plate scale of 14.166 AE 0.007 mas px −1 and a north offset angle of −0.10 AE 0.13 deg) was based on observations of calibration binaries and multiple systems obtained during the first two years of operations of the instrument. 4,10 In the course of several investigations using GPI that relied upon astrometric measurements, over time it became apparent that there were potentially remaining systematic biases after that calibration, particularly in regard to the north angle correction. This motivated a careful, thorough calibration effort into GPI astrometry, an effort that eventually grew to include cross checks of the GPI data processing pipeline, the performance of several Gemini observatory systems, and a complete reanalysis of all astrometric calibration targets observed with GPI.
This paper presents the findings of those efforts and the resulting improved knowledge of GPI's astrometric calibration. After introducing some background information regarding GPI and the Gemini architecture (Sec. 2), we describe two issues that we identified and fixed in the data reduction pipeline (DRP) (Sec. 3), a retroactive calibration of clock biases affecting some GPI observations (Sec. 4), and a model to calibrate for small apparent PA changes in some observations, at small zenith distances (Sec. 5). With those issues corrected, we revisit the astrometric calibration of GPI based on observations of several calibration binaries and multiple systems (Secs. 6 and 7). Compared to the prior calibration values, we find no significant difference in the plate scale. However, we find a different value for the true north correction by þ0.36 deg, along with tentative low-significance evidence for small gradual drifts in that correction over time. Finally, we discuss the effect of the revised astrometric calibration on the astrometric measurements of several substellar companions (Sec. 8).

GPI Optical Assemblies
The GPI 1,2 combines three major optical assemblies (Fig. 1). The adaptive optics (AO) system is mounted on a single thick custom optical bench. The Cassegrain focus of the telescope is located within the AO assembly. On that bench, the beam encounters a linear thin-plate atmospheric dispersion corrector, steerable pupil-alignment fold mirror, an off-axis parabolic (OAP) relay to the first deformable mirror, and an OAP relay to the second deformable mirror. After that, the beam is refocused to f∕64. The last optic on the AO bench is a wheel containing microdot-patterned coronagraphic apodizer masks. 11,12 These apodizer masks also include a square grid pattern that induces a regular pattern of diffracted copies of the stellar point spread function (PSF). 13,14 The second optical assembly is an infrared wavefront sensor known as the calibration (CAL) system. 15 It contains the focal plane mask component of the coronagraph (a flat mirror with a central hole) and collimating and steering optics.
The third assembly is the IFS. 16,17 The input collimated beam is refocused onto a grid of lenslets that serve as the image focal plane of the system. After this, the spectrograph optics relay and disperse the lenslet images, but since the beam has been segmented, these can no longer introduce astrometric effects. The lenslet array samples the focal plane and produces a grid of "spots" or micropupils, each of which is an image of the telescope pupil. The only aberrations affecting the image quality of the field are from elements in front of the lenslet array. 17 Each of these three assemblies is independently mounted by three bipods. The bipods are supported by a steel truss structure that attaches to a square front mounting plate. The mounting plate attaches to the Gemini Instrument Support Structure (ISS) with large fixed kinematic pins. The ISS is a rotating cube located just above the Cassegrain focus of the telescope.
In typical Gemini operations, the ISS rotator operates to keep the sky PA fixed on the science focal plane. High-contrast imaging typically instead tries to fix the telescope pupil on the science instrument to allow angular differential imaging 18 (ADI). In GPI's case, this is always done at a single orientation (corresponding to GPI's vertical axis parallel to the telescope vertical axis). In the simplest case, this would involve stopping all rotator motion. However, as discussed in Sec. 5, in some but not all observations, the observatory software instead tries to maintain the absolute (sky) vertical angle (VA) stationary on the science focal plane, which must be accounted for in astrometric observations.

Software Interface and IFS Operation
The software architecture for GPI and the Gemini South telescope is complex, as is typical for a major observatory. Simple operations often require interactions between several different computers. For example, taking an image with the IFS is a process that involves four separate computer systems; the main Gemini environment that runs the observatory's control software, GPI's top level computer (TLC) that is interfaced with each component of the instrument, the IFS "host" computer that acts as an interface between the UNIX-based TLC and the Windows-based detector software, and the IFS "brick" that interfaces directly with the Hawaii-2RG detector. 17 Three of these four computer systems are responsible for populating the flexible image transport system (FITS) 19 image header keywords appended to each image. The Gemini environment handles telescope-specific quantities such as the telescope mount position, the TLC handles keywords associated with other parts of the instrument such as the AO system, and the IFS brick records detector-specific quantities. Each of these computer systems also maintains its own clock, although only the clock of the Gemini and environment and the IFS brick are relevant for the purposes of this study. These clocks are used when appending various timestamps to FITS headers during the process of obtaining an image. In theory, these clocks should all be synchronized periodically with Gemini's Network Time Protocol (NTP) server.
The IFS camera is controlled by the IFS brick, a computer used to interface with the Teledyne JADE2 electronics and Hawaii-2RG detector. This computer is responsible for commanding the camera, calculating count rates for each pixel based on raw up-the-ramp (UTR) reads, 20 sending completed images back to the observatory computers and providing ancillary metadata including the start and end time of the exposure (UTSTART and UTEND) that are stored in the FITS header. The detector is operated almost exclusively in UTR mode; correlated double sampling (CDS) mode 21 images have been taken in the laboratory, but this mode is not available for a standard observing sequence. The IFS runs at a fixed pixel clocking rate of 1.45479 s for a full read or reset of the detector. The IFS software allows for multiple exposures to be coadded together prior to writing an FITS file. This mode has lower operational overheads and greater operational efficiency compared to individual exposures, and therefore, is frequently used for short exposures (from 1.5 to 10 s per coadd) but not generally used for long exposures (60 s per coadd) due to field rotation.

Improvements in the GPI Data Reduction Pipeline
The GPI DRP 22,23 is an open-source pipeline that performs basic reduction steps on data obtained with GPI's IFS to remove a variety of instrumental systematics and produce science-ready spectrophotometrically and astrometrically calibrated data cubes. The DRP corrects for detector dark current, identifies and corrects bad pixels and cosmic ray events, extracts the microspectra in the two-dimensional (2-D) image to construct a three-dimensional (3-D) (x; y; λ) data cube (or x; y, Stokes in polarimetry mode), and corrects for the small geometric distortion measured in the laboratory during the integration of the instrument. 4 Critically, the DRP calculates the average parallactic angle between the start and end of an exposure, an angle that is used to rotate the reduced data cubes so that the vector toward celestial north is almost aligned with the columns of the image. We have identified and corrected in the latest data pipeline version two issues with the calculation of average parallactic angle that affect a subset of GPI measurements. These issues are most pronounced for observations taken at a very small zenith distance, where the parallactic angle is changing very rapidly. An example dataset showing the combined effect of these issues, and those described in Secs. 4 and 5, on observations of a calibration binary is shown in Fig. 2. Each image has been rotated such that north is up based on the value of AVPARANG in the header of the reduced image (white compass). We use the prime symbol to denote the fact that the old reduction does not correctly rotate north up. The original detector coordinate axes are also shown (yellow compass). Note the flip of the x axis due the odd number of reflections within the instrument. A significant change in the sky PA of the companion is seen between the two images in (a), (c), due to a combination of the errors described in Sec. 3. The PA of the companion is stable after the revisions to the pipeline.

Calculation of Average Parallactic Angle from Precise Exposure Start and End Times
Calculating the time-averaged parallactic angle during the course of an exposure requires accurate and precise knowledge of the exact start and end times of that exposure. We found that the GPI DRP was not originally using a sufficiently precise value for the start time in the case of an exposure with more than one coadd. Doing this correctly requires an understanding of the low-level details of the UTR readout of the Hawaii-2RG detector and the surrounding GPI and Gemini software. The header of a raw GPI FITS file contains four timestamps saved at various times during the acquisition of an image with the IFS: UT, MJD-OBS, UTSTART, and UTEND. The keywords UT and MJD-OBS contain the time at the moment the header keyword values were queried by the Gemini master process prior to the start of the exposure. UT is reported in the coordinated universal time (UTC) scale, whereas MJD-OBS is reported in the terrestrial time scale, a scale linked to the International Atomic Time that is running ∼65 s ahead of UTC. Because these keywords are written during exposure setup by a different computer system, neither is a highly precise metric for the exact exposure time start. The other keywords (UTSTART and UTEND) are generated by the IFS brick upon receipt of the command to execute an exposure and after the final read of the last coadd has completed. These two timestamps are reported in the UTC scale. Because they are written by the same computer that directly controls the readout, these are more accurate values for exposure timing. UTSTART is written when the IFS software receives the command to start an exposure, but since the Hawaii-2RG will be in continuous reset mode between exposures, it must wait some fraction of a read time to complete the current reset before the requested exposure can begin. Thus, the true exposure start time will be some unknown fraction of a read time after UTSTART. The final keyword UTEND is written with negligible delay immediately at the moment the last read of the last pixel is concluded. A schematic diagram of the read and resets of the Hawaii-2RG is shown for two example exposures in Fig. 3.
The pipeline was, therefore, written under the assumption that the UTEND keyword provides the most accurate way to determine the true start and end time of each exposure, which, in turn, is used to calculate the average parallactic angle during the exposure. The effective end time of the exposure can be calculated as occurring half a read time prior to UTEND, i.e., the time at which half of the detector pixels have been read. The effective start time of the exposure, i.e., when half of the detector pixels have been read for the first time and can be calculated working backward from UTEND toward UTSTART. We do so based on the read time (t read ), number of reads per coadd (n read , where n read − 1 multiplied by t read yields the integration time per coadd), and number of coadds (n coadd ). The pipeline writes two additional keywords to the science extension of the reduced FITS file that stores the calculated effective start (EXPSTART) and end (EXPEND) times of the exposure calculated using UTEND, t read , n read , and n coadd . EXPSTART and EXPEND are then used to calculate the average parallactic angle over the course of the exposure, which is written as keyword AVPARANG.
Inadvertently, versions 1.4 and prior of the GPI pipeline contained an error in this calculation by not correctly accounting for the number of coadds. The total exposure time including overheads was calculated as t exp ¼ t read × ðn read − 3∕2Þ, where n read is the number of reads per coadd. Instead, the exposure time is more correctly calculated as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 1 ; 1 1 6 ; 5 6 8 (1) where the additional terms account for the extra resets that occur between each coadd. The effect of this error was negligible for single-coadd exposures, the most common type of exposures taken with GPI; 89% of on-sky observations were taken with a single coadd. For images with multiple coadds, the effect can be very significant, with the error on the estimated time elapsed during the complete observation of E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 2 ; 1 1 6 ; 4 7 2 To demonstrate how large this error can get for exposures with multiple coadds, an exposure with an integration time of 1.45 s with 10 coadds has a Δt of 40 s, an error equivalent to 98% of the actual time spent exposing (see Fig. 4). A large Δt can cause a significant and systematic error in the parallactic angle used to rotate the reduced data cubes north up as EXPSTART and Combinations with more than 100 images are shown as red circles (size scaled by the number), whereas combinations with less than 100 are shown as small gray circles. The vast majority of GPI exposures are taken with a single coadd, but for some frames with multiple coadds ΔT exceeded 120 s. EXPSTOP header keywords are converted into the hour angle at the start and end of the exposure from which the parallactic angle is calculated. This is most pronounced for targets observed at a small zenith distance where the parallactic angle is changing most rapidly. This error not only affects the astrometry of substellar companions, but also the measurement of binaries observed with other instruments that were used to calibrate GPI's true north offset angle.
After this inaccuracy was discovered, the GPI pipeline was updated to perform the correct calculation, as in version 1.5.

Average Parallactic Angle During Transits
A second issue affecting a small number of observations is related to time-averaging during exposures that span transit.
The pipeline computes the average parallactic angle between the start and end of an exposure via Romberg's method. For northern targets that transit during an exposure, the function contains a discontinuity at an hour angle (H) of H ¼ 0 rad where the parallactic angle jumps from −π to þπ. This discontinuity can easily be avoided by performing the integration between H ¼ H 0 and H ¼ 0 rad, and between H ¼ 0 rad and H ¼ H 1 , where H 0 and H 1 are the hour angles at the start and end of the exposure. The prior versions 1.4 of the pipeline and earlier contained an error in how this calculation was performed. As an example, the average parallactic angle p avg for an exposure with jH 0 j < H 1 was calculated as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 3 ; 1 1 6 ; 4 9 5 rather than E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 4 ; 1 1 6 ; 4 3 7 This error only affects sequences where the target star transited the meridian between the start and end of an exposure. The magnitude of this error depends on exactly when transit occurred relative to the start and the end and the declination of the target. The net effect of the error on companion astrometry is small as it will only affect one of ∼40 images taken in a typical GPI observing sequence. This issue has also been corrected as of the latest version of the GPI pipeline.

Inaccuracies in Some FITS Header Time Information
The pipeline necessarily relies on the accuracy of the FITS header keywords in the data it is processing. However, it been proven that the FITS header keyword time information is not always as reliable as we would like. A review of FITS header timing information allowed us to uncover several periods in which misconfiguration or malfunction of time server software resulted in systematic errors in header keyword information. We were able to reconstruct the past history of such timing drifts sufficiently well as to be able to retroactively calibrate it out when reprocessing older data. As a reminder, the UTSTART keyword is written by the IFS brick computer. The clock on the IFS brick is, at least in theory, configured to automatically synchronize once per week with Gemini's NTP server. This server provides a master time reference signal to maintain the accurate timings necessary for telescope pointing and control. In order to cause a noticeable error in the average parallactic angle, the IFS brick time stamps would have to be between a few and a few tens of seconds out of sync, depending on the declination of the star (Fig. 5). The regular synchronization of the clock on the IFS brick was intended to be sufficient to prevent it from drifting at such an amplitude relative to the time maintained by Gemini's NTP server.
However, it was eventually discovered that this time synchronization has not always operated as intended, resulting in significant clock offsets for some periods. The history of the offset between the IFS brick clock and UTC cannot be recovered directly from the various logs and headers generated by the IFS. Instead, we can use the difference between the UT and UTSTART header keywords as a proxy. The first timestamp is generated when the command to execute an observation is issued by Gemini's Sequence Executor (SeqExec) and is assumed to be accurate; a significant offset in the observatory's clock would quickly become apparent when attempting to guide the telescope. The second timestamp is generated when the IFS brick receives the command to start an exposure from the GPI TLC. The difference between these two timestamps, UTSTART-UT, should be small and relatively stable, as there have not been any significant changes to these software components since the instrument was commissioned in 2014, and we show below that this time difference does prove to be stable for the majority of GPI data.
We, therefore, data mined all available GPI data to determine the time evolution of the offset between UT and UTSTART during the entire time GPI has been operational. We queried the GPIES Structured Query Language database, 24,25 which contains the header information for all images obtained in the GPIES Campaign programs, selected guest observer (GO) programs whose principal investigators have contributed their data into this database and all public calibration programs. We augmented this with all GO programs that were publicly accessible in the Gemini Observatory Science Archive when this analysis was performed. We excluded engineering frames-images that are obtained via GPI's interactive data language interface-as the UT keyword is populated via a different process for these types of frames. A total of 99,695 measurements of the UT to UTSTART offset spanning the previous six years were obtained, including 93,575 from the GPIES database and 6120 from other GO programs not included within the database.
The evolution of this offset between the installation of the instrument at Gemini South and now is shown in Fig. 6. We identified several periods of time, two quite extended, where the IFS clock was not correctly synchronized with the Gemini NTP. From the initial commissioning of the instrument until the end of 2014, the offset varied significantly, from about 8 s slow to up to 30 s fast. The causes of these variations are not fully known, but we point out that during this first year, GPI was still in commission and shared-risk science verification and software was still significantly in flux. In several instances, negative shifts in the offset are correlated with dates on which the IFS brick was used after having been restarted but prior to the periodic time synchronization occurred. The gradual negative drifts in offset observed at several points imply that the IFS clock was running too fast, gaining time at a rate of ∼1 s per day over this period. Later, other small excursions in April 2016, August 2018, and August 2019 were also apparently caused by the IFS brick being used after an extended time powered off but prior to the scheduled weekly time synchronization. It would, of course, have been better had the time synchronization occurred automatically immediately after each reboot, but that was not the case.
A second long period with a significant offset, between June 2015 and March 2016, was caused by the IFS brick being synchronized to the wrong time server; it was tracking the Global Positioning System (GPS) time scale rather than UTC, and therefore, ran 18 s ahead Improved systems administration can prevent such drifts in the future, but in order to properly calibrate the available data, we must model out the drifts that occurred in the past. The offset between UT and UTSTART remained relatively stable from mid-2016 through mid-2018 and was independent of the observing mode. We measured the median offset value between 2016.5 and 2018.5 as −3.38 s and defined this as the nominal UT to UTSTART offset (Fig. 7). We used a rolling median with a width of 12 h to calculate the value of the offset at a resolution of 1 h between late 2013 and 2019. A lookup table was created that the pipeline queries when reducing an IFS image so that it can apply a correction to UTSTART and UTEND if the observation was taken during a period identified as having a significant offset (Fig. 6).

Modeling Apparent Image Rotation at Gemini's Cassegrain Port
Recall from Sec. 2.1 that GPI always operates in ADI mode, with its pupil fixed or nearly fixed relative to the telescope pupil. GPI is attached to Gemini's ISS, which itself is mounted on the Cassegrain port of the telescope. A Cassegrain instrument rotator is used to maintain a fixed PA between the columns on an instrument's detector and either celestial north or the zenith. For an ideal altitude-azimuth telescope with an elevation axis perfectly aligned with local vertical and with an azimuth platform perpendicular to vertical, an instrument mounted on the Cassegrain port would observe the north angle changing with the parallactic angle as the telescope tracked a star through the meridian. The angle between the columns on the instrument detector and the direction of vertical would remain fixed (Fig. 8). Differences between true vertical and the vertical axis of the telescope cause this angle to vary slightly, an effect most pronounced for stars observed near the meridian with a small zenith distance (≲5 deg). When enabled, Gemini South's instrument rotator compensates for this motion, keeping the VA fixed on the detector (Fig. 9).
Due to difficulties maintaining the AO guide loops for targets with a very small zenith distance, it became common for some operators to keep the instrument rotator drive disabled while GPI was in operation, regardless of the target elevation. However, this practice was inconsistently applied. The drive was disabled and the rotator was kept at a nominal home position for 99 of the 317 nights on which GPI was used over the last 6 years. For data taken on these nights, a small correction needs to be applied to the parallactic angle in the header to compensate for this small motion of the VA as a star is tracked through the meridian. Such a correction relies on precise knowledge of the telescope mount alignment. Sufficiently precise information on the Gemini South telescope mount is not publicly available. We, therefore, derived post facto knowledge of the Gemini South telescope mount based on the behavior of the Cassegrain rotator on nights when it was activated.  Fig. 9 Angle of the instrument rotator as a function of hour angle for GPI observations where the rotator drive was enabled. The color of the symbol denotes the declination of the target. The instrument rotator angle has a different behavior for northern and southern targets due to the nonperpendicularity of the Gemini South telescope. The angle of the vertical vector (green) remains fixed relative to the image coordinate system for an ideal altitude-azimuth telescope, here at an angle of ∼23.5 deg from the x axis within a reduced GPI data cube. Any offset between true vertical and the vertical axis of the telescope will cause the vertical vector within a reduced image to move slightly as the target crosses the meridian, the magnitude of which would be imperceptible in this diagram for a small offset as is the case for Gemini South, but significant relative to the precision of astrometric measurements made with GPI.
We constructed a simple model to predict the correction to the parallactic angle caused by the nonperpendicular nature of the telescope. 26 For a perfect telescope, the parallactic angle of a source p is calculated as (Fig. 10) E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 5 ; 1 1 6 ; 6 9 9 tan p ¼ − cos ϕ sin A sin ϕ cos E − cos ϕ sin E cos A ; (5) where A and E are the topocentric horizontal coordinates of the target, i.e., azimuth and elevation. If the telescope's azimuth platform is tilted at an angle of θ with an azimuth of Ω, the difference between the true p and apparent p 0 parallactic angles is 27 E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 6 ; 1 1 6 ; 6 2 0 where Δp ¼ − arcsinðsin θ E ∕ cos EÞ: These tilts will lead to a slight difference in the elevation and azimuth (E 0 , A 0 ) of the telescope mount versus the topocentric elevation and azimuth (E, A) of the target. The telescope elevation and azimuth modified by the azimuth tilt are calculated as and due to an elevation tilt as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 9 ; 1 1 6 ; 4 1 0 To construct a model of the tilt of the azimuth and elevation axes of the Gemini South telescope, we assumed that the instrument rotator was only compensating for the change in parallactic angle induced by these tilts. We collected measurements of the telescope elevation and azimuth and instrument rotator position on the 207 nights where GPI observations were taken with the rotator drive enabled. As the header stores the mechanical position of the telescope, we inverted the previous equations to compute the topocentric elevation and azimuth. Using these, we predicted the change in parallactic angle, and thus the position that the instrument rotator would need to be at to compensate for nonperpendicularity for a given set of tilt parameters (θ, Ω, and θ E ). We performed a least squares minimization to determine the set of tilt parameters that best reproduce the instrument rotator position for 10 roughly 6-month periods over the last 5 years. The break points were chosen arbitrarily to be at the start and midpoint of each year except for years in which a major earthquake occurred near Cerro Pachon (September 17, 2015, and January 19, 2019), and when a break point coincided with a period in which GPI was being used.
The tilt model parameters that best fit the measured instrument rotator positions are given in Table 1. A comparison between the model and data on the night of May 6, 2015, UT is shown in Fig. 11. The model is able to reproduce the commanded rotator positions with residuals smaller than the north calibration uncertainty (discussed below) in all but a handful of the images, specifically those taken at elevations ≳88 deg (Fig. 12). We identified all GPI images to which we had access that were taken with the instrument rotator drive disabled. We used the tilt model parameters in Table 1 and the telescope elevation and azimuth within the header to calculate the correction to apply to the parallactic angle to compensate for the slight change in the angle of vertical on the detector. We created a lookup table with these corrections using the DATALAB header keyword to uniquely assign a correction to a specific GPI observation taken with the rotator drive disabled. Files with DATALAB values not in the lookup table do not have a correction applied. This lookup table contains all GPI observations taken with the drive disabled that were accessible at the time of this study, including GPIES campaign data, GO program data that are ingested into the GPIES database, and GO program data that were public at the time of the analysis.

North Angle Calibration
The corrections to the GPI DRP described in Secs. 3, 4, and 5 necessitated a revision of GPI's astrometric calibration, specifically the true north angle. The north angle offset is defined as the angle between IFS pixel columns and north in an image that has been rotated to put north up based on the average parallactic angle during the exposure. Here, we define the direction of the north angle offset as θ true − θ observed , a correction that would need to be added to a PA measured in images reduced with the GPI DRP (after correcting for the x axis flip) to recover the true PA of a companion.
We calibrate true north in GPI data based on observations of astrometric reference targets on sky. The small field of view (2.8 arc sec × 2.8 arc sec) and relatively bright limiting magnitude (I < 10) of GPI exclude many of the typical astrometric calibration fields used by other instruments (e.g., M15 and M92). Instead, we rely on periodic observations of a set of calibration binaries that have near-contemporaneous measurements with the well-calibrated NIRC2 camera on the Keck II telescope. 28,29

Gemini South/GPI Observations
We have observed nine binary or multiple star systems since the start of routine operations in 2014. A summary of all these observations is given in Table 2   (λ eff ¼ 2.06 μm); note that since the spectral filter in the GPI IFS is after the spatial pixellation at the lenslet array, change of filter cannot affect the astrometric calibration. The majority of the observations were obtained in GPI's "direct" mode, a configuration where the various coronagraphic components are removed from the optical path. Some were obtained in "unblocked" mode, which includes the Lyot mask and pupil plane apodizer in the optical path to reduce instrument throughput, preventing saturation from brighter stars. The addition of a neutral density filter in 2017 allowed us to observe calibrator binaries that were significantly brighter than the nominal H-band saturation limit of the IFS in either direct or unblocked mode. Observations of the θ 1 Ori B multiple system were taken in the coronagraphic mode, the typical mode for planet search observations, allowing for a high signal-to-noise ratio (SNR) detection of the fainter stellar components B2, B3, and B4 that all lie within an arcsecond of the primary star. We do not expect the coronagraph optics to have a significant effect on astrometric measurements, except for those made for objects extremely close to the edge of the focal plane mask, which is not relevant here. The three coronagraph optics are in pupil and focal planes only, so cannot individually introduce distortions. By effectively weighting the beam profile across the pupil, they could, in principle, cause the beam to sample a different portion of any intermediate optics if those optics have polishing errors that could cause a slight field-dependent photocenter  Fig. 1) are small, located in a slow beam, and superpolished to ∼1 nm rms wavefront error. Measured distortions are ∼3 mas across the field of view 4 and completely dominated by the geometric effects of the telephoto relay inside the spectrograph, with no evidence for a polishing-error component.
These observations were processed using version 1.5 (revision e0ea9f5) of the GPI DRP, incorporating the changes described in Secs. 3, 4, and 5. The data were all processed using the same DRP recipe with standard processing steps. The raw images were dark subtracted and corrected for bad pixels using both a static bad pixel map and outlier identification. The individual microspectra in each 2-D image were reassembled into a 3-D data cube (x, y, λ) using a wavelength solution derived from observations of a calibration argon arc lamp. An additional outlier identification and rejection step was performed on the individual slices of the data cubes. A distortion correction was then applied to each slice based on measurements of a pinhole mask taken during the commissioning of the instrument. 4

Keck II/NIRC2 Observations
The same nine multiple systems have been observed with the NIRC2 instrument in conjunction with the facility AO system on the Keck II telescope. The isolated calibration binaries have between one and six NIRC2 epochs between 2014 and 2019. The Trapezium cluster that contains θ 1 Ori B has been observed periodically with NIRC2 as an astrometric calibrator field by multiple different teams, with archival measurements extending as far back as December 2001.
The observations were taken in a variety of instrument configurations and filters. A summary of these observations is given in Table 3. Datasets were taken in either PA mode, where north remains fixed at a given angle on the detector, or VA mode, where the VA remains fixed and north varies with the parallactic angle of the target.
We reduced these data using a typical near-infrared imaging DRP; correction for nonlinearity, 30 dark subtraction, flat fielding, and bad pixel identification and correction. Reduced images were corrected for geometric distortion using the appropriate distortion map. 28,29 For observations taken using a subarray of the NIRC2 detector, we zero-padded the images prior to applying the distortion correction as the distortion correction script is hard-coded for 1024 × 1024 px images. 31 The astrometric calibration of NIRC2 was derived from analyses of globular cluster observations and has been validated with measurements of the locations of SiO masers in the galactic center that were determined precisely using very long baseline radio interferometry measurements. 28,29 We used a plate scale of 9.952 AE 0.002 mas px −1 and a north angle offset of −0.252 AE 0.009 deg for data taken prior to April 13, 2015, 28 and 9.971 AE 0.005 mas px −1 and a north angle offset of −0.262 AE 0.020 deg for data taken after. 29

Relative Astrometry
We used PSF fitting to measure the position of the companion relative to the primary. For the calibration binaries other than θ 1 Ori B, we estimated the location of the primary star within each image (or wavelength slice) by fitting a 2-D Gaussian to a small 7 × 7 pixel stamp centered on an initial estimate of the primary star. The five parameters (x, y, σ x , σ y , and amplitude A) were allowed to vary except for the NIRC2 data obtained on 2019-04-25 (HIP 80628) and 2019-05-23 (HIP 44804), where σ x and σ y were fixed due to a strongly asymmetric PSF and the proximity of the companion. This process was repeated using the output of the first iteration as the initial guess for the second. We extracted a 15 × 15 px stamp centered on the fitted position of the primary to use as a template to fit the location of the secondary. We used the Nelder-Mead downhill simplex algorithm to determine the pixel offset and flux ratio between the primary and secondary stars by minimizing the squared residuals within a 2λ∕D radius aperture surrounding the secondary. We estimated the uncertainty in the centroid of each fit as the full-width-at-halfmaximum divided by the SNR measured as the peak pixel value divided by the standard deviation of pixel values within an annulus 15λ∕D from the star. We corrected differential atmospheric refraction caused by the different zenith angle of the two stars using the model described in Ref. 32. We used the simplifying assumption that the observations were monochromatic at the central wavelength of the filter, negating any stellar color dependence on the effective wavelength. This effect causes a reduction in the separation of a binary star along the elevation axis and was typically very small; at most 0.3 mas for the NIRC2 observations of HIP 80628 taken at an elevation of ∼35 deg. PAs measured in datasets taken in VA mode were corrected by the parallactic angle at the middle of the exposure such that they were effectively measured relative from north. The small angular separation between the two components of the θ 1 Ori B2-B3 binary required us to use either θ 1 Ori B1 for the NIRC2 observations or θ 1 Ori B4 for the GPI observations as a reference PSF. We used this template PSF to simultaneously fit the location and fluxes of the two components of the B2-B3 binary following a similar procedure. We used a Fourier high-pass filter to subtract the seeing halo from B1 that was introducing a background signal for both B4 and the B2-B3 binary. The relative astrometries are listed in Table 2 for GPI and in Table 3 for NIRC2. We did not apply any correction for the differential atmospheric refraction for these observations given the extremely small difference in zenith angle between the two stars. We did not use the relative astrometry of B1-B2, B1-B3, or B1-B4 as B1 was obscured by GPI's focal plane mask, nor did we use B2-B4 or B3-B4 as the relative motion of these three stars cannot be described using a simple Keplerian model.
As a verification of the relative astrometry presented here, we performed an independent analysis of a subset of both the GPI and NIRC2 observations using the procedure described in Ref. 4. The GPI data were reduced with the same version of the DRP, whereas the NIRC2 data were reduced with a separate pipeline that performed the same functions as described in Sec. 6.2. Once the data were reduced, relative astrometry was performed using StarFinder. 33 For this subset of observations, we measured consistent separations and PAs to the values reported in Tables 2 and 3.

Accounting for Orbital Motion
Orbital motion of the calibration binaries between the NIRC2 and GPI epochs can introduce a significant bias in the north angle offset measurement. We fit Keplerian orbits to each of the calibration binaries using the NIRC2 astrometry presented in Table 3. These fits allowed us to simulate NIRC2 measurements on the same epoch as the GPI observations listed in Table 2, mitigating the bias induced by orbital motion. We use the parallel-tempered affine invariant Markov chain Monte Carlo (MCMC) package emcee 34 to sample the posterior distributions of the Campbell elements describing the visual orbit and of the system parallax. A complete description of the fitting procedure as applied to the 51 Eridani system can be found in Ref. 35. We used prior distributions for the system mass based on the blended spectral type and flux ratios of the components, and for the system parallax using measurements from either Hipparcos 36 or Gaia. 37 We used a parallax of 2.41 AE 0.03 mas for θ 1 Ori B2-B3. 38 We also fitted the radial velocity measurements of both components of the HD 158614 binary 39 to help further constrain its orbital parameters. We purposely excluded astrometric measurements from other instruments and assumed that the NIRC2 astrometric calibration was stable before and after the realignment procedure in mid-2015.
The PA of the visual orbit and corresponding residuals are shown in Fig. 13 for the nine calibration binaries. We simulated NIRC2 measurements at the epoch of the GPI observations by drawing 10,000 orbits at random from MCMC chains and converting the orbital elements into separations and PAs at the desired epoch. We used the median of the resulting distribution of separations and PAs as the simulated measurement and the standard deviation as the uncertainty. These simulated measurements are reported in Table 4. The small semimajor axis of the HIP 43947 binary led to a significant uncertainty on the simulated NIRC2 observation despite the short 50-day baseline between the NIRC2 and GPI observations, precluding a measurement of the north offset angle with this binary. This was also the case for all but one epoch of both the HD 1620 and HD 6307 systems. Additional observations of these systems with NIRC2 to reduce the orbital uncertainties will be required for more precise predictions at these epochs. The remaining binaries (HD 157516, HD 158614, HIP 44804, HIP 80628, HR 7668, and θ 1 Ori B2-B3) either had enough NIRC2 measurements to sufficiently constrain the orbit at the GPI epochs or were close enough in time that the orbital motion between the NIRC2 and GPI epochs was smaller than the measurement uncertainties.

GPI Plate Scale
The plate scale for GPI was measured using the predicted separations in angular units from the orbit fit to the NIRC2 measurements and the pixel separations measured in the reduced GPI images (Table 4). We saw no evidence of a variation in the plate scale with time (Fig. 14) and adopted a single value of 14.161 AE 0.021 mas px −1 . This measurement is consistent with the previous plate scale of 14.166 AE 0.007 mas px −1 , 4,10 but with a larger uncertainty. The pipeline changes described in Secs. 3, 4, and 5 have no impact on the separation of two stars within a reduced GPI image. The slight difference in the inferred plate scale can instead be ascribed to (a) (b) Fig. 13 (a) PA and (b) residuals of the orbits (blue lines) consistent with the NIRC2 astrometry in Table 3 (squares). The dates of GPI observations are highlighted; green dashed lines denote epochs that were used for the astrometric calibration, and red dotted lines denote epochs where the orbital motion is significant relative to the GPI measurement uncertainties. In a subset of the plots in (b), the date range has been restricted to focus on the dates of the GPI observations.  trend of increasing north offset angle over the course of 6 years when comparing the calibration binary measurements in early-2014 and mid-2019. One plausible cause of a rotation of the instrument with respect to the telescope is the annual shutdown of the telescope when both the instrument and ISS are removed to perform maintenance. We fit a variable north offset angle that remains static between the dates of telescope shutdowns. A series of weighted means were calculated using measurements between each shutdown, as listed in Table 4 and plotted in Fig. 15. This model reproduces the trend of increasing north offset angle during the previous 6 years and is an improved fit (χ 2 ν ¼ 0.4, ν ¼ 31) relative to the single-valued model. We opted to use this variable north offset angle model for the final astrometric calibration of the instrument. Fig. 14 Measurements of the plate scale of GPI derived from calibration binaries (red circles) and the θ 1 Ori B2-B3 binary (black squares). The mean and standard deviation (blue solid line and shaded region) were calculated using a weighted mean and assuming that the measurements were not independent. The previous astrometric calibration is overplotted for reference (gray dashed line and shaded region).
(a) (b) Fig. 15 Measurements of the north offset angle of GPI derived from calibration binaries (red circles) and the θ 1 Ori B2-B3 binary (black squares). We fit the north angle assuming it is either (a) a constant calibration for the entire date ranges or (b) that it varies between telescope shutdowns. The mean and standard deviation (blue solid line and shaded region) are calculated as in Fig. 14

Instrument Stability
The cause of the change of the north offset angle over time is not known. In principle, a movement of the IFS or the CAL system on their bipod mounts could produce a clocking of the focal plane with respect to the telescope, although a movement of 5 mm would be required. We excluded rotations internal to the instrument by measuring the angle between two of the satellite spots within a postalignment image taken routinely before instrument operation. These satellite spots are generated by a periodic wire grid on the pupil plane apodizer, 13,14 located on the AO bench (Fig. 1). A physical rotation of the IFS relative to the apodizer would manifest itself as a rotation of the satellite spots within the focal plane as recorded by the IFS. We measured the angle between the bottom left and top right satellite spots in 406 postalignment images taken between late-2014 and mid-2019 using the satellite spot finding algorithm that is a part of the GPI DRP. We find no significant trend in this angle over the past 5 years (Fig. 16), although a significant offset of ∼0.1 deg is seen for a few months at the start of 2016 that coincides with mechanical difficulties with the wheel containing the pupil plane apodizers. Excluding this period, we find an angle between these two satellite spots of 335.96 AE 0.02 deg. The stability of this angle implies that the change in the north offset angle seen in Fig. 15 is caused by a mechanical rotation upstream of the pupil plane mechanism containing the apodizer. The GPI optics upstream of this are all rigidly mounted in a single plane onto a thick optical bench and are extremely unlikely to produce such a rotation. In principal, a rotation of the outer truss structure holding all three assemblies with respect to the mounting plate could rotate the focal plane, but again that would have to be on the order of 5 mm, essentially impossible. GPI has an extremely rigid truss structure supporting various subcomponents. Integrated finite element analysis/optical modeling shows that flexure motions of any component relative to the optical axis are <25 μm over the operating range of gravity vectors. 40 Although we did not explicitly model rotation, if any hypothetical rotation component involves displacements on the same scale, the angular rotation would be on the order of 0.01 deg. The pins that locate GPI onto the ISS face have much more precise tolerances than that as well (<0.23 mm).

Revised Astrometry for Substellar Companions
The changes to the pipeline described in Secs. 3, 4, and 5 and the revised astrometric calibration of the instrument described in Sec. 7 both necessitate a revision of previously published relative astrometry of substellar companions measured using GPI observations. Revisions for β Pictoris b, 8 Fig. 16 (a) One wavelength slice of a reduced GPI data cube for a postalignment image taken using GPI's internal source on November 12, 2014. The four satellite spots generated by the grid on the pupil plane apodizer are clearly visible. The PA between the bottom left (S1) and top right (S2) satellite spot, measured from S1 to S2 counter-clockwise from vertical, (b) plotted as a function of date for each postalignment image taken since the instrument was commissioned.
to the astrometry for the exoplanets in the HR 8799 5 and HD 95086 7 systems, and the brown dwarfs HR 2562 B 42 and HD 984 B, 43 that correct for the changes to the pipeline and the revised astrometric calibration of the instrument. We reduced the same images used in the previous studies with the latest version of the GPI DRP. The revisions described in Secs. 3, 4, and 5 all affect the AVPARANG header keyword. The change in this value is plotted as a function of frame number for each observing sequence in Fig. 17. Δ AVPARANG is typically small and static, only changing by at most ∼0.05 deg between the start and end of the J-band sequence on HD 984 taken on August 30, 2015. The effect of the parallactic angle integration error described in Sec. 3.2 is apparent in several epochs.
The median Δ AVPARANG was used in conjunction with the revised north offset angle described in Sec. 7 to revise the previously published astrometry. We assumed that a single offset to the measured PA of a companion accurately describes the effect of the change to the parallactic angle for each frame within a sequence. As the maximum change in Δ AVPARANG over a sequence was 0.05 deg, the effect on the companion astrometry is likely on this order or smaller. For the majority of cases, Δ AVPARANG changes by <100'th of a degree over the course of a full observing sequence. The previous and revised astrometries for each published epoch are given in Table 5. We find small but not significant changes in the measured separations, and significant changes in the measured PAs due to the significant change in the north offset angle described in Sec. 7.

Discussion/Conclusion
We have identified and corrected several issues with the GPI DRP that affected astrometric measurements of both calibration binaries and substellar objects whose orbital motion was being monitored. We reprocessed the calibration data after implementing these fixes into the pipeline and revised the astrometric calibration of the instrument. The most significant change was to the north offset angle; changing from −0.10 AE 0.13 deg to between 0.17 AE 0.14 deg and 0.45 AE 0.11 deg, depending on the date. The plate scale of the instrument was also remeasured as 14.161 AE 0.021 mas px −1 , consistent with the previous calibration albeit with a larger uncertainty.
Although the change to the astrometric calibration of the instrument is significant relative to the stated uncertainties, the impact should be limited to studies that combine GPI astrometry with that from instruments of similar precision. The revised calibration should not have a significant impact on the results and interpretation of studies that used GPI astrometry either solely or in conjunction with astrometry from instruments with significantly worse astrometric precision; 6,9,10 an offset in the north angle will simply change the PA of the orbit on the sky (Ω). A more significant effect might be seen for orbit fits that combined astrometry from GPI with astrometry of a similar precision from other instruments. 7,44 The magnitude of the effect on the derived orbital parameters is likely small. All but one of the substellar companions studied with GPI have a small fraction of their complete orbits measured, and so the change of the shape of the posterior distributions describing the orbital elements is likely not statistically significant. The precision of astrometric measurements made with GPI is currently limited by measurement uncertainties except for widely separated companions such as the HR 8799 bcd, and the highest SNR measurements of β Pic b made in 2013 when the projected separation was ∼430 mas, where the north angle uncertainty dominates the PA error budget. Lower SNR measurements of faint companions such as 51 Eri b are less affected, with the north angle uncertainty being between a factor of two and five smaller than the measurement uncertainty.
Future studies using archival GPI data will need to account for both the changes to the pipeline and the revision to the astrometric calibration. The updated pipeline is publicly available on the GPI instrument website 45 and on GitHub. 46 All users wishing to perform precision astrometry will have to reduce their data using the latest version of the pipeline, especially those obtained on the highlighted dates in Fig. 6, and apply the revised astrometric calibration presented in Sec. 7. The measurements presented here demonstrate the importance of continued astrometric calibration, especially for instruments on the Cassegrain mount of a telescope. Improvements to the limiting magnitude of GPI's AO system as it is moved to Gemini North will allow us to use globular clusters as astrometric calibrations instead of isolated binaries, allowing for a more precise determination of the north angle via a comparison to both archival Hubble Space Telescope and contemporaneous Keck/NIRC2 observations. This study also demonstrates the importance of precise and accurate astrometric calibration of instruments designed for high-contrast imaging of extrasolar planets. Instruments equipped with IFS necessarily have a small field of view, challenging for astrometric calibration that typically relies on images of globular clusters extending over several to tens of arcseconds. These results also demonstrate the importance of accounting for orbital motion, either between the two components of a calibration binary and/or the photocenter motion of one of the components if one of the components is itself a tight binary. A similar problem arises with the use of SiO masers near the Galactic Center; 28 the location of the infrared source is not necessarily coincident with that of the radio emission that the infrared astrometric reference frame is tied. 47 Precise and accurate astrometric calibration of future instruments with very narrow fields of view such as the Coronagraphic Instrument on the Wide Field Infrared Survey Telescope 48 will require a careful calibration strategy to mitigate the effects of these and other biases.