GOES-R series image navigation and registration performance assessment tool set

Abstract. An image navigation (NAV) and registration (INR) performance assessment tool set (IPATS) was developed to assess the US Geostationary Operational Environmental Satellite R-series (GOES-R) Advanced Baseline Imager (ABI) and Geostationary Lightning Mapper (GLM) INR performance. IPATS produces five INR metrics for level 1B ABI images: navigation, channel-to-channel registration, frame-to-frame registration, swath-to-swath registration, and within-frame registration. IPATS also produces one INR metric for GLM: navigation of background images. The high-precision INR metrics produced by IPATS are critical to INR performance evaluation and long-term monitoring. IPATS INR metrics also provide feedback to INR engineers for tuning the navigation algorithms and parameters to further refine INR performance. IPATS utilizes a modular algorithm design to allow the user-selectable data processing sequence and configuration parameters. We first describe the algorithmic design and the implementation of IPATS. Next, it describes the investigation of the optimization of the configuration parameters to reduce measurement errors. Finally, sample INR performance is presented, including GOES-16 and GOES-17 ABI NAV performance from postlaunch test to November 2019 and the comparison of example 24-h INR performance against the mission performance requirements. The INR assessment results show that both GOES-R ABIs are in compliance with the mission INR requirements.

coordinate system, which is a two-dimensional (2-D) angle space centered at the assigned location of the satellite in geosynchronous orbit. 1 ABI has columns of detectors for each spectral channel, and it scans the fields of view of these detectors in a west to east direction to acquire swaths of Earth imagery. Each ABI L1B image is composed of two or more swaths, each swath having an effective angular height in the GOES fixed-grid of about 0.8 deg. During nominal operations, ABI produces a full disk (FD) image composed of 22 swaths every 10 to 15 min, a Conterminous US (CONUS) image composed of six swaths every 5 min, and a mesoscale (MESO) image composed of two swaths every 30 s. The FD is a circle of 17.4 deg angular diameter with center at nominal satellite nadir and circumference at the Earth limb. The CONUS images are rectangular and have a nadir extent of 5000 km in the east-west (EW) direction and 3000 km in the north-south (NS) direction. The MESO image can be acquired at any location and is also rectangular and has an extent of 1000 km EW × 1000 km NS. 2 The advanced temporal and spatial resolutions make ABIs a promising data source for imaging Earth's surface and atmosphere. ABI data are used not only for weather forecasting but also for the detection and observation of severe environmental phenomena and climate change studies. 3 Earth location, or geolocation, accuracy is a key quality indicator of satellite data. Accurate geolocation ensures that data from different channels of a sensor or data from different sensors/ sources can be applied together to retrieve high-level biogeophysical information. 4,5 An image navigation (NAV) and registration (INR) performance assessment tool set (IPATS) was designed and developed under the auspices of NASA's GOES-R Flight Project for independent verification of the ABI INR performance in the postlaunch period for performance evaluation and long-term monitoring. IPATS was also developed for analysis of the navigation accuracy of background images produced by the Geostationary Lightning Mapper (GLM) onboard both GOES-R series satellites. In this study, we will focus on ABI results to describe the IPATS algorithms and ABI INR performance. Further information concerning the evaluation of GLM INR performance with IPATS can be found in Ref. 6. IPATS produces INR performance metrics for all three types of ABI images. The assessment results are not only used to verify the INR accuracy but also provide in-depth analysis to help improve the INR algorithms, operational parameters, and future instrument design.
In this paper, we describe the IPATS software design and the algorithms employed by IPATS, including image preprocessing, image registration, and evaluation and quality screening of the IPATS results. Next, the selection of configuration parameters for optimal results is described. Finally, we present the latest GOES-16 and GOES-17 ABI performance measured by IPATS and how the in-depth analyses are performed based on IPATS measurement results.

IPATS Architecture
To fully assess the INR accuracy, IPATS produces five types of INR quality assessment metrics: • ABI and GLM navigation (NAV) error: difference between the location of an image feature and its true location. • ABI channel-to-channel registration (CCR) error: relative navigation error at corresponding image features of different channels in the same frame. • ABI frame-to-frame registration (FFR) error: relative navigation error of corresponding image features of same channel in consecutive images. • ABI within-frame registration (WIFR) error: difference between radial separation of two image features on the fixed grid coordinate system and their true angular separation. WIFR is calculated indirectly from the NAV results. • ABI swath-to-swath registration (SSR) error: relative navigation error of two neighboring image features on opposite sides of the horizontal image swath boundary.
IPATS employs a modular algorithm architecture because (1) most of the processing steps of above metrics are common and (2) there are multiple algorithms available for each processing step. During IPATS development, members of the team who are image-processing subject matter experts brought various techniques to the table. Given the need to design, develop, test, and deploy IPATS in advance of GOES-R (now GOES-16) launch, the effort was focused on those techniques familiar to the IPATS team members, and a more comprehensive research effort to identify all alternatives was not undertaken. Sometimes, multiple algorithms to perform a given function, such as Pearson cross correlation (PCC) and normalized mutual information (NMI) algorithms in the image registration module, were both implemented into IPATS, with the selection of algorithm being left to the IPATS user. Figure 1 shows a high-level diagram of IPATS. Two input images are preprocessed and then the EW and NS registration differences are calculated. In the last step, the IPATS measurements are screened to identify the high-quality measurements used to produce the assessment reports. All the metrics above, except for WIFR, use similar modules with different image types, configuration, and screening parameters. Metrics, except SSR, are generated separately for each of the image types: FD, CONUS, and MESO. WIFR is an additional step that operates on the NAV results.
SSR is evaluated on two successive MESO images, which are specially tasked by mission operations to specific Earth locations. The top of the lower swath of the first MESO image overlaps the bottom of the upper swath of the second MESO image. The evaluation window with a size of 128 × 8 GOES pixels lies within the overlap region of the two tasked MESO images. The misregistration errors measured within the evaluation windows provide the assessment of SSR. 7

Landsat Chips
The ABI NAV accuracy is assessed through comparing subsets of ABI images with subsets of Landsat 8 images, called chips here, which are considered to have a negligible NAV error. The geolocation accuracy of the Landsat images is within 15 m, 8 which is 3% or less of the spatial resolutions of GOES-R ABIs.
The chips for assessing ABI NAV are mostly along the shorelines of North and South America (Fig. 2). The shorelines are emphasized because they tend to exhibit high contrast, low spatial frequency image features that are particularly suitable for image registration at the spatial scale of ABI images. The size of a Landsat chip is about 150 km × 150 km. The Landsat 8 images used to generate chips were acquired between years 2013 and 2017. Based on our experience with other instruments, the chips should be refreshed with latest Landsat images every 10 years or so. The Landsat chips are either cloud free or contain limited cloud coverage, usually <5%. For a single chip location, multiple chips corresponding to multiple seasons were collected when Landsat 8 images of sufficient quality were available.
The spectral response of different channels varies for the same target. This is a major error source when registering the images from different spectral channels. For example, the land/water boundary is the primary feature for registering ABI and Landsat subsets. The locations of the land/water boundary could be at different locations in different channels due to spectral response difference and ground feature characteristics, e.g., steep or mild slope on the shoreline. Therefore, it is necessary to use a multispectral chip library when assessing ABI NAV accuracy. Table 1 shows a comparison of the spectral channels of ABI and Landsat 8. For VNIR channels, ABI and Landsat 8 channels show good spectral overlap. For MWIR and LWIR channels, there is no close correspondence between ABI and Landsat 8 channels. We determined the corresponding channels based on the spectral response characteristics and also made necessary adjustment after carefully examining registration performance for ABI and Landsat subsets. Originally, we utilized channel 7 of Landsat 8 (short-wave infrared channel) to assess channel 7 of ABI (MWIR). 8 After examining operational ABI data, we found that channel 10 of Landsat 8 (LWIR) is a better choice to assess ABI channel 7 considering nighttime emissions. Such adjustments occurred throughout the IPATS development and testing process from 2014 to 2018.

Subimage resampling
For retrieving ABI metrics (except WIFR), subsets of ABI images, rather than the whole ABI image, are used to identify the registration errors. For ABI NAV, IPATS correlates each Landsat chip and the corresponding subsets of the ABI image. For ABI FFR/CCR, a user supplied list of evaluation windows is used to identify the subsets of the two ABI images to be correlated. Currently, the locations of the evaluation windows are mostly along the shorelines of North and South America. There are 651 and 629 locations for FFR/CCR for the current orbital longitudes of GOES-16 and GOES-17, respectively.
The ABI image subsets are upsampled to a common finer resolution for assessing registration accuracy, e.g., FFR and CCR. In the NAV assessment, the Landsat chips are downsampled to the common resolution. The Landsat chips were reprojected via uniform local averaging to GOES ABI fixed grid projection for each assigned orbital longitude at 12× finer resolution than the native resolution of the corresponding ABI band ( Table 1). The Landsat chips were preprocessed offline and stored in the IPATS multispectral landmark database to save processing time of ABI INR assessment. The possible resolution scales at which subimages may be correlated for ABI NAV are 12×, 6×, 4×, 3×, 2×, and 1× finer resolution than the native resolution of the ABI image. We introduce the term subpixel factor (SPF) for the amount of upsampling applied to ABI images during IPATS processing. For instance, an SPF of 2 means that the ABI images are upsampled to half an ABI pixel before the image registration occurs. We will discuss the optimal SPF value for the long-term monitoring later in this paper.
IPATS provides three different interpolation algorithms for upsampling of images: • Nearest-neighbor interpolation: a resampled pixel is assigned the digital number value of the nearest pixel in the source image. • Bilinear interpolation: the values of the four source pixels nearest to the destination pixel are linearly interpolated. • Bicubic interpolation: the values of the 16 source pixels in a 4 × 4 array that includes the destination pixel within its central 2 × 2 subarray or on the boundaries of this subarray are interpolated using a cubic spline.
The details of the bilinear and bicubic interpolation algorithms are described in Chapter 10 of Ref. 9. Table 1 ABI channels and the corresponding Landsat channels utilized for NAV measurements. The ABI water vapor-sensitive channels (4, 8, 9, and 10) are excluded from the NAV measurements because they cannot see the ground and so cannot be compared with static Landsat chips. The possibility of using the original channel-dependent modulation transfer function (MTF) during the interpolation was not studied mainly because we performed the INR assessment using L1B data. This L1B data had already been resampled from the original scan data into the GOES fixed-grid coordinates using resampling kernels for each channel that were derived in part in order to meet the L1B MTF specifications. It is possible that compensating for the residual MTF in the fixed-grid image could reduce the NAV measurement error, but this is outside the scope of this paper and could be a topic for a future study.

Image edge enhancement
The images, preprocessed through edge enhancement, give a sharper correlation peak than the original images. 10 The sharper peak is easier to detect, especially when noise in images is significant. IPATS provides the user with the option to perform edge enhancement operation to the resampled ABI subsets and Landsat chips on a channel-by-channel basis. Two edge enhancement operators are available, the Sobel operator 11 and the Roberts operator. 12 Sobel edge enhancement. Sobel edge enhancement is achieved using the Sobel operator, which is an isotropic 3 × 3 discrete differentiation operator. 11 Two 3 × 3 kernels are convolved with the original image to compute derivatives in the horizontal and vertical directions. Note that in these equations we use ðx; yÞ and ðu; vÞ to indicate image pixels in the EW (row) and NS (column) directions. If we call the original image A, we can define E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 1 ; 1 1 6 ; 4 8 4 (1) which represent the horizontal and vertical derivatives and where * is the convolution operation. Then, at each point in the image, the derivatives are combined: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 2 ; 1 1 6 ; 4 0 1 G is the final image with Sobel edge enhancement applied.
Roberts edge enhancement. Roberts edge enhancement is achieved using the Roberts cross operator. 12 It is a similar type of operation to the Sobel edge enhancement method, except it uses 2 × 2 kernels. Again, if we call the original image A, we can define E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 3 ; 1 1 6 ; 3 0 2 Then, the final image G is calculated from G x and G y , using Eq. (2).

Image Registration
The core of the IPATS algorithms is the correlation module that ingests two preprocessed subsets at the same resolution, in which one subset is larger than the other in both the EW and NS directions. The smaller subset, called the float image, is then shifted in both directions to all possible locations coincident with a subarray of the larger subset, which is called the fixed image. For each such shifted location, a similarity metric is computed, either PCC or NMI. The results of these computations are captured into a 2-D array of similarity metrics, called the correlation array, one for each shifted location. The size of the correlation array is always odd, with the center location corresponding to zero shift in both the EW and NS directions. The size of the correlation array is determined by the maximum anticipated registration error, a user provided input to IPATS, and by the resolution scale, defined by the SPF, at which the correlation is performed. Some additional padding of the correlation array is required to ensure that the raw correlation peak, prior to application of the peak location refinement algorithms, lies in the interior of the array, and furthermore that the subarray centered on the peak required for both peak location interpolation algorithms lies within the correlation array (Sec. 2.4.3).

Pearson cross correlation
PCC describes how well two images match each other. 13 The PCC is calculated as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 4 ; 1 1 6 ; 6 6 2 γðu; vÞ ¼ where ðu; vÞ indicates the ðx; yÞ direction integer shift of the resampled subset of GOES-R image, γðu; vÞ is the PCC at the shifted location, fðx; yÞ is the pixel value of the fixed image (either the resampled Landsat chip or another resampled subset of GOES-R image) at location x and y, and tðx − u; y − vÞ is the float image (another GOES-R subset) value at pixel location x − u and y − v, f u;v is the mean pixel value of the fixed image in the region overlapping with the float image, and t is the mean pixel value of the float image.

Normalized mutual information
The mutual information (MI) of two images describes the amount of information about one image contained in the other image and vice versa. 14 H½fðx; yÞ; tðx − u; y − vÞ ¼ H½fðx; yÞ ¼ H½tðx − u; y − vÞ: When the two images are totally independent, we have E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 7 ; 1 1 6 ; 2 9 5 H½fðx; yÞ; tðx − u; y − vÞ ¼ H½fðx; yÞ þ H½tðx − u; y − vÞ: Therefore, the range of MI values is [1,2]. To shift the range to [0, 1], we define the NMI as To calculate the Shannon entropies for the distribution of radiance values in a single image and for the joint distribution of radiance values for a pair of images, we linearly divide the radiance range between mean plus and minus 3σ of the image radiance into 256 bins. The pixels with the radiance value beyond 3σ are put in the bins at two ends. And then, the Shannon entropy of one image A is defined as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 9 ; 1 1 6 ; 1 6 1 where p A ðiÞ is the fraction of radiance values in the i'th radiance bin. Similarly, the Shannon joint entropy of the images A and B is defined as follows: p AB ði; jÞ log½p AB ði; jÞ: (10)

Peak interpolation
The correlation array generated in last step is in same spatial resolution as the resampled subsets. To further interpolate the correlation peak location beyond the limitation of the spatial resolution, IPATS provides two different algorithms, parabolic and centroiding.
Parabolic interpolation. Parabolic interpolation refers to uniquely fitting a parabola in one direction, either X or Y, to the correlation array. The maximum of the fitted parabola, rather than the maximum of the correlation array, is the final peak location, effectively facilitating subpixel resolution in locating the correlation maximum. The location of the peak value in the correlation array serves as the middle point for each interpolation, with one point to the left and right considered for the x-direction interpolation, and one point above and below considered for the y-direction interpolation. Assume ðx 1 ; x 2 ; x 3 Þ denote the integer x index values centered on the correlation array peak located at ðx 2 ; y 2 Þ and the corresponding correlation value are ðz 1x ; z 2x ; z 3x Þ. Then, the interpolated peak location in x direction is calculated as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 1 ; 1 1 6 ; 5 1 2 The calculation of the interpolated peak location in Y direction is same as X direction. The interpolated peak correlation value is E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 2 ; 1 1 6 ; 4 4 3 where E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 3 ; 1 1 6 ; 3 9 8 The calculation of a y is same as a x . The peak sharpness in X direction is calculated as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 4 ; 1 1 6 ; 3 4 0 where a x is calculated in Eq. (13). The calculation of the peak sharpness in Y direction is same as X direction.
Centroiding interpolation. The centroid is calculated over the correlation array with a user selectable square window W centered on the array peak. The centroid of the peak location, rather than the maximum of the correlation array, is the final peak location. The size of the correlation array is always odd, with the center corresponding to array peak location. The centroid in the x directions can be calculated as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 5 ; 1 1 6 ; 2 1 4 where x pk is the refined peak location in the x direction, zðx; yÞ is the correlation value at location ðx; yÞ. The calculation of y pk is same as x pk .

Screening of IPATS Results
The accuracy of the IPATS measurements depends on the characteristics of the image pair.
In addition to the differences between the ABI and Landsat sensor, other factors, such as cloud coverage, seasonality, and image acquisition time, lead to additional registration errors between Landsat chips and the corresponding GOES-R subsets. The measured INR errors due to these factors could be as large as several GOES-R pixels, and so it is necessary to remove such poorquality measurements to obtain a better estimate of the true INR errors.

Sun-view geometry screening
The solar and viewing geometry is an important factor of remotely sensed images, especially for the reflective channels. 16,17 The solar zenith angle (SZA) is the solar angle at the observation time as measured downward from the local vertical of an observed point on the Earth. It determines the intensity of the incident radiance on the ground surface. The signal level of reflective channels images decreases when SZA increases. IPATS applies an SZA of 75 deg as the maximum threshold and removes unscreened IPATS results with SZA above this value because the low image contrast and significant shadow effects cause INR assessments of such images to be of poor quality. The view zenith angle (VZA) is the angle to the GOES satellite as measured downward from the local vertical of a point observed on the Earth. It affects the quality of remotely sensed images in a different way from SZA. The pixel footprint increases as VZA increases due to the shape of the Earth. The actual pixel footprint at the edge of the globe is about eight times coarser than at nadir. 15 NAV is impacted most by VZA because the dimensions of the ABI image subsets are limited by the size of the Landsat chips. Typically, one Landsat chip covers about 80 × 80 GOES-R pixels at nadir. The dimension of coverage drops to about 10 × 10 GOES pixels at locations close to the limb of the Earth. The detailed spatial features in the original Landsat chips appear in only a few GOES-R pixels and are difficult to distinguish from noise in the images. Image correlation results between such image subset pairs are not reliable. Therefore, a VZA threshold of 70 deg is applied in screening NAV measurements.

Analytic measurement uncertainty screening
De Luccia et al. 7 introduced a parameter, called "analytic measurement uncertainty (aMU)," that quantitatively describes the quality of image registration. The aMU value is high when two images have very different scene content, in which would typically result in unreliable image registration results. Dr. De Luccia then revised the original aMU equation and developed a second version of aMU, called aMU2, to provide an improved indicator of measurement quality: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 6 ; 1 1 6 ; 3 3 8 where aMU2 x is the measurement uncertainty in the x direction, PkSh x is the peak sharpness in the X direction [Eq. (14)], z pk is refined PCC (refer to Sec.
E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 9 ; 1 1 6 ; 1 1 4 where fðx; yÞ and tðx; yÞ are the fixed and float subsets, f and t are the mean values of the two subsets, and σ f and σ t are the standard deviations (STDs) of the two subsets.
Note that the formulation of aMU2 (nor aMU) does not use the misregistration measurement as an input; in other words, aMU2 does not reject misregistration measurements based on their magnitudes. Peak sharpness is calculated in the EW and NS directions separately. Therefore, aMU2 is also evaluated independently in each direction. Figure 3 shows the relationship between the scatter in unscreened IPATS measurements versus PCC coefficient, peak sharpness of PCC, and 1/aMU2 in the EW direction. The plot is from GOES-16 channel 2 data acquired on April 11, 2019. The variation of measurements decreases quickly with increasing 1/aMU2 values, which indicates aMU2 represents the measurement quality very well. On the other hand, there is still considerable amount of scattering IPATS measurements when PCC coefficient and peak sharpness are high. The plots show that aMU2 is more effective at discriminating between likely invalid measurements and likely valid measurements than screening with PCC coefficient or peak sharpness alone. Currently for NAV, the aMU2 threshold was visually examined and manually set to 0.357 (1/aMU2 = 2.8) for all ABI channels, which is marked as the vertical line in Fig. 3(a). The measurements with an aMU2 value in the EW direction that are larger than the threshold are removed. A similar screening process is also applied in the NS direction.

Statistics-based screening
Most of the poor-quality NAV measurements are removed by the aMU2 screening. However, a few significant outliers occasionally pass the aMU2 screening. Therefore, a median absolute deviation from the median (MAD) screening is applied to clean up the remaining outliers. First, the MAD is calculated from all INR measurements that passed the aMU2 screening in a 24-h period. An INR measurement is then removed by the MAD screening when the absolute deviation of this measurement from the median of INR measurements in 24-h is larger than the MAD value times a user-specified factor, currently set to nine. Occasionally, the MAD screening masks short-term abnormal situations by removing legitimate measurements with large INR error. For example, in a 24-h period, if a small number of images have significant (and real) INR error, most of the INR measurements in those images are removed by the MAD screening because of the out-of-family large INR error readings. The remaining INR measurements are actually low-quality measurements, although they pass the aMU2 screening, because they do not represent the real INR error of these scenes. To correct the overshoot of the MAD screening, a scene is marked as a "real significant error" scene if more than 50% of measurements from this scene are removed by the MAD screening. The measurement outliers of such a scene are not determined by the MAD but the STD of all the measurements of this scene. A measurement in such a scene is marked as an outlier and removed if it is greater than three times the STD of this scene. This process is named short-term abnormal detection (STAND). 18 The STAND screening is currently applied to ABI NAV only and is also planned to be applied to other metrics.

Assessment of Measurement Error
where ϵ INR is the IPATS measured INR error, ϵ INR_intrinsic is the INR intrinsic error of the ABI system, and ϵ ME is the measurement error due to uncertainty resulting from the IPATS algorithms.
To estimate ϵ ME and determine the optimal configuration parameters to minimize ϵ ME , we ran IPATS with 136 Landsat chips to process test images with known, intentionally induced INR errors. The induced errors are up to one GOES-R pixel in all four directions (East, West, South, and North). The root mean square error (RMSE), ϵ ME , of the 136 measured errors against induced errors in EW and NS directions are considered the measurement error ϵ ME .
Stair-step error is one important measurement error introduced by the image registration algorithm. 18 The name "stair-step error" comes from the fact measured INR error plotted against the true INR error resembles a stair-step shape (see Fig. 1.1-1 in Ref. 19). The difference between measured and true error is therefore oscillatory, and the frequency and the amplitude of the oscillation depend on the spatial resolution at which the image correlation is performed. As mentioned, the spatial resolution for correlation is determined by the SPF parameter in the IPATS configuration. The theoretical stair-step error is 0 when the intrinsic error is zero. 18,19 However, it is observed that the ϵ ME of the 136 measured error is close to but not exactly 0 in this test when the intrinsic error is zero in both directions (Fig. 4). This is due to other error sources, e.g., imperfect resampling algorithm, the possible true minor mismatch between Landsat chips and ABI subsets and insufficient spatial/spectral information in the coarse resolution image. The ϵ ME for the zero-shift case increases with increasing SPF. This indicates that measurement error increases when interpolating to finer and finer resolution. Fig. 4 The relationship between the RMSE of measured errors and SPF. There is no induced error in the test.
The frequency of the stair-step error, equal to the size of the target resolution of the image registration, and the amplitude of RMSE depend on the SPF value. 18,19 The stair-step error emerges when the intrinsic error is not zero. Table 2 shows the maximum RMSE for different SPF values. In contrast to the situation of zero intrinsic error, the maximum RMSE decreases with increasing SPF when the intrinsic error exists. This means the stair-step error is a more significant measurement error source than the error introduced by other sources. At the GOES resolution (SPF = 1), the maximum RMSE is about 0.19 pixels. There is a significant drop, from 0.19 to 0.06 pixels, with a change of SPF from 1 to 2. It then drops slowly from 0.06 to 0.02 pixels when the SPF increases from 2 to 12. The measurement quality improved slightly but the computation time increases significantly because the computation time of the image registration algorithm has O (SPF 2 ) time complexity. After examining the tradeoff between the accuracy and the computation cost, the SPF was set to 2 for NAV, CCR, and FFR in the IPATS baseline configuration. The measurement error is only 1% of the pixel size when the intrinsic error is close to 0 (Fig. 4). This is sufficient for assessing the NAV accuracy. The intrinsic error of GOES-16 and GOES-17 ABIs is close to 0 since they turned into the operational status. The measurement error could drop from 1% of the pixel size to 0.5% if the SPF value switches from 2 to 1. However, the intrinsic error often increased unexpectedly due to various reasons, which we will discuss later in this paper. In such a case, the measurement error increases significantly as the stair-step error emerges. IPATS is designed to not only monitor the normal operation status but also detect and assess the abnormal INR errors. Therefore, the optimal SPF value is set to 2 for NAV, CCR, and FFR through the mission. The selection of the optimal algorithms and parameters of SSR, not shown here, is independent to the selection of other metrics because of the small evaluation window over special tasked MESO image pairs. For SSR, the optimal SPF value is set to 3.

IPATS Baseline Configuration for ABI
IPATS provides multiple algorithms for each processing step. In monitoring ABI INR performance, a set of IPATS algorithms and user-configurable parameters were selected as the baseline configuration for optimal monitoring results. The selection of the at-launch algorithms and parameters was based on simulated ABI data and surrogate Advanced Himawari Imager (AHI) data, and these algorithms and parameters were later refined with on-orbit GOES-16 data during the postlaunch test (PLT) phase. Table 3 shows the algorithm selections and parameter values for preprocessing, image registration, and screening IPATS results. For the metrics generated in the operational mode, including NAV, CCR, and FFR, the algorithm selections of preprocessing and image registration are the same. The primary difference is in screening results. The VZA screening is not applied in CCR and FFR because two ABI images are compared in these two metrics, and they do not suffer the small subset dimension issue when the VZA is large (Sec. 2.5.1). The STAND screening is not applied in CCR and FFR also, but we plan to include it for these two metrics in the future. The aMU2 threshold value for NAV is uniformly set as 0.357 for all channels. However, the aMU2 thresholds for CCR and FFR vary by channel pairs or channels. For SSR, there is no Sobel edge enhancement applied in preprocessing, and only SZA and aMU2 are applied in screening SSR results because they are sufficient to remove poor quality measurements and meet the accuracy requirement.

Image Navigation and Registration Results
IPATS has been used to assess the INR performance of GOES-16 and GOES-17 ABIs since the start of their respective PLT phases. NAV, CCR, FFR, and WIFR are measured continuously on all MESO, CONUS, and FD images. SSR requires specially commanded MESO image pairs where two MESO images overlap with small offset in NS direction. Therefore, SSR is only measured during PLT, when the special MESO image pairs can be commanded. IPATS produces three levels assessment reports: 24-h statistics, scene statistics, and single location assessments. The 24-h statistics are computed from 18:00 UTC to 17:59 UTC the next day. In the 24-h and scene assessment reports, the statistics of measured and screened EW and NS errors, including mean, STD, minimum, maximum, and metric (absolute value of the mean plus 3 STD) values, are reported for each measured channel/channel pairs together with the number of samples. Reports of single location measurements include measured errors, measurement location, the channel number, measurement time, and aMU2 values. The assessment reports are produced for MESO, CONUS, and FD images separately. In this section, FD reports are presented to demonstrate ABI INR performance.
The 24-h statistics report is good for long-term tendency monitoring and diagnostic of small amplitude but long-term disturbances. The scene statistics report is usually used to assess shortterm abnormal INR performance. The single location measurements report is a resource for the analysis of intrascene anomalies.   spacecraft's Earth orientation calculations. The missing term drifts slowly over time and led to the EW error gradually increasing as observed. Figure 6 is the NAV long-term trend of GOES-17. The NAV errors dropped to around 1 μrad in VNIR channels and around 2 μrad in IR channels after the INR updates in July 2018. The EW errors increased about 0.5 μrad in VNIR channels and 2 μrad in MWIR/LWIR since the yaw flip on September 9, 2019. This indicates there is room to fine tune the INR parameters to improve the NAV accuracy to pre-yaw-flip level.
It took 3 months for GOES-17 to bring the NAV accuracy to about 1 to 2 μrad and 11 months for GOES-16 to obtain the similar improvement. The faster improvement of GOES-17 is because of the lessons learned from GOES-16. Currently, the NAV errors of GOES-17 are about 1 μrad larger than GOES-16.

Long-Term CCR Record
The CCR performance of both ABIs is presented in Figs. 7 and 8. The CCR of channels 1 and 2 and channels 7 and 13 represents the CCR of visible and IR channel pairs. For GOES-16 ABI, CCR of visible channels achieves excellent accuracy, both mean and σ are within 1 μrad, in a month from the beginning of PLT. CCR of IR channels improved significantly from start of PLT to provisional status and to operational status. The mean error dropped from more than 10 μrad to around 1 μrad and to less than 0.5 μrad. The STD of error also dropped from more than 5 μrad to less than 0.5 μrad.
The CCR performance of GOES-17 ABI improved to <0.5 μrad in about 5 months after the start of PLT, which is much faster than GOES-16 ABI. The INR algorithm/parameters were continually updated after reaching the operational status. The impact of the updates on CCR is showed clearly in Fig. 8. The latest CCR change is after the yaw flip on September 9, 2019, and the specific cause is being investigated, as is the prior jump near the end of July 2019. The variation of CCR between channels 1 and 2 increased in the NS direction.
For both ABIs, the CCR performance in NS direction is consistently better than EW direction since the start of PLT.

Long-Term FFR Record
The mean of 24-h FFR assessment of both ABIs are close to zero since the start of PLT. The variation of FFR, represented by the STD of 24-h FFR measurements, dropped through PLT. The STD of both ABIs improved to about 1 μrad in the VNIR channels and 2.5 μrad in the IR channels after reaching operational status. GOES-17 has slightly better FFR performance in VNIR channels than GOES-16 (Figs. 9 and 10). Tables 4 and 5 show the 24-h INR performance of both GOES-R ABIs from 18:00 UTC October 27 to 17:59 UTC October 28, 2019. This day was selected randomly. The INR performance of both ABIs is stable since they were relocated to the operational location (Figs. 5 and 6). These 24-h statistics are representative of the current ABI INR status. The water vapor-sensitive channels (channels 4, 8, 9, and 10) are not included in these metrics because there is little ground surface information captured by these channels. The ground surface information, especially the land and water boundaries, are the key features for IPATS image registration algorithms in the context of GOES-R INR.

INR Performance Summary
The requirements of each INR metrics are measured with 99.73% (3σ) absolute error (Tables 4 and 5). 20 The worst performance channel (of NAV and FFR) and channel pair (of CCR) are listed in Tables 4 and 5. The performance is calculated as the absolute value of the mean plus 3 STD. All metrics, except direct CCR between VNIR and MWIR/LWIR, are well within the mission requirements.
For this 24-h time period, GOES-16 has a slightly better NAV performance, especially in the EW direction (15.0 versus 18.1 μrad). GOES-17 results are better for FFR and CCR of VNIR channels, but about 1.5 μrad higher than GOES-16 for CCR of the MWIR/LWIR channels in both directions. The CCR of VNIR and MWIR/LWIR being larger than the requirements does not necessarily indicate true large CCR misregistration but may due to the high measurement error between VNIR and MWIR/LWIR channels, because of the significant spectral response difference. An alternative method to reduce CCR measurement error is to apply one or more bridging channels to estimate CCR indirectly using the transitive property. Equation (21) shows how the mean indirect, or bridged, CCR between channels A and D are calculated from the means of direct CCR of channels A, B, C, and D: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 1 ; 1 1 6 ; 3 0 9 where CCR bridged_A_D is the mean bridged CCR between A and D, and CCR A_B , CCR B_C , and CCR C_D are means of the direct CCR measurements. The variation of bridged CCR, measured by STD, is calculated as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 2 ; 1 1 6 ; 2 3 4 where STD bridged_A_D is the STD of bridged CCR, and STDA_B etc. are the STD of direct CCR measurements. The transitive path chosen for the bridged CCR estimate for a given channel pair is the path with no more than two intermediary channels that minimize the bridged STD, where each step along the path is required to have at least 1000 valid CCR measurements. Determining and documenting a permanent set of bridging paths is a candidate for future study. Table 5 shows the assessments of CCR performance using this bridging approach. The estimated CCR performance with the bridging method improves from the direct CCR measurements. The most significant improvements is for the CCR of the VNIR to MWIR and LWIR channels, where the values drop from over 30 μrad to about 10 μrad and about 15 μrad to about 10 μrad in the EW and NS directions, respectively. The bridged CCR of VNIR to MWIR and LWIR channels of GOES-17 ABI is slightly out of mission requirements in the NS direction. This is due to the poor performance of channel 16 of GOES-17 ABI. 21 When channel 16 of GOES-17 is excluded, the bridged CCR of VNIR-MWIR/LWIR is 11.0 and 8.1 μrad in EW and NS, respectively. They are all now within the mission requirements.

Anomaly Detection and In-Depth Analysis
Besides assessing the overall INR performance, IPATS also played an important role in anomaly detection and in-depth analysis. The following two sections show examples of each.

NAV during eclipse
During the eclipse season, fast thermal deformations in the sensor around penumbral times lead to abnormal, large image navigation error for a short time period. 22   For GOES-17, the impact of the eclipse is similar to GOES-16 but with larger amplitude. At first, the EW error changed from about −1 to 12 μrad around 8:20 UTC and then swung in the opposite direction, about −10 μrad. The error gradually shifted from −10 to −5 μrad from 8:40 to 9:30 UTC, followed by another variation to −20 μrad. Then, the EW error gradually returned to normal level, about −1 μrad, over ∼1 h. Similar to GOES-16, the eclipse impact in NS direction is not significant. In general, the eclipse effect is more significant and lasts longer on GOES-17 than GOES-16.
The observed NAV abnormality during the eclipse is not only due to the thermal contraction on the hardware but also due to how the software, in particular the Kalman filter, handling the situation. The eclipse transients have been reduced since the beginning of the PLT with INR algorithm and tuning improvements (not discussed here). There should be room to further minimize the eclipse transient, at least for GOES-17.

INR measurements for in-depth analysis
In addition to directly assessing INR accuracy, the measured INR errors are also useful to evaluate and understand more in-depth information, such as focal plane misalignment, scan encoder misalignment, etc. Figures 13 and 14  For GOES-16, the mild slope of lines in Figs. 13(a) and 13(d) indicate a scale error in EW and NS directions, respectively. On average, the measured EW errors at the east and west boundaries of an LWIR scene have a difference of 2.5 μrad. Similarly, the measured NS errors at the north and south boundaries of an LWIR scene have a difference of 1.5 μrad. The systematic offsets are also observed in both directions as the lines in Figs. 13(a) and 13(d) are all located to one side of the zero-error line. This is consistent with the conclusions in the 24-h NAV assessment reports. The dependency of NS errors in EW image location is negligible [ Fig. 13(b)]. However, the EW errors are about −1.2 and 1.7 μrad at the north and south boundaries of an LWIR scene, The scale error in GOES-17 ABI in both EW and NS directions is more significant than the scale error in GOES-16 ABI. On average, the measured EW errors at the east and west boundaries of an LWIR scene have a difference of 3.8 μrad [ Fig. 14(a)]. The measured NS errors at the north and south boundaries of an LWIR scene have a difference of 10.3 μrad [ Fig. 14(d)]. The systematic offsets are also observed in both directions [ Fig. 14(a) and 14(d)]. The dependency of NS errors in EW image location is weak [ Fig. 14(b)]. The NS error difference is only <0.5 μrad at the east and west boundaries of an LWIR scene. However, the EW error difference is about 12.1 μrad at the north and south boundaries of an LWIR scene [ Fig. 14(c)]. There are two possible reasons for this effect: the first is the potential orthogonality issue, and the second is the impact of EW scale error [ Fig. 14(a)] due to the unbalanced spatial distribution of GOES-17 Landsat chips [ Fig. 2(b)]. A thorough analysis is needed to confirm which reason, or the combination of two reasons, led to this effect. Channel 16 of GOES-17 is not included due to poor performance. 20 We have repeated similar analysis for GOES-17 data acquired in June and September 2019 (not shown) and the same effect was observed with minor variations. Further study is needed to determine the causes of these phenomena. Fig. 13 The linear least square fit between NAV errors and the image location of the measurement. The statistics are based on the LWIR data of GOES-16 in April 1 to 30, 2019. Besides the individual measured LWIR channels (channel 12 to 16), the average of all measured LWIR channels is also plotted and marked as "LWIR" in the plots.

Summary
IPATS has been implemented to continuously monitor ABI INR performance since the start of GOES-16 PLT. The INR metrics produced by IPATS have been used to help tuning of both ABI INR systems to achieve excellent INR accuracy performance. IPATS is not a static system. The algorithm processing modules and the module-specific configuration parameters are all user selectable. The processing sequence and the postprocess screenings are all customizable for each metric. Additional screenings and subprocedures were developed when the demand emerged, e.g., the STAND and VZA screening approaches were developed during the PLT of GOES-16 and GOES-17, respectively. The INR accuracy of the GOES-R ABIs improved with updates and tuning in PLT. The IPATS measurements show that in general both GOES-R ABIs are in compliance with the mission requirements. There is still room for further ABI INR improvements, such as NAV performance during eclipse, CCR discontinuities after algorithm/parameters updates, scale errors, orthogonality issues, etc.