Comparison of consecutive and restained sections for image registration in histopathology

. Significance: Although the registration of restained sections allows nucleus-level alignment that enables a direct analysis of interacting biomarkers, consecutive sections only allow the transfer of region-level annotations. The latter can be achieved at low computational cost using coarser image resolutions. Purpose: In digital histopathology, virtual multistaining is important for diagnosis and biomarker research. Additionally, it provides accurate ground truth for various deep-learning tasks. Virtual multistaining can be obtained using different stains for consecutive sections or by restaining the same section. Both approaches require image registration to compensate for tissue deformations, but little attention has been devoted to comparing their accuracy. Approach: We compared affine and deformable variational image registration of consecutive and restained sections and analyzed the effect of the image resolution that influences accuracy and required computational resources. The registration was applied to the automatic nonrigid histological image registration (ANHIR) challenge data (230 consecutive slide pairs) and the hyperparameters were determined. Then without changing the parameters, the registration was applied to a newly published hybrid dataset of restained and consecutive sections (HyReCo, 86 slide pairs, 5404 landmarks). Results: We obtain a median landmark error after registration of 6 . 5 μ m (HyReCo) and 24 . 1 μ m (ANHIR) between consecutive sections. Between restained sections, the median registration error is 2.2 and 0 . 9 μ m in the two subsets of the HyReCo dataset. We observe that deformable registration leads to lower landmark errors than affine registration in both cases ( p < 0 . 001 ), though the effect is smaller in restained sections. Conclusion: Deformable registration of consecutive and restained sections is a valuable tool for the joint analysis of different stains.


Introduction
In histopathology, much insight into disease subtyping, biomarker discovery, and tissue organization is gained by analyzing differently stained histological sections.For this procedure, a fixed tissue is transferred into a paraffin block and cut into 2-5 µm thin slices.These slices are subsequently stained by e.g.immunohistochemistry, and-in a digital workflow-scanned to obtain a digital whole slide image (WSI) [1].The resulting image can be used for digital analysis, e.g. in biomarker discovery by combining two or more different stains [2].Deep-learning models are increasingly used to analyze histopathology slides and first methods have been cleared for clinical use [3].These methods require a large amount of annotated images to learn specific tissue properties.Image registration is used to automatically create annotations as training data in order to reduce the time spent on manually annotating slide images [4], [5].
Enabled by digital slide scanners, a re-staining approach which was initially used in fluorescence microscopy and known as tissue-based cyclic immunofluorescence (t-CyCIF) [6], [7] is gaining popularity in bright-field imaging [4], [8], [9].Instead of staining consecutive sections and scanning them later, a section is stained and scanned first.In a second step, the stain is washed or bleached and another stain is applied.After re-scanning, both images contain the same tissue with different staining, so that it is possible to compare the same cell with respect to different antibodies or markers.However, we still observe nonlinear deformations in the tissue, which are most likely due to the chemical reactions during the re-staining process.
Independently of the sectioning method, researchers face a number of questions when applying image registration to a new dataset.These include • Which image resolution is best suited to obtain the best accuracy while keeping the computational cost as low as possible?• Is deformable registration required or is an affine registration sufficient, especially for the registration of re-stained sections where little differences are expected between both images?
As we show in this work, image registration is required in both, consecutive as well as re-stained, image pairs.We further compare the accuracy of the registration for both types of image pairs which-to our knowledge-has not been analyzed before.In the case of registration of re-stained sections, accuracy is achieved at the nucleus level.In the case of consecutive sections, this level of accuracy cannot usually be reached due to the lack of corresponding objects at the appropriate resolution caused by the slice thickness or distance.Here, a good registration of structures with a size above the nucleus level can be achieved on the basis of images with relatively low resolution and the use of a nonlinear deformation model.
We compare the two types of image pairs using an optimization-based image registration method that is based on minimizing an energy functional consisting of a distance measure and a regularizer [10].This class of optimization-based methods is widely used in medical imaging [11], [12] and has also been applied to problems in pathology [13]- [17].
This class of energy-minimizing methods makes explicit model assumptions through the choice of distance measure and regularization scheme.When applying a method to a new dataset, model refinements can be made by adjusting its parameters.For example, when a new dataset contains larger deformations, the weight that balances image distance and regularization can be adapted to allow for larger displacements.
Another class of methods that gained popularity in the recent years is based on training a deep learning model to estimate the deformation in problems in medical imaging [18] and specifically pathology [19].Here, the model assumptions are made implicitly by the training data.This in turn makes generalization and adaption to unseen datasets more challenging, although recent work [20]- [22] addresses this issue.
Below, we first describe the used registration method and its application to re-stained and consecutive slide images.We then describe an evaluation framework based on land-marks accuracies on two datasets, the "automatic nonlinear histological image registration challenge" (ANHIR1 , [15]) and on a new dataset "HyReCo" [23] that contains both consecutive and re-stained slides and that we make publicly available.Finally, we analyze the accuracy of the image registration method with respect to image resolution and sectioning in both datasets.

Fully-automatic image registration
We compare the registration of the two sectioning methods based on a 3-step, energyminimizing registration pipeline.It consists of 1) a robust pre-alignment, 2) an affine registration computed on coarse resolution images, and 3) a curvature-regularized deformable registration.The method is based on the variational image registration framework first described by Fischer and Modersitzki [10], [24] which has been applied to many clinical fields from histology [25] to radiology [26], [27].
Given a so-called reference image R : R 2 → R and a so-called template image T : R 2 → R, the goal of image registration is to find a reasonable spatial transformation y : R 2 → R 2 such that R(x) ≈ T (y(x)), i.e., R and the deformed template T • y are similar in an adequate sense.Following [10], we formulate the image registration as the optimization problem J(R, T, y) y → min of an appropriate objective function J with respect to the desired spatial transformation.A key component of the objective function is a so-called distance or image similarity measure that quantifies the quality of the alignment.We use the Normalized Gradient Fields (NGF) distance measure [28] as it has been shown to be robust to different stains and is suitable for multi-modal image registration of histological images [4].For the discretization, 2D images with extents n 1 -by-n 2 are assumed, correspondingly consisting of N = n 1 • n 2 pixels with uniform size h ∈ R in each dimension and pixel centers x 1 , ..., x N ; x i ∈ R 2 .The NGF distance measure is given by with x, y ε = x y + ε 2 , x ε := x, x ε , and the edge parameter ε, which controls the sensitivity to edges in contrast to noise.This image distance becomes minimal if intensity gradients and edges, respectively, are aligned and which therefore leads to the alignment of morphological structures.
The NGF distance measure is used in all three steps of the registration pipeline: Prealignment, affine registration, and deformable (non-linear) registration.In addition, we use a multilevel optimization scheme that starts with the registration of images at low resolution levels and then refines the transformation to higher image resolutions to reduce the risk of converging too early to local minima and to speed up the optimization process [29].The per-level optimization is performed using a Gauss-Newton type (affine registration) and L-BFGS quasi-Newton (deformable registration) method, see e.g.[10] or [30], [31] for a more detailed discussion and additional strategies.All of the three following registration steps rely on the edge parameter ε, the number of levels N level of the image pyramid, and the image resolution at the finest level.The parameters are set independently for each step and such that the registration error is minimal and the deformation grid is regular in the sense that it is not folded in the image domain.These parameters are shown in Table 1.

Step 1: Automatic rotation alignment (ARA)
Before histological images are scanned, the tissue is cut, preprocessed, and stained in a pathology lab.
After this manual process, neighboring tissue slices can end up in arbitrary positions on the object slide (such as upside down or turned in various ways).In general, no assumptions can be made on the initial tissue positioning and-in a first step-we aim to find a rigid alignment, correcting for global translation and global rotation.
Images are assumed to be available in a multilevel image data format to reduce the time and memory requirements to load the image data at a given resolution.
The NGF distance measure is based on structural changes expressed through the image gradient and therefore, color information is of limited value.To reduce the amount of image data to be handled, all images are converted from color to gray scale and inverted to obtain a black background while loading from disk.
Automatic Rotation Alignment (ARA) first determines the center of mass [32] of both images, using the gray values of the pixels as the weights.Let (t 1 , t 2 ) be the vector pointing from the center of mass of the reference image to the center of mass of the template image, and let φ k = 2π(k − 1)/(N rotations − 1), k = 1, . . ., N rotations be equidistant rotation angles sampling the interval [0, 2π).For each angle, a rigid registration is computed, optimizing J(R, T, y rigid ) = NGF(R, T, y rigid ) → min, with initial parameters (φ k , t 1 , t 2 ), k = 1, . . ., N rotations .Among all N rotations rigid registration results, the minimizer y * rigid with the smallest image distance is selected as an initial guess for the subsequent affine registration.

Step 2: Affine registration
In a second step, again an NGF-based image registration is computed.To allow for additional degrees of freedom, the registration is optimized with respect to an affine transformation y affine and based on a finer image resolution with the previously computed y * rigid as an initial guess.The resulting transformation is then used as initial guess for a subsequent deformable registration.

Step 3: Deformable registration
The final step is a deformable image registration.Here, the transformation y is given by In contrast to an affine registration, the deformation is not restricted to a particular parameterizable deformation model and the nonlinear transformation is controlled by introducing a regularization term into the objective function that measures the deformation energy and penalizes unwanted transformations.Here we use the so-called curvature regularization, which penalizes second-order derivatives of the displacement [33] and which has been shown to work very well in combination with the NGF distance measure [26], [27].As with the NGF distance, we evaluate the displacements in the pixel centers x 1 , . . ., x m with uniform grid spacing h and use finite differences to approximate the derivatives.Thus, the discretized curvature regularizer is defined as where ∆ h is the common 5-point finite difference approximation of the 2D Laplacian ∆ = ∂ xx + ∂ yy with Neumann boundary conditions.In summary, for deformable registration, we minimize the objective function with respect to the deformation y with the previously computed optimal y affine as initial guess.The parameter α > 0, is a regularization parameter that controls the smoothness of the computed deformation.The parameter α is chosen manually to achieve a smooth deformation and avoid topological changes (lattice folds), while being flexible enough to correct for local changes that improve image similarity.The resolution of the control point grid is independent of the image resolution and is typically chosen to be coarser than the image resolution (see also Table 1).A higher number of grid points allows for a more accurate representation of local deformations.Linear interpolation is used to evaluate the deformation between its grid nodes.

Evaluation
We compare both image acquisition methods with respect to the accuracy of the registration in a new, previously unpublished dataset, HyReCo [23], that combines re-stained and consecutive sections.To relate to the previous work in registration of consecutive sections, we addtionally evaluate the registration accuracy in the training part of the ANHIR challenge data [34].
We report the distribution of the target registration error r k − t k 2 , k = 1, ..., N images and its median over all N images image pairs and over all available landmarks r k , t k ∈ R2 in both datasets.Multiple parameterizations were tested systematically and the parameter set with the lowest MTRE was selected (Table 1).Moderate modifications in NGF ε, regularizer parameter α, size of the deformation grid and number of levels only show a small influence on the accuracy when registering coarse image resolutions (up to approx.4 µm/px).On higher image resolutions, the parameter choice seems to have a larger impact.We choose the parameters reported in Table 1 that lead to the best results across all datasets.While this parameterization was optimal in the median across all images, single registrations can be improved by determining an individual set of parameters.
The registration was applied to consecutive and re-stained sections in the HyReCo and ANHIR datasets.

Hybrid Re-stained and Consecutive Data (HyReCo)
The HyReCo dataset was acquired at the Radboud University Medical Center, Nijmegen, the Netherlands 2 .
HyReCo subset A (re-stained & consecutive) It consists of two subsets of slides, first (A) nine sets of consecutive sections, each containing four slides stained with H&E, CD8, CD45RO, Ki67, respectively (Fig. 1).In addition, PHH3-stained slides have been produced by removing the cover slip from the respective H&E-stained slide, bleaching the H&E stain, re-staining the same section with PHH3 and scanning it again, similar to the t-CyCIF technique [6], [7] that is well established in fluorescence imaging.For each of these sections, 11-19 landmarks (138 per stain, 690 in total) have been placed manually on corresponding structures and verified by two experienced researchers.
Finding the same points across several consecutive slides is quite difficult, because care must be taken to locate a similar point in all slides of the stack simultaneously.In contrast to these consecutive sections, an image pair of re-stained sections contains the same cells and nuclei such that a one-to-one correspondence can be found for most structures.

HyReCo subset B (re-stained)
To overcome the limitations in annotation accuracy imposed by the simultaneous annotation of consecutive and re-stained slides, a second subset (B) of re-stained slides without corresponding consecutive sections were scanned and annotated.An additional number of 2303 annotations were produced for 54 additional image pairs of H&E-PHH3 (approx.43 annotations per pair).These have again been verified by two experienced researchers.
All images have been digitized with a resolution of 0.24 µm/px and are approximately 95000 × 220000 pixels in size at their highest magnification level.
To estimate a lower bar for landmark accuracy, two researchers annotated the same structures (approx.20 landmarks each) in the same and in one consecutive slide, independently from each other.In this setting, the inter-observer error on the same section was 0.57 µm ± 0.36 (mean ± standard deviation), corresponding to 2.3 pixels ± 1.5 and the intra-observer error was in a similar range (0.53 µm ± 0.32).In two consecutive sections (H&E and Ki67) the inter-observer error was 1.1 µm ± 0.6 (4.7 pixels ± 2.6).In the consecutive sections, the landmark positions were selected such that a corresponding structure was available in both images.For many structures this is not always the case in consecutive sections such that the inter-observer error likely overestimates the possible alignment accuracy.
The dataset including the landmarks has been made available at [van der Laak, Lotz, Johannes, Weiss, Nick, et al. [23]]3 under the Creative Commons Attribution-ShareAlike 4.0 International license4 .

ANHIR Dataset
The accuracy of the registration of serial sections depends on the distance between the sections and on the quality of the tissue sectioning.To broaden the scope of the analysis and to make the results comparable to previous work in registration of serial sections, we additionally evaluate the accuracy of the registration of the ANHIR challenge data [34].
The public part of the ANHIR challenge dataset consists of 230 image pairs from 8 different tissue types (lung lesions, whole mice lung lobes, mammary glands, mice kidney, colon adenocarcinoma, gastric mucosa and adenocarcinoma, human breast, human kidney) with 18 different stains.An example is shown in Fig. 2.
In the following sections, we measure the accuracy of deformable and affine registration with respect to image resolution on both datasets.We distinguish re-stained and consecutive sectioning and determine the possible alignment accuracies in the different datasets.

Results
We apply the 3-step registration pipeline to the HyReCo datasets and to the ANHIR training dataset.

Experiment 1: Image's Resolution
Histological images are typically stored at different image resolutions in a pyramidal image format to accommodate for the large size of the images and to make the different scales of tissue structures easily accessible.The registration can be computed at any of these scales and the result can be interpolated to apply it to higher image resolutions.We measure the registration accuracy with respect to the resolution used for registration and compare affine and deformable registration on consecutive and re-stained sections.In each image the deformation is applied to a regular grid (background, gray) and plotted in blue (foreground).The consecutive pair shows a larger nonlinear component but small non-linear effects are also visible between the two restained images.

Consecutive sections
The resulting landmark errors after applying the full 3-step registration to the consecutive HyReCo subset A and to the ANHIR dataset are shown in Fig. 3.
Comparing different image resolutions, a smaller pixel size is correlated with a smaller registration error up to a level of saturation that differs between datasets.This saturation level is likely influenced by the quality of the slide and the similarity of the slide pairs.The similarity is reduced with a growing distance between two consecutive sections and small structures can no longer be aligned if their counterpart is not present in the other slide.At image resolutions below 2 µm/px we even observe a small increase in TRE in some datasets.This is likely due to the larger influence of smaller structures that-due to the differences from slide to slide-lack a correspondence and that are otherwise invisible at coarser image resolutions.
Comparing the HyReCo to the ANHIR cases, the overall MTRE is larger in the ANHIR dataset where the larger average landmark errors (denoted by ♦ in Fig. 3) indicate a higher amount of badly aligned landmark outliers.This is likely due to the larger structural differences between the slides in some of the ANHIR subsets (Fig. 2).
Fig. 4 shows one of the ANHIR image pairs after pre-alignment, affine registration, and deformable registration.

Re-stained sections compared to consecutive sections on the same tissue block
Re-stained sections show very little differences and allow a nucleus-level alignment that can be used for a multiplexed analysis of the finest structures in the image (Fig. 5).
The TRE in the re-stained images in HyReCo subset A reaches 2.3 µm and is approximately two to four times lower than between consecutive sections (Fig. 6, Table 2).As expected, the deformation between consecutive image pairs shows stronger non-linear components than between re-stained sections.No foldings were detected in the deformations in any of the re-stained image pairs.A visual comparison of the deformations after re-stained and after consecutive registrations is shown in Fig. 7.
From the landmark errors in the consecutive sections we are able to derive the likely section order (HE, Ki67, CD8, CD45RO): as the distance between two sections in the stack grows, the landmark error increases as well.The registration accuracy in consecutive sections largely depends on the quality and similarity of the sections.We again observe a slight decrease in accuracy at resolutions below 2 µm/px which is only present in the consecutive but not in the re-stained subset.

Experiment 2: Deformable Compared to Affine Registration
The two images of the re-stained section pair show the same tissue specimen before and after an additional chemical processing and scanning.We show that deformable registration leads to superior results despite the tissue being fixed at the glass slide during re-staining.To this end we compare the MTRE after affine and deformable registration in all datasets.

Improved accuracy of deformable registration in all datasets
Deformable registration outperforms affine registration except for image resolutions coarser than 64 µm/px in all datasets (Fig. 8, Table 3).Compared to consecutive sections, the difference between affine and deformable registration is lower in the re-stained dataset which is due to the smaller mechanical deformation in the processing.The lower difference in the ANHIR dataset compared to the consecutive subset of HyReCo is likely due to the larger proportion of artifacts and structures without correspondence in this dataset.In the separate subset B of re-stained slides (H&E-PHH3) where no consecutive sections are available, the MTRE is lower and reaches 0.86 µm which is at the same level as the intra-observer error.The difference to subset A is likely influenced by the pairwise landmark setup.
In subset B, the deformable registration again lowers the landmark error compared to affine registration (Fig. 9, but to a lower degree than between consecutive sections (0.86 µm compared to 1.60 µm).A visualization of the deformation field after re-stained section registration shows a small non-linear component which is consistent with the lower landmark error (Fig. 7).
We note that purely re-stained sections are easier to annotate than consecutive sections because the corresponding structures can easily be identified.This leads to a lower TRE in HyReCo subset B compared to subset A.
The better correspondence of the two sections leads to an additional advantage of restained sections that cannot be measured in terms of landmark error: Since landmarks in consecutive sections have only been placed on corresponding structures, areas without correspondence are not reported and therefore not part of the TRE.This is a limitation of the current approach but could at least partly be mitigated by resorting to a different measurement of alignment, such as the difference of segmentations or larger structures.

Computation Times of Deformable Compared to Affine Registration
The computation time of an image registration algorithm depends largely on the implementation and on the size of the input images but also on other factors like CPU and RAM performance, disk access etc.We report the measurements of our setup (Intel(R) Core(TM) i7-7700K CPU (4.20GHz, four cores) with 32 GB of RAM) in order to give a relative comparison with respect to the size of the images.
For an affine registration on the HyReCo data5 , the average computation time ranges from 0.5 seconds (image size 400 x 800, 62.1 µm/pixel) to 58 seconds (image size 12800 x 25600, 1.94 µm/pixel).The majority of the computation time for large images is spent on the deformable registration.Previous analyses [27] have shown that doubling the image resolution (an increase of four times the number of pixels) leads to a four-fold increase in the computation time.In other words, the computation time is roughly linearly dependent on the number of pixels in the image.Together with the contributions from pre-alignment and affine registration, we see a similar trend in the computation times in Table 4.For deformable registration on the HyReCo data, the average computation time ranges from 2.6 seconds to 30 minutes.

Discussion
We compared the accuracy of numerical image registration in re-stained and consecutive sections in histopathology.The median landmark error in re-stained sections goes down to 0.86 µm.When compared on the same tissue block, the registration error between re-stained sections is smaller by a factor of two to five compared to the corresponding consecutive sections (2.3 µm compared to 7.1 µm).In consecutive sections, the accuracy largely depends on the sections quality and image resolution.The difference in the alignment quality between re-stained and consecutive sections is relevant for applications where small structures or single nuclei are of interest.An MTRE of 1.0 µm allows nucleus-level alignment which is infeasible in serial sections where the same nucleus is often not present on the next slide.For comparison, the size of an average mammalian nucleus is approx.6 µm [35], while tissue sections typically measure 2-5 µm in thickness.The increased accuracy comes at the price of the loss of the physical stained glass slide and an increased processing time due to the de-staining.Only the staining that is applied last can be conserved physically.Especially in clinical settings, long-term storage of the glass slides and short time to diagnosis are important.
Smaller nonlinear deformations occur in the re-staining process, likely due to the mechanical and chemical manipulation and tile stitching during scanning.These nonlinear components can also be observed in the deformation fields resulting from the registration of re-stained images.When aiming at a high registration accuracy in re-stained images, deformable image registration further decreases the landmark error in fine image resolutions.
The accuracy of the registration depends on the employed image resolution.As histological images are typically organized in a pyramidal structure, lower-resolution representations can be extracted without additional computational effort.Otherwise, loading the image into memory in order to produce a low-resolution representation further extends the computation time.
In our experiments, the impact of the image resolution was highest in re-stained sections where optimal results could be reached at 0.97 µm/px.Finer image resolutions even exceeded 180 GB of RAM on a more powerful computer.In consecutive sections, the gain in accuracy of the registration stagnates between 7.8 µm/px and 3.89 µm/px such that these registrations can be computed based on smaller image size and hence require less memory and time.We assume that fine structures that lack correspondence have a negative impact on registration accuracy in finer image resolutions.
Our analysis is limited by the focus on landmarks as the only measurement of accuracy.Since the landmarks were placed at positions that can be re-identified by a human observer, these locations likely have a superior contrast and thus have a higher impact on the distance measure.This could lead to a bias in the evaluation that underestimates the registration error in low-contrast regions.
The purely landmark-based approach also ignores the quality of the alignment of larger structures.This could be included by segmenting corresponding areas in multiple slides and evaluating the alignment of these segmentations.
The regularity or smoothness of the deformation is another quality criterion for an image registration.We automatically analyze the deformed grid for folds (one occurrence in 36 cases for HyReCo subset A, zero occurrences in subset B) but otherwise did not systematically evaluate smoothness of the deformation except for visual inspection.

Conclusion
In conclusion, re-stained sections allow an accurate registration of differently stained structures that is below the level required to align single nuclei.Registrations of consecutive sections result in a higher alignment error that increases with the distance between the slides.Consecutive sections are better suited to align larger areas such as tumor or inflammatory areas based on a second stain.We recommend deformable registration which was always more accurate, and the use of re-stained sections, if possible.Higher image resolutions benefit the accuracy, as long as the increase in image detail leads to an increase in corresponding structures.

Figure 1 :
Figure 1: One of nine sets of the HyReCo data.Slides A-D are consecutive stains (H&E, CD8, Ki67, CD45RO).Slide E is washed and re-stained from slide A. In the cropped images a, b, and e corresponding nuclei can be found only between the re-stained slide pair a and e.

Figure 2 :
Figure 2: Two consecutive slides from the ANHIR dataset (image set COAD 03).Structures that are only present in one image and that cannot be aligned by image registration are indicated by arrows.

Figure 3 :
Figure 3: TRE after consecutive deformable registration at different image resolutions for the HyReCo (left group, blue) and the ANHIR dataset (right group, yellow, logarithmic plot).The HyReCo dataset shows overall smaller registration errors.The boxes denote the interquartile range and the whiskers extend this range by a factor of 1.5.The diamond (♦) denotes the mean TRE.

Figure 4 :
Figure 4: Spy-view of an image pair from the ANHIR dataset after pre-alignment, affine and deformable registration (left to right).Arrows indicate tissue regions with misalignment.

Figure 5 :
Figure 5: Checkerboard plot after registration of a re-stained image pair.Nucleus correspondences are visible at the borders of the checkerboard tiles.

Figure 6 :
Figure 6: TRE after consecutive and re-stained deformable registration at different image resolutions of the HyReCo subset A for different staining pairs (logarithmic plot).The TRE between re-stained section (right group, red) is lower than between consecutive sections (three left groups).The accuracy after registration of consecutive section depends on their distance (section order: H&E, Ki67, CD8, CD45).The boxes denote the interquartile range and the whiskers extend this range by a factor of 1.5.

Figure 7 :
Figure 7: Deformation of a consecutive (left) and a re-stained registration (right) based on the same H&E-stained slide (image stack 361 of the HyReCo dataset).In each image the deformation is applied to a regular grid (background, gray) and plotted in blue (foreground).The consecutive pair shows a larger nonlinear component but small non-linear effects are also visible between the two restained images.

Figure 8 :
Figure 8: Ratio MTRE affine MTRE deformable for ANHIR and the consecutive and re-stained HyReCo datasets.Deformable registration outperforms affine registration except for image resolutions coarser than 64 µm/px.

Figure 9 :
Figure 9: Median TRE of re-stained image pairs after affine and deformable registration at different image resolutions (HyReCo subset B).Deformable registration does further improve the landmark error if compared to affine registration.

Table 1 :
Parameters used in the registration pipeline for all datasets.

Table 2 :
Best median TRE obtained and required image resolution.The MTRE between re-stained sections is lower by a factor of approx.2. MTRE increases with the distance between the sections for consecutive sections.

Table 3 :
Best MTRE obtained after affine and deformable registration.

Table 4 :
Execution time for a deformable registration of consecutive sections with respect to image size (and resolution).Larger image sizes require a larger computation time.