## 1.

## Introduction

Remote sensing (RS) is developing toward high spatial resolution, high time resolution, and high spectral resolution. Multispectral images of line scanners are the main type of data acquired by RS sensors in Earth observation. Unlike space mission applications, platforms of aerial applications suffer more effects such as pitch, yaw, and roll. Boresight error consists of yaw error, pitch error, and roll error. For an airborne line scanner, i.e., pushbroom scanner, pitch error will make the whole line data see either forward or backward instead of nadir. Roll error will shift the whole line data either to the left or the right. Yaw error may cause the whole line data to rotate along the flight direction. If the boresight error is constant during the flight, it is easy to correct. However, in reality the boresight error varies from time to time caused by air turbulence, wind, and platform vibration. These irregular effects can bring a problem for airborne multispectral line scanners: raw RS multispectral bands may be not innately registered. Alignment accuracy of multispectral bands from airborne scanners is a criterion impacting product quality in RS. Many scholars have been investigating to solve the band-to-band registration problem.

There are dozens of intensity-based image registration methods, but none of them can be used to solve the multispectral image registration problem in a straightforward manner. The reason is that different spectral characteristics of ground lead to intensity variation between bands. Especially, visible light images and infrared light images of the same scene will be regarded as totally different images in term of intensity. There are some algorithms^{1}2.^{–}^{3} to solve the band-to-band registration problem for the Enhanced Thematic Mapper Plus and moderate-resolution imaging spectroradiometer sensors. In general, the scale-invariant feature transform (SIFT) algorithm^{4} gives excellent matching results for visible image pairs. For infrared and visible image pairs, however, the number of mismatched SIFT keypoints increases rapidly, and this decreases the SIFT-based registration accuracy drastically. Therefore, authors in Ref. 5 proposed a new scale restriction criteria for keypoint matching with better performance. Moreover, there are also some algorithms based on mutual information^{6} and statistical learning techniques^{7} to solve the band-to-band rigid/global registration problem. A correlation and Hough transform-based method of automatic image registration was proposed in Ref. 8. The authors claimed that the method can globally register a pair of images with different spectral contents. Common elastic image registration methods^{9}^{,}^{10} are based on the intensity of the images. Since intensity of the same region varies according to different spectral characteristics of ground, those elastic methods often fail. Moreover, high computational cost of Periaswamy’s^{9}^{,}^{10} method limits its application for RS images. Therefore, a new and efficient elastic band-to-band registration is needed to solve the problem.

Recently, a new type of multispectral line scanner developed by Changchun Institute of Optics in China is composed of four off-axis aspheric reflectors in order to achieve a wide field of view (FOV) with a short focal length. The new multispectral scanner can achieve an FOV of 61.93 and IFOV of 0.18 mrad with four bands: blue (420 to 520 nm), green (520 to 600 nm), red (630 to 690 nm), and near-infrared (760 to 900 nm). A picture of the multispectral line scanner is shown in Fig. 1. The physical dimensions of the scanner are about $150\times 220\times 200\text{\hspace{0.17em}}\text{\hspace{0.17em}}(\mathrm{mm})$. This kind of scanner is vulnerable to the problem of different nonlinear warping and inconsistent local transformation between bands. This is caused by air turbulence, self-propelled vibration of airborne platforms, and the systematic design of the multispectral line scanner. As mentioned earlier, common global image registration methods are not suitable for this problem. One example image taken by this scanner is shown in Fig. 2. We can see that each band suffers different nonlinear warping, although the flight was carried out under good weather conditions. Based on our survey results, most of the band-to-band registration methods can only solve the misalignment globally, and they fail to register bands with local warps shown in Fig. 2. This is problem is not brought only by this particular multispectral line scanner. Another new type of spectrometer was proposed and the technique is abbreviated as IRIS.^{11}^{,}^{12} Based on the IRIS design scheme, all multispectral bands are projected on a single conventional detector array, and this kind of spectrometer may also suffer the above problem when it is used for airborne applications.

In this paper, a new elastic band-to-band registration method is proposed to solve the above problem. The rest of this paper is organized as follows. Feature images are introduced in Sec. 1, then a new elastic image registration method is proposed in Sec. 2. In Sec. 4, we demonstrate the high accuracy of registration between bands with the new method for some real flight data. Finally, conclusions are provided in Sec. 5.

## 2.

## Methodology

## 2.1.

### Feature Images

Nonrigid warps between multispectral bands require elastic registration methods considering both light intensity variation and local warping. Human eyes are very sensitive to edges rather than smooth areas of a scene. This is the reason humans can easily recognize the same region, although the region of intensity is different across different wavelength ranges. Considering the characteristic of human eyes, we mimic its edge detection by introducing a gradient operator to explore the similarities within multispectral images. We define feature images as follows:

## (1)

$${F}^{k}(x,y)=\sqrt{{\left(\frac{\partial {I}^{k}(x,y)}{\partial x}\right)}^{2}+{\left(\frac{\partial {I}^{k}(x,y)}{\partial y}\right)}^{2}},$$## 2.2.

### Feature Images-Based Elastic Image Registration

Because the feature images summarize the major transition of the regions by exploring the similarity within bands, we align the corresponding feature images instead of multispectral images with a new elastic image registration method. We denote the reference and source feature images as $T(x,y)$ and $S(u,v)$, respectively.

In this paper, the idea of the inverse compositional algorithm^{13}^{,}^{14} is adopted and expanded in dealing with both rigid and local warping with high efficiency. To improve the efficiency, we warp the target image toward the source image so that a Hessian matrix can be obtained outside of the iterations, which avoids the clumsy step-by-step warping with a Taylor series expansion as in Ref. 9.

## 2.2.1.

#### Global affine model

We use a seven-dimensional vector to describe the warp and the brightness variations between the source and reference images. The motion between them can be modeled by an affine transform

## (2)

$$T(x,y)=S(u,v)+{m}_{7}\phantom{\rule{0ex}{0ex}}=S({m}_{1}x+{m}_{2}y+{m}_{5},{m}_{3}x+{m}_{4}y+{m}_{6})+{m}_{7},$$Equation (2) can also be denoted as $T(x,y)=S[W(x,y;M)]+{m}_{7}$, where

## (3)

$$W(x,y;M)=\left[\begin{array}{ccc}{m}_{1}& {m}_{2}& {m}_{5}\\ {m}_{3}& {m}_{4}& {m}_{6}\\ 0& 0& 1\end{array}\right]\left[\begin{array}{c}x\\ y\\ 1\end{array}\right]$$In general, the process of image registration is to keep the source image aligned with the reference image; therefore, warping the source image toward the reference image step-by-step in each iteration is very common. In contrast, in our method, the reference image is warped toward the source image step-by-step and the inverse of the final warping parameters is applied to the source image to complete the registration.

We set $\overrightarrow{\tilde{m}}={[\tilde{{m}_{1}},\tilde{{m}_{2}},\tilde{{m}_{3}},\tilde{{m}_{4}},\tilde{{m}_{5}},\mathrm{\Delta}{m}_{6},{m}_{7}]}^{\mathrm{T}}$ and

## (4)

$$T[W(u,v;\mathrm{\Delta}M)]=T(\tilde{{m}_{1}}u+\tilde{{m}_{2}}v+\tilde{{m}_{5}},\tilde{{m}_{3}}u+\tilde{{m}_{4}}v+\tilde{{m}_{6}})\phantom{\rule{0ex}{0ex}}=S(u,v)+{m}_{7}.$$The cost function is as follows:

## (5)

$$E(\overrightarrow{\tilde{m}})=\sum _{u,v\in \mathrm{\Omega}}{[T(\tilde{{m}_{1}}u+\tilde{{m}_{2}}v+\tilde{{m}_{5}},\tilde{{m}_{3}}u+\tilde{{m}_{4}}v+\tilde{{m}_{6}})-S(u,v)-{m}_{7}]}^{2},$$The goal is to minimize the above cost function to calculate $\mathrm{\Delta}M$, which is the inverse of the affine parameter matrix $M$. If $T[W(u,v;\mathrm{\Delta}M)]$ is expanded using a first-order truncated Taylor series, then

## (6)

$$E(\overrightarrow{\tilde{m}})\approx \sum _{u,v\in \mathrm{\Omega}}{[T+(\tilde{{m}_{1}}u+\tilde{{m}_{2}}v+\tilde{{m}_{5}}-u){T}_{u}+(\tilde{{m}_{3}}u+\tilde{{m}_{4}}v+\tilde{{m}_{6}}-v){T}_{v}-S-{m}_{7}]}^{2}.$$$T$ and $S$ are symbolized as lexicographically ordered vectors of $T(x,y)$ and $S(x,y)$ of length ${N}^{2}$, where ${N}^{2}$ denotes the size of images. ${T}_{u}$ and ${T}_{v}$ are the spatial derivatives of $T(u,v)$ and are symbolized as lexicographically ordered vectors of length ${N}^{2}$. Then the error function can be simplified as follows:

## (7)

$$E(\overrightarrow{\tilde{m}})={\parallel k-{\overrightarrow{c}}^{\mathrm{T}}\overrightarrow{\tilde{m}}\parallel}_{2}^{2},$$Then the error function in Eq. (7) can be minimized by differentiating $E(\overrightarrow{\tilde{m}})$ with respect to the unknowns $\overrightarrow{\tilde{m}}$ as

## (10)

$$\frac{\mathrm{d}E(\overrightarrow{\mathrm{\Delta}m})}{\mathrm{d}\overrightarrow{\mathrm{\Delta}m}}=-2\overrightarrow{c}[k-{\overrightarrow{c}}^{\mathrm{T}}\overrightarrow{\mathrm{\Delta}m}].$$Then

## (11)

$$\overrightarrow{\mathrm{\Delta}m}={[\overrightarrow{c}{\overrightarrow{c}}^{\mathrm{T}}]}^{-1}[\overrightarrow{c}k].$$Note that $\overrightarrow{\mathrm{\Delta}m}$ in Eq. (11) is the warping parameter set for $T(x,y)$ to be aligned with $S(u,v)$, so the affine transform matrix $M$ for $S(u,v)$ is obtained as follows:

The initial $M$ is a $3\times 3$ identity matrix. Initially, ${m}_{7}$ is set to 0 and is updated at each iteration. After several iterations, the source image will be warped by $M$ to match the reference image.

Note that, compared with Periaswamy’s method, $[\overrightarrow{c}{\overrightarrow{c}}^{\mathrm{T}}]$ (the Hessian matrix of the reference image) is a constant matrix, because there is nothing in this matrix that depends on iteratively warped $S$, so it can be precomputed. The additional step is to calculate Eq. (12) at each iteration. Compared with the calculation of $[\overrightarrow{c}{\overrightarrow{c}}^{\mathrm{T}}]$, the computational cost of the additional step, i.e., $M=\mathrm{\Delta}{M}^{-1}$ is very small, because $\mathrm{\Delta}M$ is a $3\times 3$ matrix only. ${N}^{2}$ is set to be the number of pixels in the reference image, and then the saved cost in each iteration is $O({7}^{2}{N}^{2})$. In this way, the efficiency of Periaswamy’s algorithm has been improved in processing the global affine warp.

In dealing with large distortion between the source image and the target image, a coarse to fine scheme is adopted. A Gaussian pyramid as shown in Fig. 4 is built for both the source image and the reference image, and the global affine parameters are calculated from the coarsest level.

## 2.2.2.

#### Local shift model

Two images can be coarsely aligned after the global affine transform mentioned above. Now, a local warping between local regions needs to be considered. As in many other methods, the two images are divided into many boxes to deal with the local warping individually. The smaller the box size, the better the complex nonrigid warping can be approximated by translation. Therefore, a three-dimensional vector is adopted to describe the translation and brightness variation between the source image box and the reference image box.

Let $BT(x,y)$ and $BS(u,v)$ denote the reference and source image boxes, respectively. The motion between them can be modeled by a shift transform

where $u=x+{m}_{5}$ and $v=y+{m}_{6}$. Then, the above equation can be rewritten asWe can either shift the source box a particular amount to align it with the reference box, or shift the reference box in the inverse direction by the same amount to keep them aligned. The latter strategy is adopted to improve the efficiency as described in Sec. 2.2.1. Then, the cost function is defined by

## (15)

$${E}_{\text{box}}(\overrightarrow{m})=\sum _{u,v\in \mathrm{\Omega}}{[BT(u-{m}_{5},v-{m}_{6})-BS(u,v)-{m}_{7}]}^{2},$$The goal is to minimize this expression to calculate the translation parameters and the intensity variation. $BT(u-{m}_{5},v-{m}_{6})$ is expanded using a first-order truncated Taylor series, thus

## (16)

$${E}_{\mathrm{box}}(\overrightarrow{m})\approx {[BT-{m}_{5}B{T}_{u}-{m}_{6}B{T}_{v}-BS-{m}_{7}]}^{2},$$## (17)

$${E}_{\mathrm{box}}(\overrightarrow{m})={\parallel k-{\overrightarrow{c}}^{\mathrm{T}}\overrightarrow{m}\parallel}_{2}^{2},$$The cost function in Eq. (17) can be minimized by differentiating ${E}_{\mathrm{box}}(\overrightarrow{m})$ with respect to the unknowns $\overrightarrow{m}$ by

## (20)

$$\frac{\mathrm{d}{E}_{\mathrm{box}}(\overrightarrow{m})}{\mathrm{d}\overrightarrow{m}}=\sum _{u,v\in \mathrm{\Omega}}-2\overrightarrow{c}[k-{\overrightarrow{c}}^{\mathrm{T}}\overrightarrow{m}].$$Then

## (21)

$$\overrightarrow{m}={[\overrightarrow{c}{\overrightarrow{c}}^{\mathrm{T}}]}^{-1}[\overrightarrow{c}k].$$As we can see, the Hessian matrix ${[\overrightarrow{c}{\overrightarrow{c}}^{\mathrm{T}}]}^{-1}$ is a constant matrix, because there is nothing in this matrix that depends on the shifted source box, so it can be precomputed. Once it is calculated, it can be saved in memory for use in each iteration. On the contrary, the vector $[\overrightarrow{c}k]$ has to be calculated in each iteration. After a few iterations, the best $\overrightarrow{m}$ is obtained for Eq. (17) by making ${E}_{\mathrm{box}}(\overrightarrow{m})$ minimum between two boxes.

## 2.2.3.

#### Smoothness constraint

Usually, the motion parameters vary smoothly in most parts of the image. We assume that neighbor pixel locations may have constant/smooth motion parameters. This provides a very useful constraint for obtaining meaningful local alignment parameters. During every local registration iteration introduced above, to avoid a large difference between neighbor parameter sets and to express the movement parameters with intensity variation only, a smoothness constraint is applied to these parameters.

As in Refs. 9 and 15, a similar smoothness constraint is set up. The error function of every pixel is described as follows:

## (22)

$${E}_{\mathrm{pix}}(\overrightarrow{m})={E}_{xy}(\overrightarrow{m})+{E}_{\mathrm{sm}}(\overrightarrow{m}),$$^{9}

## (24)

$${E}_{\mathrm{sm}}(\overrightarrow{m})=\sum _{e=5}^{7}{\zeta}_{e}[{\left(\frac{\partial {m}_{e}}{\partial x}\right)}^{2}+{\left(\frac{\partial {m}_{e}}{\partial y}\right)}^{2}],$$## (25)

$$\frac{\mathrm{d}{E}_{\mathrm{pix}}(\overrightarrow{m})}{\mathrm{d}\overrightarrow{m}}=\frac{\mathrm{d}{E}_{xy}(\overrightarrow{m})}{\mathrm{d}\overrightarrow{m}}+\frac{\mathrm{d}{E}_{\mathrm{sm}}(\overrightarrow{m})}{\mathrm{d}\overrightarrow{m}}=0.$$The derivative of ${E}_{xy}(\overrightarrow{m})$ is

## (26)

$$\frac{\mathrm{d}{E}_{xy}(\overrightarrow{m})}{\mathrm{d}\overrightarrow{m}}=-2\overrightarrow{c}[k-{\overrightarrow{c}}^{\mathrm{T}}\overrightarrow{m}].$$To compute the derivative of $[\mathrm{d}{E}_{\mathrm{sm}}(\overrightarrow{m})]/(\mathrm{d}\overrightarrow{m})$, the method used in Refs. 9 and 15 is chosen by using the horizontal and vertical derivatives of ${m}_{e}(x,y)$ as

## (27)

$$\frac{\partial {m}_{e}(x,y)}{\partial x}={m}_{e}(x,y)-{m}_{e}(x+1,y),\phantom{\rule{0ex}{0ex}}\frac{\partial {m}_{e}(x,y)}{\partial y}={m}_{e}(x,y)-{m}_{e}(x,y+1),$$Then, Eq. (24) can be written as

## (28)

$${E}_{\mathrm{sm}}[{m}_{e}(x,y)]={\zeta}_{e}\{{[{m}_{e}(x,y)-{m}_{e}(x+1,y)]}^{2}\phantom{\rule{0ex}{0ex}}+{[{m}_{e}(x,y)-{m}_{e}(x,y+1)]}^{2}\}.$$Therefore, $\{\mathrm{d}{E}_{\mathrm{sm}}[{m}_{e}(x,y)]\}/[\mathrm{d}{m}_{e}(x,y)]$ can be approximated as

## (29)

$$\frac{\mathrm{d}{E}_{\mathrm{sm}}[{m}_{e}(x,y)]}{\mathrm{d}{m}_{e}(x,y)}=2{\zeta}_{e}\{{m}_{e}(x,y)-{m}_{e}(x+1,y)+{m}_{e}(x,y)-{m}_{e}(x,y+1)\}\phantom{\rule{0ex}{0ex}}=4{\zeta}_{e}[{m}_{e}(x,y)-\overline{{m}_{e}}(x,y)],$$Using vector notation, the derivative of ${E}_{\mathrm{sm}}(\overrightarrow{m})$ is denoted by discrete approximations^{9}

## (30)

$$\frac{\mathrm{d}{E}_{\mathrm{sm}}(\overrightarrow{m})}{\mathrm{d}\overrightarrow{m}}=2L(\overrightarrow{m}-\overline{\overrightarrow{m}}),$$## (31)

$${\overrightarrow{m}}^{(j+1)}={(\overrightarrow{c}{\overrightarrow{c}}^{\mathrm{T}}+L)}^{-1}(\overrightarrow{c}k+L{\overline{\overrightarrow{m}}}^{(j)}).$$At each iteration $j$, ${\overline{\overrightarrow{m}}}^{(j)}$ can be estimated from the current ${\overrightarrow{m}}^{(j)}$. The initial ${\overrightarrow{m}}^{(0)}$ can be estimated from Eq. (21).

Here, ${\zeta}_{e}=1$ in our experimental tests, and the selection of the size of local box $\mathrm{\Omega}$ depends on particular images. The smaller the box size, the more the elastic warping is fixed; however, the weaker the constraint that is applied. A larger box size means that more pixels provide constraints on the mean square error minimization. Therefore, it shows more rigid characteristics. A smaller box size brings more freedom with less constraint, but it also brings more error because the matrix $[\overrightarrow{c}{\overrightarrow{\overrightarrow{m}}}^{\mathrm{T}}]$ may not be invertible. There is a natural tradeoff in choosing the box size.

## 3.

## Overview and Implementation Details

Here, the overall structure of the new elastic image registration for multispectral images will be reviewed. First, feature images of the multispectral bands are created. One of the feature images is selected as a reference image and other feature images as source images. Then, the image pyramid is constructed from the two input feature (source and reference) images. The new image registration algorithm starts from the coarsest layer images. The global affine image registration is applied to these two coarsest source and reference feature images. After a certain number of global iterations, a parameter set $\mathrm{\Delta}M$ will be calculated. Because $\mathrm{\Delta}M={M}^{-1}$, $M$ can be calculated, which is used to warp the source image to the reference image for this current layer. Then it brings this warped source image and the reference image into the local shift registration procedure. In each local iteration, two images are divided into boxes, and the shift parameters are calculated for each coordinate position.

After each local shift registration iteration, a number of smoothness constraint iterations follow to avoid some unexpected errors. Once the local shift registration procedure and the smoothing procedure finish, the warping parameters within that layer are interpolated by a factor of 2 for the next finer-layer image until reaching the resolution of the original source and reference images. In order to avoid the artifacts of the bilinear interpolation in each warping procedure for the source image pyramid, an accumulated warp is finally applied to the original source image pyramid layer at each level. For example, if 15 global iterations were to be adopted, there would be 15 steps in warping the source image layer toward the reference image layer at each scale. Therefore, 15 times bilinear interpolation would be sequentially applied to the same source image layer. Then, the blur artifacts would be extremely severe. Therefore, rather than continuously warping and interpolating the same source image layer, the warping parameter set is accumulated step-by-step and the overall warping parameter set is applied to the original source image layer in a single warp at the end for each source layer image.

Based on the feature images of both the reference and the source, the pixel translation of each position: ${m}_{5}$ and ${m}_{6}$ can be calculated between the reference feature image and the source feature image. Finally, we warp (i.e., a grid step plus a bilinear interpolation step) the source band toward the reference band with the precalculated translation values ${m}_{5}$ and ${m}_{6}$. The above operation should be routinely carried out for each band. Eventually, the whole multispectral band-to-band registration is completed.

To help the readers better understand the whole process of our method, a flow chart is shown in Fig. 5. Moreover, MATLAB code and other details of our algorithm can be found at http://code.google.com/p/b2bregistration.

## 4.

## Experimental Results

In order to evaluate the performance of the proposed method, we tested our method with real flight data. By reforming an unmanned aerial vehicle (UAV), multipayloads such as the new type of wide-view multispectral line scanner introduced above, a hyperspectral camera and a panchromatic array camera are integrated into the vehicle. The flight was conducted in a testing site (for calibration and validation) near BaoTou city of China on September 3, 2011. The total flight time was over 10 h with three different flight altitudes. One of the test data sets acquired by the new multispectral line scanner without the geometric correction procedure is shown in Fig. 2. The true color image shown in Fig. 6(a) indicates the misalignment between bands. By setting the red band image Fig. 2(c) as a reference, we register the blue band Fig. 2(a) and the green band Fig. 2(b) to the reference, respectively. The true color image after the elastic registration is shown in Fig. 6(b). We can see that the true color image shows no misalignment between bands. Note that the true color image still suffers from some geometric distortion. This is because geometric distortion exists in the red band, which is the reference.

To decrease the geometric distortion in the reference image, we took images captured by the array camera in another flight track as the reference. Moreover, a geometric correction step was used to achieve the goal. In our flight test, payloads were equipped on an airborne photoelectric stabilized platform. Given the position and orientation system (POS) parameter and the compensation parameters from the airborne photoelectric stabilized platform, a geometric correction step can be applied to the raw image data to decrease the geometric error brought by the UAV attitude. Images captured by the panchromatic array camera generally suffer less local warps caused by the vibration of the platform than those from line scanner cameras. The reason is that each line data may suffer different POS and photoelectric stabilized platform parameters during the exposure period. A further test image captured by the array camera on the same day but in another track of the flight is shown in Fig. 7(a) with size of $1024\times 1024\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{pixels}$. An enlarged region of (a) is shown in Fig. 7(b). Images of the same region of the multispectral data after geometric correction are shown in Figs. 7(c)–7(f) of blue, green, red, and near-infrared, respectively. From Figs. 7(c)–7(f), we can see that there are still some nonlinear local warps between bands, although they are much less than those in the data shown in Fig. 2. By setting the image from the array camera as the reference band, the same region of the registered images of blue, green, red, and near-infrared are shown in Figs. 7(g)–7(j), respectively. Moreover, to help the readers have a visual comparison, the true color image of the same region before and after the elastic registration procedure are shown in Figs. 7(k) and 7(l), respectively. We can see that the true color image of Fig. 7(l) contains less misalignment between registered bands and less geometric distortions. In this experiment, $\mathrm{\Omega}=40\times 40$ was selected as the box size for solving local warps.

To demonstrate the performance of our band-to-band registration method quantitatively, we carried out a numerical comparison by evaluating the misalignment between the registered bands and the panchromatic image before and after registrations. We randomly selected 11 testing points on edges or corners in the whole images to ensure that these points are identifiable for all bands. This is because testing points from smooth regions cannot be uniquely identified by either human eyes or evaluation methods. After selecting the testing points, a paraboloid surface fit method^{16} was adopted to evaluate the registration accuracy in subpixels. The mean of the horizontal and vertical misalignments of the 11 points between bands and the reference (the panchromatic image captured by the onboard array camera) is shown in Table 1. Both visual and numerical comparison show that our elastic band-to-band registration method works well for solving different nonlinear warping and local translations between bands.

## Table 1

Numerical comparison results before and after the registration.

Bands | Horizontal shift average value in pixels | Vertical shift average value in pixels | ||
---|---|---|---|---|

Before registration | After registration | Before registration | After registration | |

Blue band | 4.79 | 0.20 | 0.71 | 0.09 |

Green band | 3.55 | 0.11 | 1.02 | 0.08 |

Red band | 3.32 | 0.09 | 0.66 | 0.07 |

Near-infrared band | 4.03 | 0.21 | 0.59 | 0.25 |

## 5.

## Conclusions

In this paper, a new elastic band-to-band registration method is proposed for solving local warps between bands. Gradient-based feature images are constructed as intermediate products to assist the registration. Instead of registering multispectral bands straightforwardly, we apply a new intensity-based elastic registration method to register those feature images to calculate their corresponding warping parameters. Experimental results of real data have shown that the proposed method can correct local warps in band-to-band registration with satisfactory performance. With a growing number of real applications of the new multispectral line scanner of large FOV and short focal length, this band-to-band registration method will certainly attract more attention.

## Acknowledgments

Feng Li acknowledges the support of NSFC under Grant 41371415. Yi Guo is supported by the CSIRO Minerals Down Under Research Flagship and Capability Development Fund.

## References

## Biography

**Feng Li** received his PhD degree from the University of New South Wales in 2009. He joined CSIRO in the same year and developed algorithms for the applications of compressive sensing to radio astronomy. Currently, he is working at the Academy of Opto-electronics, Chinese Academy of Sciences. His research interests are in general image processing, image restoration, and compressive sensing.

**ChuanRong Li** is a professor at the Academy of Opto-electronics, Chinese Academy of Sciences. His research is mainly on remote sensing data processing, geometric correction, image processing, and remote sensing data calibration and validation.

**LingLi Tang** is a professor at the Academy of Opto-electronics, Chinese Academy of Sciences. Her research is mainly on remote sensing data preprocessing, SAR data processing, image processing, and remote sensing data calibration and validation.

**Yi Guo** obtained a BE degree in electrical engineering from North China University of Technology in 1998 and a PhD degree from University of New England in 2008. He has been with the Commonwealth Science and Industry Organisation (CSIRO) as a research scientist since 2008. His research focuses on machine learning and computational statistics for big data.