## 1.

## Introduction

In aerial imagery, the estimation problem of global motion referred to that as caused by the airborne moving observer is usually a key part in many aerial surveillance applications,^{1} such as image stabilization, motion detection, mosaic, coding, etc. Generally, the global motion of aerial imagery can be represented by a 2-D parametric model such as an affine or planar model and can be estimated directly by a parametric optimization process.^{2} Furthermore, block-based methods, feature-based algorithms, and frequency domain methods have also been widely used for global motion estimation.^{3, 4} However, many of these methods are either time-consuming or not robust enough to record uncertainties or outliers. RANSAC has been an ideal solution for elimination of outliers, but the random nature of the algorithm makes direct use of RANSAC inefficient.^{5}

Since global motion is usually dominant in aerial imagery as compared to small independent or local motions and other distracting ones, it is natural to obtain the global motion model in an indirect way by fitting the optical flow field of the aerial imagery. However, due to the outlier-sensitive defect of the standard linear regression technique used in fitting, either uncertainties^{6} or independent motions^{7} of the optical flow field would ruin the final fitting result. Therefore, instead of fitting the whole optical flow field, we prefer to choose a small set of reliable flow components for fitting so that the negative effect of flow outliers can be greatly reduced.

## 2.

## Optical Flow Estimation and Motion Model Fitting

Our optical flow computation is based on two well-known assumptions: brightness constancy and flow smoothness constraint. For computational saving, we conduct our algorithm under a coarse-to-fine hierarchical framework as proposed in Ref. 2. The multiscale implementation allows for both computation efficiency and estimation of large motions. In addition, if the algorithm proceeds to a middle layer of the image pyramid, e.g., layer $n\u22152+1$ of an $n$ -layered pyramid, distracting motions and noise can also be filtered somehow and would benefit to global motion model fitting.

Now we discuss how to use the estimated optical flow field to fit a global motion model. Assume that the motion model has the form:

where $(u,v)$ is the motion vector. As to an affine motion model, we have $\Phi =(x,y,1)$ , ${a}_{u}^{T}=({a}_{1},{a}_{2},{a}_{3})$ , ${a}_{v}^{T}=({a}_{4},{a}_{5},{a}_{6})$ . Using the least-squares regression technique, the motion model parameters can be derived as follows:## 2

$$({a}_{u},{a}_{v})={(\sum {\Phi}^{T}\Phi )}^{-1}\sum {\Phi}^{T}(u,v),\phantom{\rule{1em}{0ex}}(x,y)\u220a{D}_{I}$$While this approach provides a simple mechanism for global motion estimation, it is unfortunately of limited use because it is sensitive to outliers, which correspond to either uncertainties or independent motions of the optical flow field. Therefore, to get a more accurate motion model, outliers have to be removed, which leads to the following refined method.

## 3.

## Optical Flow Valuation and Outlier Removal

To evaluate optical flows and remove flow outliers unfit for motion model fitting, first we divide the optical flow field into an array of non-overlapping regions and derive a set of motion hypotheses by fitting each region separately. Suppose ${a}_{ui}^{T}$ , ${a}_{vi}^{T}$ are the $i$ ’th hypothesis, then for each region we have:

## 3

$$({a}_{ui},{a}_{vi})={(\sum {\Phi}^{T}\Phi )}^{-1}\sum {\Phi}^{T}({u}_{i},{v}_{i}),\phantom{\rule{1em}{0ex}}(x,y)\u220a{R}_{i}$$## 4

$${\sigma}_{i}^{2}=(\sum {\parallel {V}_{i}-{V}_{ai}\parallel}^{2})\u2215{N}_{i},\phantom{\rule{1em}{0ex}}(x,y)\u220a{R}_{i}$$## 4.

## Selection of Optical Flows for Fitting

A troublesome problem during optical flow valuation and outlier removal is how to choose an appropriate threshold, which varies with different region size. In fact, if the region size is set larger, we tend to choose a higher threshold since the residual error defined in Eq. 4 will increase, and vice versa. Empirically, as to the aerial imagery with a dominant global motion, the selection of a small proportion (e.g., 5%) of reliable optical flows would be sufficient to arrive at an accurate motion model. Therefore, instead of trying efforts to find a proper threshold for outlier removal, we prefer to select a small set of reliable flows with relatively small residual errors so that the flow outliers can be excluded. Since the choice of proportion criterion is usually constant with respect to different aerial imagery, the bothersome threshold selection work can be avoided. The flows valuation and selection algorithm is outlined below:

(i) Set an initial threshold $\Omega ={\Omega}_{0}$ (usually assigned a very small value), and a searching step $\Delta \Omega $ . The proportion criterion is set to $\beta $ .

(ii) Divide the optical flow field into non-overlapping regions and compute the residual errors, i.e., ${\sigma}_{i}^{2}$ , $i=1,2\dots M$ , according to Eq. 4, where $M$ is the number of regions.

(iii) Find flow regions that satisfy ${\sigma}_{i}^{2}<\Omega $ , and the number of selected regions is denoted by $N$ .

(iv) If $N\u2215M\u2a7e\beta $ , stop searching and $\Omega $ is the ultimate threshold, otherwise $\Omega =\Omega +\Delta \Omega $ and go to step (iii).

Once the threshold $\Omega $ is determined, reliable flow regions are extracted while flow outliers are rejected in the meanwhile. Finally, by fitting the selected optical flows again, a more accurate global motion model can be obtained. As only a small amount of flows participates in fitting, the algorithm efficiency is further improved.

## 5.

## Experimental Results

We first test the performance of our algorithm by registering the two aerial images shown in Fig. 1. Obviously, there are both large scale and rotation changes between Fig. 1a and Fig. 1b. Figure 1c and Fig. 1d demonstrate two cases of optical flow selection results (denoted by areas outlined in black) using the proposed algorithm according to two different proportion criteria. The optical flow field is calculated under a five-layered Gauss pyramid and the computation is proceeded until the fourth layer, which means the size of the optical flow field is actually $128\times 128$ . The same region size $16\times 16$ is set in both cases and the proportion criteria are set to 5% and 30%, respectively. Accordingly, the final threshold is determined as 0.0965 in Fig. 1c and three flow regions are selected, while 19 regions are selected in Fig. 1d with threshold determined as 1.5084. By fitting the selected optical flows to an affine model, we get ${a}_{u}^{T}=(1.2791,-0.2644,-8.2940)$ , ${a}_{v}^{T}=(0.2769,1.2855,-78.5839)$ in the first case and ${a}_{u}^{T}=(1.1744,-0.1175,-9.6717)$ , ${a}_{v}^{T}=(0.1739,1.1751,-52.0925)$ in the latter. It is evident from the registration results shown in Fig. 1e and Fig. 1f that the global motion estimation result in Fig. 1e is much more accurate because fewer but more precise optical flows have been chosen in Fig. 1c for motion model fitting, while in Fig. 1d the involvement of more probably imprecise flows leads to an inaccurate fitting result in Fig. 1f.

Figure 2 shows another example of our algorithm in independent motion detection. To detect the moving truck based on the two consecutive frames shown in Fig. 2a and Fig. 2b, the apparent background or global motion has to be compensated. For optical flows selection and outlier removal, each region size is set as
$16\times 12$
, the proportion criterion is 5%, and the final determined threshold is 0.0409 with five regions selected. It is clear from Fig. 2c that the regions containing independent motions with respect to the moving truck have been successfully excluded. In fact, the multiscale optical flow computation proceeds only to the third layer of a four-layered Gauss pyramid, indicating that the real size of the optical flow field in Fig. 2c is
$160\times 120$
. By fitting the selected optical flows to an affine motion model, we get
${a}_{u}^{T}=(1.0019,0.0026,-7.6282)$
,
${a}_{v}^{T}=(-0.0044,1.0002,5.5330)$
. Figure 2d shows the residual motion image^{8} after the global motion has been compensated, from which we see the moving truck can be easily detected.

## 6.

## Conclusions

According to the characteristic of aerial imagery, a robust global motion estimation algorithm by fitting optical flow field is proposed in this letter. Since optical flow outliers are almost completely removed by choosing a small proportion of reliable flows for motion model fitting, global motion estimation accuracy and robustness have been highly increased. Experimental results on both aerial image registration and independent motion detection show the effectiveness of our algorithm.

## References

*Tech. Report*IRIS-03-421, Institute for Robotics and Intelligent Systems, University of Southern California (2003). Google Scholar