Open Access
1 April 2007 Fast motion estimation with adaptive search range adjustment
Si-Woong Lee, Seong-Mo Park, Hyun-Soo Kang
Author Affiliations +
Abstract
A fast motion estimation algorithm based on the adaptive search range adjustment (ASRA) is presented. The size of search range for each block is adaptively controlled according to the initial cost at the search center, and then the full search algorithm (FSA) is performed to find the optimal matching block in the adjusted search range. The proposed method is implemented in H.264/AVC reference software (JM) to demonstrate its efficiency over the conventional FSA.

1.

Introduction

Block matching motion estimation is the core process in various video coding standards because of its efficiency in removing the temporal redundancy. However, motion estimation is very computationally intensive, if the exhaustive fast search algorithm (FSA) is employed to find the optimal matched block in the reference frame. To alleviate the heavy computation, many fast algorithms, such as the well-known three-step search (3SS),1 the four-step search (4SS),2 the diamond search (DS),3 and the hexagon-based search (HEXBS),4 have been proposed. Except for the successive elimination algorithm (SEA),5 these algorithms achieve fewer computations by sampling the search points in the search range, i.e., searching over a subset of possible candidate points with certain search patterns. Although algorithms in this category considerably reduce the computational complexity, they occasionally produce ambiguous search directions, and hence result in a local minimum.

This letter proposes a new fast motion estimation algorithm, which focuses on reducing computation not by sampling the search points but by reducing the size of the search range. Based on the assumption of convexity of the unimodal error surface, the proposed method makes use of the fact that the search center is probably close to the optimal matching point when the initial cost at the search center is efficiently low. Thus, in such cases, it’s possible to reduce the size of the search range for increased speed without much increase in matching error, which is the basic concept of the proposed method.

2.

Proposed Method

In general, motion vectors of spatially adjacent blocks are highly correlated if they belong to the same moving area. Having a predictive motion vector (MVp) for a current block as the search center, as in the H.264/AVC JM,6, 7 the initial cost at the search center J(MVp) is a good estimate for correlation between the motion vector of the current block (MVc) and MVp . If J(MVp) is low, it means that MVp is probably close to MVc , and hence the size of the search range can be reduced to a proper size without missing the global minimum. On the other hand, if J(MVp) is high, it means that MVp is not so correlated with MVc , and thus the size of the search range should be larger in order to include the global minimum in the search range. Based on this background, the block matching process for each block is performed as follows:

  • 1. The MVp for a current block is determined in the same way as H.264/AVC, and the search center is set to MVp .

  • 2. Using MVp , J(MVp) is measured according to the definition of the cost.

  • 3. With J(MVp) , the size of the search range for a current block, SRvar , is adjusted as

    Eq. 1

    SRvar=SRfixed2Lk,ifTk<J(MVp)Tk+1,k=0,1,,L,
    where SRfixed denotes the size of the fixed search range used in JM, and L+1 represents the number of search range sizes in the adaptive search range adjustment (ASRA)–based method. Note that SRvar reduces to the power of 2 as J(MVp) decreases. For SRfixed=16 and L=2 , there are three search range sizes of ±16 , ±8 , and ±4 .

  • 4. The FSA is performed to estimate MVc in the adjusted search range.

3.

Threshold Decision

With SRfixed=16 and L=2 , two thresholds T1 and T2 (T1<T2) must be given to discriminate the ranges of J(MVp) for the three kinds of search range such as

Eq. 2

SRvar={4,ifJ(MVP)<T18,elseifJ(MVP)<T216,else.
Taking advantage of the similarity of the minimum costs among the neighboring blocks in the same moving region, we suggest determining the thresholds using the costs of already coded neighboring blocks as follows;

Eq. 3

T1=αMed[JA,JB,JC],

Eq. 4

T2=αMax[JA,JB,JC],
where α is a scaling constant, and JA , JB , and JC are stored minimum costs of the left (A), the upper (B), and the upper-right (C) blocks of the current block, respectively. Med[ a , b , c ] and Max[ a , b , c ] are functions returning the median and maximum values among a , b , and c . Figure 1 illustrates the relative locations of the neighboring blocks to the current block (X). For all block partitioning modes including 16×8 and 8×16 modes, the locations follow the definition used in JM to find the predictive motion vector by median prediction. If block C is unavailable, the upper-left block D replaces C. Note that when the partitioning mode of a current block is different from that of neighboring blocks, the stored costs of neighboring blocks are scaled in accordance to the size of the current block.

Fig. 1

Determination of the neighboring blocks.

040504_1_1.jpg

4.

Experimental Results

To evaluate the performance of the proposed technique, Foreman, Stefan, and Soccer sequences with the CIF format were tested with H.264/AVC JM9.4. The encoding parameters are as follows: number of frames—200, frame rate— 15Hz , target bit rate— 256kbits ; RD optimization off; 16×16 , 16×8 , 8×16 , and 8×8 block modes on; number of reference frames—1.

Table 1 shows the performance comparison between the proposed algorithm and the conventional FSA. Note that the spiral search method with MVp as the initial search point is used in both methods. The computational complexity is measured relative to the FSA with a ±16 search range. The results are summarized as follows:

  • In terms of computational complexity, the proposed method has a significant gain. When α=2.0 , the computational complexity is approximately 30% compared to the FSA with a ±16 search range and nearly equivalent to the FSA with a ±8 search range.

  • In terms of reconstructed image quality, the proposed method keeps up with the FSA with a ±16 search range. The average PSNR decreases 0.2dB at most. In contrast, the FSA with a ±8 search range exhibits severe degradation from 0.4dB to 1.1dB . This is because those blocks having large motion cannot be accurately motion-compensated with a small fixed search range. On the contrary, for such cases, the proposed ASRA-based approach may provide the full search range of ±16 , so the accuracy of the motion estimation can be maintained. Table 2 shows the distributions of selected search ranges in the proposed method.

Table 1

Performance comparison between the proposed approach and the conventional FSA.

ForemanStefanSoccer
MethodPSNRComplexityPSNRComplexityPSNRComplexity
FSA(±16) 35.34128.39131.521
FSA(±8) 34.980.26527.070.26530.950.265
ASRA(α=1.0) 35.290.49028.290.50331.430.506
ASRA(α=1.5) 35.270.34328.200.34831.360.333
ASRA(α=2.0) 35.250.27328.180.26131.320.275

Table 2

Distribution of selected search ranges in the proposed approach.

Foreman (%)Stefan (%)Soccer (%)
Method ±16 ±8 ±4 ±16 ±8 ±4 ±16 ±8 ±4
ASRA(α=1.0) 34.436.129.526.137.336.636.235.728.1
ASRA(α=1.5) 22.023.055.016.223.760.121.620.358.1
ASRA(α=2.0) 16.514.868.710.715.473.917.112.370.6

Figure 2 shows PSNR and complexity graphs for a range of α . We can observe that as α grows, the reduction ratio of the complexity decreases, and it is nearly saturated over α=2.0 , whereas the PSNR drops continually, although the degree of drop is negligible. From this result, a value about 2.0 seems proper as the empirical value for α .

Fig. 2

PSNR and complexity versus α : (a) PSNR; (b) complexity.

040504_1_2.jpg

5.

Conclusions

This letter has presented a fast motion estimation algorithm based on the ASRA. The size of the search range is adaptively adjusted according to the initial cost at the search center. Experimental results demonstrate that the proposed method saves a significant part of computations while providing comparable performance to the conventional FSA. If we combine our method with a fast FSA such as SEA, additional reduction in computation can be simply obtained. In addition, the ASRA-based approach has a strong point with respect to the hardware implementation, since it has a regular structure like the FSA.

Acknowledgments

This research was supported by the Ministry of Information and Communication (MIC), Korea, under the Information Technology Research Center (ITRC) support program supervised by the Institute of Information Technology Advancement (IITA), IITA-2006-(C1090-0603-0002).

References

1. 

J. Jain and A. Jain, “Displacement measurement and its application in interframe image coding,” IEEE Trans. Commun., 29 (12), 1799 –1808 (1981). https://doi.org/10.1109/TCOM.1981.1094950 0090-6778 Google Scholar

2. 

L.-M. Po and W.-C. Ma, “A novel four-step search algorithm for fast block motion estimation,” IEEE Trans. Circuits Syst. Video Technol., 9 (2), 313 –317 (1996). 1051-8215 Google Scholar

3. 

S. Zhu and K.-K. Ma, “A new diamond search algorithm for fast block matching motion estimation,” IEEE Trans. Image Process., 9 (2), 287 –290 (2000). 1057-7149 Google Scholar

4. 

C. Zhu, X. Lin, and L. P. Chau, “Hexagon-based search pattern for fast block motion estimation,” IEEE Trans. Circuits Syst. Video Technol., 12 (5), 349 –355 (2002). https://doi.org/10.1109/TCSVT.2002.1003474 1051-8215 Google Scholar

5. 

W. Li and E. Salari, “Successive elimination algorithm for motion estimation,” IEEE Trans. Image Process., 4 105 –107 (1995). https://doi.org/10.1109/83.350809 1057-7149 Google Scholar

6. 

ITU-T Recommendation H.264Draft text of final draft standard for advanced video coding,” (2003) Google Scholar

7. 

ISO/IEC JTC1/SC29/WG11, “Working draft of reference software for advanced video coding,” (2003) Google Scholar
©(2007) Society of Photo-Optical Instrumentation Engineers (SPIE)
Si-Woong Lee, Seong-Mo Park, and Hyun-Soo Kang "Fast motion estimation with adaptive search range adjustment," Optical Engineering 46(4), 040504 (1 April 2007). https://doi.org/10.1117/1.2721767
Published: 1 April 2007
Lens.org Logo
CITATIONS
Cited by 16 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Motion estimation

Information technology

Computer engineering

Optical engineering

Video coding

Communication engineering

Diamond

Back to Top