Block matching motion estimation is the core process in various video coding standards because of its efficiency in removing the temporal redundancy. However, motion estimation is very computationally intensive, if the exhaustive fast search algorithm (FSA) is employed to find the optimal matched block in the reference frame. To alleviate the heavy computation, many fast algorithms, such as the well-known three-step search (3SS),1 the four-step search (4SS),2 the diamond search (DS),3 and the hexagon-based search (HEXBS),4 have been proposed. Except for the successive elimination algorithm (SEA),5 these algorithms achieve fewer computations by sampling the search points in the search range, i.e., searching over a subset of possible candidate points with certain search patterns. Although algorithms in this category considerably reduce the computational complexity, they occasionally produce ambiguous search directions, and hence result in a local minimum.
This letter proposes a new fast motion estimation algorithm, which focuses on reducing computation not by sampling the search points but by reducing the size of the search range. Based on the assumption of convexity of the unimodal error surface, the proposed method makes use of the fact that the search center is probably close to the optimal matching point when the initial cost at the search center is efficiently low. Thus, in such cases, it’s possible to reduce the size of the search range for increased speed without much increase in matching error, which is the basic concept of the proposed method.
In general, motion vectors of spatially adjacent blocks are highly correlated if they belong to the same moving area. Having a predictive motion vector for a current block as the search center, as in the H.264/AVC JM,6, 7 the initial cost at the search center is a good estimate for correlation between the motion vector of the current block and . If is low, it means that is probably close to , and hence the size of the search range can be reduced to a proper size without missing the global minimum. On the other hand, if is high, it means that is not so correlated with , and thus the size of the search range should be larger in order to include the global minimum in the search range. Based on this background, the block matching process for each block is performed as follows:
1. The for a current block is determined in the same way as H.264/AVC, and the search center is set to .
2. Using , is measured according to the definition of the cost.
3. With , the size of the search range for a current block, , is adjusted asdenotes the size of the fixed search range used in JM, and represents the number of search range sizes in the adaptive search range adjustment (ASRA)–based method. Note that reduces to the power of 2 as decreases. For and , there are three search range sizes of , , and .
4. The FSA is performed to estimate in the adjusted search range.
With and , two thresholds and must be given to discriminate the ranges of for the three kinds of search range such asis a scaling constant, and , , and are stored minimum costs of the left (A), the upper (B), and the upper-right (C) blocks of the current block, respectively. Med[ , , ] and Max[ , , ] are functions returning the median and maximum values among , , and . Figure 1 illustrates the relative locations of the neighboring blocks to the current block (X). For all block partitioning modes including and modes, the locations follow the definition used in JM to find the predictive motion vector by median prediction. If block C is unavailable, the upper-left block D replaces C. Note that when the partitioning mode of a current block is different from that of neighboring blocks, the stored costs of neighboring blocks are scaled in accordance to the size of the current block.
To evaluate the performance of the proposed technique, Foreman, Stefan, and Soccer sequences with the CIF format were tested with H.264/AVC JM9.4. The encoding parameters are as follows: number of frames—200, frame rate— , target bit rate— ; RD optimization off; , , , and block modes on; number of reference frames—1.
Table 1 shows the performance comparison between the proposed algorithm and the conventional FSA. Note that the spiral search method with as the initial search point is used in both methods. The computational complexity is measured relative to the FSA with a search range. The results are summarized as follows:
• In terms of computational complexity, the proposed method has a significant gain. When , the computational complexity is approximately 30% compared to the FSA with a search range and nearly equivalent to the FSA with a search range.
• In terms of reconstructed image quality, the proposed method keeps up with the FSA with a search range. The average PSNR decreases at most. In contrast, the FSA with a search range exhibits severe degradation from to . This is because those blocks having large motion cannot be accurately motion-compensated with a small fixed search range. On the contrary, for such cases, the proposed ASRA-based approach may provide the full search range of , so the accuracy of the motion estimation can be maintained. Table 2 shows the distributions of selected search ranges in the proposed method.
Performance comparison between the proposed approach and the conventional FSA.
Distribution of selected search ranges in the proposed approach.
|Foreman (%)||Stefan (%)||Soccer (%)|
Figure 2 shows PSNR and complexity graphs for a range of . We can observe that as grows, the reduction ratio of the complexity decreases, and it is nearly saturated over , whereas the PSNR drops continually, although the degree of drop is negligible. From this result, a value about 2.0 seems proper as the empirical value for .
This letter has presented a fast motion estimation algorithm based on the ASRA. The size of the search range is adaptively adjusted according to the initial cost at the search center. Experimental results demonstrate that the proposed method saves a significant part of computations while providing comparable performance to the conventional FSA. If we combine our method with a fast FSA such as SEA, additional reduction in computation can be simply obtained. In addition, the ASRA-based approach has a strong point with respect to the hardware implementation, since it has a regular structure like the FSA.
This research was supported by the Ministry of Information and Communication (MIC), Korea, under the Information Technology Research Center (ITRC) support program supervised by the Institute of Information Technology Advancement (IITA), IITA-2006-(C1090-0603-0002).