## 1.

## Introduction

A block-based full search algorithm (FSA) has been widely used for motion estimation (ME) in video encoding, but it has the serious problem of significant computation requirements.^{1} In order to solve this problem, many fast algorithms have been developed, and partial difference elimination (PDE) algorithms^{1, 2, 3, 4} are good examples. PDE algorithms reduce computations by eliminating impossible candidates before the complete matching error calculation between matching and candidate blocks. PDE algorithms have also been improved by selecting a low initial block matching error^{1, 2} or arranging the line/sub-block matching order in a candidate block.^{3, 4} In this letter, we propose a fast PDE algorithm for reducing additional computational costs based on the property of block matching error. The proposed algorithm predicts a total block matching error from a partially computed matching error between the blocks and removes impossible candidates earlier than conventional algorithms. Simulation results show that the proposed algorithm can save a large amount of computations with ignorable image degradation in comparison with conventional PDE algorithms.

## 2.

## Conventional PDE Algorithms

The FSA finds the most similar block to a matching block within a given search range of the reference frame. The similarity between blocks is measured by the block matching error and is often computed by the sum of absolute difference (SAD) between matching and candidate blocks. In the PDE algorithm, the partial SAD
$\left(p\mathit{SAD}\right)$
, which is the matching error accumulated for every period, such as a block’s line^{3} or a sub-block,^{4} is computed and compared with the minimum block SAD
$(\mathit{min}\u0331b\mathit{SAD})$
. Once the
$p\mathit{SAD}$
is larger than the
$\mathit{min}\u0331b\mathit{SAD}$
at each period, the candidate block cannot be the most similar block regardless of the rest of the incomplete matching computations. Therefore, the PDE algorithm can find and remove impossible candidates before the complete matching error calculation of the candidate block. In this letter, we follow the line-based SAD comparison. The accumulated matching error at the
$k$
’th line of a candidate block
$\left(p{\mathit{SAD}}^{k}\right)$
in the
$(x,y)$
position of the given search range can be expressed by

## 1

$$p{\mathit{SAD}}^{k}(x,y)=\sum _{i=0}^{k}\sum _{j=0}^{N-1}\mid {f}_{t}(i,j)-{f}_{t-1}(i+x,j+y)\mid ,\phantom{\rule{1em}{0ex}}k=0,1,\dots ,N-1,$$The efficiency of the PDE algorithm can be improved when a small initial block matching error is selected or large line/sub-block matching errors occur early. A spiral PDE algorithm^{3} and a sorting-based PDE algorithm^{4} are typical examples, respectively, and they are compared with the proposed PDE algorithm for performance evaluation.

## 3.

## Proposed Fast PDE Algorithm

The disadvantage of the PDE scheme is that the algorithm has to compute the $p{\mathit{SAD}}^{k}$ of a candidate block for $k=0,\dots ,N-1$ , which will be eliminated last, until the $p{\mathit{SAD}}^{k}$ reaches $\mathit{min}\u0331b\mathit{SAD}$ . It is obvious that predetermination of the elimination will help in reducing the computational costs and increasing the process speed for the ME.

A fast PDE algorithm is proposed by predicting a total block SAD $\left(b\mathit{SAD}\right)$ , which is equivalent to the $p{\mathit{SAD}}^{N-1}$ . The proposed algorithm is based on the property that a $p{\mathit{SAD}}^{k}$ is an intermediate value and is gradually increased to reach the $b\mathit{SAD}$ because the $p{\mathit{SAD}}^{k}$ is continuously accumulated as $k$ goes from 0 to $N-1$ . The prediction at the $k$ ’th line of a block $(p\u0331b{\mathit{SAD}}^{k})$ is performed, whenever the $p{\mathit{SAD}}^{k}$ for $k<N-1$ is less than the $\mathit{min}\u0331b\mathit{SAD}$ by using the $p{\mathit{SAD}}^{k}$ , as follows:

## 2.

## 3

$$w=\{\begin{array}{ll}0.8,& \text{if}\phantom{\rule{0.3em}{0ex}}\mathit{avg}\u0331b\mathit{SAD}\u2a7d300,\\ 0.8-\frac{0.7}{600}(\mathit{avg}\u0331b\mathit{SAD}-300)& \text{if}\phantom{\rule{0.3em}{0ex}}300<\mathit{avg}\u0331b\mathit{SAD}<900,\\ 0.1& \text{if}\phantom{\rule{0.3em}{0ex}}\mathit{avg}\u0331b\mathit{SAD}\u2a7e900.\end{array}\phantom{\}}$$The proposed approach computes a $p\u0331b{\mathit{SAD}}^{k}$ to predict the $b\mathit{SAD}$ from the $p{\mathit{SAD}}^{k}$ if the $p{\mathit{SAD}}^{k}$ for $k<N-1$ is less than the $\mathit{min}\u0331b\mathit{SAD}$ and compares the predicted block SAD $(p\u0331b{\mathit{SAD}}^{k})$ with the $\mathit{min}\u0331b\mathit{SAD}$ before the next line computation. Therefore, the proposed algorithm has a high possibility to eliminate impossible candidates earlier than the conventional approach because the $p\u0331b{\mathit{SAD}}^{k}$ always exceeds the $p{\mathit{SAD}}^{k}$ . It is worth noting that the $p\u0331b{\mathit{SAD}}^{k}$ is adjusted adaptively according to the contents of each matching block because the accuracy of the prediction is affected by complexity of the block. We provide a practical example to show the procedures of the proposed algorithm in Fig. 1. First, a $p{\mathit{SAD}}^{1}$ is computed and compared with a $\mathit{min}\u0331b\mathit{SAD}$ (see ① in Fig. 1). A $p\u0331b{\mathit{SAD}}^{1}$ is estimated because the $p{\mathit{SAD}}^{1}$ is less than the $\mathit{min}\u0331b\mathit{SAD}$ and is compared with the $\mathit{min}\u0331b\mathit{SAD}$ . The $p\u0331b{\mathit{SAD}}^{1}$ (see ② in Fig. 1) is still smaller than the $\mathit{min}\u0331b\mathit{SAD}$ , and the algorithm goes to the next line. Similarly, a $p{\mathit{SAD}}^{2}$ and a $p\u0331b{\mathit{SAD}}^{2}$ are calculated at the second line because the $p{\mathit{SAD}}^{2}$ is less than the $\mathit{min}\u0331b\mathit{SAD}$ . However, the $p\u0331b{\mathit{SAD}}^{2}$ (see ③) is larger than the $\mathit{min}\u0331b\mathit{SAD}$ . Therefore, the proposed scheme skips the rest of the matching procedures for the current candidate block and goes to the next candidate, while the conventional PDE keeps performing the same procedures until the $p{\mathit{SAD}}^{k}$ exceeds the $\mathit{min}\u0331b\mathit{SAD}$ at the seventh line (see ⑥). Consequently, the proposed algorithm requires only two line matching events, while the conventional algorithm requires seven line matching events.

## 4.

## Simulation Results

The proposed PDE algorithm
$(P\u0331\mathit{PDE})$
is simulated with various video sequences—Foreman, Stefan, Akiyo, Mobile, Container, Silent voice, News, and Table tennis—and they consist of 300 frames at
$30\phantom{\rule{0.3em}{0ex}}\mathrm{Hz}$
in the format of QCIF. The matching block size is
$16\times 16$
, and the search range is
$\pm 16$
. The simulation results are compared with the spiral PDE algorithm
$(s\u0331\mathit{PDE})$
^{3} and the sorting-based PDE algorithm
$(S\u0331\mathit{PDE})$
.^{4} It is noted that both the
$P\u0331\mathit{PDE}$
and the
$S\u0331\mathit{PDE}$
also employ a spiral scanning searching scheme, as the
$s\u0331\mathit{PDE}$
does. Table 1 shows the performance of the proposed algorithm with respect to PSNR and matching ability. It can be seen that the degradations in PSNR and matching ability are only
$0.0012\phantom{\rule{0.3em}{0ex}}\mathrm{dB}$
and 0.66% (0.65 block in 99 matching blocks) average values, respectively. In Table 1, mismatched MV indicates the number of mismatched motion vectors in the proposed algorithm versus the corresponding motion vectors of the conventional PDE algorithm.

## Table 1

Comparisons with respect to image quality and prediction ability.

s̱PDE, S̱PDE | P̱PDE | |||
---|---|---|---|---|

PSNR | PSNR | Difference | Mismatched MV | |

Foreman | 32.2338 | 32.2304 | $-0.0051$ | 1.311 |

Stefan | 24.6757 | 24.6733 | $-0.0032$ | 2.599 |

Akiyo | 45.8185 | 45.8175 | 0.0006 | 0.043 |

Mobile | 26.0088 | 26.0088 | $-0.0001$ | 0.027 |

Container | 43.0933 | 43.0944 | 0.0000 | 0.395 |

Silent voice | 35.0830 | 35.0809 | $-0.0031$ | 0.221 |

News | 36.2600 | 36.2596 | 0.0014 | 0.171 |

Table tennis | 31.2572 | 31.2560 | $-0.0024$ | 0.472 |

Average | 34.3038 | 34.3026 | $-0.0012$ | 0.6547 |

In order to evaluate the computational performance, the average checked line numbers per candidate block to determine its validity for three algorithms are summarized in Table 2. It is noted that the proposed scheme requires 1 addition, 1 division, and 2 multiplications more to estimate a total block matching error if necessary. The results in Table 2 show that the proposed method reduces computational cost on average 40% and 30% compared with the
$s\u0331\mathit{PDE}$
^{3} and
$S\u0331\mathit{PDE}$
,^{4} respectively. The efficiency value of the proposed approach versus the conventional PDE algorithm is defined by

## Table 2

Average checked line numbers per candidate block.

Average checked lines | |||
---|---|---|---|

s̱PDE | S̱PDE | P̱PDE | |

Foreman | 3.309 | 2.688 | 1.874 |

Stefan | 4.236 | 3.827 | 2.529 |

Akiyo | 1.236 | 1.107 | 1.032 |

Mobile | 3.429 | 2.964 | 1.676 |

Container | 2.169 | 1.691 | 1.094 |

Silent voice | 1.914 | 1.804 | 1.314 |

News | 1.723 | 1.420 | 1.238 |

Table tennis | 3.306 | 2.709 | 2.015 |

Average | 2.665 | 2.276 | 1.596 |

Efficiency (%) | 40.110 | 29.880 | 0.000 |

## 5.

## Conclusion

In this letter, we proposed a fast PDE algorithm to reduce the computational cost of the conventional PDE algorithms. We predict a block matching error from a partial block matching error and remove the impossible candidates earlier using the predicted block matching error. The simulation results show that this can considerably reduce the computational complexity at the cost of negligible image quality degradation. We believe that the proposed approach can be applied to the conventional PDE algorithms without significant modifications and can play an important role as a significant tool for fast motion estimation.