## 1.

## Introduction

H.264/AVC^{1} can give higher coding efficiency than any other video coding standards because it provides many complex modes, both in intra and interframe coding. Rate distortion optimization (RDO)^{2} is used to choose the best mode when coding a block. In this work, intraframe coding of H.264/AVC is investigated. Motivated by the observation that different prediction methods can give similar predicted values, a new intraprediction method is proposed to decrease coding complexity, as well as enhance coding efficiency of H.264/AVC.

## 2.

## Proposed Intraprediction Mode

In H.264/AVC, a picture is divided into many macroblocks (MB). Each MB can be partitioned into blocks with different sizes, such as $4\times 4$ , $8\times 8$ , or $16\times 16$ . For each block, different intraprediction methods can all be modeled as a mathematical expression, as shown in Eq. 1,

## Eq. 1

$${t}_{y}=\raisebox{1ex}{$\sum _{x}({\omega}_{\gamma ,x}\times {s}_{x})$}\!\left/ \!\raisebox{-1ex}{$\sum _{x}{\omega}_{\gamma ,x}$}\right.,$$In Eq. 1, when ${s}_{x}$ are the same, ${t}_{y}$ are the same, and even if ${s}_{x}$ are different, ${t}_{y}$ may still be similar, because the coefficients ${\omega}_{\gamma}$ are also different. Therefore, different prediction modes may give similar predicted values. That means the residual signals derived by different prediction modes are semblable, and the distortions are also semblable. Accordingly, the RDO process can be saved, and there is no need to use so many bits to label different prediction modes.

To determine whether different predicted blocks are semblable, features of different predicted blocks must be extracted, and for easy implementation, means and variances are used. First, the predicted blocks of all the prediction modes are obtained. The mean and variance of each predicted block are then derived by Eqs. 2, 3,

## Eq. 2

$${\mathrm{avg}}_{i}=\frac{1}{W\times H}\sum _{m=1}^{W}\sum _{n=1}^{H}{\mathrm{pred}}_{i}(m,n),$$## Eq. 3

$${\mathrm{var}}_{i}=\frac{1}{W\times H}\sum _{m=1}^{W}\sum _{n=1}^{H}{[{\mathrm{pred}}_{i}(m,n)-{\mathrm{avg}}_{i}]}^{2},$$## Eq. 4

$${\mathrm{VAR}}_{\mathrm{avg}}=\frac{1}{M}\sum _{i=0}^{M-1}{({\mathrm{avg}}_{i}-\frac{1}{M}\sum _{j=0}^{M-1}{\mathrm{avg}}_{j})}^{2},$$## Eq. 5

$${\mathrm{VAR}}_{\mathrm{var}}=\frac{1}{M}\sum _{i=0}^{M-1}{({\mathrm{var}}_{i}-\frac{1}{M}\sum _{j=0}^{M-1}{\mathrm{var}}_{j})}^{2},$$For a certain picture, the number of DSPMD blocks increases with the increment of $\mathit{TH}$ , and ${\mathrm{PRED}}_{\text{final}}$ of the DSPMD blocks will be distant from their optimal predicted values, which is determined by the RDO process when $\mathit{TH}$ increases. But when the quantization step $\left({Q}_{\text{step}}\right)$ is large, $\mathit{TH}$ should be increased. That is because even if the predicted values are a little farther from each other, the quantized signals (derived from discrete cosine transform on residual signals and the succeeding quantization) are still similar. To adapt to ${Q}_{\text{step}}$ , $\mathit{TH}$ is set to be ${Q}_{\text{step}}\u22152$ empirically.

Figure 1 shows the proportions of DSPMD blocks of the Foreman sequence. The proportion of luminance and chrominance DSPMD blocks can reach as high as 60 and 80%. Therefore, the RDO process and prediction mode information can be greatly saved by detecting DSPMD blocks.

In a bit stream of H.264/AVC, mode information of all the blocks of an MB are followed by a coded block pattern (CBP) and coefficient bits, as shown in Fig. 2, which means that the mode information of all the blocks must be decoded first without considering whether a block is a DSPMD block. To ensure that DSPMD blocks can be decoded, a bit stream of an MB is changed to Fig. 2, where CBP of the MB is decoded first, and then whether a block is a DSPMD block is determined. For a DSPMD block, ${\mathrm{PRED}}_{\text{final}}$ , shown in Eq. 6, is used as the prediction result, and the following bits are decided as coefficients.

## 3.

## Experimental Results

In the experiments, H.264/AVC reference software Joint Model (JM) 15.1 was used to encode ten sequences with various representative contents. Each sequence is encoded at four different quantization parameters (QP) i.e., 24, 28, 32, and 36. The corresponding
${Q}_{\text{step}}$
are 10, 16, 26, and 40. The coding complexity and efficiency are evaluated by the percentage of average encoding time savings, and percentage of average bit rate savings under the same PSNR of reconstructed videos (BD bit rate),^{3} respectively. Statistic results of all the sequences are shown in Table 1. From Table 1, by integrating the proposed method into JM15.1, the 39.25% coding time can be saved and the BD bit rate is
$-2.62\mathrm{\%}$
on average.

## Table 1

Coding efficiency of different test sequences.

Sequences | BD-BitRate (%) | ΔTime (%) | Sequences | BD-BitRate (%) | ΔTime (%) |
---|---|---|---|---|---|

Carphone (cif) | $-2.86$ | $-36.82$ | Bigship (720p) | $-5.35$ | $-59.62$ |

Claire (cif) | $-1.87$ | $-27.41$ | City (720p) | $-1.93$ | $-30.59$ |

Container (cif) | $-1.02$ | $-20.76$ | Crew (720p) | $-2.68$ | $-44.56$ |

Foreman (cif) | $-3.72$ | $-45.01$ | Night (720p) | $-1.85$ | $-30.35$ |

PeopleOnStreet (1080p) | $-2.45$ | $-41.54$ | Traffic (1080p) | $-2.51$ | $-55.83$ |

Average of all the sequences | $-\mathbf{2.62}$ | $-\mathbf{39.25}$ |

Figure 3 compares bits of each component between the proposed method and JM15.1. It can be concluded that the gains of coding efficiency are embodied in bit reductions of both mode information and coefficients, while keeping the same coding quality. The bit reductions of the coefficients benefit from the combination of all the predicted values, as shown in Eq. 6. To show the results clearly, RD curves for the Bigship sequence are presented in Fig. 4. It can be concluded that the proposed method outperforms JM15.1.

Moreover, the proposed method was compared with the estimation based method in Ref. 4 and the pixel-based direction detection (PDD) method in Ref. 5. Compared with the method in Ref. 4, the BD bit rate of the proposed method is $-1.05\mathrm{\%}$ ; meanwhile 73.47% coding time is saved. Compared with the PDD method, 23.11% average coding time is increased. However, the BD bit rate of the proposed method can achieve $-5.33\mathrm{\%}$ . Although the coding time of the proposed method is larger than that of the PDD method, coding efficiency of the proposed method is better than that of the PDD method.

## 4.

## Conclusions

A fast and efficient intraprediction method is proposed. Whether a block is a DSPMD block is determined, and then all the predicted values of a DSPMD block are averaged to get the final predicted value. The RDO process can be neglected for those DSPMD blocks, and the identifiers of prediction methods can also be omitted as well. Experimental results show that the coding complexity is decreased, and that coding efficiency is improved as well.

## Acknowledgments

The work was supported by the National Natural Science Foundation Research Program of China numbers 60772134, 60902081, and 60902052, the 111 Project (B08038), the Natural Science Basic Research Plan in Shaanxi Province of China (program number SJ08F03), and the Basic Science Research Fund in Xidian University (72105457). We would like to thank the editors and anonymous reviewers for their valuable comments.

## References

**,” (2009) Google Scholar**

*Series H: audio visual and multimedia systems Infrastructure of audiovisual services–coding of moving video***,” IEEE Trans. Circuits Syst. Video Technol., 13 (7), 688 –703 (2003). https://doi.org/10.1109/TCSVT.2003.815168 1051-8215 Google Scholar**

*Rate-constrained coder control and comparison of video coding standards***,” Opt. Eng., 48 (3), 030506 –1–3 (2009). https://doi.org/10.1117/1.3101375 0091-3286 Google Scholar**

*Estimation-based intra prediction algorithm for H.264/AVC***,” IEEE Trans. Circuits Syst. Video Technol., 18 (7), 975 –982 (2008). https://doi.org/10.1109/TCSVT.2008.920742 1051-8215 Google Scholar**

*Effective subblock-based and pixel-based fast direction detections for H.264 intra prediction*