H.264/AVC1 can give higher coding efficiency than any other video coding standards because it provides many complex modes, both in intra and interframe coding. Rate distortion optimization (RDO)2 is used to choose the best mode when coding a block. In this work, intraframe coding of H.264/AVC is investigated. Motivated by the observation that different prediction methods can give similar predicted values, a new intraprediction method is proposed to decrease coding complexity, as well as enhance coding efficiency of H.264/AVC.
Proposed Intraprediction Mode
In H.264/AVC, a picture is divided into many macroblocks (MB). Each MB can be partitioned into blocks with different sizes, such as , , or . For each block, different intraprediction methods can all be modeled as a mathematical expression, as shown in Eq. 1,and represent the predicted and prediction pixels, and represent the ’th predicted pixel and the ’th prediction pixel, respectively, is the corresponding weighted coefficient of the ’th prediction pixel under the ’th prediction mode, and represents a set of coefficients of the ’th prediction mode.
In Eq. 1, when are the same, are the same, and even if are different, may still be similar, because the coefficients are also different. Therefore, different prediction modes may give similar predicted values. That means the residual signals derived by different prediction modes are semblable, and the distortions are also semblable. Accordingly, the RDO process can be saved, and there is no need to use so many bits to label different prediction modes.
To determine whether different predicted blocks are semblable, features of different predicted blocks must be extracted, and for easy implementation, means and variances are used. First, the predicted blocks of all the prediction modes are obtained. The mean and variance of each predicted block are then derived by Eqs. 2, 3,and are the width and height of the block, respectively, is the predicted pixel of the ’th prediction mode at position of the predicted block, and and are the mean and variance of all the predicted pixels in the predicted block under the ’th prediction mode. Furthermore, Eqs. 4, 5 are then used to compute the variance of and , respectively, is the number of prediction modes of the block, and and represent the variance of and , respectively. Different prediction modes can give similar predicted values when and are less than some threshold . The block, which has similar predicted values under different prediction modes, is named the decoder side prediction method derivable (DSPMD) block for the decoder, and can derive the prediction method without labeling information. For a DSPMD block, any of the predicted values can be chosen as the final value. However, one cannot guarantee which prediction mode is best. Therefore, to get a better predicted value, the average of different predicted values is used as the final predicted value of the block, as shown in Eq. 6, is the average of different predicted values at position of the current block.
For a certain picture, the number of DSPMD blocks increases with the increment of , and of the DSPMD blocks will be distant from their optimal predicted values, which is determined by the RDO process when increases. But when the quantization step is large, should be increased. That is because even if the predicted values are a little farther from each other, the quantized signals (derived from discrete cosine transform on residual signals and the succeeding quantization) are still similar. To adapt to , is set to be empirically.
Figure 1 shows the proportions of DSPMD blocks of the Foreman sequence. The proportion of luminance and chrominance DSPMD blocks can reach as high as 60 and 80%. Therefore, the RDO process and prediction mode information can be greatly saved by detecting DSPMD blocks.
In a bit stream of H.264/AVC, mode information of all the blocks of an MB are followed by a coded block pattern (CBP) and coefficient bits, as shown in Fig. 2, which means that the mode information of all the blocks must be decoded first without considering whether a block is a DSPMD block. To ensure that DSPMD blocks can be decoded, a bit stream of an MB is changed to Fig. 2, where CBP of the MB is decoded first, and then whether a block is a DSPMD block is determined. For a DSPMD block, , shown in Eq. 6, is used as the prediction result, and the following bits are decided as coefficients.
In the experiments, H.264/AVC reference software Joint Model (JM) 15.1 was used to encode ten sequences with various representative contents. Each sequence is encoded at four different quantization parameters (QP) i.e., 24, 28, 32, and 36. The corresponding are 10, 16, 26, and 40. The coding complexity and efficiency are evaluated by the percentage of average encoding time savings, and percentage of average bit rate savings under the same PSNR of reconstructed videos (BD bit rate),3 respectively. Statistic results of all the sequences are shown in Table 1. From Table 1, by integrating the proposed method into JM15.1, the 39.25% coding time can be saved and the BD bit rate is on average.
Coding efficiency of different test sequences.
|Sequences||BD-BitRate (%)||ΔTime (%)||Sequences||BD-BitRate (%)||ΔTime (%)|
|Carphone (cif)||Bigship (720p)|
|Claire (cif)||City (720p)|
|Container (cif)||Crew (720p)|
|Foreman (cif)||Night (720p)|
|PeopleOnStreet (1080p)||Traffic (1080p)|
|Average of all the sequences|
Figure 3 compares bits of each component between the proposed method and JM15.1. It can be concluded that the gains of coding efficiency are embodied in bit reductions of both mode information and coefficients, while keeping the same coding quality. The bit reductions of the coefficients benefit from the combination of all the predicted values, as shown in Eq. 6. To show the results clearly, RD curves for the Bigship sequence are presented in Fig. 4. It can be concluded that the proposed method outperforms JM15.1.
Moreover, the proposed method was compared with the estimation based method in Ref. 4 and the pixel-based direction detection (PDD) method in Ref. 5. Compared with the method in Ref. 4, the BD bit rate of the proposed method is ; meanwhile 73.47% coding time is saved. Compared with the PDD method, 23.11% average coding time is increased. However, the BD bit rate of the proposed method can achieve . Although the coding time of the proposed method is larger than that of the PDD method, coding efficiency of the proposed method is better than that of the PDD method.
A fast and efficient intraprediction method is proposed. Whether a block is a DSPMD block is determined, and then all the predicted values of a DSPMD block are averaged to get the final predicted value. The RDO process can be neglected for those DSPMD blocks, and the identifiers of prediction methods can also be omitted as well. Experimental results show that the coding complexity is decreased, and that coding efficiency is improved as well.
The work was supported by the National Natural Science Foundation Research Program of China numbers 60772134, 60902081, and 60902052, the 111 Project (B08038), the Natural Science Basic Research Plan in Shaanxi Province of China (program number SJ08F03), and the Basic Science Research Fund in Xidian University (72105457). We would like to thank the editors and anonymous reviewers for their valuable comments.