1 December 2006 Improved H.264 frame-layer rate control for low bit rate video with high motion
Author Affiliations +
Optical Engineering, 45(12), 120502 (2006). doi:10.1117/1.2401154
Abstract
In the existing frame-layer rate control for H.264, buffer status and content complexity are used improperly, causing quality fluctuations of high-motion video at low bit rates and high frame rates (under 19.2 kbps at 30 fps). We propose an improved H.264 frame-layer rate control scheme to obtain steady video quality and stable buffer management. Experimental results showed that the proposed scheme performed better than existing schemes.
Lee, Lee, Oh, and Kim: Improved H.264 frame-layer rate control for low bit rate video with high motion

1.

Introduction

Rate control is important for rate-constrained video coding. Good rate control results in high video quality, low fluctuation of video quality, stable buffer status, and low mismatch between the target bit rate and the encoded bit rate. H.264/AVC is the state-of-the-art video coding standard,1 and its existing rate control is based on JVT.2 According to Refs. 2, 3, rate control generates a quantization parameter (QP) after three steps: estimation of the number of target bits (target-bit), computation of a QP, and adjustment of the QP. Because the JVT method does not consider the content complexity within a sequence as a target-bit is estimated, it cannot perform optimally for high-motion video. In Ref. 4, the quality fluctuations of high-motion video are improved by multiplying one part of the target-bit estimation by the relative content complexity (RCC). The RCC is measured by the ratio of the i ’th predicted mean absolute difference (MAD) to the averaged actual MAD over all P (forward predicted) frames encoded previously. However, that scheme still suffers from unstable buffer management and high mismatch between bit rates for two reasons. First, when the RCC is high, the target-bit is increased regardless of the buffer status. Second, a QP obtained by the modified target-bit may not appropriately maintain a stable ratio of buffer occupation. Therefore, both buffer status and RCC must be considered simultaneously, and it is reasonable to use them in the adjustment of the QP rather than in the estimation of a target-bit.

When the number of available bits per frame (avail-bit) is low (e.g., low target-bit rate and high target-frame rate) and the RCC is sequentially high, the probability of a target-bit dropping below zero is very high. In such cases, the QP for a current frame is forced to be larger than that of the previous frame by two,2, 3, 4 producing poor video quality. When target-bits are frequently negative, video quality will fluctuate severely. Thus, it is important to prevent negative target-bits to avoid such fluctuations.

Based on the preceding observations, QPs were adjusted using buffer status and RCC to maintain positive target-bits for a low avail-bit application and high-motion video.

2.

Proposed Scheme

It was assumed, without loss of generality, that the GOP structure was an IPPP,,P , where I and P denote an intracoded picture and a forward-predicted picture, respectively. The ultimate aim of rate control is to obtain an appropriate QP for high performance within permitted rates. Rate control is composed of three steps: estimation of a frame target-bit, computation of a QP, and adjustment of the QP.

To estimate a frame target-bit, the number of remaining bits is needed and both buffer fullness and the target buffer level are used to avoid overflow or underflow. According to Refs. 2, 3, the target-bit Tb,i is estimated before encoding the i ’th frame:

1

Tb,i=βRb,iNPr,i+(1β)[brfrΓ(CBFi1TBLi)],
where Rb,i and NPr,i denote the total number of remaining bits for all noncoded P frames and the number of noncoded P frames for the i ’th frame, respectively, and br and fr are the target-bit rate and frame rate, respectively. CBFi and TBLi indicate the current buffer fullness and the target buffer level for the i ’th frame. β and Γ are constants with typical values of 0.5 and 0.75, respectively. After estimating a target-bit, a QP was computed and adjusted in the two following cases.

3.

Case 1: Positive Target-Bit

When the target-bit was positive, the QP was computed using the quadratic rate-distortion (R-D) model5:

2

Ttb,iPMADi=x1Qc,i+x2Qc,i2,
where Ttb,i is the texture-bit for the i ’th frame. Ttb,i shows the difference between the target-bit and the number of previously encoded header bits (header-bit). PMADi and Qc,i denote the predicted MAD and the computed QP for the i ’th frame, respectively, and x1 and x2 are the first- and second-order coefficients, respectively. The texture-bit can fall below zero when the previously encoded header-bit is comparable to the size of the current target-bit in a low avail-bit application. In that case, the texture-bit was limited to one. Since the computed QP (Qc,i) may oscillate noticeably for sequences with rapid changes in content complexity, changes in QP were limited to no more than ±2units between pictures using

3

Qlm,i=MAX[Qi12,MIN(Qi1+2,Qc,i)],
where Qi is the final QP for encoding the i ’th frame. The limited QP was adjusted using both buffer status and RCC measure. Then the final QP was obtained as

4

Qi={Qlm,i1([(Qi1Qlm,i)<2]and[(CMi> 1.09)]and[(CBFi1TBLi)<br(fr×Γ)]Qlm,i+1([(CMi<0.99)]and[(CBFi1TBLi)> br(fr×Γ)]}
where CMi indicates the RCC measure for the i ’th frame. The MAD ratio was used as CMi for a fair comparison to the existing scheme.4 CMi is the ratio of the i ’th predicted MAD to the averaged actual MAD over all P frames encoded previously. When the buffer status was low enough to keep the target-bit positive and the RCC measure was high, the QP decreased by one to reduce the fluctuation in video quality. In the experiments, when all frames were encoded by a constant QP (except for only one high-complexity frame that was encoded by one smaller QP), over-bit (the number of over bits due to the decrease of QP) accounted for less than 1% of resource-bit (the number of total given bits for encoding). This value did not cause even the right side of Eq. 1 to be negative. Figure 1 shows the effect of over-bit with respect to negative target-bit. The line of near-negative target-bit denotes that the right side of Eq. 1 is negative. There is enough margin to maintain positive target-bit. Figure 2 shows the percentage of over-bit in resource-bit and the decreased fluctuation in video quality. In brief, the test conditions were as follows: The sequence was “Foreman,” and the constant QP was 41 over 100 frames except for only one high-complexity frame that was quantized by 40. In contrast, when the buffer status was high and the RCC measure was low, the QP increased by one to lessen the level of buffer fullness and prevent the target-bit of the next picture from being negative.

Fig. 1

The effect of over-bit with respect to negative target-bit.

120502_1_1.jpg

Fig. 2

The percentage of over-bit in resource-bit and the improvement of fluctuation in video quality where the constant QP is 41 over 100 frames except for only one high-complexity frame that is quantized by 40.

120502_1_2.jpg

4.

Case 2: Negative Target-Bit

As seen in Eq. 1, a negative target-bit occurs when CBFi1 , to at least a certain extent, is larger than TBLi . Generally, CBFi1 is large in a high-complexity picture and TBLi is small in a low avail-bit application. In these conditions, there is a high probability that the target-bit will be negative. Once a target-bit becomes negative, the level of the current buffer fullness must be reduced to make the next target-bit positive. To do so, the current QP was made larger than the previous one, as shown in Eq. 5. When the RCC measure was low, the increase in the current QP over the previous QP was greater, and the reduction in the buffer fullness level became more rapid. Hence, even if next picture was high RCC, the buffer had more room to avoid negative target-bit for the following picture:

5

Qi={Qi1+2CMi> 1.09Qi1+3otherwise}.
After encoding a frame by a final QP, a linear regression method like Ref. 5 was used to update the parameters of the linear prediction model for MAD as well as x1 and x2 of the quadratic R-D model [Eq. 2] for the next frame.

5.

Experimental Results

Experiments were conducted on high-motion sequences (“Carphone” and “Foreman”) under low avail-bit conditions ( 9.6kbps and 19.2kbps , 30fps ). The sequences were in QCIF 4:2:0 formats. H.264 reference software version JM6.1 was the test platform, and the proposed scheme was compared with the existing H.264 rate control schemes.3, 4 For all tests, parameters were set as follows: RDO was enabled, the search range for motion estimation was 16, the number of reference frames was one for less computational load and three and five for better performance, a Hadamard transform was used, and the entropy coding method was CABAC. All other parameters were carefully selected to make the three schemes equivalent. The peak signal to noise ratio (PSNR) curves of the two sequences are plotted in Fig. 3. PSNR fluctuation of the proposed scheme was much less than that of Refs. 3, 4 while maintaining a good PSNR. Detailed results are shown in Table 1. The proposed scheme considerably enhanced performance with respect to PSNR fluctuation and bit rate mismatch. The PSNR fluctuation of the proposed scheme was improved by 53.3% and 43.7% in “Carphone” and by 42.9% and 24.2% in “Foreman” compared to Refs. 3, 4 respectively. Figure 4 shows the buffer fullness at each frame. The buffer size was set to half of the target-bit rate. According to Fig. 4, the proposed rate control managed a more stable buffer fullness level than Refs. 3, 4.

Fig. 3

PSNR curves for sequences (a) “Carphone” at 9.6kbps30fps1 reference picture, and (b) “Foreman” at 19.2kbps30fps1 reference picture.

120502_1_3.jpg

Fig. 4

Current buffer fullness curves for sequences (a) “Carphone” at 9.6kbps30fps1 reference picture, and (b) “Foreman” at 19.2kbps30fps1 reference picture.

120502_1_4.jpg

Table 1

Performance comparisons of the proposed scheme with Refs. 3, 4 at 30fps (#: number of reference pictures).

SequencesTargetbit rate#Average Y-PSNR (dB)Standard deviationEncoded bit rate (kbps)
Ref. 3Ref. 4ProposedRef. 3Ref. 4ProposedRef. 3Ref. 4Proposed
Carphone9.60kbps125.8525.9825.951.050.870.499.899.689.57
325.9026.0226.020.850.680.559.719.709.66
Foreman19.20kbps127.0427.2427.201.541.391.0719.3319.4619.21
526.9927.2927.131.701.280.9719.3219.3519.25

6.

Conclusion

An improved H.264 frame-layer rate control scheme for high-motion video at low bit rates and high frame rates was proposed. The scheme yielded much smaller fluctuations in video quality, lower mismatches between bit rates, and more stable buffer fullness levels compared with existing rate control schemes.

Acknowledgments

This research was supported by the University ITRC Project and partially by the TN R&D Center, Samsung Electronic Co., Ltd.

References

1.  T. Wiegand, H. Schwarz, A. Joch, F. Kossentini, and G. J. Sullivan, “Rate-constrained coder control and comparison of video coding standards,” IEEE Trans. Circuits Syst. Video Technol.1051-8215 10.1109/TCSVT.2003.815168 13(7), 688–703 (2003). Google Scholar

2.  Z. G. Li, F. Pan, K. P. Lim, G. Feng, X. Lin, and S. Rahardja, “Adaptive basic unit layer rate control for JVT,” in 7th Meeting, Pattaya II, JVT-G012-r1 (2003). Google Scholar

4.  M. Jiang, X. Yi, and N. Ling, “Improved frame-layer rate control for H.264 using MAD ratio,” in IEEE Intl. Symp. Circuits Syst., Vol. III, pp. 813–816 (2004). Google Scholar

5.  L. Hung-Ju, C. Tihao, and Z. Ya-Qin, “Scalable rate control for MPEG-4 video,” IEEE Trans. Circuits Syst. Video Technol.1051-8215 10.1109/76.867926 10(6), 878–894 (2000). Google Scholar

Chang-Hyun Lee, Seongjoo Lee, Yunje Oh, Jaeseok Kim, "Improved H.264 frame-layer rate control for low bit rate video with high motion," Optical Engineering 45(12), 120502 (1 December 2006). http://dx.doi.org/10.1117/1.2401154
JOURNAL ARTICLE
3 PAGES


SHARE
KEYWORDS
Video

Curium

Computer programming

Low bit rate video

Video coding

Optical engineering

Signal to noise ratio

Back to Top