Rate control is important for rate-constrained video coding. Good rate control results in high video quality, low fluctuation of video quality, stable buffer status, and low mismatch between the target bit rate and the encoded bit rate. H.264/AVC is the state-of-the-art video coding standard,1 and its existing rate control is based on JVT.2 According to Refs. 2, 3, rate control generates a quantization parameter (QP) after three steps: estimation of the number of target bits (target-bit), computation of a QP, and adjustment of the QP. Because the JVT method does not consider the content complexity within a sequence as a target-bit is estimated, it cannot perform optimally for high-motion video. In Ref. 4, the quality fluctuations of high-motion video are improved by multiplying one part of the target-bit estimation by the relative content complexity (RCC). The RCC is measured by the ratio of the ’th predicted mean absolute difference (MAD) to the averaged actual MAD over all (forward predicted) frames encoded previously. However, that scheme still suffers from unstable buffer management and high mismatch between bit rates for two reasons. First, when the RCC is high, the target-bit is increased regardless of the buffer status. Second, a QP obtained by the modified target-bit may not appropriately maintain a stable ratio of buffer occupation. Therefore, both buffer status and RCC must be considered simultaneously, and it is reasonable to use them in the adjustment of the QP rather than in the estimation of a target-bit.
When the number of available bits per frame (avail-bit) is low (e.g., low target-bit rate and high target-frame rate) and the RCC is sequentially high, the probability of a target-bit dropping below zero is very high. In such cases, the QP for a current frame is forced to be larger than that of the previous frame by two,2, 3, 4 producing poor video quality. When target-bits are frequently negative, video quality will fluctuate severely. Thus, it is important to prevent negative target-bits to avoid such fluctuations.
Based on the preceding observations, QPs were adjusted using buffer status and RCC to maintain positive target-bits for a low avail-bit application and high-motion video.
It was assumed, without loss of generality, that the GOP structure was an , where and denote an intracoded picture and a forward-predicted picture, respectively. The ultimate aim of rate control is to obtain an appropriate QP for high performance within permitted rates. Rate control is composed of three steps: estimation of a frame target-bit, computation of a QP, and adjustment of the QP.
To estimate a frame target-bit, the number of remaining bits is needed and both buffer fullness and the target buffer level are used to avoid overflow or underflow. According to Refs. 2, 3, the target-bit is estimated before encoding the ’th frame:and denote the total number of remaining bits for all noncoded frames and the number of noncoded frames for the ’th frame, respectively, and and are the target-bit rate and frame rate, respectively. and indicate the current buffer fullness and the target buffer level for the ’th frame. and are constants with typical values of 0.5 and 0.75, respectively. After estimating a target-bit, a QP was computed and adjusted in the two following cases.
Case 1: Positive Target-Bit
When the target-bit was positive, the QP was computed using the quadratic rate-distortion (R-D) model5:is the texture-bit for the ’th frame. shows the difference between the target-bit and the number of previously encoded header bits (header-bit). and denote the predicted MAD and the computed QP for the ’th frame, respectively, and and are the first- and second-order coefficients, respectively. The texture-bit can fall below zero when the previously encoded header-bit is comparable to the size of the current target-bit in a low avail-bit application. In that case, the texture-bit was limited to one. Since the computed QP may oscillate noticeably for sequences with rapid changes in content complexity, changes in QP were limited to no more than between pictures using is the final QP for encoding the ’th frame. The limited QP was adjusted using both buffer status and RCC measure. Then the final QP was obtained as indicates the RCC measure for the ’th frame. The MAD ratio was used as for a fair comparison to the existing scheme.4 is the ratio of the ’th predicted MAD to the averaged actual MAD over all frames encoded previously. When the buffer status was low enough to keep the target-bit positive and the RCC measure was high, the QP decreased by one to reduce the fluctuation in video quality. In the experiments, when all frames were encoded by a constant QP (except for only one high-complexity frame that was encoded by one smaller QP), over-bit (the number of over bits due to the decrease of QP) accounted for less than 1% of resource-bit (the number of total given bits for encoding). This value did not cause even the right side of Eq. 1 to be negative. Figure 1 shows the effect of over-bit with respect to negative target-bit. The line of near-negative target-bit denotes that the right side of Eq. 1 is negative. There is enough margin to maintain positive target-bit. Figure 2 shows the percentage of over-bit in resource-bit and the decreased fluctuation in video quality. In brief, the test conditions were as follows: The sequence was “Foreman,” and the constant QP was 41 over 100 frames except for only one high-complexity frame that was quantized by 40. In contrast, when the buffer status was high and the RCC measure was low, the QP increased by one to lessen the level of buffer fullness and prevent the target-bit of the next picture from being negative.
Case 2: Negative Target-Bit
As seen in Eq. 1, a negative target-bit occurs when , to at least a certain extent, is larger than . Generally, is large in a high-complexity picture and is small in a low avail-bit application. In these conditions, there is a high probability that the target-bit will be negative. Once a target-bit becomes negative, the level of the current buffer fullness must be reduced to make the next target-bit positive. To do so, the current QP was made larger than the previous one, as shown in Eq. 5. When the RCC measure was low, the increase in the current QP over the previous QP was greater, and the reduction in the buffer fullness level became more rapid. Hence, even if next picture was high RCC, the buffer had more room to avoid negative target-bit for the following picture:5 was used to update the parameters of the linear prediction model for MAD as well as and of the quadratic R-D model [Eq. 2] for the next frame.
Experiments were conducted on high-motion sequences (“Carphone” and “Foreman”) under low avail-bit conditions ( and , ). The sequences were in QCIF 4:2:0 formats. H.264 reference software version JM6.1 was the test platform, and the proposed scheme was compared with the existing H.264 rate control schemes.3, 4 For all tests, parameters were set as follows: RDO was enabled, the search range for motion estimation was 16, the number of reference frames was one for less computational load and three and five for better performance, a Hadamard transform was used, and the entropy coding method was CABAC. All other parameters were carefully selected to make the three schemes equivalent. The peak signal to noise ratio (PSNR) curves of the two sequences are plotted in Fig. 3. PSNR fluctuation of the proposed scheme was much less than that of Refs. 3, 4 while maintaining a good PSNR. Detailed results are shown in Table 1. The proposed scheme considerably enhanced performance with respect to PSNR fluctuation and bit rate mismatch. The PSNR fluctuation of the proposed scheme was improved by 53.3% and 43.7% in “Carphone” and by 42.9% and 24.2% in “Foreman” compared to Refs. 3, 4 respectively. Figure 4 shows the buffer fullness at each frame. The buffer size was set to half of the target-bit rate. According to Fig. 4, the proposed rate control managed a more stable buffer fullness level than Refs. 3, 4.
Performance comparisons of the proposed scheme with Refs. 3, 4 at 30fps (#: number of reference pictures).
|Sequences||Targetbit rate||#||Average Y-PSNR (dB)||Standard deviation||Encoded bit rate (kbps)|
|Ref. 3||Ref. 4||Proposed||Ref. 3||Ref. 4||Proposed||Ref. 3||Ref. 4||Proposed|
An improved H.264 frame-layer rate control scheme for high-motion video at low bit rates and high frame rates was proposed. The scheme yielded much smaller fluctuations in video quality, lower mismatches between bit rates, and more stable buffer fullness levels compared with existing rate control schemes.
This research was supported by the University ITRC Project and partially by the TN R&D Center, Samsung Electronic Co., Ltd.