Layered coding combined with unequal error protection (UEP) is a promising method for video transmission over error-prone channels. Data partitioning1 in the H.264/AVC standard2 is an effective layered coding technique and is utilized in many UEP schemes, owing to its lower overhead and better error resilience.3, 4 With data partitioning, each partition in one slice has unequal importance for the reconstructed video quality due to their different dependency relationships.
In one group of pictures (GOP), which consists of one intraframe [I-frame or instantaneous decoder refresh (IDR)-frame] and a set of interframes (P-frames and B-frames), the intraframe is the predictive reference frame of subsequent interframes, thus the intraframe is more important than the interframes. Furthermore, previous P-frames are more important than succeeding P-frames, and B-frames are the least important in the event that hierarchical B-frames are not applied. Based on this fact, some video transmission schemes provide unequal error protection to frames with different encoding types.5
From the prior analysis, the importance measurement at different coded levels, e.g., slice level, frame level, or GOP level, is a prerequisite for designing an effective UEP scheme. In this work, we develop an algorithm for deriving distortion as a metric to evaluate importance, jointly considering the unequal importance of different partitions in one slice and frames with different positions in one GOP. Moreover, a priority sorting method, which employs the characteristics of wireless channels, is proposed.
According to the definition of data partitioning,1, 2 the encoded video data of one slice is rearranged into three different partitions, namely, partitions A, B, and C. Without loss of generality, we adopt a simple slice mode, i.e., one slice per frame, thus the encoded data of one frame is divided into three partitions. The average distortion over one GOP is selected as the metric to assess the importance of different partitions in different frames. A GOP is composed of one I-frame followed by P-frames, where denotes the total number of frames in one GOP. Due to its importance, the I-frame is assumed to be perfectly protected and correctly received.
Let denote the th pixel in frame , and and denote its corresponding reconstructed pixel at the encoder and the decoder, respectively. Moreover, let denote the distortion in terms of the mean squared error (MSE) of , which can be calculated by:
To analyze the importance of different partitions in frame , we derive the distortion of each partition in frame under the assumption that only the partition under discussion is lost and other partitions in the current frame and all partitions in subsequent frames are error-free. As for the error concealment scheme, the modified temporal replacement is adopted, in which the intracoded macroblock (MB) or intercoded MB with lost motion information is replaced by the colocated MB in the previous frame.
To compute using Eq. 1, is required. For the determination of , three cases are considered, depending on which partition in frame is lost. We refer to an intracoded MB as I-MB, and an intercoded MB as P-MB.
Partition A is lost
When partition A in frame is lost, the whole frame is corrupted, because partition B and partition C are dependent on it. Hence, for all pixels in the current frame, the reconstructed pixel can be expressed as:
Partition B is lost
When partition B in frame is lost, the pixels in an I-MB are affected, but the reconstructed pixels in a P-MB can be obtained by motion compensated prediction with the correct motion vector, reference frame, and residual data. Let denote the pixel from which is predicted, and refers to the quantized prediction error. Accordingly, can be formulated as:
Partition C is lost
When partition C in frame is lost, the reconstructed pixels in an I-MB will not be influenced when constrained intraprediction is utilized. Therefore, the reconstructed pixel can be computed as:
As analyzed before, in frame and successive frames within one GOP can be obtained when any previously mentioned case has occurred. Moreover, let , , and denote the average distortion for cases 1, 2, and 3, respectively, and stand for the total pixels in one frame. Substituting into Eq. 1 to compute , and then the corresponding average distortions, which will be used to assess the importance of the different partitions, are given by:
Based on the analysis of , , and for frames from frame 2 to frame , we present a priority sorting method jointly taking into consideration the unequal importance of partitions in one frame and P-frames at different positions in one GOP. That is, all partitions of the first P-frames and partition A of the middle P-frames following the preceding P-frames are labeled as high priority (HP), and partition B and partition C of the P-frames and all partitions of the remaining P-frames in one GOP are labeled as lower priority (LP).
For efficient delivery of the coded video data over wireless channels, the coded video data with HP must be delivered to the decoder with small probability of being affected by bit errors, providing basic reconstruction video quality. For each subcarrier in wireless channels employing orthogonal frequency division multiplexing (OFDM) technology, the extent of attenuation subject to fading is different, which can be utilized to categorize the subcarriers into two groups. The subcarriers with the signal-to-noise ratio (SNR) above are assigned to the high quality (HQ) subchannel group, otherwise the subcarriers are assigned to the low quality (LQ) subchannel group. Thus, the threshold used for grouping subchannels is determined by the bit error rate (BER) requirement of the coded video data with HP. Let and denote the available transmission capacity provided by HQ and LQ subchannel groups during P-frame intervals, respectively. And refers to the summation of all partition A for total P-frames. Moreover, let , , and signify the total coded bits for partitions A, B, and C in frame , respectively, and be the summation of and for simplicity. Therefore, , and for determining the priority of the total coded video data in all P-frames should satisfy the following constraints:, i.e., , is the total number of P-frames in one GOP.
Given , , and video encoding parameters, , , and can be determined by a heuristic algorithm summarized as follows:
1. Calculate and subtracted it from , and then initialize and to one.
2. Compare with the remaining transmission capacity supported by the HQ subchannel group, i.e., . If is less than or equal to the latter, increase by one and repeat step 2; otherwise go to step 3.
3. Similarly, compare with . If is less than or equal to the latter, increase by one and repeat step 3; otherwise go to step 4.
4. Increase by one and compare with the rest of the transmission capacity provided by the HP subchannel group. If is less than or equal to the latter, increase by one and repeat step 4; otherwise the procedure is finished.
We first examine the performance of the distortion derivation algorithm. The sequences “Grandma” and “Silent” in QCIF format with a frame rate were encoded using H.264/AVC reference software JM15.1. The GOP consisted of one I-frame and P-frames without B-frames, where was set to 15. In addition, the UseConstrainedIntraPred option was set up, and the quantization parameter (QP) was fixed at 28 for all experiments.
Figure 1 shows the distortion of all subsequent frames versus frame index when partition A of the first P-frame is lost for “Silent.” From Fig. 1, we can observe that the distortion derived accords with the distortion simulated to a large extent, so the distortion determined by Eq. 5 can be used as the metric to evaluate the importance of different partitions of P-frames in one GOP.
Next, the performance of the priority sorting method is verified. Five video sequences of 300 frames, “Mother&Daughter,” “Grandma,” “Salesman,” “Silent,” and “Foreman,” were encoded with the same parameters used in the first set of experiments. Assume that the BER of HP coded video data is , the BER of LP data is and , and and satisfied . For comparison, the rations of , , and to were fixed as , , and , repectively.
Comparing the proposed priority sorting scheme with data partitioning in H.264/AVC at the frame level, the average peak signal-to-noise ratio (PSNR) for all sequences and the PSNR versus the frame number for sequence “Silent” are given in Table 1 and Fig. 2, respectively. From Table 1 and Fig. 2, we can see that the proposed method can improve the reconstructed quality due to its better error resilience and adaptability, especially for sequences with medium motion.
Average PSNR comparison of the proposed scheme against data partitioning.
|Sequence||BER for LP data is 10−3||BER for LP data is 10−2|
|Mother and Daughter||36.66||36.89||35.62||36.29|
A distortion derivation model is proposed by jointly considering the unequal importance of different partitions in one slice and each frame in one GOP. Furthermore, a heuristic algorithm is designed to determine priority sorting. Simulation results show that the distortion derivation model matches the actual distortion well, and the introduced priority sorting method improves the reconstructed video quality under error-prone environments.
This research was supported by the National Natural Science Foundation Research Program of China, numbers 60772134 and 60902081, and the 111 project (B08038). We would like to thank the editors and anonymous reviewers for their valuable comments and suggestions.