The extracted low resolution video from a motion compensated 3-D subband/wavelet scalable video coder is unnecessarily sharp and sometimes contains significant aliasing, compared to that by the MPEG4 lowpass filter. In this paper, we propose a content adaptive method for aliasing reduction in subband/wavelet scalable video coding. We try to make the low resolution frame (LL subband) visually similar to that of the MPEG4 decimation filter through frequency roll-off. Scaling of the subbands is introduced to make the variances of the subbands comparable in these two cases. Thanks to the embedded properties of the EZBC coder, we can achieve the needed scaling of energies in each subband by sub-bitplane shift in the extractor and
coefficient scaling in the decoder. An analysis is presented for the relationship between sub-bitplane shift and scaling, which shows that our selected sub-bitplane shift works well for high to medium bitrates. Two different energy-matching structures, i.e. a dyadic decomposition and non-dyadic decomposition, are proposed. The
first dyadic method has low complexity but low accuracy for energy matching, while the second non-dyadic method is more accurate for energy matching but has higher analysis/synthesis complexity. The first scheme introduces no PSNR loss for full resolution video, while the second one introduces a slight PSNR loss for full resolution, due to our omission of interband context modeling in this case. Both methods offer substantial PSNR gain for lower spatial resolution, as well as substantial reduction in visible aliasing.
In the scalable video coder MC-EZBC, the scalability for motion vectors was not provided, and this greatly impacts its performance when scaling down to very low bit rates and resolutions. Here we enhance MC-EZBC with scalable motion vector coding using the Context based Adaptive Binary Arithmetic Coder (CABAC). Both a layered structure for motion vector coding and an alphabet general partition (AGP) of the motion vector symbols are employed for SNR and resolution scalability of the motion vector bitstream. With these two new features and the careful arrangement of the motion vector bitstream output from the existing MC-EZBC, we obtain temporal, SNR, and resolution scalability for motion vectors. This significantly improves both visual and objective performance at low bit rates and resolutions with only a slight PSNR loss (about 0.05 dB), but no detectable visual loss, at high bit rates.
In previous work, a performance bound for multi-hypothesis motion-compensated prediction (MCP) has been derived based on a video signal model with independent Gaussian displacement errors. A simplified form of the result is derived in this work. A performance bound for optimal motion-compensated temporal filtering (MCTF) has also been proposed based on a signal model with correlated Gaussian displacement errors. In this previous work, the optimal MCTF (KLT) was found to perform better than one-hypothesis MCP but not better than infinite-hypothesis MCP. In this work, we derive the performance of multi-hypothesis MCP again based on the signal model with correlated Gaussian displacement errors. Now with the same signal model, we find that optimal MCTF has the same performance as that of infinite-hypothesis MCP.