Layered video coding is used for adaptive transmission over channels having variable bandwidths. In the two well-known methods of data partitioning (DP) and fine granularity scalability (FGS), a base layer contains essential information and one or more enhancement layers contain fine detail. FGS is continuously scalable above the base layer by successive DCT coefficient bit planes of lower significance, but suffers losses in coding efficiency at low base layer rates. DP, on the other hand, only provides a base partition for header information and low-frequency coefficients and one or more enhancement partitions for higher-frequency coefficients. This results in degraded quality when the enhancement layer is lost but offers performance near single-layer video as the transmission rate approaches the encoding rate. DP is thus suited to bandwidths that vary over a narrow range, whereas FGS performs robustly over a wider range but not as well as single-layer or DP at bandwidths near the full rate. A combination of the two methods can provide higher quality than FGS alone, over a greater bandwidth range than DP alone. This is achieved by using DP on an FGS base layer, which can now have a sufficiently high rate to improve the FGS coding efficiency. Such a combination has been investigated for one form of DP, known as Rate-Distortion optimal Data Partitioning (RDDP), which attempts to provide the best possible base partition quality for a given rate. A method for combining FGS and DP is described, along with expected and computed performances for different rates.
In this paper, we consider the problem of transmitting compressed
video over wirless LAN under transmission power constraints, where
retransmission is adopted as the error control scheme. For
different retry limits, transmitter energy per bit is adjusted to
keep a constant video quality at the receiver, which costs
different transmission power. We propose an algorithm to minimize
transmission power by carefully choosing the maximum number of
retransmission times and transmission energy level based on the
quality and delay requirement. We also examine the effect of
distance on the choice of the optimal points.
Video decoding at reduced resolution with resizing embedded in the decoding loop saves computational resources such as memory, memory bandwidth and CPU cycles. Key to such embedded resizing is proper filtering/scaling of DCT data, and motion compensation at the reduced resolution. Although MPEG-2 video decoding with embedded resizing has been investigated in the past, little work has been reported on solving problems associated with interlaced video undergoing decoding with embedded resizing. In particular, annoying artifacts may occur in moving areas of interlaced video due to improper scaling or motion compensation. In this paper, we introduce the notion of the Local Interlacing Property for interlaced moving areas and propose algorithms to detect and process data with the Local Interlacing Property properly in the context of decoding with embedded resizing. Specifically, we demonstrate that 1) vertical high frequency in interlaced moving areas should be preserved during downscaling, and 2) phase shift must be added for motion compensation in interlaced moving areas under certain circumstances. Experimental results show that our method effectively removes artifacts in interlaced moving areas, making MPEG-2 video decoding with embedded resizing a practical tradeoff for interlaced video.
Video transmission over bandwidth-varying networks is becoming increasingly important due to emerging applications such as streaming of video over the Internet. The fundamental obstacle in designing such systems resides in the varying characteristics of the Internet (i.e. bandwidth variations and packet-loss patterns). In MPEG-4, a new SNR scalability scheme, called Fine-Granular-Scalability (FGS), is currently under standardization, which is able to adapt in real-time (i.e. at transmission time) to Internet bandwidth variations. The FGS framework consists of a non-scalable motion-predicted base-layer and an intra-coded fine-granular scalable enhancement layer. For example, the base layer can be coded using a DCT-based MPEG-4 compliant, highly efficient video compression scheme. Subsequently, the difference between the original and decoded base-layer is computed, and the resulting FGS-residual signal is intra-frame coded with an embedded scalable coder. In order to achieve high coding efficiency when compressing the FGS enhancement layer, it is crucial to analyze the nature and characteristics of residual signals common to the SNR scalability framework (including FGS). In this paper, we present a thorough analysis of SNR residual signals by evaluating its statistical properties, compaction efficiency and frequency characteristics. The signal analysis revealed that the energy compaction of the DCT and wavelet transforms is limited and the frequency characteristic of SNR residual signals decay rather slowly. Moreover, the blockiness artifacts of the low bit-rate coded base-layer result in artificial high frequencies in the residual signal. Subsequently, a variety of wavelet and embedded DCT coding techniques applicable to the FGS framework are evaluated and their results are interpreted based on the identified signal properties. As expected from the theoretical signal analysis, the rate-distortion performances of the embedded wavelet and DCT-based coders are very similar. However, improved results can be obtained for the wavelet coder by deblocking the base- layer prior to the FGS residual computation. Based on the theoretical analysis and our measurements, we can conclude that for an optimal complexity versus coding-efficiency trade- off, only limited wavelet decomposition (e.g. 2 stages) needs to be performed for the FGS-residual signal. Also, it was observed that the good rate-distortion performance of a coding technique for a certain image type (e.g. natural still-images) does not necessarily translate into similarly good performance for signals with different visual characteristics and statistical properties.
In this paper, a simple yet highly effective video compression technique is presented. The zerotree method by Said, which is an improved version of Shapiro's original one, is applied and expanded to three-dimension to encode image sequences. A three-dimensional subband transformation on the image sequences is first performed, and the transformed information is then encoded using the zerotree coding scheme. The algorithm achieves results comparable to MPEG-2, without the complexity of motion compensation. The reconstructed image sequences have no blocking effects at very low rates, and the transmission is progressive.