Due to the growing popularity of portable multimedia display devices and wide availability of high-definition video
content, the transcoding of high-resolution videos into lower resolution ones with different formats has become a crucial
challenge for PC platforms. This paper presents our study on the leveraging of the Unified Video Decoder (UVD)
provided by the graphics processor unit (GPU) for achieving high-speed video transcoding with low CPU usage. Our
experimental results show off-loading video decoding and video scaling to the GPU can double transcoding speed with
only half the CPU usage compared to in-box software decoders for transcoding 1080p (1920x1080) video content on an
AMD Vision processor with an integrated graphics unit.
The video coding scheme defined by MPEG-4 standard offers several content-based functionalities, demanding the description of the scene in terms of so-called video- objects. The separate coding of the video objects may enrich the user interaction in several multimedia services due to flexible access to the bit-stream and an easy manipulation of the video information. In this framework, the coder may perform a locally defined pre-processing, aimed at the automatic identification of the objects appearing in the sequence. Hence, video segmentation is a key issue in efficiency applying the MPEG-4 coding scheme. This paper presents a segmentation algorithm based on watershed algorithm and optical flow motion estimation. Our simulation results show that this method is able to solve complex segmentation tasks according to luminance homogeneity and motion coherence criterion.
This paper presents a new region-based multi-scalar video coding algorithm, which employs an efficient spatial and frequency decomposition scheme within each frame. Spatial decomposition is implemented through a simple region growing block segmentation method, yielding arbitrary-shaped (relatively) homogeneous regions. Frequency decomposition is implemented by finding the optimum wavelet packet for each of the spatial regions by pruning a 3-level full wavelet tree. In order to adapt the coding to the motion within the sequence, that is, to exploit the temporal redundancy between frames, a differential pulse code modulation (DPCM) loop is used. The optimum coding is based on a rate-distortion criterion. The bit allocation to all subbands of the wavelet packet for each of the regions of a frame is done such that the overall distortion D is minimized under a bit rate R constraint. The constrained optimization problem is transformed into an unconstrained minimization of the Lagrangian cost function J equals D plus (lambda) R. It then follows that the optimum bit rate allocation is achieved when all subbands of all regions are operating at points with identical slopes, on their respective R-D curves.
In this paper we propose an adaptive region-based, multi- scale, motion compensated video compression algorithm designed for transmission over hostile communication channels. Our codec extracts spatial information form video frames to create video regions that are then decomposed into sub-bands of different perceptual importance before being compressed and transmitted independently. This allows the system to apply unequal error protection, prioritized transmission, and 'lego-reconstruction' to guarantee a minimum spatial and temporal resolution at the receiver. Furthermore, the region segmented frames bound both spatial and temporal error propagation within frames and when combined with our novel connection-level inter-region statistical multiplexing scheme ensure optimal utilization of the reserved transmission bandwidth. Simulation results demonstrate that in the presence of severe time-varying error conditions and severe bandwidth constraints, our video codec exhibits better error concealment, better temporal resolution, and better bandwidth utilization properties than the popular International Telecommunication Union's and International Standard Organization's video coding standards.
H.263 is one of the most efficient video compression standards available today. It is expected that H.263 will replace H.261 in many applications. This paper analyzes the computational complexity of H.263 and presents a set of methods for maximizing the performance of this codec based on 64-bit Digital's Alpha processor. The optimization problem is approached from two directions: algorithmic enhancement and efficient software implementation. Performance comparisons are evaluated for the default mode and full-option mode. Finally, this paper provides a brief description of multimedia instructions supported by a new generation of Alpha CPU, which is designed to provide the most cost-effective solution for software-only video compression and other multimedia applications.
This paper presents an efficient temporal, spatial and frequency decomposition method for improving the performance of the existing methods for 3D subband coding of video signals. In this method, a given video sequence is first partitioned into constant-sized 'groups' of frames. Within each group, every two consecutive frames are decomposed into low and high temporal subbands by a two-tap temporal filter. The redundancy in the low temporal subbands is removed by a closed loop DPCM unit. The low and high subbands of the first pair of frames, the difference of the low subbands from the DPCM loop and the high subbands of the subsequent pairs of frames in the 'group' are then divided into image blocks of equal size, such that each block can be efficiently decomposed by an adaptive wavelet packet based on a rate-distortion criterion. The subbands of wavelet packets for all blocks are quantized by a hybrid scalar/pyramidal lattice vector quantizer. This scheme achieves minimum distortion under a rate constraint specified for the 'groups' of frames and under the structural constraints of the algorithm. In this scheme, the motion compensation is not used explicitly, but the effects of motion are accounted for through the low-high temporal subbanding and the DPCM procedure. The results compare favorably with those of traditional video coding techniques. Although the computational complexity of this scheme is higher than that of the existing 3D subband method, it is suitable for parallel implementation.
This paper presents the design of an improved image compression algorithm based on an optimal spatial and frequency decomposition of images. The use of spatially varying wavelet packets for a generalized wavelet decomposition of images was recently introduced by Asai, Ramchandram and Vetterli. They use a `double tree' algorithm to obtain the optimal set of bases for a given image, through a joint optimization with respect to frequency decomposition by a wavelet packet and spatial decomposition based on a quad-tree structure. In this paper, we present a `double-tree' frequency and spatial decomposition algorithm that extends the existing algorithm in three areas. First, instead of the quad-tree structure, our algorithm uses a more flexible merging scheme for the spatial decomposition of the image. Second, instead of a scalar quantizer, we use a pyramidal lattice vector quantizer to represent each subband of each wavelet packet, which improves the coding efficiency of the representation. Both of these extensions yield an improved rate-distortion (R-D) performance. Finally, our algorithm uses a scheme that gives a good initial value for the slope of the R-D curve, reducing the total computations needed to obtain the optimum decompositions.