Block effect is one of the most annoying artifacts in digital video processing and is especially visible in low-bitrate
applications, such as mobile video. To alleviate this problem, we propose an adaptive quantization method for inter
frames that can reduce visible block effect in DCT-based video coding. In the proposed method, a set of quantization
matrices are constructed before processing the video data. Matrices are constructed by exploiting the temporal frequency
limitations of human visual system. The method is adaptive to motion information and is able to select an appropriate
quantization matrix for each inter-coded block. Based on the experimental results, the proposed scheme can achieve
better subjective video quality compared to conventional flat quantization especially at low-bitrate application.
Moreover, it does not introduce extra computational cost in software implementation. This method does not change
standard bitstream syntax, so it can be directly applied to many DCT-based video codecs. A potential application could
be for mobile phone and other digital devices with low-bitrate requirement.
Proc. SPIE. 6507, Multimedia on Mobile Devices 2007
KEYWORDS: Digital signal processing, 3D acquisition, Video acceleration, Detection and tracking algorithms, Video, Denoising, Signal processing, Video processing, Algorithm development, Motion estimation
The recent development of in the field of embedded systems has enabled mobile devices with significant computation
power and long battery life. However, there are still a limited number of video applications for such platforms. Due to
high computational requirements of video processing algorithms, an intensive assembler optimization or even hardware
design is required to meet the resource constraints of the mobile platforms. One example of such challenging video
processing problem is video denoising.
In this paper, we present a software implementation of a state-of-the-art video denoising algorithm on a mobile
computational platform. The chosen algorithm is based on the three-dimensional discrete cosine transform (3D DCT)
and block-matching. Apart from its architectural simplicity, algorithm allows the computational scalability due to the
"sliding window"-style processing. In addition, main components of this algorithm are 8-point DCT and block matching
which can be efficiently calculated with hardware acceleration of the modern DSP.
Our target platform is the OMAP Innovator development kit, a dual processor environment including ARM 925 RISC
general purpose processor (GPP) and TMS320C55x digital signal processor (DSP). The C55x DSP offers a hardware
acceleration support for computing of the DCT and block-matching intensively used in the chosen denoising algorithm.
Hardware acceleration can offer a significant "speed-up" in comparison to assembler optimization of source codes. The
results demonstrate a possibility to implement an efficient video denoising algorithm on a mobile computational
platform with limited computational resources.
Low complexity video coding schemes are aimed to provide video encoding services also for devices with restricted
computational power. Video coding process based on the three-dimensional discrete cosine transform (3D DCT)
can offer a low complexity video encoder by omitting the computationally demanding motion estimation operation.
In this coding scheme, extended fast transform is also used, instead of the motion estimation, to decorrelate
the temporal dimension of video data. Typically, the most complex part of the 3D DCT based coding process
is the three-dimensional transform. In this paper, we demonstrate methods that can be used in lossy coding
process to reduce the number of one-dimensional transforms required to complete the full 3D DCT or its inverse
operation. Because unnecessary computations can be omitted, fewer operations are required to complete the
transform. Results include the obtained computational savings for standard video test sequences. The savings
are reported in terms of computational operations. Generally, the reduced number of computational operations
also implies longer battery lifetime for portable devices.
In this paper, we propose an image coding scheme using adaptive resizing algorithm to obtain more compact coefficient representation in the block-DCT domain. Standard coding systems, e.g. JPEG baseline, utilize the block-DCT transform to reduce spatial correlation and to represent the image information with a small number of visually significant transform coefficients. Because the neighboring coefficient blocks may include only a few low-frequency coefficients, we can use downsizing operation to combine the information of two neighboring blocks into a single block.
Fast and elegant image resizing methods operating in transform domain have been introduced previously. In this paper, we introduce a way to use these algorithms to reduce the number of coefficient blocks that need to be encoded. At the encoder, the downsizing operation should be performed delicately to gain compression efficiency. The information of neighboring blocks can be efficiently combined if the blocks do not contain significant highfrequency components and if the blocks share similar characteristics. Based on our experiments, the proposed method can offer from 0 to 4 dB PSNR gain for block-DCT based coding processes. Best performance can be expected for large images containing smooth homogenous areas.
This paper describes how a scene cut detector could be utilized in a video codec based on the three-dimensional discrete cosine transform (3D DCT). In the 3D DCT based video codec, data is processed with 8x8x8 cubes, hence a set of 8 images need to be available in a memory at a time. A change of video scene may occur between any of those images stored in the memory. Rapid scene change within an 8x8x8 cube produces significant high frequency coefficients into the temporal dimension of the DCT domain. If the important high frequency coefficients are discarded, the information between the scenes is mixed around the scene cut position causing ghost artifacts into the reconstructed video sequence. Therefore, an approach to handle each of the eight possible scene change situations within an 8x8x8 cube is proposed. The proposed method includes the utilization of the 8x8x4 DCT, forced-fill, repeat previous frame, and average to previous frame techniques. By utilizing a scene cut detector into the 3D DCT based video codec, unnecessary quality drops could be avoided without reducing the compression ratio. Notable quality improvements could be achieved for images around a scene cut position.
In this paper, a simplified three-dimensional discrete cosine transform (3D DCT) based video codec is proposed. The computational complexity of the baseline 3D DCT based video codec is reduced by simplifying the transformation block. In video sequences with low motion activity, consecutive images are highly correlated in the temporal dimension, thus the DCT does not usually produce significant coefficient values to the higher temporal frequencies. Therefore, we have a possibility to use a simple averaging operation and the 2D DCT, instead of the full 3D DCT operation, for some of the cubes in processing. Furthermore, some of the resulting cubes could be combined together to achieve more efficient binary representation.
Based on our results, simplifications considerably improved compression efficiency of the 3D DCT based codec for video sequences with low motion activity. In addition, the compression efficiency for video sequences with high motion activity was maintained. At the same time, the coding speed of the simplified 3D DCT based video codec was increased from the original. Although the compression efficiency of H.263 video codec was not reached, the encoding speed of the 3D DCT based video encoder was many times faster than the encoding speed of the H.263.