As video databases become increasingly important for full exploitation of multimedia resources, this paper aims at describing our recent efforts in feasibility studies towards building up a content-based and high-level video retrieval/management system. The study is focused on constructing a semantic tree structure via combination of low-level image processing techniques and high-level interpretation of visual content. Specifically, two separate algorithms were developed to organise input videos in terms of two layers: the shot layer and the key-frame layer. While the shot layer is derived by developing a multi-featured shot cut detection, the key frame layer is extracted automatically by a genetic algorithm. This paves the way for applying pattern recognition techniques to analyse those key frames and thus extract high level information to interpret the visual content or objects. Correspondingly, content-based video retrieval can be conducted in three stages. The first stage is to browse the digital video via the semantic tree at structural level, the second stage is match the key frame in terms of low-level features, such as colour, shape of objects, and texture etc. Finally, the third stage is to match the high-level information, such as conversation with indoor background, moving vehicles along a seaside road etc. Extensive experiments are reported in this paper for shot cut detection and key frame extraction, enabling the tree structure to be constructed.
Block-based motion estimation is widely used in the field of video compression due to its feature of high processing
speed and competitive compression efficiency. In the chain of compression operations, however, motion estimation still
remains to be the most time-consuming process. As a result, any improvement in fast motion estimation will enable
practical applications of MPEG techniques more efficient and more sustainable in terms of both processing speed and
computing cost. To meet the requirements of real-time compression of videos and image sequences, such as video
conferencing, remote video surveillance and video phones etc., we propose a new search algorithm and achieve fast
motion estimation for MPEG compression standards based on existing algorithm developments. To evaluate the
proposed algorithm, we adopted MPEG-4 and the prediction line search algorithm as the benchmarks to design the
experiments. Their performances are measured by: (i) reconstructed video quality; (ii) processing time. The results reveal
that the proposed algorithm provides a competitive alternative to the existing prediction line search algorithm. In
comparison with MPEG-4, the proposed algorithm illustrates significant advantages in terms of processing speed and
We propose a fast partial decoding algorithm for content access to MPEG compressed videos, where full decompression is not necessarily required, such as compressed video browsing, content analysis, and specific pattern search. The proposed decoding bypasses the inverse DCT via an approximation process to extract average pixels directly from compressed DCT coefficients. Although such extracted pixels may incur differences compared with their fully decompressed counterparts, extensive experiments show that such partially decoded video frames preserve their content very well and achieve reasonable perceptual quality in terms of visual inspections.