31 July 2006 An efficient video shot representation for fast video retrieval
Author Affiliations +
Proceedings Volume 5960, Visual Communications and Image Processing 2005; 59600P (2006) https://doi.org/10.1117/12.631564
Event: Visual Communications and Image Processing 2005, 2005, Beijing, China
Abstract
For video retrieval, a video is partitioned into a group of shots, which are then represented by either key frames or video shot representations. An optimal representation of a shot should include all the information about the frames concerned. In this paper, we propose an efficient representation scheme for a shot, which considers both the spatial frequency contents and the temporal statistics of the frames for video retrieval. In our scheme, each frame in a video shot is transformed into the frequency domain using the discrete cosine transform (DCT), and a number of values at each frequency are selected based on their probability of occurrence. This representation scheme allows retrieval to be carried out hierarchically, i.e. from low-frequency to high-frequency components. Experimental results show that our proposed scheme outperforms the alpha-trimmed average histogram method in terms of retrieval accuracy.
© (2006) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Cheng Cai, Cheng Cai, Kin-Man Lam, Kin-Man Lam, Zheng Tan, Zheng Tan, } "An efficient video shot representation for fast video retrieval", Proc. SPIE 5960, Visual Communications and Image Processing 2005, 59600P (31 July 2006); doi: 10.1117/12.631564; https://doi.org/10.1117/12.631564
PROCEEDINGS
9 PAGES


SHARE
Back to Top