Translator Disclaimer
1 January 2001 Multimodal pattern matching for audio-visual query and retrieval
Author Affiliations +
A necessary capability for content-based retrieval is to support the paradigm of query by example. In the past, there have been several attempts to use low-level features for video retrieval. None of the approaches however uses the multimedia information content of the video. We present an algorithm for matching multi modal patterns for the purpose of content-based video retrieval. The novel ability of our approach to use the information content in multiple media coupled with a strong emphasis on temporal similarity differentiates it from the state-of-the-art in content-based retrieval. At the core of the pattern matching scheme is a dynamic programming algorithm, which leads to a significant improvement in performance. Coupling the use of audio with video this algorithm can be applied to grouping of shots based on audio-visual similarity. This is much more effective in constructing scenes from shots than using only visual content to do the same.
© (2001) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Milind Ramesh Naphade, Roy R. Wang, and Thomas S. Huang "Multimodal pattern matching for audio-visual query and retrieval", Proc. SPIE 4315, Storage and Retrieval for Media Databases 2001, (1 January 2001); doi: 10.1117/12.410927;


Back to Top