1 October 2001 Audio-visual query and retrieval: a system that uses dynamic programming and relevance feedback
Author Affiliations +
Abstract
A necessary capability for content-based retrieval is to support the paradigm of query by example. In the past, there have been several attempts to use low-level features for video retrieval. However, most of the approaches support queries using image sequences only. We present an algorithm for matching multimodal (audio-visual) patterns for the purpose of content-based video retrieval. The novel ability of our approach to use the information content in multiple media coupled with a strong emphasis on temporal similarity differentiates it from the state of the art in content-based retrieval. At the core of the pattern matching scheme is a dynamic programming algorithm, which leads to a significant improvement in performance. Coupling the use of audio with video this algorithm can be applied to grouping of shots based on audio-visual similarity. We also support relevance feedback. The user can provide feedback to the system, by choosing clips, which are closer to the user’s desired target. The system then automatically adjusts the relative weights or relevance of the media and fetches different sets of target clips accordingly. It is our observation that a few iterations of such feedback are generally sufficient, for retrieving the desired video clips.
© (2001) Society of Photo-Optical Instrumentation Engineers (SPIE)
Milind Ramesh Naphade, Roy R. Wang, Thomas S. Huang, "Audio-visual query and retrieval: a system that uses dynamic programming and relevance feedback," Journal of Electronic Imaging 10(4), (1 October 2001). https://doi.org/10.1117/1.1407822 . Submission:
JOURNAL ARTICLE
10 PAGES


SHARE
Back to Top