23 December 1999 Integrated approach to multimodal media content analysis
Author Affiliations +
Abstract
In this work, we present a system for the automatic segmentation, indexing and retrieval of audiovisual data based on the combination of audio, visual and textural content analysis. The video stream is demultiplexed into audio, image and caption components. Then, a semantic segmentation of the audio signal based on audio content analysis is conducted, and each segment is indexed as one of the basic audio types. The image sequence is segmented into shots based on visual information analysis, and keyframes are extracted from each shot. Meanwhile, keywords are detected from the closed caption. Index tables are designed for both linear and non-linear access to the video. It is shown by experiments that the proposed methods for multimodal media content analysis are effective. And that the integrated framework achieves satisfactory results for video information filtering and retrieval.
© (1999) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Tong Zhang, Tong Zhang, C.-C. Jay Kuo, C.-C. Jay Kuo, } "Integrated approach to multimodal media content analysis", Proc. SPIE 3972, Storage and Retrieval for Media Databases 2000, (23 December 1999); doi: 10.1117/12.373583; https://doi.org/10.1117/12.373583
PROCEEDINGS
12 PAGES


SHARE
Back to Top