Translator Disclaimer
23 December 1999 Stochastic modeling of soundtrack for efficient segmentation and indexing of video
Author Affiliations +
Abstract
Tools for efficient and intelligent management of digital content are essential for digital video data management. An extremely challenging research area in this context is that of multimedia analysis and understanding. The capabilities of audio analysis in particular for video data management are yet to be fully exploited. We present a novel scheme for indexing and segmentation of video by analyzing the audio track. This analysis is then applied to the segmentation and indexing of movies. We build models for some interesting events in the motion picture soundtrack. The models built include music, human speech and silence. We propose the use of hidden Markov models to model the dynamics of the soundtrack and detect audio-events. Using these models we segment and index the soundtrack. A practical problem in motion picture soundtracks is that the audio in the track is of a composite nature. This corresponds to the mixing of sounds from different sources. Speech in foreground and music in background are common examples. The coexistence of multiple individual audio sources forces us to model such events explicitly. Experiments reveal that explicit modeling gives better result than modeling individual audio events separately.
© (1999) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Milind Ramesh Naphade and Thomas S. Huang "Stochastic modeling of soundtrack for efficient segmentation and indexing of video", Proc. SPIE 3972, Storage and Retrieval for Media Databases 2000, (23 December 1999); doi: 10.1117/12.373546; https://doi.org/10.1117/12.373546
PROCEEDINGS
9 PAGES


SHARE
Advertisement
Advertisement
Back to Top