10 January 2003 Survey of compressed domain audio features and their expressiveness
Author Affiliations +
We give an overview of existing audio analysis approaches in the compressed domain and incorporate them into a coherent formal structure. After examining the kinds of information accessible in an MPEG-1 compressed audio stream, we describe a coherent approach to determine features from them and report on a number of applications they enable. Most of them aim at creating an index to the audio stream by segmenting the stream into temporally coherent regions, which may be classified into pre-specified types of sounds such as music, speech, speakers, animal sounds, sound effects, or silence. Other applications centre around sound recognition such as gender, beat or speech recognition.
© (2003) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Silvia Pfeiffer, Silvia Pfeiffer, Thomas Vincent, Thomas Vincent, } "Survey of compressed domain audio features and their expressiveness", Proc. SPIE 5021, Storage and Retrieval for Media Databases 2003, (10 January 2003); doi: 10.1117/12.476300; https://doi.org/10.1117/12.476300

Back to Top