Paper
13 April 2018 Hierarchical vs non-hierarchical audio indexation and classification for video genres
Author Affiliations +
Proceedings Volume 10696, Tenth International Conference on Machine Vision (ICMV 2017); 1069621 (2018) https://doi.org/10.1117/12.2309852
Event: Tenth International Conference on Machine Vision, 2017, Vienna, Austria
Abstract
In this paper, Support Vector Machines (SVMs) are used for segmenting and indexing video genres based on only audio features extracted at block level, which has a prominent asset by capturing local temporal information. The main contribution of our study is to show the wide effect on the classification accuracies while using an hierarchical categorization structure based on Mel Frequency Cepstral Coefficients (MFCC) audio descriptor. In fact, the classification consists in three common video genres: sports videos, music clips and news scenes. The sub-classification may divide each genre into several multi-speaker and multi-dialect sub-genres. The validation of this approach was carried out on over 360 minutes of video span yielding a classification accuracy of over 99%.
© (2018) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Nouha Dammak and Yassine BenAyed "Hierarchical vs non-hierarchical audio indexation and classification for video genres", Proc. SPIE 10696, Tenth International Conference on Machine Vision (ICMV 2017), 1069621 (13 April 2018); https://doi.org/10.1117/12.2309852
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Feature extraction

Databases

Visualization

Information visualization

Fourier transforms

Classification systems

RELATED CONTENT


Back to Top