22 June 2012 Video genre categorization and representation using audio-visual information
Bogdan E. Ionescu, Christoph Rasche, Constantin Vertan, Patrick Lambert, Klaus Seyerlehner
Author Affiliations +
Abstract
We propose an audio-visual approach to video genre classification using content descriptors that exploit audio, color, temporal, and contour information. Audio information is extracted at block-level, which has the advantage of capturing local temporal information. At the temporal structure level, we consider action content in relation to human perception. Color perception is quantified using statistics of color distribution, elementary hues, color properties, and relationships between colors. Further, we compute statistics of contour geometry and relationships. The main contribution of our work lies in harnessing the descriptive power of the combination of these descriptors in genre classification. Validation was carried out on over 91 h of video footage encompassing 7 common video genres, yielding average precision and recall ratios of 87% to 100% and 77% to 100%, respectively, and an overall average correct classification of up to 97%. Also, experimental comparison as part of the MediaEval 2011 benchmarking campaign demonstrated the efficiency of the proposed audio-visual descriptors over other existing approaches. Finally, we discuss a 3-D video browsing platform that displays movies using feature-based coordinates and thus regroups them according to genre.
© 2012 SPIE and IS&T 0091-3286/2012/$25.00 © 2012 SPIE and IS&T
Bogdan E. Ionescu, Christoph Rasche, Constantin Vertan, Patrick Lambert, and Klaus Seyerlehner "Video genre categorization and representation using audio-visual information," Journal of Electronic Imaging 21(2), 023017 (22 June 2012). https://doi.org/10.1117/1.JEI.21.2.023017
Published: 22 June 2012
Lens.org Logo
CITATIONS
Cited by 5 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Visualization

Information visualization

Binary data

Feature extraction

Visual analytics

3D displays

RELATED CONTENT

2D-to-3D conversion by using visual attention analysis
Proceedings of SPIE (February 25 2010)
Content-based analysis of news video
Proceedings of SPIE (September 25 2001)
Semantic filtering of video content
Proceedings of SPIE (January 01 2001)
MPEG-7-based metadata generator and its browser
Proceedings of SPIE (December 10 2002)
Integrated approach to multimodal media content analysis
Proceedings of SPIE (December 23 1999)

Back to Top