22 June 2012 Video genre categorization and representation using audio-visual information
Author Affiliations +
Abstract
We propose an audio-visual approach to video genre classification using content descriptors that exploit audio, color, temporal, and contour information. Audio information is extracted at block-level, which has the advantage of capturing local temporal information. At the temporal structure level, we consider action content in relation to human perception. Color perception is quantified using statistics of color distribution, elementary hues, color properties, and relationships between colors. Further, we compute statistics of contour geometry and relationships. The main contribution of our work lies in harnessing the descriptive power of the combination of these descriptors in genre classification. Validation was carried out on over 91 h of video footage encompassing 7 common video genres, yielding average precision and recall ratios of 87% to 100% and 77% to 100%, respectively, and an overall average correct classification of up to 97%. Also, experimental comparison as part of the MediaEval 2011 benchmarking campaign demonstrated the efficiency of the proposed audio-visual descriptors over other existing approaches. Finally, we discuss a 3-D video browsing platform that displays movies using feature-based coordinates and thus regroups them according to genre.
© 2012 SPIE and IS&T
Bogdan E. Ionescu, Christoph Rasche, Constantin Vertan, Patrick Lambert, Klaus Seyerlehner, "Video genre categorization and representation using audio-visual information," Journal of Electronic Imaging 21(2), 023017 (22 June 2012). https://doi.org/10.1117/1.JEI.21.2.023017 . Submission:
JOURNAL ARTICLE
18 PAGES


SHARE
Back to Top