7 March 2013 Sparse conditional mixture model: late fusion with missing scores for multimedia event detection
Author Affiliations +
Abstract
The problem of event detection in multimedia clips is typically handled by modeling each of the component modalities independently, then combining their detection scores in a late fusion approach. One of the problems of a late fusion model in the multimedia setting is that the detection scores may be missing from one or more components for a given clip; e.g., when there is no speech in the clip; or when there is no overlay text. Standard fusion techniques typically address this problem by assuming a default backoff score for a component when its detection score is missing for a clip. This may potentially bias the fusion model, especially if there are many missing detections from a given component. In this work, we present the Sparse Conditional Mixture Model (SCMM) which models only the observed detection scores for each example, thereby avoiding making any assumptions about the distributions of the scores that are made by backoff models. Our experiments in multi-media event detection using the TRECVID-2011 corpus demonstrates that SCMM achieves statistically significant performance gains over standard late fusion techniques. The SCMM model is very general and is applicable to fusion problems with missing data in any domain.
© (2013) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Ramesh Nallapati, Ramesh Nallapati, Eric Yeh, Eric Yeh, Gregory Myers, Gregory Myers, } "Sparse conditional mixture model: late fusion with missing scores for multimedia event detection", Proc. SPIE 8667, Multimedia Content and Mobile Devices, 866706 (7 March 2013); doi: 10.1117/12.2007463; https://doi.org/10.1117/12.2007463
PROCEEDINGS
7 PAGES


SHARE
RELATED CONTENT


Back to Top