1 February 2011 Hybrid generative-discriminative human action recognition by combining spatiotemporal words with supervised topic models
Author Affiliations +
Abstract
We present a hybrid generative-discriminative learning method for human action recognition from video sequences. Our model combines a bag-of-words component with supervised latent topic models. A video sequence is represented as a collection of spatiotemporal words by extracting space-time interest points and describing these points using both shape and motion cues. The supervised latent Dirichlet allocation (sLDA) topic model, which employs discriminative learning using labeled data under a generative framework, is introduced to discover the latent topic structure that is most relevant to action categorization. The proposed algorithm retains most of the desirable properties of generative learning while increasing the classification performance though a discriminative setting. It has also been extended to exploit both labeled data and unlabeled data to learn human actions under a unified framework. We test our algorithm on three challenging data sets: the KTH human motion data set, the Weizmann human action data set, and a ballet data set. Our results are either comparable to or significantly better than previously published results on these data sets and reflect the promise of hybrid generative-discriminative learning approaches.
© (2011) Society of Photo-Optical Instrumentation Engineers (SPIE)
Hao Sun, Hao Sun, Cheng Wang, Cheng Wang, Boliang Wang, Boliang Wang, } "Hybrid generative-discriminative human action recognition by combining spatiotemporal words with supervised topic models," Optical Engineering 50(2), 027203 (1 February 2011). https://doi.org/10.1117/1.3537969 . Submission:
JOURNAL ARTICLE
11 PAGES


SHARE
Back to Top