25 May 2019 Bilayer model for cross-view human action recognition based on transfer learning
Yandi Li, Xiping Xu, Jiahong Xu, Enyu Du
Author Affiliations +
Abstract
In cross-view action recognition, there remains a challenge that the action representation will lack the ability of transfer learning when the feature space changes. To solve this problem, a cross-view action recognition approach using a bilayer discriminative model is proposed. We first extract the key poses to capture the essence of each action sequence and represent each key pose by a bag of visual words (BoVW) in a single view. We then construct a bipartite graph between the heterogeneous poses and apply multipartitioning to cocluster the view-dependent visual words for developing the cross view bags of visual words feature, which is more discriminative in the presence of view changes. The novelty is to design a bilayer classifier consisting of SVM and HMM at the frame level and sequence level, respectively, to make up for the loss of temporal information when using a BoVW to represent the whole action sequence. Finally, DTW is used as a pruning algorithm to lessen the number of nodes for searching the Viterbi path. Extensive experiments are performed on two well-known multiple view action datasets IXMAS and N-UCLA, and a detailed performance comparison with the existing view-invariant action recognition techniques indicates that the proposed method works equally well in accuracy and efficiency.
© 2019 SPIE and IS&T 1017-9909/2019/$25.00 © 2019 SPIE and IS&T
Yandi Li, Xiping Xu, Jiahong Xu, and Enyu Du "Bilayer model for cross-view human action recognition based on transfer learning," Journal of Electronic Imaging 28(3), 033016 (25 May 2019). https://doi.org/10.1117/1.JEI.28.3.033016
Received: 20 November 2018; Accepted: 25 April 2019; Published: 25 May 2019
Lens.org Logo
CITATIONS
Cited by 3 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
3D modeling

Visualization

Data modeling

Video

Detection and tracking algorithms

Target recognition

Cameras

RELATED CONTENT

Body-part estimation from Lucas-Kanade tracked Harris points
Proceedings of SPIE (February 19 2013)
Depth enhanced and content aware video stabilization
Proceedings of SPIE (March 11 2015)
Traffic camera markup language (TCML)
Proceedings of SPIE (February 15 2012)
A novel shot boundary detection framework
Proceedings of SPIE (June 24 2005)

Back to Top