Spatiotemporal visual-semantic embedding network for zero-shot action recognition

Rongqiao An; Zhenjiang Miao; Qingyu Li; Wanru Xu; Qiang Zhang

doi:10.1117/1.JEI.28.2.023007

8 March 2019 Spatiotemporal visual-semantic embedding network for zero-shot action recognition

Rongqiao An, Zhenjiang Miao, Qingyu Li, Wanru Xu, Qiang Zhang

Author Affiliations +

Journal of Electronic Imaging, Vol. 28, Issue 2, 023007 (March 2019). https://doi.org/10.1117/1.JEI.28.2.023007

Abstract

Zero-shot learning (ZSL) has recently attracted increasing attention in visual tasks like action recognition. We propose a spatiotemporal visual-semantic embedding network (STVSEM) for zero-shot action recognition. First, given the fact that two-stream architecture based action recognition algorithms have achieved excellent results in recent years, the module is assembled to our designed network by simultaneously using the spatial features (e.g., RGB appearance) and optical flow in time domain as visual features to significantly improve the visual expression capability. Then, in order to slightly alleviate the problem of semantic loss that typically occurs in the case of using embedding-based ZSL methods, an autoencoder is introduced to get a better semantic representation and complement semantic relationship information for unseen classes by seen classes. Last but not least, a joint embedding mechanism that explores and exploits the relationships of the visual data and semantic information in an intermediate space is employed to ameliorate the gap between vision and semantics. The experimental results on Charades and UCF101 datasets indicate that the proposed method outperforms the state-of-the-art methods in accuracy, which further demonstrates the effectiveness of our method.

Citation Download Citation

Rongqiao An, Zhenjiang Miao, Qingyu Li, Wanru Xu, and Qiang Zhang "Spatiotemporal visual-semantic embedding network for zero-shot action recognition," Journal of Electronic Imaging 28(2), 023007 (8 March 2019). https://doi.org/10.1117/1.JEI.28.2.023007

Received: 10 November 2018; Accepted: 15 February 2019; Published: 8 March 2019

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available