7 March 2019 Two-stream siamese network with contrastive-center losses for RGB-D action recognition
Chunxiao Fan, Zhengyuan Zhai, Yue Ming, Lei Tian
Author Affiliations +
Abstract
Many fusion methods have been developed to improve the performance of action recognition with RGB and depth data, where learning conjoint representation of heterogeneous modalities by a single network has not been paid enough attention. We present an associated representation method for RGB-D action recognition using the siamese network with contrastive-center loss. First, some samples of each class and modality data are selected as the references to construct positive and negative pairs. Each positive pair consists of a training sample and its class reference, whereas the negative pair only involves different classes reference. Then these pairs are inputted to a two-stream siamese network to learn the collaborative representation of RGB and depth data. Two ranking losses, namely intramodal and cross-modal contrastive-center loss, are developed to impose similarity/dissimilarity metric on those pairs. Specifically, the intramodal contrastive-center loss measures the relationship between samples and references from RGB or depth data. The cross-modal contrastive-center loss measures the relationship of visual and depth features in a same low-dimensional space. Finally, the ranking losses and a softmax loss are jointly optimized for action recognition. The proposed method is evaluated on two large action datasets, LAP IsoGD and NTU RGB+D, and a smaller dataset, Sheffield Kinect gesture. The experimental results demonstrate that the proposed method surpasses most of the state-of-the-art methods.
© 2019 SPIE and IS&T 1017-9909/2019/$25.00 © 2019 SPIE and IS&T
Chunxiao Fan, Zhengyuan Zhai, Yue Ming, and Lei Tian "Two-stream siamese network with contrastive-center losses for RGB-D action recognition," Journal of Electronic Imaging 28(2), 023004 (7 March 2019). https://doi.org/10.1117/1.JEI.28.2.023004
Received: 21 June 2018; Accepted: 11 February 2019; Published: 7 March 2019
Lens.org Logo
CITATIONS
Cited by 1 scholarly publication.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
RGB color model

Data modeling

Convolution

Network architectures

Distance measurement

Neural networks

Data fusion

Back to Top