18 July 2016 Multiview fusion for activity recognition using deep neural networks
Author Affiliations +
Abstract
Convolutional neural networks (ConvNets) coupled with long short term memory (LSTM) networks have been recently shown to be effective for video classification as they combine the automatic feature extraction capabilities of a neural network with additional memory in the temporal domain. This paper shows how multiview fusion can be applied to such a ConvNet LSTM architecture. Two different fusion techniques are presented. The system is first evaluated in the context of a driver activity recognition system using data collected in a multicamera driving simulator. These results show significant improvement in accuracy with multiview fusion and also show that deep learning performs better than a traditional approach using spatiotemporal features even without requiring any background subtraction. The system is also validated on another publicly available multiview action recognition dataset that has 12 action classes and 8 camera views.
© 2016 SPIE and IS&T
Rahul Kavi, Rahul Kavi, Vinod Kulathumani, Vinod Kulathumani, Fnu Rohit, Fnu Rohit, Vlad Kecojevic, Vlad Kecojevic, } "Multiview fusion for activity recognition using deep neural networks," Journal of Electronic Imaging 25(4), 043010 (18 July 2016). https://doi.org/10.1117/1.JEI.25.4.043010 . Submission:
JOURNAL ARTICLE
8 PAGES


SHARE
Back to Top