Paper
4 March 2022 Two-stream deep representation for human action recognition
Najla Bouarada Ghrab, Emna Fendri, Mohamed Hammami
Author Affiliations +
Proceedings Volume 12084, Fourteenth International Conference on Machine Vision (ICMV 2021); 1208410 (2022) https://doi.org/10.1117/12.2623121
Event: Fourteenth International Conference on Machine Vision (ICMV 2021), 2021, Rome, Italy
Abstract
Human action recognition has received a lot of attention in computer vision community given its interest in many real applications. In this paper, we proposed a new method for human action recognition based on deep learning methods. The main contribution of the proposed method is an efficient combination of two Convolutional neural networks. The two-stream framework allows to fully utilize the rich multimodal information in videos. In fact, we explored the complementarity between appearance information and motion information to represent human actions. Specifically, we suggested a spatial Convolutional Neural Network performed on still individual images to model spatial information. To exploit motion between frames, a second Convolutional Neural Network is processed on accumulated optical flow images obtained by stacking the optical flow estimations between consecutive frames in a single image. Then, a fusion score is performed between the two Convolutional Neural Networks to achieve the appropriate class. In order to prove the performance of our method, we trained and evaluated our architecture on a standard human actions benchmark, the Weizmann dataset.
© (2022) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Najla Bouarada Ghrab, Emna Fendri, and Mohamed Hammami "Two-stream deep representation for human action recognition", Proc. SPIE 12084, Fourteenth International Conference on Machine Vision (ICMV 2021), 1208410 (4 March 2022); https://doi.org/10.1117/12.2623121
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Convolutional neural networks

Optical flow

Feature extraction

Video surveillance

Image fusion

Motion models

Back to Top