Paper
19 October 2023 Motion recognition of two-stream based on multi-attention
Bo He, Tianqing Zhang, Minghua Liu, HongBo Shao
Author Affiliations +
Proceedings Volume 12709, Fourth International Conference on Artificial Intelligence and Electromechanical Automation (AIEA 2023); 1270913 (2023) https://doi.org/10.1117/12.2684904
Event: Fourth International Conference on Artificial Intelligence and Electromechanical Automation (AIEA 2023), 2023, Nanjing, China
Abstract
To address the problems that the traditional two-stream network has limitations in effectively fusing spatial-temporal information and the extracted temporal information hard to capture complex motion patterns, which resulting in low recognition accuracy, an improved two-stream convolutional neural network method for human motion recognition is proposed based on multi-attention mechanism. Firstly, in the temporal network, the C3D network is used to replace the original two-dimensional network to solve the problem that the temporal information cannot be extracted effectively; secondly, in the spatial network, multi-scale convolutional Transformer encoder is coded based on the contextual relative position, features are integrated adaptively at different scales under the action of the adaptive scale attention mechanism. Experimental findings conducted on the UCF101 dataset indicate the approach described in this paper performs better in terms of the number of parameters and recognition accuracy.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Bo He, Tianqing Zhang, Minghua Liu, and HongBo Shao "Motion recognition of two-stream based on multi-attention", Proc. SPIE 12709, Fourth International Conference on Artificial Intelligence and Electromechanical Automation (AIEA 2023), 1270913 (19 October 2023); https://doi.org/10.1117/12.2684904
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Convolution

Feature extraction

Transformers

Action recognition

Detection and tracking algorithms

Motion models

Video

Back to Top