Paper
24 November 2021 Long-distance gesture detection based on deep learning for 3D spatial interaction
Author Affiliations +
Proceedings Volume 12066, AOPC 2021: Micro-optics and MOEMS; 1206618 (2021) https://doi.org/10.1117/12.2606391
Event: Applied Optics and Photonics China 2021, 2021, Beijing, China
Abstract
With the technological development of stereoscopic display, an immersive 3D space with large size can be reconstructed more and more easily, and a 3D spatial interaction method with high-efficiency become more and more urgent. Gesture interaction, as the most natural and efficient way of human-computer interaction, can convey information very quickly and efficiently. However, the effective interaction distance of most existing gesture interaction methods is less than one meter, and can not meet the demand of the long distance 3D spatial interaction. In this paper, an efficient network named Gesture YOLO for long-distance gesture detection is proposed to achieve the small gesture object detection with improved accuracy. There are two modules in our Gesture YOLO, one is the Dual CSPDarknet53-tiny Backbone module for fusing person features and gesture features, and the other is the Progressive Multi-Scale Feature Fusion module for enhancing output features. The experimental results on our test set show that our Gesture YOLO can achieve higher gesture detection accuracy than the YOLOv4-tiny at distances ranging from 2m to 5m, and can mitigate the significant drop in gesture detection accuracy when the distance increases.
© (2021) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Xiang Chen, Xinzhu Sang, Duo Chen, Peng Wang, Binbin Yan, Shuo Chen, and Zeyuan Yang "Long-distance gesture detection based on deep learning for 3D spatial interaction", Proc. SPIE 12066, AOPC 2021: Micro-optics and MOEMS, 1206618 (24 November 2021); https://doi.org/10.1117/12.2606391
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
3D displays

Feature extraction

Visualization

Large screens

3D image processing

Computer vision technology

Machine vision

Back to Top