13 May 2016 Video-based convolutional neural networks for activity recognition from robot-centric videos
Author Affiliations +
In this evaluation paper, we discuss convolutional neural network (CNN)-based approaches for human activity recognition. In particular, we investigate CNN architectures designed to capture temporal information in videos and their applications to the human activity recognition problem. There have been multiple previous works to use CNN-features for videos. These include CNNs using 3-D XYT convolutional filters, CNNs using pooling operations on top of per-frame image-based CNN descriptors, and recurrent neural networks to learn temporal changes in per-frame CNN descriptors. We experimentally compare some of these different representatives CNNs while using first-person human activity videos. We especially focus on videos from a robots viewpoint, captured during its operations and human-robot interactions.
© (2016) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
M. S. Ryoo, M. S. Ryoo, Larry Matthies, Larry Matthies, } "Video-based convolutional neural networks for activity recognition from robot-centric videos", Proc. SPIE 9837, Unmanned Systems Technology XVIII, 98370R (13 May 2016); doi: 10.1117/12.2229531; https://doi.org/10.1117/12.2229531

Back to Top