Poster + Paper
15 June 2023 Video-based complex human event recognition with a probabilistic transformer
Author Affiliations +
Conference Poster
Abstract
Complex human events are high-level human activities that are composed of a set of interacting primitive human actions over time. Complex human event recognition is important for many applications, including security surveillance, healthcare, sports and games. Complex human event recognition requires recognizing not only the constituent primitive actions but also, more importantly, their long range spatiotemporal interactions. To meet this requirement, we propose to exploit the self-attention mechanism in the Transformer to model and capture the long-range interactions among primitive actions. We further extend the conventional Transformer to a probabilistic Transformer in order to quantify the event recognition confidence and to detect anomaly events. Specifically, given a sequence of human 3D skeletons, the proposed model first performs primitive action localization and recognition. The recognized primitive human actions and their features are then fed into the probabilistic Transformer for complex human event recognition. By using a probabilistic attention score, the probabilistic Transformer can not only recognize complex events but also quantify its prediction uncertainty. Using the prediction uncertainty, we further propose to detect anomaly events in an unsupervised manner. We evaluate the proposed probabilistic Transformer on FineDiving dataset and Olympics Sports dataset for both complex event recognition and abnormal event detection. The dataset consists of complex events composed of primitive diving actions. The experimental results demonstrate the effectiveness and superiority of our method against baseline methods.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Hongji Guo, Alexander Aved, Collen Roller, Erika Ardiles-Cruz, and Qiang Ji "Video-based complex human event recognition with a probabilistic transformer", Proc. SPIE 12525, Geospatial Informatics XIII , 125250J (15 June 2023); https://doi.org/10.1117/12.2664106
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Transformers

Action recognition

Video

Data modeling

Performance modeling

Machine learning

Statistical modeling

RELATED CONTENT


Back to Top