Translator Disclaimer
27 April 2018 Going deeper with CNN in malicious crowd event classification
Author Affiliations +
Terror attacks are often targeted towards the civilians gathered in one location (e.g., Boston Marathon bombing). Distinguishing such ’malicious’ scenes from the ’normal’ ones, which are semantically different, is a difficult task as both scenes contain large groups of people with high visual similarity. To overcome the difficulty, previous methods exploited various contextual information, such as language-driven keywords or relevant objects. Although useful, they require additional human effort or dataset. In this paper, we show that using more sophisticated and deeper Convolutional Neural Networks (CNNs) can achieve better classification accuracy even without using any additional information outside the image domain. We have conducted a comparative study where we train and compare seven different CNN architectures (AlexNet, VGG-M, VGG16, GoogLeNet, ResNet- 50, ResNet-101, and ResNet-152). Based on the experimental analyses, we found out that deeper networks typically show better accuracy, and that GoogLeNet is the most favorable among the seven architectures for the task of malicious event classification.
© (2018) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Sungmin Eum, Hyungtae Lee, and Heesung Kwon "Going deeper with CNN in malicious crowd event classification", Proc. SPIE 10646, Signal Processing, Sensor/Information Fusion, and Target Recognition XXVII, 1064616 (27 April 2018);

Back to Top