Perimeter monitoring systems have become one of the most researched topics in recent times. Owing to the increasing demand for using multiple sensor modalities, the data for processing is becoming high dimensional. These representations are often too complex to visualize and decipher. In this paper, we will investigate the use of feature selection and dimensionality reduction strategies for the classification of targets using seismic and acoustic signatures. A time-slice classification approach with 43 numbers of features extracted from multi-domain transformations has been evaluated on the SITEX02 military vehicle dataset consisting of tracked AAV and wheeled DW vehicle. Acoustic signals with SVM-RBF resulted in an accuracy of 93.4%, and for seismic signals, the ensemble of decision trees classifier with bagging approach resulted in an accuracy of 90.6 %. Further principal component analysis (PCA) and neighborhood component analysis (NCA) based feature selection approach has been applied to the extracted features. NCA based approach retained only 20 features that obtained classification accuracy ~ 94.7% for acoustic and ~ 90.5% for seismic. An increase of ~2% to 4% is observed for NCA when compared to PCA based feature transformation approach. A further fusion of individual seismic and acoustic classifier posterior probabilities increases the classification accuracy to 97.7%. Further, a comparison with PCA and NCA based feature optimization strategies have also been validated on CSIO experimental datasets comprising of moving civilian vehicles and anthropogenic activities.
Surveillance applications demand round the clock monitoring of regions in constrained illumination conditions. Thermal infrared cameras which capture the heat emitted by the objects present in the scene appear as a suitable sensor technology for such applications. However, developing of AI techniques for automatic detection of targets for monitoring applications is challenging due to high variability of targets within a class, variations in pose of targets, widely varying environmental conditions, etc. This paper presents a real-time framework to detect and classify targets in a forest landscape. The system comprises of two main stages: the moving target detection and detected target classification. For the first stage, Mixture of Gaussians (MoG) background subtraction is used for detection of Region of Interest (ROI) from individual frames of the IR video sequence. For the second stage, a pre-trained Deep Convolutional Neural Network with additional custom layers has been used for the feature extraction and classification. A challenging thermal dataset created by using both experimentally generated thermal infrared images and from publically available FLIR Thermal Dataset. This dataset is used for training and validating the proposed deep learning framework. The model demonstrated a preliminary testing accuracy of 95%. The real-time deployment of the framework is done on embedded platform having an 8-core ARM v8.2 64-bit CPU and 512-core Volta GPU with Tensor Cores. The moving target detection and recognition framework achieved a frame rate of approximately 23 fps on this embedded computing platform, making it suitable for deployment in resource constrained environments.
Human action recognition in indoor environment can prove to be very crucial in avoiding serious accidents and (or) damage. Application domain spans from monitoring the actions of solitary elders or persons with disabilities to monitoring persons working alone in a chamber or in isolated industry environment. These scenarios demand an automatic near real-time activity recognition and alert to save life and assets. In this work, considering the fact that the sensing modality should be capable of working round the clock in a non-intrusive manner, we have opted for thermal infrared camera, which captures the heat emitted by objects in the scene and generates an image. Motivated by the recent success of convolutional neural networks (CNN) for human action recognition in IR images, we extend this work by incorporating one additional dimension i.e. the temporal information. In this work, we have designed and implemented a 3D-CNN for learning the spatial as well as the sequential features in the thermal IR videos. In this work, eight action classes are considered - Walking, Standing, Falling, Lying, Sitting, Falling from chair, Sitting up (recovering from fall from sitting posture), Getting up (recovering from fall from lying posture). To evaluate the proposed framework, infrared (IR) videos of different actions were generated in three diverse environments of home – inside study room, inside a bedroom and in the garden. The dataset comprised of 2641 and 894 IR videos for training and testing respectively, each of half a second duration performed by more than 50 volunteers. We have designed and implemented 3D-CNN, comprising of two blocks, each of two convolution and one max pool layer, which automatically constructs features from raw data incorporating both spatial and temporal information to learn actions. Network parameters are learned using back-propagation algorithm and the learning is supervised. Experimental results indicate 85% classification accuracy on 894 complex test videos of the proposed Spatio-Temporal Deep Learning architecture on the IR action dataset.
Automatic vehicle type classification plays a significant role in security, traffic control and autonomous driving applications. Thermal infrared (IR) cameras operating even in complete darkness and adverse weather conditions emerge as a potential sensing modality for such challenging outdoor applications. However, automated vehicle type classification in infrared imagery still poses significant challenges due to high variability of vehicle signature in infrared band leading to high intra-class variation and low inter-class variation. To address these issues, we demonstrate the use of local features represented in a bag of words framework. In this work, we present comparative analysis of two feature detectors, MSER – a sparse region based detector and uniform dense sampling of points in the image across multiple scales (termed dense). A bag of features (BoF) framework based on SURF feature descriptor and SVM classifier for vehicle type classification are evaluated on a thermal infrared (TIR) vehicle dataset. A number of variations are present in the TIR vehicle dataset - scale variation, pose variation and partial visibility of vehicles captured under varied environmental conditions. The dataset contains four vehicle categories commonly plying on Indian roads, Bike, Autorickshaw, Car and Heavy vehicle. The performance of the designed vehicle type classification framework was evaluated using performance metrics, classification accuracy and confusion matrix. The optimized sparse MSER and dense BoF framework demonstrated decent classification accuracies of 85.7% and 93% respectively for automatic vehicle type classification on the thermal infrared vehicle dataset.
Proc. SPIE. 9844, Automatic Target Recognition XXVI
KEYWORDS: Signal to noise ratio, Detection and tracking algorithms, Sensors, Error analysis, Fourier transforms, Interference (communication), Data acquisition, Data processing, Chemical elements, Acoustics
In this work, array processing techniques based on subspace decomposition of signal have been evaluated for estimation of direction of arrival of moving targets using acoustic signatures. Three subspace based approaches – Incoherent Wideband Multiple Signal Classification (IWM), Least Square-Estimation of Signal Parameters via Rotation Invariance Techniques (LS-ESPRIT) and Total Least Square- ESPIRIT (TLS-ESPRIT) are considered. Their performance is compared with conventional time delay estimation (TDE) approaches such as Generalized Cross Correlation (GCC) and Average Square Difference Function (ASDF). Performance evaluation has been conducted on experimentally generated data consisting of acoustic signatures of four different types of civilian vehicles moving in defined geometrical trajectories. Mean absolute error and standard deviation of the DOA estimates w.r.t. ground truth are used as performance evaluation metrics. Lower statistical values of mean error confirm the superiority of subspace based approaches over TDE based techniques. Amongst the compared methods, LS-ESPRIT indicated better performance.