In the field of security and defense, it is extremely important to reliably detect moving objects, such as cars, ships, drones and missiles. Detection and analysis of moving objects in cameras near borders could be helpful to reduce illicit trading, drug trafficking, irregular border crossing, trafficking in human beings and smuggling. Many recent benchmarks have shown that convolutional neural networks are performing well in the detection of objects in images. Most deep-learning research effort focuses on classification or detection on single images. However, the detection of dynamic changes (e.g., moving objects, actions and events) in streaming video is extremely relevant for surveillance and forensic applications. In this paper, we combine an end-to-end feedforward neural network for static detection with a recurrent Long Short-Term Memory (LSTM) network for multi-frame analysis. We present a practical guide with special attention to the selection of the optimizer and batch size. The end-to-end network is able to localize and recognize the vehicles in video from traffic cameras. We show an efficient way to collect relevant in-domain data for training with minimal manual labor. Our results show that the combination with LSTM improves performance for the detection of moving vehicles.
|