Multi-category object detection in aerial images is an important task for many applications such as surveillance, tracking or search and rescue tasks. In recent years, deep learning approaches using features extracted by convolutional neural networks (CNN) significantly improved the detection accuracy on detection benchmark datasets compared to traditional approaches based on hand-crafted features as used for object detection in aerial images. However, these approaches are not transferable one to one on aerial images as the used network architectures have an insufficient resolution of feature maps for handling small instances. This consequently results in poor localization accuracy or missed detections as the network architectures are explored and optimized for datasets that considerably differ from aerial images in particular in object size and image fraction occupied by an object. In this work, we propose a deep neural network derived from the Faster R-CNN approach for multi- category object detection in aerial images. We show how the detection accuracy can be improved by replacing the network architecture by an architecture especially designed for handling small object sizes. Furthermore, we investigate the impact of different parameters of the detection framework on the detection accuracy for small objects. Finally, we demonstrate the suitability of our network for object detection in aerial images by comparing our network to traditional baseline approaches and deep learning based approaches on the publicly available DLR 3K Munich Vehicle Aerial Image Dataset that comprises multiple object classes such as car, van, truck, bus and camper.
Lars W. Sommer, Tobias Schuchert, and Jürgen Beyerer, "Deep learning based multi-category object detection in aerial images," Proc. SPIE 10202, Automatic Target Recognition XXVII, 1020209 (Presented at SPIE Defense + Security: April 10, 2017; Published: 1 May 2017); https://doi.org/10.1117/12.2262083.
Conference Presentations are recordings of oral presentations given at SPIE conferences and published as part of the conference proceedings. They include the speaker's narration along with a video recording of the presentation slides and animations. Many conference presentations also include full-text papers. Search and browse our growing collection of more than 14,000 conference presentations, including many plenary and keynote presentations.
Study of self-shadowing effect as a simple means to realize nanostructured thin films and layers with special attentions to birefringent obliquely deposited thin films and photo-luminescent porous silicon