Training deep convolutional networks for satellite or aerial image analysis often requires a large amount of training data. For a more robust algorithm, training data need to have variations not only in the background and target, but also radiometric variations in the image such as shadowing, illumination changes, atmospheric conditions, and imaging platforms with different collection geometry. Data augmentation is a commonly used approach to generating additional training data. However, this approach is often insufficient in accounting for real world changes in lighting, location or viewpoint outside of the collection geometry. Alternatively, image simulation can be an efficient way to augment training data that incorporates all these variations, such as changing backgrounds, that may be encountered in real data. The Digital Imaging and Remote Sensing Image Image Generation (DIRSIG) model is a tool that produces synthetic imagery using a suite of physics-based radiation propagation modules. DIRSIG can simulate images taken from different sensors with variation in collection geometry, spectral response, solar elevation and angle, atmospheric models, target, and background. Simulation of Urban Mobility (SUMO) is a multi-modal traffic simulation tool that explicitly models vehicles that move through a given road network. The output of the SUMO model was incorporated into DIRSIG to generate scenes with moving vehicles. The same approach was used when using helicopters as targets, but with slight modifications. Using the combination of DIRSIG and SUMO, we quickly generated many small images, with the target at the center with different backgrounds. The simulations generated images with vehicles and helicopters as targets, and corresponding images without targets. Using parallel computing, 120,000 training images were generated in about an hour. Some preliminary results show an improvement in the deep learning algorithm when real image training data are augmented with the simulated images, especially when obtaining sufficient real data was particularly challenging.
Sanghui Han, Alex Fafard, John Kerekes, Michael Gartley, Emmett Ientilucci, Andreas Savakis, Charles Law, Jason Parhan, Matt Turek, Keith Fieldhouse, and Todd Rovito, "Efficient generation of image chips for training deep learning algorithms," Proc. SPIE 10202, Automatic Target Recognition XXVII, 1020203 (Presented at SPIE Defense + Security: April 10, 2017; Published: 1 May 2017); https://doi.org/10.1117/12.2261702.
Conference Presentations are recordings of oral presentations given at SPIE conferences and published as part of the conference proceedings. They include the speaker's narration along with a video recording of the presentation slides and animations. Many conference presentations also include full-text papers. Search and browse our growing collection of more than 14,000 conference presentations, including many plenary and keynote presentations.