11 May 2015 Subset selection of training data for machine learning: a situational awareness system case study
Author Affiliations +
Abstract
Recent advances in machine learning with big data sets has allowed for significant advances in the optimisation of classification and recognition systems. However, for applications such as situational awareness systems, the entirety of the available data dwarfs the amount permissible for a training set with tractable machine learning optimization times. Furthermore, the performance of any optimized system is highly dependent of the training set correctly and completely representing the entire data space of scenarios. In this paper we present a technique to characterize the entire data space to ascertain the key factors for representation and subsequently select a subset that statistically represents the correct mix of scenarios. We demonstrate the effectiveness of these characterization and subset selection techniques by using a genetic algorithm to optimize the performance of a gunfire recognition system.
© (2015) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
M. McKenzie, M. McKenzie, S. C. Wong, S. C. Wong, } "Subset selection of training data for machine learning: a situational awareness system case study", Proc. SPIE 9494, Next-Generation Robotics II; and Machine Intelligence and Bio-inspired Computation: Theory and Applications IX, 94940U (11 May 2015); doi: 10.1117/12.2176536; https://doi.org/10.1117/12.2176536
PROCEEDINGS
8 PAGES


SHARE
Back to Top