Object detection in 3D point clouds is essential in fields such as geospatial intelligence and autonomous driving. The common machine learning problem of scarce labeled training data is even more acute with 3D point cloud data. Active learning provides a framework to prioritize the additional effort to manually annotate unlabeled training data. Most active learning methods for deep learning fall into one of two categories: uncertainty methods and diversity methods. Uncertainty methods select data by assessing model outputs for their confidence and consistency and are therefore dependent on the expected output of each deep learning task. These methods tend to select batches of informative yet highly similar samples to label. Diversity-based active learning aims to create a labeled dataset that is both varied and representative of the remaining unlabeled data. Diversity methods operate directly on the feature representations of the inputs and are thus more flexible with respect to the specifics of the deep learning task. Our current work explores applying diversity methods and uncertainty-diversity hybrid methods to 3D object detection. We evaluate various approaches to incorporate diversity, including K-Medoids clustering, core set selection, and furthest nearest neighbors. We address the high dimensionality of the features extracted from a VoxelNet-based object detector by varying the distance metric used in the active learning algorithms. Furthermore, we compare our results to those obtained using only uncertainty methods. We assess the performance and efficiency of each active learning method in addition to the representativeness and diversity of the labeled datasets produced. We find that hybrid uncertainty-diversity methods outperform other methods in terms of object detection AP50 throughout active learning, annotation efficiency, and class balance.
|