The advent of deep learning methods has led to impressive advances in computer vision tasks over the past decades, largely due to their ability to extract non-linear features that are well adapted to the task at hand. For supervised approaches, data labeling is essential to achieve a high level of performance; however, this task can be so fastidious or even troublesome in difficult contexts (e.g., specific defect detection, unconventional data annotations, etc.) that experts can sometimes erroneously provide the wrong ground truth label. Considering classification problems, this paper addresses the issue of handling noisy labels in datasets. Specifically, we first detect the noisy samples of a dataset using set-valued labels and then improve their classification using Venn–Abers predictors. The obtained results reach more than 0.99 and 0.90 accuracy for noisified versions of two widely used image classification datasets, digit MNIST and CIFAR-10 respectively with a 40% two-class pair-flip noise ratio and 0.87 accuracy for CIFAR-10 with 10-class uniform 40% noise ratio. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
Education and training
Calibration
Binary data
Machine learning
Matrices
Data analysis
Data modeling