4 February 2013 Optimal policy for labeling training samples
Author Affiliations +
Confirming the labels of automatically classified patterns is generally faster than entering new labels or correcting incorrect labels. Most labels assigned by a classifier, even if trained only on relatively few pre-labeled patterns, are correct. Therefore the overall cost of human labeling can be decreased by interspersing labeling and classification. Given a parameterized model of the error rate as an inverse power law function of the size of the training set, the optimal splits can be computed rapidly. Projected savings in operator time are over 60% for a range of empirical error functions for hand-printed digit classification with ten different classifiers.
© (2013) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Lester Lipsky, Lester Lipsky, Daniel Lopresti, Daniel Lopresti, George Nagy, George Nagy, } "Optimal policy for labeling training samples", Proc. SPIE 8658, Document Recognition and Retrieval XX, 865809 (4 February 2013); doi: 10.1117/12.2005942; https://doi.org/10.1117/12.2005942


Parameter estimation of clutter pdf models
Proceedings of SPIE (December 04 1998)
Toward quantifying the amount of style in a dataset
Proceedings of SPIE (January 16 2006)
Mask cost analysis via write time estimation
Proceedings of SPIE (May 05 2005)
The OCRopus open source OCR system
Proceedings of SPIE (January 28 2008)
Sherpa: a mission-independent data analysis application
Proceedings of SPIE (November 01 2001)

Back to Top