4 February 2013 Optimal policy for labeling training samples
Author Affiliations +
Confirming the labels of automatically classified patterns is generally faster than entering new labels or correcting incorrect labels. Most labels assigned by a classifier, even if trained only on relatively few pre-labeled patterns, are correct. Therefore the overall cost of human labeling can be decreased by interspersing labeling and classification. Given a parameterized model of the error rate as an inverse power law function of the size of the training set, the optimal splits can be computed rapidly. Projected savings in operator time are over 60% for a range of empirical error functions for hand-printed digit classification with ten different classifiers.
© (2013) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Lester Lipsky, Lester Lipsky, Daniel Lopresti, Daniel Lopresti, George Nagy, George Nagy, "Optimal policy for labeling training samples", Proc. SPIE 8658, Document Recognition and Retrieval XX, 865809 (4 February 2013); doi: 10.1117/12.2005942; https://doi.org/10.1117/12.2005942

Back to Top