22 December 1999 Optical character recognition: an illustrated guide to the frontier
Author Affiliations +
We offer a perspective on the performance of current OCR systems by illustrating and explaining actual OCR errors made by three commercial devices. After discussing briefly the character recognition abilities of humans and computers, we present illustrated examples of recognition errors. The top level of our taxonomy of the causes of errors consists of Imaging Defects, Similar Symbols, Punctuation, and Typography. The analysis of a series of 'snippets' from this perspective provides insight into the strengths and weaknesses of current systems, and perhaps a road map to future progress. The examples were drawn from the large-scale tests conducted by the authors at the Information Science Research Institute of the University of Nevada, Las Vegas. By way of conclusion, we point to possible approaches for improving the accuracy of today's systems. The talk is based on our eponymous monograph, recently published in The Kluwer International Series in Engineering and Computer Science, Kluwer Academic Publishers, 1999.
© (1999) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
George Nagy, George Nagy, Thomas A. Nartker, Thomas A. Nartker, Stephen V. Rice, Stephen V. Rice, "Optical character recognition: an illustrated guide to the frontier", Proc. SPIE 3967, Document Recognition and Retrieval VII, (22 December 1999); doi: 10.1117/12.373511; https://doi.org/10.1117/12.373511


Automated measurement of printer effective addressability
Proceedings of SPIE (February 02 2014)
Quasi-Monte Carlo: halftoning in high dimensions?
Proceedings of SPIE (June 30 2003)
Document Image Processing For Office Applications
Proceedings of SPIE (June 02 1987)
Asymptotic cost in document conversion
Proceedings of SPIE (January 23 2012)
Systematic bias in OCR experiments
Proceedings of SPIE (March 29 1995)

Back to Top