22 December 1999 Optical character recognition: an illustrated guide to the frontier
Author Affiliations +
Abstract
We offer a perspective on the performance of current OCR systems by illustrating and explaining actual OCR errors made by three commercial devices. After discussing briefly the character recognition abilities of humans and computers, we present illustrated examples of recognition errors. The top level of our taxonomy of the causes of errors consists of Imaging Defects, Similar Symbols, Punctuation, and Typography. The analysis of a series of 'snippets' from this perspective provides insight into the strengths and weaknesses of current systems, and perhaps a road map to future progress. The examples were drawn from the large-scale tests conducted by the authors at the Information Science Research Institute of the University of Nevada, Las Vegas. By way of conclusion, we point to possible approaches for improving the accuracy of today's systems. The talk is based on our eponymous monograph, recently published in The Kluwer International Series in Engineering and Computer Science, Kluwer Academic Publishers, 1999.
© (1999) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
George Nagy, George Nagy, Thomas A. Nartker, Thomas A. Nartker, Stephen V. Rice, Stephen V. Rice, } "Optical character recognition: an illustrated guide to the frontier", Proc. SPIE 3967, Document Recognition and Retrieval VII, (22 December 1999); doi: 10.1117/12.373511; https://doi.org/10.1117/12.373511
PROCEEDINGS
12 PAGES


SHARE
RELATED CONTENT

Automated measurement of printer effective addressability
Proceedings of SPIE (February 03 2014)
Interaction for style-constrained OCR
Proceedings of SPIE (January 29 2007)
Computer-Generated Barrier-Strip Autostereography
Proceedings of SPIE (September 11 1989)
Asymptotic cost in document conversion
Proceedings of SPIE (January 23 2012)

Back to Top