22 December 1999 Optical character recognition: an illustrated guide to the frontier
Author Affiliations +
Proceedings Volume 3967, Document Recognition and Retrieval VII; (1999); doi: 10.1117/12.373511
Event: Electronic Imaging, 2000, San Jose, CA, United States
Abstract
We offer a perspective on the performance of current OCR systems by illustrating and explaining actual OCR errors made by three commercial devices. After discussing briefly the character recognition abilities of humans and computers, we present illustrated examples of recognition errors. The top level of our taxonomy of the causes of errors consists of Imaging Defects, Similar Symbols, Punctuation, and Typography. The analysis of a series of 'snippets' from this perspective provides insight into the strengths and weaknesses of current systems, and perhaps a road map to future progress. The examples were drawn from the large-scale tests conducted by the authors at the Information Science Research Institute of the University of Nevada, Las Vegas. By way of conclusion, we point to possible approaches for improving the accuracy of today's systems. The talk is based on our eponymous monograph, recently published in The Kluwer International Series in Engineering and Computer Science, Kluwer Academic Publishers, 1999.
© (1999) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
George Nagy, Thomas A. Nartker, Stephen V. Rice, "Optical character recognition: an illustrated guide to the frontier", Proc. SPIE 3967, Document Recognition and Retrieval VII, (22 December 1999); doi: 10.1117/12.373511; http://dx.doi.org/10.1117/12.373511
PROCEEDINGS
12 PAGES


SHARE
KEYWORDS
Optical character recognition

Printing

Error analysis

Scanners

Image processing

Visualization

Computing systems

Back to Top