Paper
17 January 2005 Software tools and test data for research and testing of page-reading OCR systems
Thomas A. Nartker, Stephen V. Rice, Steven E. Lumos
Author Affiliations +
Proceedings Volume 5676, Document Recognition and Retrieval XII; (2005) https://doi.org/10.1117/12.587293
Event: Electronic Imaging 2005, 2005, San Jose, California, United States
Abstract
We announce the availability of the UNLV/ISRI Analytic Tools for OCR Evaluation together with a large and diverse collection of scanned document images with the associated ground-truth text. This combination of tools and test data will allow anyone to conduct a meaningful test comparing the performance of competing page-reading algorithms. The value of this collection of software tools and test data is enhanced by knowledge of the past performance of several systems using exactly these tools and this data. These performance comparisons were published in previous ISRI Test Reports and are also provided. Another value is that the tools can be used to test the character accuracy of any page-reading OCR system for any language included in the Unicode standard. The paper concludes with a summary of the programs, test data, and documentation that is available and gives the URL where they can be located.
© (2005) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Thomas A. Nartker, Stephen V. Rice, and Steven E. Lumos "Software tools and test data for research and testing of page-reading OCR systems", Proc. SPIE 5676, Document Recognition and Retrieval XII, (17 January 2005); https://doi.org/10.1117/12.587293
Lens.org Logo
CITATIONS
Cited by 26 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Optical character recognition

Binary data

Legal

Standards development

Computer programming

Image compression

Nomenclature

RELATED CONTENT

Development of new image compression algorithm (Xena)
Proceedings of SPIE (February 26 2007)
The Bible, truth, and multilingual OCR evaluation
Proceedings of SPIE (January 07 1999)
Noncausal predictive image coding
Proceedings of SPIE (October 22 1993)

Back to Top