Translator Disclaimer
17 March 2017 Trigram-based algorithms for OCR result correction
Author Affiliations +
Proceedings Volume 10341, Ninth International Conference on Machine Vision (ICMV 2016); 103410O (2017)
Event: Ninth International Conference on Machine Vision, 2016, Nice, France
In this paper we consider a task of improving optical character recognition (OCR) results of document fields on low-quality and average-quality images using N-gram models. Cyrillic fields of Russian Federation internal passport are analyzed as an example. Two approaches are presented: the first one is based on hypothesis of dependence of a symbol from two adjacent symbols and the second is based on calculation of marginal distributions and Bayesian networks computation. A comparison of the algorithms and experimental results within a real document OCR system are presented, it's showed that the document field OCR accuracy can be improved by more than 6% for low-quality images.
© (2017) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Konstantin Bulatov, Temudzhin Manzhikov, Oleg Slavin, Igor Faradjev, and Igor Janiszewski "Trigram-based algorithms for OCR result correction", Proc. SPIE 10341, Ninth International Conference on Machine Vision (ICMV 2016), 103410O (17 March 2017);

Cited by 1 scholarly publication.
Optical character recognition

Detection and tracking algorithms

Associative arrays

Analytical research

Visual process modeling

Error analysis

Image processing


Comparison of scanned administrative document images
Proceedings of SPIE (January 31 2020)
MANICURE document processing system
Proceedings of SPIE (April 01 1998)
A multi-evidence, multi-engine OCR system
Proceedings of SPIE (January 29 2007)
Heuristics for test recognition using contextual information
Proceedings of SPIE (January 31 1995)

Back to Top