Paper
4 February 2013 Combining multiple thresholding binarization values to improve OCR output
Author Affiliations +
Proceedings Volume 8658, Document Recognition and Retrieval XX; 86580R (2013) https://doi.org/10.1117/12.2006228
Event: IS&T/SPIE Electronic Imaging, 2013, Burlingame, California, United States
Abstract
For noisy, historical documents, a high optical character recognition (OCR) word error rate (WER) can render the OCR text unusable. Since image binarization is often the method used to identify foreground pixels, a body of research seeks to improve image-wide binarization directly. Instead of relying on any one imperfect binarization technique, our method incorporates information from multiple simple thresholding binarizations of the same image to improve text output. Using a new corpus of 19th century newspaper grayscale images for which the text transcription is known, we observe WERs of 13.8% and higher using current binarization techniques and a state-of-the-art OCR engine. Our novel approach combines the OCR outputs from multiple thresholded images by aligning the text output and producing a lattice of word alternatives from which a lattice word error rate (LWER) is calculated. Our results show a LWER of 7.6% when aligning two threshold images and a LWER of 6.8% when aligning five. From the word lattice we commit to one hypothesis by applying the methods of Lund et al. (2011) achieving an improvement over the original OCR output and a 8.41% WER result on this data set.
© (2013) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
William B. Lund, Douglas J. Kennard, and Eric K. Ringger "Combining multiple thresholding binarization values to improve OCR output", Proc. SPIE 8658, Document Recognition and Retrieval XX, 86580R (4 February 2013); https://doi.org/10.1117/12.2006228
Lens.org Logo
CITATIONS
Cited by 28 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Optical character recognition

RGB color model

Image processing

Optical alignment

Error analysis

Image segmentation

Machine learning

Back to Top