23 March 1994 Document image binarization based on texture analysis
Author Affiliations +
A new thresholding algorithm is presented to address strong noise, complex patterns, poor contrast, and variable modalities in gray-scale histograms. It is based on document image texture analysis. The algorithm consists of three steps. First candidate thresholds are produced from the gray scale histogram analysis. Then, texture features associated with each candidate threshold are computed from the corresponding run-length histogram. Finally, the optimal threshold is selected according to the goodness evaluation so that the most desirable document texture features are produced. Only one pass through an image is required for optimal threshold selection. The test set consisted of 9000 machine printed address blocks from an unconstrained U.S. mail stream. Over 99.6% of the images were visually well-binarized. In an objective test, a system run with 594 mail address blocks, which contains many difficult images, shows that an 8.1% higher character recognition rate was achieved, compared to that by a previous algorithm due to Otsu.
© (1994) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Ying Liu, Ying Liu, Sargur N. Srihari, Sargur N. Srihari, } "Document image binarization based on texture analysis", Proc. SPIE 2181, Document Recognition, (23 March 1994); doi: 10.1117/12.171112; https://doi.org/10.1117/12.171112

Back to Top