Translator Disclaimer
15 December 2003 Adaptive color document image binarization for text retrieval
Author Affiliations +
Proceedings Volume 5296, Document Recognition and Retrieval XI; (2003)
Event: Electronic Imaging 2004, 2004, San Jose, California, United States
This paper presents a decision tree based adaptive binarization method for text retrieval in color document images. This method extends Ni-Black windowed thresholding technique and hue (H), saturation (S) and value (V) are employed. First, an observation window is retrieved, and based on standard deviation of H, S and V, a pre-defined decision tree is used for selecting proper variables that should be employed. Secondly, Karhunen-Loeve Transform (KLT) is used for eliminating correlation and reducing dimension. Finally, center point of the window is classified based on 2-D standard normal distribution. The result shows that our binarization method generates better result than Ni-Black and other global thresholding binarization method such as Otsu’s in color document images. A comparison using a commercial OCR system shows that our method can be used in various situations for high quality text retrieval.
© (2003) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Yi Li, Zhiyan Wang, and Haizan Zeng "Adaptive color document image binarization for text retrieval", Proc. SPIE 5296, Document Recognition and Retrieval XI, (15 December 2003);


Location and recovery of text on oriented surfaces
Proceedings of SPIE (December 22 1999)
Translation lexicon acquisition from bilingual dictionaries
Proceedings of SPIE (December 18 2001)
Geometric image processing of stereo pairs
Proceedings of SPIE (June 22 2001)

Back to Top