24 March 2014 Robust binarization of degraded document images using heuristics
Author Affiliations +
Abstract
Historically significant documents are often discovered with defects that make them difficult to read and analyze. This fact is particularly troublesome if the defects prevent software from performing an automated analysis. Image enhancement methods are used to remove or minimize document defects, improve software performance, and generally make images more legible. We describe an automated, image enhancement method that is input page independent and requires no training data. The approach applies to color or greyscale images with hand written script, typewritten text, images, and mixtures thereof. We evaluated the image enhancement method against the test images provided by the 2011 Document Image Binarization Contest (DIBCO). Our method outperforms all 2011 DIBCO entrants in terms of average F1 measure – doing so with a significantly lower variance than top contest entrants. The capability of the proposed method is also illustrated using select images from a collection of historic documents stored at Yad Vashem Holocaust Memorial in Israel.
© (2014) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Jon Parker, Ophir Frieder, Gideon Frieder, "Robust binarization of degraded document images using heuristics", Proc. SPIE 9021, Document Recognition and Retrieval XXI, 90210U (24 March 2014); doi: 10.1117/12.2042581; https://doi.org/10.1117/12.2042581
PROCEEDINGS
12 PAGES


SHARE
Back to Top