24 March 2014 Form similarity via Levenshtein distance between ortho-filtered logarithmic ruling-gap ratios
Author Affiliations +
Abstract
Geometric invariants are combined with edit distance to compare the ruling configuration of noisy filled-out forms. It is shown that gap-ratios used as features capture most of the ruling information of even low-resolution and poorly scanned form images, and that the edit distance is tolerant of missed and spurious rulings. No preprocessing is required and the potentially time-consuming string operations are performed on a sparse representation of the detected rulings. Based on edit distance, 158 Arabic forms are classified into 15 groups with 89% accuracy. Since the method was developed for an application that precludes public dissemination of the data, it is illustrated on public-domain death certificates.
© (2014) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
George Nagy, George Nagy, Daniel Lopresti, Daniel Lopresti, } "Form similarity via Levenshtein distance between ortho-filtered logarithmic ruling-gap ratios", Proc. SPIE 9021, Document Recognition and Retrieval XXI, 902106 (24 March 2014); doi: 10.1117/12.2041956; https://doi.org/10.1117/12.2041956
PROCEEDINGS
8 PAGES


SHARE
RELATED CONTENT


Back to Top