Paper
29 January 2007 Segmentation and labeling of documents using conditional random fields
Author Affiliations +
Proceedings Volume 6500, Document Recognition and Retrieval XIV; 65000U (2007) https://doi.org/10.1117/12.704410
Event: Electronic Imaging 2007, 2007, San Jose, CA, United States
Abstract
The paper describes the use of Conditional Random Fields(CRF) utilizing contextual information in automatically labeling extracted segments of scanned documents as Machine-print, Handwriting and Noise. The result of such a labeling can serve as an indexing step for a context-based image retrieval system or a bio-metric signature verification system. A simple region growing algorithm is first used to segment the document into a number of patches. A label for each such segmented patch is inferred using a CRF model. The model is flexible enough to include signatures as a type of handwriting and isolate it from machine-print and noise. The robustness of the model is due to the inherent nature of modeling neighboring spatial dependencies in the labels as well as the observed data using CRF. Maximum pseudo-likelihood estimates for the parameters of the CRF model are learnt using conjugate gradient descent. Inference of labels is done by computing the probability of the labels under the model with Gibbs sampling. Experimental results show that this approach provides for 95.75% of the data being assigned correct labels. The CRF based model is shown to be superior to Neural Networks and Naive Bayes.
© (2007) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Shravya Shetty, Harish Srinivasan, Matthew Beal, and Sargur Srihari "Segmentation and labeling of documents using conditional random fields", Proc. SPIE 6500, Document Recognition and Retrieval XIV, 65000U (29 January 2007); https://doi.org/10.1117/12.704410
Lens.org Logo
CITATIONS
Cited by 42 scholarly publications and 2 patents.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image segmentation

Data modeling

Nickel

Neural networks

Feature extraction

Image retrieval

Lithium

RELATED CONTENT


Back to Top