29 January 2007 Interactive training for handwriting recognition in historical document collections
Author Affiliations +
Abstract
We present a method of interactive training for handwriting recognition in collections of documents. As the user transcribes (labels) the words in the training set, words are automatically skipped if they appear to match words that are already transcribed. By reducing the amount of redundant training, better coverage of the data is achieved, resulting in more accurate recognition. Using word-level features for training and recognition in a collection of George Washington's manuscripts, the recognition ratio is approximately 2%-8% higher after training with our interactive method than after training the same number of words sequentially. Using our approach, less training is required to achieve an equivalent recognition ratio. A slight improvement in recognition ratio is also observed when using our method on a second data set, which consists of several pages from a diary written by Jennie Leavitt Smith.
© (2007) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Douglas J. Kennard, Douglas J. Kennard, William A. Barrett, William A. Barrett, "Interactive training for handwriting recognition in historical document collections", Proc. SPIE 6500, Document Recognition and Retrieval XIV, 65000E (29 January 2007); doi: 10.1117/12.703378; https://doi.org/10.1117/12.703378
PROCEEDINGS
8 PAGES


SHARE
Back to Top