Proc. SPIE. 5296, Document Recognition and Retrieval XI
KEYWORDS: Information fusion, Human-machine interfaces, Databases, Computing systems, Personal digital assistants, Telecommunications, Associative arrays, Optical character recognition, Systems modeling, Instrument modeling
Guiding a recognition task using a language model is commonly accepted as having a positive effect on accuracy and is routinely used in automated speech processing. This paper presents a quantitative study of the impact of the use of word models in online handwriting recognition applied to form-filling tasks on handheld devices. Two types of word models are considered: a dictionary, typically from few thousands and up to hundred-thousand words; and a grammar or regular expression generating a language several orders of magnitude bigger than the dictionary. It is reported that the improvement in accuracy obtained by the use of a grammar compares with the gain provided by the use of a dictionary. Finally, the impact of the word models on user acceptance of online handwriting recognition in a specific form-filling application is presented.
KEYWORDS: Lithium, Statistical analysis, Detection and tracking algorithms, Visualization, Error analysis, Silicon, Computer programming, Picosecond phenomena, Optical character recognition, System on a chip
In this paper the Damerau-Levenshtein string difference metric is generalized in two ways to more accurately compensate for the types of errors that are present in the script recognition domain. First, the basic dynamic programming method for computing such a measure is extended to allow for merges, splits and two-letter substitutions. Second, edit operations are refined into categories according to the effect they have on the visual `appearance' of words. A set of recognizer-independent constraints is developed to reflect the severity of the information lost due to each operation. These constraints are solved to assign specific costs to the operations. Experimental results on 2,335 corrupted strings and a lexicon of 21,299 words show higher correcting rates than with the original form.
This paper explores different distance algorithms that can group connected components of a handwritten text line into words. A binarized handwritten text image normally consists of many connected components, where each component is a character fragment, an isolated character, or a group of characters. When the writing style is unconstrained, recognition of individual components is unreliable so the components must be grouped into words before recognition algorithms (which may require dictionaries) can be used. Algorithms that compute the distance between connected components can indicate how the connected components should be clustered into words. We show that fast straightforward distance algorithms (such as using the horizontal distance between the component''s bounding boxes) have mediocre performance. Euclidean distance algorithms perform well but are computationally slow. This paper describes original methods of computing distances. These algorithms include combining a set of horizontal distances between components (applied to each pixel row) with the Euclidean and bounding box methods to achieve high performance and reasonable speed. We examine six distance algorithms and each is tested on unconstrained handwritten address images.