14 April 1993 Synchronous tracking of outputs from multiple OCR systems
Author Affiliations +
The accuracies of OCR systems have increased in recent years due to improvements in pre- processing methods and recognition algorithms. It has been suggested that even higher accuracy can be attained by integrating the results of two or more OCR systems with uncorrelated errors. There are several methods for integrating outputs, some of which had already been published. However, prior to integration, the individual characters or symbols from the various OCR outputs have to be synchronously tracked in order to compare them. Whereas it is simple to determine character correspondence among strings containing only substitution errors, the matching of strings of unequal length, which result when an OCR system generates insertion and deletion errors, is more complicated. Detecting loss of synchronization is made more difficult when consecutive errors occur. The length of the error burst must be determined or upper bounded before the error can be classified or synchronicity restored. This paper focuses on the tracking problem, and uses a dynamic programming search in n dimensions, where n is the number of OCR systems. The algorithm models the error generation process at each of the OCR systems and looks for the most probable combination of synchronized OCR outputs from the beginning to the end of all strings. The final output of the process (which can be the input to an integrator) is a series of n-tuples, each one containing exactly one output character (including nulls or deletions) from an OCR system.
© (1993) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Vicente P. Concepcion, Vicente P. Concepcion, Donald P. D'Amato, Donald P. D'Amato, "Synchronous tracking of outputs from multiple OCR systems", Proc. SPIE 1906, Character Recognition Technologies, (14 April 1993); doi: 10.1117/12.143623; https://doi.org/10.1117/12.143623

Back to Top