23 March 1994 Degraded text recognition using word collocation
Author Affiliations +
Abstract
A relaxation-based algorithm is proposed that improves the performance of a text recognition technique by propagating the influence of word collocation statistics. Word collocation refers to the likelihood that two words co-occur within a fixed distance of one another. For example, in a story about water transportation, it is highly likely that the word `river' will occur within ten words on either side of the word `boat.' The proposed algorithm receives groups of visually similar decisions (called neighborhoods) for words in a running text that are computed by a word recognition algorithm. The position of decisions within the neighborhoods are modified based on how often they co-occur with decisions in the neighborhoods of other nearby words. This process is iterated a number of times effectively propagating the influence of the collocation statistics across an input text. This improves on a strictly local analysis by allowing for strong collocations to reinforce weak (but related) collocations elsewhere. An experimental analysis is discussed in which the algorithm is applied to improving text recognition results that are less than 60% correct. The correct rate is effectively improved to 90% or better in all cases.
© (1994) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Tao Hong, Tao Hong, Jonathan J. Hull, Jonathan J. Hull, } "Degraded text recognition using word collocation", Proc. SPIE 2181, Document Recognition, (23 March 1994); doi: 10.1117/12.171121; https://doi.org/10.1117/12.171121
PROCEEDINGS
8 PAGES


SHARE
RELATED CONTENT

Automatic face recognition in HDR imaging
Proceedings of SPIE (May 14 2014)
Error-detective one-dimensional mapping
Proceedings of SPIE (February 07 2017)
A unified framework for PCA, LDA, and LPP
Proceedings of SPIE (June 01 2012)
Visualizing node attribute uncertainty in graphs
Proceedings of SPIE (January 24 2011)

Back to Top