30 March 1995 Character segmentation using visual interword constraints in a text page
Author Affiliations +
Abstract
Character segmentation is a critical preprocessing step for text recognition. In this paper a method is presented that utilizes visual inter-word constraints available in a text image to split word images into smaller image pieces. This method is applicable to machine-printed texts in which the same spacing is always used between identical pairs of characters. The visual inter- word constraints considered here include information about whether a word image is a sub- image of another word image. For example, given two word images A and B, which are `mathematical' and `the.' If the short word image B is found to be a sub-image of the long word image A, the longer image A is split into three pieces, A1, A2, and A3, where A2 matches B, A1 corresponds to `ma,' and A3 corresponds to `matical.' The image piece A1 can be further used to split A3 into two parts, `ma' and `tical.' This method is based purely on image processing using the visual context in a text page. No recognition is involved.
© (1995) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Tao Hong, Tao Hong, Jonathan J. Hull, Jonathan J. Hull, } "Character segmentation using visual interword constraints in a text page", Proc. SPIE 2422, Document Recognition II, (30 March 1995); doi: 10.1117/12.205820; https://doi.org/10.1117/12.205820
PROCEEDINGS
11 PAGES


SHARE
RELATED CONTENT

Text segmentation for automatic document processing
Proceedings of SPIE (January 06 1999)
Quality analysis of blue veined cheeses by MRI a...
Proceedings of SPIE (April 30 2003)
Length estimation of digital curves
Proceedings of SPIE (September 22 1999)
Unified approach toward text recognition
Proceedings of SPIE (March 06 1996)

Back to Top