Translator Disclaimer
1 August 1990 Segmentation of document images
Author Affiliations +
Document scanning is now an accepted part of office procedure, allowing the incorporation of digitized images into new documents and the conversion of scanned print into ASCII by optical character recognition ( OCR). Often document pages contain more than one form of information - textual, graphical and/or pictorial. Segmentation of document images into these three categories is feasible with the aid of image processing. Projections of the thresholded document images in conjunction with autocorrelation are used to check text alignment. Then the edge shifting properties of the rank filter are used to coalesce image regions containing text into solid near-rectangular blocks. Pyramidal reduction is combined with the filtering to ease the computational burden. Horizontal and vertical projections are used to segment whole pages recursively into homogeneous blocks whose properties are then analysed. Applications forseen for the image segmentation include modified facsimile systems, achievement of artifact-free OCR and conversion of document images into files with separate formats for text, graphics and pictures.
© (1990) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Philip J. Bones, Todd C. Griffin, and Chris M. Carey-Smith "Segmentation of document images", Proc. SPIE 1258, Image Communications and Workstations, (1 August 1990);

Back to Top