Translator Disclaimer
21 December 2000 Text block segmentation using pyramid structure
Author Affiliations +
Proceedings Volume 4307, Document Recognition and Retrieval VIII; (2000) https://doi.org/10.1117/12.410849
Event: Photonics West 2001 - Electronic Imaging, 2001, San Jose, CA, United States
Abstract
Text block segmentation is necessary in document layout analysis. An algorithm and its implementation that segregates text block by block (a block is either a title or a paragraph) from the provided document, e.g. newspaper image, based on pyramid structure is described in this paper. The pyramid structure, which is amenable for parallel processing on output, is a multi-resolution image representation. The pyramid structure also simulates what the human eyes see the document from afar visualizing the block structure of the document, the block segmentation can identify the titles, and distinguish different paragraphs based on the indentation between them. Our implementation will be used in a news articles retrieval project.
© (2000) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Chew Lim Tan and Zheng Zhang "Text block segmentation using pyramid structure", Proc. SPIE 4307, Document Recognition and Retrieval VIII, (21 December 2000); https://doi.org/10.1117/12.410849
PROCEEDINGS
10 PAGES


SHARE
Advertisement
Advertisement
Back to Top