Translator Disclaimer
21 December 2000 Document text segmentation using multiband disc model
Author Affiliations +
This paper proposes a multi-band disc model to do document page segmentation to segregate text blocks from graphic images. We first introduce the idea of our disc-model and go on to discuss the improved multi-band version of the disc- model. The disc-model takes a bottom-up segmentation approach that tries to establish local neighborhood of objects on a page and then trace the propagation of such neighborhood until all objects in text blocks are reached. The significance of the disc-model is the link established between the sizes of the objects and their positional thus logical relationship. Furthermore, the disc-model is rotational symmetric. Therefore, the disc-model can be applied to text with mixed typefaces, with arbitrary outline shapes. It is tolerable to skews or misalignment of the objects in the input images.
© (2000) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Chew Lim Tan and Bo Yuan "Document text segmentation using multiband disc model", Proc. SPIE 4307, Document Recognition and Retrieval VIII, (21 December 2000);

Back to Top