Translator Disclaimer
21 December 2000 Page segmentation and text extraction from gray-scale images in microfilm format
Author Affiliations +
Proceedings Volume 4307, Document Recognition and Retrieval VIII; (2000)
Event: Photonics West 2001 - Electronic Imaging, 2001, San Jose, CA, United States
The paper deals with a suitably designed system that is being used to separate textual regions from graphics regions and locate textual data from textured background. We presented a method based on edge detection to automatically locate text in some noise infected grayscale newspaper images with microfilm format. The algorithm first finds the appropriate edges of textual region using Canny edge detector, and then by edge merging it makes use of edge features to do block segmentation and classification, afterwards feature aided connected component analysis was used to group homogeneous textual regions together within the scope of its bounding box. We can obtain an efficient block segmentation with reduced memory size by introducing the TLC. The proposed method has been used to locate text in a group of newspaper images with multiple page layout. Initial results are encouraging, we would expand the experiment data to over 300 microfilm images with different layout structures, promising result is anticipated with corresponding modification on the prototype of former algorithm to make it more robust and suitable to different cases.
© (2000) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Qing Yuan and Chew Lim Tan "Page segmentation and text extraction from gray-scale images in microfilm format", Proc. SPIE 4307, Document Recognition and Retrieval VIII, (21 December 2000);

Back to Top