31 January 2017 Unified layout analysis and text localization framework
Author Affiliations +
Abstract
A technique appropriate for extracting textual information from documents with complex layouts, such as newspapers and journals, is presented. It is a combination of a foreground analysis and a text localization method. The first one is used to segment the page in text and nontext blocks, whereas the second one is used to detect text that may be embedded inside images, charts, diagrams, tables, etc. Detailed experiments on two public databases showed that mixing layout analysis and text localization techniques can lead to improved page segmentation and text extraction results.
© 2017 SPIE and IS&T
Nikos Vasilopoulos, Ergina Kavallieratou, "Unified layout analysis and text localization framework," Journal of Electronic Imaging 26(1), 013009 (31 January 2017). https://doi.org/10.1117/1.JEI.26.1.013009 . Submission: Received: 22 September 2016; Accepted: 10 January 2017
Received: 22 September 2016; Accepted: 10 January 2017; Published: 31 January 2017
JOURNAL ARTICLE
11 PAGES


SHARE
RELATED CONTENT


Back to Top