15 December 2003 Hierarchical logical structure extraction of book documents by analyzing tables of contents
Author Affiliations +
Proceedings Volume 5296, Document Recognition and Retrieval XI; (2003); doi: 10.1117/12.528808
Event: Electronic Imaging 2004, 2004, San Jose, California, United States
Abstract
Logical structure extraction of book documents is significant in electronic document database automatic construction. The tables of contents in a book play an important role in representing the overall logical structure and reference information of the book documents. In this paper, a new method is proposed to extract the hierarchical logical structure of book documents, in addition to the reference information, by combining spatial and semantic information of the tables of contents in a book. Experimental results obtained from testing on various book documents demonstrate the effectiveness and robustness of the proposed approach.
© (2003) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Feng He, Xiaoqing Ding, Liangrui Peng, "Hierarchical logical structure extraction of book documents by analyzing tables of contents", Proc. SPIE 5296, Document Recognition and Retrieval XI, (15 December 2003); doi: 10.1117/12.528808; https://doi.org/10.1117/12.528808
PROCEEDINGS
8 PAGES


SHARE
KEYWORDS
Optical character recognition

Analytical research

Image processing

Databases

Intelligence systems

Statistical analysis

Digital libraries

Back to Top