15 December 2003 Hierarchical logical structure extraction of book documents by analyzing tables of contents
Author Affiliations +
Abstract
Logical structure extraction of book documents is significant in electronic document database automatic construction. The tables of contents in a book play an important role in representing the overall logical structure and reference information of the book documents. In this paper, a new method is proposed to extract the hierarchical logical structure of book documents, in addition to the reference information, by combining spatial and semantic information of the tables of contents in a book. Experimental results obtained from testing on various book documents demonstrate the effectiveness and robustness of the proposed approach.
© (2003) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Feng He, Xiaoqing Ding, Liangrui Peng, "Hierarchical logical structure extraction of book documents by analyzing tables of contents", Proc. SPIE 5296, Document Recognition and Retrieval XI, (15 December 2003); doi: 10.1117/12.528808; https://doi.org/10.1117/12.528808
PROCEEDINGS
8 PAGES


SHARE
Back to Top