15 December 2003 Automatic document navigation for digital content remastering
Author Affiliations +
Abstract
This paper presents a novel method of automatically adding navigation capabilities to re-mastered electronic books. We first analyze the need for a generic and robust system to automatically construct navigation links into re-mastered books. We then introduce the core algorithm based on text matching for building the links. The proposed method utilizes the tree-structured dictionary and directional graph of the table of contents to efficiently conduct the text matching. Information fusion further increases the robustness of the algorithm. The experimental results on the MIT Press digital library project are discussed and the key functional features of the system are illustrated. We have also investigated how the quality of the OCR engine affects the linking algorithm. In addition, the analogy between this work and Web link mining has been pointed out.
© (2003) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Xiaofan Lin, Steven J. Simske, "Automatic document navigation for digital content remastering", Proc. SPIE 5296, Document Recognition and Retrieval XI, (15 December 2003); doi: 10.1117/12.521991; https://doi.org/10.1117/12.521991
PROCEEDINGS
8 PAGES


SHARE
RELATED CONTENT

DRR is a teenager
Proceedings of SPIE (January 28 2008)
Visual mining geo-related data using pixel bar charts
Proceedings of SPIE (March 11 2005)
XML middleware for scalable web mining
Proceedings of SPIE (March 21 2003)

Back to Top