16 January 2006 Content-based text mapping using multi-dimensional projections for exploration of document collections
Author Affiliations +
This paper presents a technique for generation of maps of documents targeted at placing similar documents in the same neighborhood. As a result, besides being able to group (and separate) documents by their contents, it runs at very manageable computational costs. Based on multi-dimensional projection techniques and an algorithm for projection improvement, it results in a surface map that allows the user to identify a number of important relationships between documents and sub-groups of documents via visualization and interaction. Visual attributes such as height, color, isolines and glyphs as well as aural attributes (such as pitch), help add dimensions for integrated visual analysis. Exploration and narrowing of focus can be performed using a set of tools provided. This novel text mapping technique, named IDMAP (Interactive Document Map), is fully described in this paper. Results are compared with dimensionality reduction and cluster techniques for the same purposes. The maps are bound to support a large number of applications that rely on retrieval and examination of document collections and to complement the type of information offered by current knowledge domain visualizations.
© (2006) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Rosane Minghim, Rosane Minghim, Fernando Vieira Paulovich, Fernando Vieira Paulovich, Alneu de Andrade Lopes, Alneu de Andrade Lopes, "Content-based text mapping using multi-dimensional projections for exploration of document collections", Proc. SPIE 6060, Visualization and Data Analysis 2006, 60600S (16 January 2006); doi: 10.1117/12.650880; https://doi.org/10.1117/12.650880

Back to Top