Translator Disclaimer
6 April 2000 Toward text understanding: classification of text documents by word map
Author Affiliations +
Abstract
In many fields, for example in business, engineering, and law there is interest in the search and the classification of text documents in large databases. To information retrieval purposes there exist methods. They are mainly based on keywords. In cases where keywords are lacking the information retrieval is problematic. One approach is to use the whole text document as a search key. Neural networks offer an adaptive tool for this purpose. This paper suggests a new adaptive approach to the problem of clustering and search in large text document databases. The approach is a multilevel one based on word, sentence, and paragraph level maps. Here only the word map level is reported. The reported approach is based on smart encoding, on Self-Organizing Maps, and on document histograms. The results are very promising.
© (2000) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Ari J. E. Visa, Jarmo Toivanen, Barbro Back, and Hannu Vanharanta "Toward text understanding: classification of text documents by word map", Proc. SPIE 4057, Data Mining and Knowledge Discovery: Theory, Tools, and Technology II, (6 April 2000); https://doi.org/10.1117/12.381745
PROCEEDINGS
7 PAGES


SHARE
Advertisement
Advertisement
RELATED CONTENT

Hybrid neural networks for gray image recognition
Proceedings of SPIE (August 18 1998)
Spotting words in handwritten Arabic documents
Proceedings of SPIE (January 15 2006)
Data mining of text as a tool in authorship attribution
Proceedings of SPIE (March 26 2001)
Data de-duplication using neural networks
Proceedings of SPIE (June 29 1994)
Index point data using algebraic lattice
Proceedings of SPIE (December 22 1999)

Back to Top