6 April 2000 Toward text understanding: classification of text documents by word map
Author Affiliations +
Abstract
In many fields, for example in business, engineering, and law there is interest in the search and the classification of text documents in large databases. To information retrieval purposes there exist methods. They are mainly based on keywords. In cases where keywords are lacking the information retrieval is problematic. One approach is to use the whole text document as a search key. Neural networks offer an adaptive tool for this purpose. This paper suggests a new adaptive approach to the problem of clustering and search in large text document databases. The approach is a multilevel one based on word, sentence, and paragraph level maps. Here only the word map level is reported. The reported approach is based on smart encoding, on Self-Organizing Maps, and on document histograms. The results are very promising.
© (2000) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Ari J. E. Visa, Jarmo Toivanen, Barbro Back, Hannu Vanharanta, "Toward text understanding: classification of text documents by word map", Proc. SPIE 4057, Data Mining and Knowledge Discovery: Theory, Tools, and Technology II, (6 April 2000); doi: 10.1117/12.381745; https://doi.org/10.1117/12.381745
PROCEEDINGS
7 PAGES


SHARE
RELATED CONTENT

Hybrid neural networks for gray image recognition
Proceedings of SPIE (August 19 1998)
Neural net selection of features for defect inspection
Proceedings of SPIE (February 01 1991)
Spotting words in handwritten Arabic documents
Proceedings of SPIE (January 16 2006)
Data mining of text as a tool in authorship attribution
Proceedings of SPIE (March 27 2001)
Data de-duplication using neural networks
Proceedings of SPIE (June 30 1994)

Back to Top