Translator Disclaimer
25 November 2008 The Rocchio classifier and second generation wavelets
Author Affiliations +
Classification and characterization of text is of ever growing importance in defense and national security. The text classification task is an instance of classification using sparse features residing in a high dimensional feature space. Two standard (of a wide selection of available) algorithms for this task are the naive Bayes classifier and the Rocchio linear classifier. Naive Bayes classifiers are widely applied; the Rocchio algorithm is primarily used in document classification and information retrieval. Both these classifiers are popular because of their simplicity and ease of application, computational speed and reasonable performance. One aspect of the Rocchio approach, inherited from its information retrieval origin, is that it explicitly uses both positive and negative models. Parameters have been introduced which make it adaptive to the particulars of the corpora of interest and thereby improve its performance. The ideas inherent in these classifiers and in second generation wavelets can be recombined into new algorithms for classification. An example is a classifier using second generation wavelet-like functions for class probes that mimic the Rocchio positive template - negative template approach.
© (2008) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Patricia H. Carter "The Rocchio classifier and second generation wavelets", Proc. SPIE 6576, Independent Component Analyses, Wavelets, Unsupervised Nano-Biomimetic Sensors, and Neural Networks V, 65760F (25 November 2008);

Back to Top