Translator Disclaimer
28 March 2005 Utilizing unsupervised learning to cluster data in the Bayesian data reduction algorithm
Author Affiliations +
In this paper, unsupervised learning is utilized to illustrate the ability of the Bayesian Data Reduction Algorithm (BDRA) to cluster unlabeled training data. The BDRA is based on the assumption that the discrete symbol probabilities of each class are a priori uniformly Dirichlet distributed, and it employs a "greedy" approach (similar to a backward sequential feature search) for reducing irrelevant features from the training data of each class. Notice that reducing irrelevant features is synonymous here with selecting those features that provide best classification performance; the metric for making data reducing decisions is an analytic formula for the probability of error conditioned on the training data. The contribution of this work is to demonstrate how clustering performance varies depending on the method utilized for unsupervised training. To illustrate performance, results are demonstrated using simulated data. In general, the results of this work have implications for finding clusters in data mining applications.
© (2005) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Robert S. Lynch Jr. and Peter K. Willett "Utilizing unsupervised learning to cluster data in the Bayesian data reduction algorithm", Proc. SPIE 5812, Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security 2005, (28 March 2005);


Back to Top