28 March 2005 Parametric model-based clustering
Author Affiliations +
Abstract
Parametric, model-based algorithms learn generative models from the data, with each model corresponding to one particular cluster. Accordingly, the model-based partitional algorithm will select the most suitable model for any data object (Clustering step}, and will recompute parametric models using data specifically from the corresponding clusters {Maximization step). This Clustering-Maximization framework have been widely used and have shown promising results in many applications including complex variable-length data. The paper proposes (Experience-Innovation} (EI) method as a natural extension of the (Clustering-Maximization} framework. This method includes 3 components: (1) keep the best past experience and make empirical likelihood trajectory monotonical as a result; (2) find a new model as a function of existing models so that the corresponding cluster will split existing clusters with bigger number of elements and smaller uniformity; (3) heuristical innovations, for example, several trials with random initial settings. Also, we introduce clustering regularisation based on the balanced complex of two conditions: (1) significance of any particular cluster; (2) difference between any 2 clusters. We illustrate effectiveness of the proposed methods using first-order Markov model in application to the large web-traffic dataset. The aim of the experiment is to explain and understand the way people interact with web sites.
© (2005) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Vladimir Nikulin, Vladimir Nikulin, Alex J. Smola, Alex J. Smola, } "Parametric model-based clustering", Proc. SPIE 5812, Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security 2005, (28 March 2005); doi: 10.1117/12.603199; https://doi.org/10.1117/12.603199
PROCEEDINGS
12 PAGES


SHARE
Back to Top