18 April 2006 Efficient mining of strongly correlated item pairs
Author Affiliations +
Past attempts to mine transactional databases for strongly correlated item pairs have been beset by difficulties. In an attempt to be efficient, some algorithms produce false positive and false negative results. In an attempt to be accurate and comprehensive, other algorithms sacrifice efficiency. We propose an efficient new algorithm that uses Jaccard's correlation coefficient, which is simply the ratio between the sizes of the intersection and the union of two sets, to generate a set of strongly correlated item pairs that is both accurate and comprehensive. The pruning of candidate item pairs based on an upper bound facilitates efficiency. Furthermore, there is no possibility of false positives or false negatives. Testing of our algorithm on datasets of various sizes shows its effectiveness in real-world application.
© (2006) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Shuxin Li, Shuxin Li, Robert Lee, Robert Lee, Sheau-Dong Lang, Sheau-Dong Lang, } "Efficient mining of strongly correlated item pairs", Proc. SPIE 6241, Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security 2006, 624102 (18 April 2006); doi: 10.1117/12.664567; https://doi.org/10.1117/12.664567


Decomposition in data mining: a medical case study
Proceedings of SPIE (March 26 2001)
Theoretical sampling for data mining
Proceedings of SPIE (April 05 2000)
Data modeling for data mining
Proceedings of SPIE (March 11 2002)

Back to Top