17 March 2008 Research on parallel algorithm for sequential pattern mining
Author Affiliations +
Abstract
Sequential pattern mining is the mining of frequent sequences related to time or other orders from the sequence database. Its initial motivation is to discover the laws of customer purchasing in a time section by finding the frequent sequences. In recent years, sequential pattern mining has become an important direction of data mining, and its application field has not been confined to the business database and has extended to new data sources such as Web and advanced science fields such as DNA analysis. The data of sequential pattern mining has characteristics as follows: mass data amount and distributed storage. Most existing sequential pattern mining algorithms haven't considered the above-mentioned characteristics synthetically. According to the traits mentioned above and combining the parallel theory, this paper puts forward a new distributed parallel algorithm SPP(Sequential Pattern Parallel). The algorithm abides by the principal of pattern reduction and utilizes the divide-and-conquer strategy for parallelization. The first parallel task is to construct frequent item sets applying frequent concept and search space partition theory and the second task is to structure frequent sequences using the depth-first search method at each processor. The algorithm only needs to access the database twice and doesn't generate the candidated sequences, which abates the access time and improves the mining efficiency. Based on the random data generation procedure and different information structure designed, this paper simulated the SPP algorithm in a concrete parallel environment and implemented the AprioriAll algorithm. The experiments demonstrate that compared with AprioriAll, the SPP algorithm had excellent speedup factor and efficiency.
© (2008) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Lijuan Zhou, Lijuan Zhou, Bai Qin, Bai Qin, Yu Wang, Yu Wang, Zhongxiao Hao, Zhongxiao Hao, } "Research on parallel algorithm for sequential pattern mining", Proc. SPIE 6973, Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security 2008, 69730Q (17 March 2008); doi: 10.1117/12.775402; https://doi.org/10.1117/12.775402
PROCEEDINGS
8 PAGES


SHARE
RELATED CONTENT

Flight plan optimization
Proceedings of SPIE (May 21 2015)
Data mining in the real world
Proceedings of SPIE (February 24 1999)
A data mining algorithm based on the rough sets theory...
Proceedings of SPIE (December 02 2005)
Incremental information mining
Proceedings of SPIE (March 11 2002)
Mining sequential patterns including time intervals
Proceedings of SPIE (April 05 2000)

Back to Top