Mining frequent closed itemsets for large data

Huaiguo Fu, Engelbert Mephu Nguifo

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Citations (Scopus)

Abstract

Mining frequent closed itemsets is one effective method to analyse frequent pattern, and further, to generate association rules. Several algorithms were proposed to generate frequent closed itemsets, including CLOSE, A-CLOSE, CLOSET, CHARM and CLOSET+ etc. However it's still hard for these algorithms to deal with dense and very large data. In this paper, we analyze the search space of frequent closed itemsets and propose a new decomposition algorithm for mining frequent closed itemsets called PFC. PFC can dynamically generate non-overlapping partitions of the search space and mine frequent closed itemsets in each parution. Furthermore, each partition is independent and only sliares the same source data with other partitions. So it is possible to implement PFC with multi-threads or parallel methods, and prune efficiently the search space of frequent closed itemsets. In this study, PFC is implemented in Java. We compare PFC with an author's; C++ version of CLOSET+ on some large UCI repository datasets and on the worst case. The preliminary experimental results demonstrate good performance of PFC for dealing with dense and very large data.

Original languageEnglish
Title of host publicationProceedings of the 2004 International Conference on Machine Learning and Applications, ICMLA '04
EditorsM. Kantardzic, O. Nasraoui, M. Milanova
Pages328-335
Number of pages8
Publication statusPublished - 2004
Externally publishedYes
Event2004 International Conference on Machine Learning and Applications, ICMLA '04 - Louisville, KY, United States
Duration: 16 Dec 200418 Dec 2004

Publication series

NameProceedings of the 2004 International Conference on Machine Learning and Applications, ICMLA '04

Conference

Conference2004 International Conference on Machine Learning and Applications, ICMLA '04
Country/TerritoryUnited States
CityLouisville, KY
Period16/12/200418/12/2004

Fingerprint

Dive into the research topics of 'Mining frequent closed itemsets for large data'. Together they form a unique fingerprint.

Cite this