Skip to main content

Advertisement

Log in

An effective association rule mining scheme using a new generic basis

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Association rule mining among itemsets is a fundamental task and is of great importance in many data mining applications including attacks in network data, stock market, financial applications, bioinformatics to find genetic disorders, etc. However, association rule extraction from a reasonable-sized database produces a large number of rules. As a result, many of them are redundant to other rules, and they are practically useless. To overcome this issue, methods for mining non-redundant rules are essentially required. To address such problem, we initially propose a definition for redundancy in sense of minimal knowledge and then a compact representation of non-redundant association rules which we call as compact informative generic basis. We also provide an improved version of the existing DCI_CLOSED algorithm (DCI_PLUS) to find out the frequent closed itemsets (FCI) with their minimal representative generators in combination with BitTable which represents a compact database form in a single scan of the original database. We further introduce an algorithm for constructing the compact informative generic basis from the FCI and their generators in an efficient way. We finally present an inference mechanism in which all association rules can be generated without accessing the database. Experiments are performed on the proposed method. The experimental results show that the proposed method outperforms the other existing related methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on management of data (SIGMOD ’93), pp 207–216

  2. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on very large data bases (VLDB ’94), pp 487–499

  3. Bache K, Lichman M (2012) UCI machine learning repository (http://archive.ics.uci.edu/ml). University of California, School of Information and Computer Science, Irvine, CA

  4. Bastide Y, Taouil R, Pasquier N, Stumme G, Lakhal L (2000) Mining frequent patterns with counting inference. SIGKDD Explor Newsl 2(2):66–75

    Article  Google Scholar 

  5. Bonchi F, Lucchese C (2006) On condensed representations of constrained frequent patterns. Knowl Inf Syst 9(2):180–201

    Article  Google Scholar 

  6. Boulicaut JF, Bykowski A, Rigotti C (2003) Free-sets: a condensed representation of Boolean data for the approximation of frequency queries. Data Min Knowl Disc 7(1):5–22

    Article  MathSciNet  Google Scholar 

  7. Brin S, Motwani R, Ullman JD, Tsur S (1997) Dynamic itemset counting and implication rules for market basket data. In: Proceedings of the 1997 ACM SIGMOD international conference on management of data (SIGMOD ’97), pp 255–264

  8. Calders T, Goethals B (2007) Non-derivable itemset mining. Data Min Knowl Disc 14(1):171–206

    Article  MathSciNet  Google Scholar 

  9. Casali A, Cicchetti R, Lakhal L (2005) Essential patterns: a perfect cover of frequent patterns. In: Proceedings of the 7th international conference on data warehousing and knowledge, discovery (DaWaK’05), pp 428–437

  10. Cheng J, Ke Y, Ng W (2008) Effective elimination of redundant association rules. Data Min Knowl Disc 16(2):221–249

    Article  MathSciNet  Google Scholar 

  11. Dong J, Han M (2007) BitTableFI: an efficient mining frequent itemsets algorithm. Knowl Based Syst 20(4):329–335

    Article  Google Scholar 

  12. Fournier-Viger P, Gomariz A, Soltani A, Gueniche T (2012) SPMF: open-source data mining platform. http://www.philippe-fournier-viger.com/spmf/

  13. Frequent itemset mining dataset repository. http://fimi.cs.helsinki.fi/data/

  14. Ganter B, Wille R, Wille R (1999) Formal concept analysis. Springer, Berlin

    Book  MATH  Google Scholar 

  15. Grahne G, Zhu J (2005) Fast algorithms for frequent itemset mining using FP-trees. IEEE Trans Knowl Data Eng 17(10):1347–1362

    Article  Google Scholar 

  16. Hamrouni T, Yahia SB, Nguifo EM (2008) Succinct minimal generators: theoretical foundations and applications. Int J Found Comput Sci 19(2):271–296

    Article  MATH  Google Scholar 

  17. Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM SIGMOD international conference on management of data (SIGMOD ’00), pp 1–12

  18. Jin R, Xiang Y, Liu L (2009) Cartesian contour: a concise representation for a collection of frequent sets. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (KDD ’09), pp 417–426

  19. Kryszkiewicz M (1998) Representative association rules. In: Wu X, Kotagiri R, Korb KB (eds) Proceedings of the Pacific-Asia conference on advances in knowledge discovery and data mining (PAKDD ’98). Lecture notes in computer science 1394. Springer, Berlin, pp 198–209

  20. Lin DI, Kedem ZM (2002) Pincer-search: an efficient algorithm for discovering the maximum frequent set. IEEE Trans Knowl Data Eng 14(3):553–566

    Article  Google Scholar 

  21. Liu G, Li J, Wong L (2008) A new concise representation of frequent itemsets using generators and a positive border. Knowl Inf Syst 17(1):35–56

    Article  MathSciNet  Google Scholar 

  22. Lucchese C, Orlando S, Perego R (2006) Fast and memory efficient mining of frequent closed itemsets. IEEE Trans Knowl Data Eng 18(1):21–36

    Article  Google Scholar 

  23. Ng RT, Lakshmanan LVS, Han J, Pang A (1998) Exploratory mining and pruning optimizations of constrained associations rules. SIGMOD Rec 27(2):13–24

    Article  Google Scholar 

  24. Palshikar GK, Kale MS, Apte MM (2007) Association rules mining using heavy itemsets. Data Knowl Eng 61(1):93–113

    Article  Google Scholar 

  25. Park JS, Chen M-S, Yu PS (1995) An effective hash-based algorithm for mining association rules. In: Proceedings of the 1995 ACM SIGMOD international conference on management of data (SIGMOD ’95), pp 175–186

  26. Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Efficient mining of association rules using closed itemset lattices. Inf Syst 24(1):25–46

    Article  Google Scholar 

  27. PYR Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: Proceedings of the 7th international conference on database theory (ICDT ’99), pp 398–416

  28. Pasquier N, Taouil R, Bastide Y, Stumme G, Lakhal L (2005) Generating a condensed representation for association rules. J Intell Inf Syst 24(1):29–60

    Article  MATH  Google Scholar 

  29. Pei J, Han J, Mao R (2000) CLOSET: an efficient algorithm for mining frequent closed itemsets. In: ACM SIGMOD workshop on research issues in, data mining and knowledge discovery, pp 21–30

  30. Pei J, Han J, Lu H, Nishio S, Tang S, Yang D (2001) H-mine: hyper-structure mining of frequent patterns in large databases. In: Proceedings of the 2001 IEEE international conference on data mining (ICDM ’01), pp 441–448

  31. Pei J, Dong G, Zou W, Han J (2004) Mining condensed frequent-pattern bases. Knowl Inf Syst 6(5):570–594

    Article  Google Scholar 

  32. Singh NG, Singh SR, Mahanta AK (2005) CloseMiner: discovering frequent closed itemsets using frequent closed tidsets. In: Proceedings of the fifth IEEE international conference on data mining (ICDM ’05), pp 633–636

  33. Song W, Yang B, Xu Z (2008) Index-CloseMiner: an improved algorithm for mining frequent closed itemset. Intell Data Anal 12(4):321–338

    MathSciNet  Google Scholar 

  34. Srikant R, Vu Q, Agrawal R (1997) Mining association rules with item constraints. In: Proceedings of the 3rd international conference on knowledge discovery in databases and data mining, pp 67–73

  35. Tsai PSM, Chen CM (2004) Mining interesting association rules from customer databases and transaction databases. Inf Syst 29(8):685–696

    Article  Google Scholar 

  36. Vo B, Hong TP, Le B (2012) DBV-Miner: a dynamic bit-vector approach for fast mining frequent closed itemsets. Expert Syst Appl 39(8):7196–7206

    Article  Google Scholar 

  37. Wang J, Han J, Pei J (2003) CLOSET+: searching for the best strategies for mining frequent closed itemsets. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining (KDD ’03), pp 236–245

  38. Xu Y, Li Y, Shaw G (2011) Reliable representations for association rules. Data Knowl Eng 70(6):555–575

    Article  Google Scholar 

  39. Yahia SB, Gasmi G, Nguifo EM (2009) A new generic basis of “factual” and “implicative” association rules. Intell Data Anal 13(4):633–656

    Google Scholar 

  40. Yen SJ, Lee YS (2002) Mining interesting association rules: a data mining language. In: Proceedings of the 6th Pacific-Asia conference on advances in knowledge discovery and data mining (PAKDD ’02), pp 172–176

  41. Zaki MJ, Parthasarathy S, Ogihara M, Li W (1997) New algorithms for fast discovery of association rules. Technical report, University of Rochester, Rochester, NY

  42. Zaki MJ (April 2002) Hsiao CJ (2002) CHARM: an efficient algorithm for closed itemset mining. In: Grossman R, Han J, Kumar V, Mannila H, Motwani R (eds) Proceedings of the second SIAM international conference on data mining. Arlington, VA, pp 457–473

  43. Zaki MJ, Gouda K (2003) Fast vertical mining using diffsets. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining (KDD ’03), pp 326–335

  44. Zaki MJ (2004) Mining non-redundant association rules. Data Min Knowl Disc 9(3):223–248

    Article  MathSciNet  Google Scholar 

  45. Zaki MJ, Hsiao CJ (2005) Efficient algorithms for mining closed itemsets and their lattice structure. IEEE Trans Knowl Data Eng 17(4):462–478

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to thank the anonymous reviewers, the Editor and the Editor-in-Chief for providing constructive and generous feedbacks, which have improved significantly the content, quality and the presentation of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. Goswami.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sahoo, J., Das, A.K. & Goswami, A. An effective association rule mining scheme using a new generic basis. Knowl Inf Syst 43, 127–156 (2015). https://doi.org/10.1007/s10115-014-0732-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-014-0732-4

Keywords

Navigation