ABSTRACT
We present a pattern-mining algorithm that scales roughly linearly in the number of maximal patterns embedded in a database irrespective of the length of the longest pattern. In comparison, previous algorithms based on Apriori scale exponentially with longest pattern length. Experiments on real data show that when the patterns are long, our algorithm is more efficient by an order of magnitude or more.
- 1.Agrawal, R.; Imielinski, T.; and Swami, A. 1993. Mining Association Rules between Sets of Items in Large Databases. In Proc. of the 1993 A CM-SIGMOD Conf. on the Management of Data, 207-216. Google ScholarDigital Library
- 2.Agrawal, R.; Mannila, H.; Srikant, R.; Toivonen, H.; and Verkamo, A. I. 1996. Fast Discovery of Association Rules. In Advances in Knowledge Discovery and Data Mining, AAAI Press, 307-328. Google ScholarDigital Library
- 3.Agrawal, R., and Srikant, R. 1994. Fast Algorithms for Mining Association Rules. IBM Research Report RJ9839, June 1994, IBM Almaden Research Center, San Jose, CA.Google Scholar
- 4.Agrawal, R. and Srikant, R. 1995. Mining Sequential Patterns. In Proc. of the 11 th lnt'l Conf. on Data Engineering, 3- 14. Google ScholarDigital Library
- 5.Bayardo, R. J. 1997. Brute-Force Mining of High-Confidence Classification Rules. In Proc. of the Third Int 'l Conf. on Knowledge Discovery and Data Mining, 123-126.Google ScholarDigital Library
- 6.Brin, S.; Motwani, R.; Ullman, J.; and Tsur, S. 1997. Dynamic Itemset Counting and Implication Rules for Market Basket Data. In Proc. of the 1997 SIGMOD Conf. on the Management of Data, 255-264. Google ScholarDigital Library
- 7.Gunopulos, G.; Mannila, H.; and Saluja, S. 1997. Discovering All Most Specific Sentences by Randomized Algorithms. In Proc. of the 6th Int '1 Conf. on Database Theory, 215-229. Google ScholarDigital Library
- 8.Lin, D. and Kedem, Z. M. 1998. Pincer-Search: A New Algorithm for Discovering the Maximum Frequent Set. In Proc. of the Sixth European Conf. on Extending Database Technology, to appear. Google ScholarDigital Library
- 9.Park, J. S.; Chen, M.-S.; and Yu, P. S. 1996. An Effective Hash Based Algorithm for Mining Association Rules. In Proc. of the 1995 SIGMOD Conf. on the Management of Data, 175-186. Google ScholarDigital Library
- 10.Rymon, R. 1992. Search through Systematic Set Enumeration. In Proc. of Third Int '1 Conf. on Principles of Knowledge Representation and Reasoning, 539-550.Google ScholarDigital Library
- 11.Savasere, A.; Omiecinski, E.; and Navathe, S. 1995. An Efficient Algorithm for Mining Association Rules in Large Databases. In Proc. of the 21st Conf. on Very Large Data-Bases, 432-444. Google ScholarDigital Library
- 12.Slagel, J. R.; Chang, C.-L.; and Lee, R. C. T. 1970. A New Algorithm for Generating Prime Implicants. IEEE Trans. on Computers, C- 19(4):304-310.Google ScholarDigital Library
- 13.Smythe, P. and Goodman, R. M. 1992. An Information Theoretic Approach to Rule Induction from Databases. IEEE Transactions on Knowledge and Data Engineering, 4(4):301- 316. Google ScholarDigital Library
- 14.Srikant, R. and Agrawal, R. 1996. Mining Sequential Patterns: Generalizations and Performance Improvements. In Proc. of the Fifth int I Conf. on Extending Database Technology, 3-17. Google ScholarDigital Library
- 15.Srikant, R.; Vu, Q.; and Agrawal, R. 1997. Mining Association Rules with Item Constraints. In Proc. of the Third lnt T Conf. on Knowledge Discovery in Databases and Data Mining, 67-73.Google Scholar
- 16.Zaki, M. J.; Parthasarathy, S.; Ogihara, M.; and Li, W. 1997. New Algorithms for Fast Discovery of Association Rules. In Proc. of the Third Int l Conf. on Knowledge Discovery in Databases and Data Mining, 283-286.Google Scholar
Index Terms
- Efficiently mining long patterns from databases
Recommendations
Efficiently mining long patterns from databases
We present a pattern-mining algorithm that scales roughly linearly in the number of maximal patterns embedded in a database irrespective of the length of the longest pattern. In comparison, previous algorithms based on Apriori scale exponentially with ...
Mining Regular Patterns in Transactional Databases
The frequency of a pattern may not be a sufficient criterion for identifying meaningful patterns in a database. The temporal regularity of a pattern can be another key criterion for assessing the importance of a pattern in several applications. A ...
Long unavoidable patterns
We examine long unavoidable patterns, unavoidable in the sense of Bean, Ehrenfeucht, McNulty. Zimin and independently Schmidt have shown that there is only one unavoidable pattern of length 2 n -1 on an alphabet with n letters; this pattern is a "quasi-...
Comments