Efficiently mining long patterns from databases

Author:
Roberto J. Bayardo

IBM Almaden Research Center

IBM Almaden Research Center
View Profile

SIGMOD '98: Proceedings of the 1998 ACM SIGMOD international conference on Management of dataJune 1998Pages 85–93https://doi.org/10.1145/276304.276313

Published:01 June 1998Publication History

SIGMOD '98: Proceedings of the 1998 ACM SIGMOD international conference on Management of data

Pages 85–93

ABSTRACT

We present a pattern-mining algorithm that scales roughly linearly in the number of maximal patterns embedded in a database irrespective of the length of the longest pattern. In comparison, previous algorithms based on Apriori scale exponentially with longest pattern length. Experiments on real data show that when the patterns are long, our algorithm is more efficient by an order of magnitude or more.

References

1.Agrawal, R.; Imielinski, T.; and Swami, A. 1993. Mining Association Rules between Sets of Items in Large Databases. In Proc. of the 1993 A CM-SIGMOD Conf. on the Management of Data, 207-216. Google ScholarDigital Library
2.Agrawal, R.; Mannila, H.; Srikant, R.; Toivonen, H.; and Verkamo, A. I. 1996. Fast Discovery of Association Rules. In Advances in Knowledge Discovery and Data Mining, AAAI Press, 307-328. Google ScholarDigital Library
3.Agrawal, R., and Srikant, R. 1994. Fast Algorithms for Mining Association Rules. IBM Research Report RJ9839, June 1994, IBM Almaden Research Center, San Jose, CA.Google Scholar
4.Agrawal, R. and Srikant, R. 1995. Mining Sequential Patterns. In Proc. of the 11 th lnt'l Conf. on Data Engineering, 3- 14. Google ScholarDigital Library
5.Bayardo, R. J. 1997. Brute-Force Mining of High-Confidence Classification Rules. In Proc. of the Third Int 'l Conf. on Knowledge Discovery and Data Mining, 123-126.Google ScholarDigital Library
6.Brin, S.; Motwani, R.; Ullman, J.; and Tsur, S. 1997. Dynamic Itemset Counting and Implication Rules for Market Basket Data. In Proc. of the 1997 SIGMOD Conf. on the Management of Data, 255-264. Google ScholarDigital Library
7.Gunopulos, G.; Mannila, H.; and Saluja, S. 1997. Discovering All Most Specific Sentences by Randomized Algorithms. In Proc. of the 6th Int '1 Conf. on Database Theory, 215-229. Google ScholarDigital Library
8.Lin, D. and Kedem, Z. M. 1998. Pincer-Search: A New Algorithm for Discovering the Maximum Frequent Set. In Proc. of the Sixth European Conf. on Extending Database Technology, to appear. Google ScholarDigital Library
9.Park, J. S.; Chen, M.-S.; and Yu, P. S. 1996. An Effective Hash Based Algorithm for Mining Association Rules. In Proc. of the 1995 SIGMOD Conf. on the Management of Data, 175-186. Google ScholarDigital Library
10.Rymon, R. 1992. Search through Systematic Set Enumeration. In Proc. of Third Int '1 Conf. on Principles of Knowledge Representation and Reasoning, 539-550.Google ScholarDigital Library
11.Savasere, A.; Omiecinski, E.; and Navathe, S. 1995. An Efficient Algorithm for Mining Association Rules in Large Databases. In Proc. of the 21st Conf. on Very Large Data-Bases, 432-444. Google ScholarDigital Library
12.Slagel, J. R.; Chang, C.-L.; and Lee, R. C. T. 1970. A New Algorithm for Generating Prime Implicants. IEEE Trans. on Computers, C- 19(4):304-310.Google ScholarDigital Library
13.Smythe, P. and Goodman, R. M. 1992. An Information Theoretic Approach to Rule Induction from Databases. IEEE Transactions on Knowledge and Data Engineering, 4(4):301- 316. Google ScholarDigital Library
14.Srikant, R. and Agrawal, R. 1996. Mining Sequential Patterns: Generalizations and Performance Improvements. In Proc. of the Fifth int I Conf. on Extending Database Technology, 3-17. Google ScholarDigital Library
15.Srikant, R.; Vu, Q.; and Agrawal, R. 1997. Mining Association Rules with Item Constraints. In Proc. of the Third lnt T Conf. on Knowledge Discovery in Databases and Data Mining, 67-73.Google Scholar
16.Zaki, M. J.; Parthasarathy, S.; Ogihara, M.; and Li, W. 1997. New Algorithms for Fast Discovery of Association Rules. In Proc. of the Third Int l Conf. on Knowledge Discovery in Databases and Data Mining, 283-286.Google Scholar

Index Terms

Efficiently mining long patterns from databases
1. Information systems
  1. Information systems applications
    1. Data mining
2. Mathematics of computing
  1. Mathematical software

Recommendations

Efficiently mining long patterns from databases

We present a pattern-mining algorithm that scales roughly linearly in the number of maximal patterns embedded in a database irrespective of the length of the longest pattern. In comparison, previous algorithms based on Apriori scale exponentially with ...
Read More
Mining Regular Patterns in Transactional Databases

The frequency of a pattern may not be a sufficient criterion for identifying meaningful patterns in a database. The temporal regularity of a pattern can be another key criterion for assessing the importance of a pattern in several applications. A ...
Read More
Long unavoidable patterns

We examine long unavoidable patterns, unavoidable in the sense of Bean, Ehrenfeucht, McNulty. Zimin and independently Schmidt have shown that there is only one unavoidable pattern of length 2ⁿ-1 on an alphabet with n letters; this pattern is a "quasi-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGMOD '98: Proceedings of the 1998 ACM SIGMOD international conference on Management of data
June 1998
599 pages
ISBN:0897919955
DOI:10.1145/276304
Chairmen:
Laura Haas
IBM AlmadenResearch Center, San Jose, CA
,
Pamela Drew
Boeing Co.
,
Editors:
Ashutosh Tiwary
Boeing Co.; and Univ. of Washington, Seattle
,
Michael Franklin
Univ. of Maryland, College Park
ACM SIGMOD Record Volume 27, Issue 2
June 1998
595 pages
ISSN:0163-5808
DOI:10.1145/276305
Chairmen:
Laura Haas
IBM Almaden Research Center, San Jose, CA
,
Pamela Drew
Boeing Co.
,
Editor:
Ashutosh Tiwary
Boeing Co.; and Univ. of Washington, Seattle
Issue’s Table of Contents
Copyright © 1998 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 June 1998
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate785of4,003submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1,029
  Total Citations
  View Citations
- 3,023
  Total Downloads
- Downloads (Last 12 months)161
- Downloads (Last 6 weeks)25
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Efficiently mining long patterns from databases

SIGMOD '98: Proceedings of the 1998 ACM SIGMOD international conference on Management of data

ABSTRACT

References

Cited By

Index Terms

Recommendations

Efficiently mining long patterns from databases

Mining Regular Patterns in Transactional Databases

Long unavoidable patterns