research-article

Frequent pattern mining with uncertain data

Authors:
Charu C. Aggarwal

IBM T. J. Watson Research Ctr, Hawthorne, NY, USA

IBM T. J. Watson Research Ctr, Hawthorne, NY, USA
View Profile

,
Yan Li

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Jianyong Wang

Tsinghua University, Beijing, China

Tsinghua University, Beijing, China
View Profile

,
Jing Wang

New York University, New York, NY, USA

New York University, New York, NY, USA
View Profile

KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data miningJune 2009Pages 29–38https://doi.org/10.1145/1557019.1557030

Published:28 June 2009Publication History

KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 29–38

ABSTRACT

This paper studies the problem of frequent pattern mining with uncertain data. We will show how broad classes of algorithms can be extended to the uncertain data setting. In particular, we will study candidate generate-and-test algorithms, hyper-structure algorithms and pattern growth based algorithms. One of our insightful observations is that the experimental behavior of different classes of algorithms is very different in the uncertain case as compared to the deterministic case. In particular, the hyper-structure and the candidate generate-and-test algorithms perform much better than tree-based algorithms. This counter-intuitive behavior is an important observation from the perspective of algorithm design of the uncertain variation of the problem. We will test the approach on a number of real and synthetic data sets, and show the effectiveness of two of our approaches over competitive techniques.

References

R. Agarwal, C. Aggarwal, V. Prasad. A Tree Projection Algorithm for Generating Frequent Itemsets. Journal of Parallel and Distributed Computing, 61(3), 2001. Google ScholarDigital Library
C. C. Aggarwal. Managing and Mining Uncertain Data, Springer, 2009. Google ScholarDigital Library
R. Agrawal, R. Srikant. Fast Algorithms for Mining Association Rules in Large Databases. VLDB 1994. Google ScholarDigital Library
R. J. Bayardo. Efficiently mining long patterns from databases SIGMOD 1998. Google ScholarDigital Library
F. Bodon. A fast APRIORI implementation. URL: {http://fimi.cs.helsinki.fi/src/}.Google Scholar
D. Burdick, M. Calimlim, J. Gehrke. MAFIA: A Maximal Frequent Itemset Algorithm. IEEE TKDE, 17(11), pp. 1490--1504, 2005. Google ScholarDigital Library
C.-K. Chui, B. Kao, E. Hung. Mining Frequent Itemsets from Uncertain Data. PAKDD 2007.Google ScholarDigital Library
C.-K. Chui, B. Kao. Decremental Approach for Mining Frequent Itemsets from Uncertain Data. PAKDD 2008.Google ScholarCross Ref
J. Han, J. Pei, Y. Yin. Mining frequent patterns without candidate generation. SIGMOD 2000. Google ScholarDigital Library
S. Guha, N. Koudas, K. Shim. Approximation and streaming algorithms for histogram construction problems. ACM TODS, 31(1), 396--438, 2006. Google ScholarDigital Library
C. K.-S. Leung, M. A. F. Mateo, D. A. Brajczuk. A Tree-Based Approach for Frequent Pattern Mining from UncertainData, PAKDD 2008.Google ScholarCross Ref
J. Pei, J. Han, et al. H-Mine: Hyper-Struction Mining of Frequent Patterns in Large Databases. ICDM 2001. Google ScholarDigital Library
Y.G. Sucahyo, R.P. Gopalan. CT-PRO: A Bottom-up Non-Recursive Frequent Itemset Mining Algorithm Using Compressed FP-Tree DataStructure. URL: {http://fimi.cs.helsinki.fi/src/}.Google Scholar
Q. Zhang, F. Li, and K. Yi. Finding Frequent Items in Probabilistic Data, SIGMOD 2008. Google ScholarDigital Library

Index Terms

Frequent pattern mining with uncertain data
1. Information systems
  1. Information systems applications

Recommendations

Mining uncertain data for constrained frequent sets
IDEAS '09: Proceedings of the 2009 International Database Engineering & Applications Symposium

Data mining aims to search for implicit, previously unknown, and potentially useful pieces of information---such as sets of items that are frequently co-occurring together---that are embedded in data. The mined frequent sets can be used in the discovery ...
Read More
Finding efficiencies in frequent pattern mining from big uncertain data

Many existing data mining algorithms search interesting patterns from transactional databases of precise data. However, there are situations in which data are uncertain. Items in each transaction of these probabilistic databases of uncertain data are ...
Read More
Item-centric mining of frequent patterns from big uncertain data
Abstract
High volumes of wide varieties of valuable data of different veracity (e.g., imprecise and uncertain data) can be easily generated or collected at a high velocity for various knowledge-based and intelligent information & engineering systems in ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
June 2009
1426 pages
ISBN:9781605584959
DOI:10.1145/1557019
General Chairs:
John Elder
Elder Research, Inc., USA
,
Françoise Soulié Fogelman
KXEN, France
,
Program Chairs:
Peter Flach
University of Bristol, UK
,
Mohammed Zaki
RPI, USA
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 28 June 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
uncertain data
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 196
  Total Citations
  View Citations
- 2,865
  Total Downloads
- Downloads (Last 12 months)42
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Frequent pattern mining with uncertain data

KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Mining uncertain data for constrained frequent sets

Finding efficiencies in frequent pattern mining from big uncertain data

Item-centric mining of frequent patterns from big uncertain data

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Frequent pattern mining with uncertain data

KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Mining uncertain data for constrained frequent sets

Finding efficiencies in frequent pattern mining from big uncertain data

Item-centric mining of frequent patterns from big uncertain data

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media