skip to main content
10.1145/1557019.1557030acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Frequent pattern mining with uncertain data

Published:28 June 2009Publication History

ABSTRACT

This paper studies the problem of frequent pattern mining with uncertain data. We will show how broad classes of algorithms can be extended to the uncertain data setting. In particular, we will study candidate generate-and-test algorithms, hyper-structure algorithms and pattern growth based algorithms. One of our insightful observations is that the experimental behavior of different classes of algorithms is very different in the uncertain case as compared to the deterministic case. In particular, the hyper-structure and the candidate generate-and-test algorithms perform much better than tree-based algorithms. This counter-intuitive behavior is an important observation from the perspective of algorithm design of the uncertain variation of the problem. We will test the approach on a number of real and synthetic data sets, and show the effectiveness of two of our approaches over competitive techniques.

References

  1. R. Agarwal, C. Aggarwal, V. Prasad. A Tree Projection Algorithm for Generating Frequent Itemsets. Journal of Parallel and Distributed Computing, 61(3), 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. C. C. Aggarwal. Managing and Mining Uncertain Data, Springer, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Agrawal, R. Srikant. Fast Algorithms for Mining Association Rules in Large Databases. VLDB 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. J. Bayardo. Efficiently mining long patterns from databases SIGMOD 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. F. Bodon. A fast APRIORI implementation. URL: {http://fimi.cs.helsinki.fi/src/}.Google ScholarGoogle Scholar
  6. D. Burdick, M. Calimlim, J. Gehrke. MAFIA: A Maximal Frequent Itemset Algorithm. IEEE TKDE, 17(11), pp. 1490--1504, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C.-K. Chui, B. Kao, E. Hung. Mining Frequent Itemsets from Uncertain Data. PAKDD 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C.-K. Chui, B. Kao. Decremental Approach for Mining Frequent Itemsets from Uncertain Data. PAKDD 2008.Google ScholarGoogle ScholarCross RefCross Ref
  9. J. Han, J. Pei, Y. Yin. Mining frequent patterns without candidate generation. SIGMOD 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. Guha, N. Koudas, K. Shim. Approximation and streaming algorithms for histogram construction problems. ACM TODS, 31(1), 396--438, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. C. K.-S. Leung, M. A. F. Mateo, D. A. Brajczuk. A Tree-Based Approach for Frequent Pattern Mining from UncertainData, PAKDD 2008.Google ScholarGoogle ScholarCross RefCross Ref
  12. J. Pei, J. Han, et al. H-Mine: Hyper-Struction Mining of Frequent Patterns in Large Databases. ICDM 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Y.G. Sucahyo, R.P. Gopalan. CT-PRO: A Bottom-up Non-Recursive Frequent Itemset Mining Algorithm Using Compressed FP-Tree DataStructure. URL: {http://fimi.cs.helsinki.fi/src/}.Google ScholarGoogle Scholar
  14. Q. Zhang, F. Li, and K. Yi. Finding Frequent Items in Probabilistic Data, SIGMOD 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Frequent pattern mining with uncertain data

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
      June 2009
      1426 pages
      ISBN:9781605584959
      DOI:10.1145/1557019

      Copyright © 2009 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 28 June 2009

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Author Tags

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader