skip to main content
10.1145/1645953.1646028acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Frequent subgraph pattern mining on uncertain graph data

Authors Info & Claims
Published:02 November 2009Publication History

ABSTRACT

Graph data are subject to uncertainties in many applications due to incompleteness and imprecision of data. Mining uncertain graph data is semantically different from and computationally more challenging than mining exact graph data. This paper investigates the problem of mining frequent subgraph patterns from uncertain graph data. The frequent subgraph pattern mining problem is formalized by designing a new measure called expected support. An approximate mining algorithm is proposed to find an approximate set of frequent subgraph patterns by allowing an error tolerance on the expected supports of the discovered subgraph patterns. The algorithm uses an efficient approximation algorithm to determine whether a subgraph pattern can be output or not. The analytical and experimental results show that the algorithm is very efficient, accurate and scalable for large uncertain graph databases.

References

  1. C. C. Aggarwal, Y. Li, J. Wang, and J. Wang. Frequent pattern mining with uncertain data. In KDD, pages 29--38, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Asthana, O. D. King, F. D. Gibbons, and F. P. Roth. Predicting protein complex membership using probabilistic network reliability. Genome Research, 14(6):1170--1175, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  3. T. Bernecker, H.-P. Kriegel, M. Renz, F. Verhein, and A. Züfle. Probabilistic frequent itemset mining in uncertain databases. In KDD, pages 119--128, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. G. Cormode and A. McGregor. Approximation algorithms for clustering uncertain data. In PODS, pages 191--200, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Ghosh, H. Q. Ngo, S. Yoon, and C. Qiao. On a routing problem within probabilistic graphs and its application to intermittently connected networks. In INFOCOM, pages 1721--1729, 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. Hintsanen and H. Toivonen. Finding reliable subgraphs from large probabilistic graphs. Data Min. Knowl. Discov., 17(1):3--23, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Huan, W. Wang, and J. Prins. Efficient mining of frequent subgraphs in the presence of isomorphism. In ICDM, page 549, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Huan, W. Wang, J. Prins, and J. Yang. Spin: mining maximal frequent subgraphs from graph databases. In KDD, pages 581--586, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Inokuchi, T. Washio, and H. Motoda. An apriori-based algorithm for mining frequent substructures from graph data. In PKDD, pages 13--23, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. R. M. Karp and M. Luby. Monte-carlo algorithms for enumeration and reliability problems. In FOCS, pages 56--64, 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Y. Ke, J. Cheng, and W. Ng. Correlation search in graph databases. In KDD, pages 390--399, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Koyutürk, A. Grama, and W. Szpankowski. An efficient algorithm for detecting frequent subgraphs in biological networks. Bioinformatics, 20(Suppl. 1):i200--i207, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Kuramochi and G. Karypis. Frequent subgraph discovery. In ICDM, pages 313--320, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. Liu, J. Li, and H. Gao. Summarizing graph patterns. In ICDE, pages 903--912, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Luby and B. Velickovic. On deterministic approximation of dnf. In STOC, pages 430--438, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Mitzenmacher and E. Upfal. Probability and Computing: Randomized algorithms and probabilistic analysis. Cambridge University Press, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. Nijssen and J. N. Kok. A quickstart in frequent structure mining can make a difference. In KDD, pages 647--652, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Pei, D. Jiang, and A. Zhang. On mining cross-graph quasi-cliques. In KDD, pages 228--238, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Suthram, T. Shlomi, E. Ruppin, R. Sharan, and T. Ideker. A direct comparison of protein interaction confidence assignment schemes. BMC Bioinformatics, 7(1):360, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  21. S. Tsang, B. Kao, K. Y. Yip, W.-S. Ho, and S. D. Lee. Decision trees for uncertain data. In ICDE, pages 441--444, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. L. G. Valiant. The complexity of computing the permanent. Theor. Comput. Sci., 8:189--201, 1979.Google ScholarGoogle ScholarCross RefCross Ref
  23. N. Vanetik. Discovering frequent graph patterns using disjoint paths. TKDE, 18(11):1441--1456, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. C. Wang, W. Wang, J. Pei, Y. Zhu, and B. Shi. Scalable mining of large disk-based graph databases. In KDD, pages 316--325, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Wang, Z. Zeng, and L. Zhou. Clan: An algorithm for mining closed cliques from large dense graph databases. In ICDE, page 73, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. X. Yan, H. Cheng, J. Han, and P. S. Yu. Mining significant graph patterns by leap search. In SIGMOD, pages 433--444, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. X. Yan and J. Han. gspan: Graph-based substructure pattern mining. In ICDM, page 721, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. X. Yan and J. Han. Closegraph: mining closed frequent graph patterns. In KDD, pages 286--295, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Z. Zeng, J. Wang, L. Zhou, and G. Karypis. Out-of-core coherent closed quasi-clique mining from large dense graph databases. TODS, 32(2):13, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Q. Zhang, F. Li, and K. Yi. Finding frequent items in probabilistic data. In SIGMOD, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Frequent subgraph pattern mining on uncertain graph data

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management
      November 2009
      2162 pages
      ISBN:9781605585123
      DOI:10.1145/1645953

      Copyright © 2009 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 2 November 2009

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader