skip to main content
research-article

ciForager: Incrementally discovering regions of correlated change in evolving graphs

Published:29 October 2012Publication History
Skip Abstract Section

Abstract

Data mining techniques for understanding how graphs evolve over time have become increasingly important. Evolving graphs arise naturally in diverse applications such as computer network topologies, multiplayer games and medical imaging. A natural and interesting problem in evolving graph analysis is the discovery of compact subgraphs that change in a similar manner. Such subgraphs are known as regions of correlated change and they can both summarise change patterns in graphs and help identify the underlying events causing these changes. However, previous techniques for discovering regions of correlated change suffer from limited scalability, making them unsuitable for analysing the evolution of very large graphs. In this paper, we introduce a new algorithm called ciForager, that addresses this scalability challenge and offers considerable improvements. The efficiency of ciForager is based on the use of new incremental techniques for detecting change, as well as the use of Voronoi representations for efficiently determining distance. We experimentally show that ciForager can achieve speedups of up to 1000 times over previous approaches. As a result, it becomes feasible for the first time to discover regions of correlated change in extremely large graphs, such as the entire BGP routing topology of the Internet.

References

  1. Aggarwal , C. C., Han, J., Wang, J., and Yu, P. S. 2003. A framework for clustering evolving data streams. In Proceedings of the 29th International Conference on Very Large Data Bases. 81--92. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Ali, M. H., Mokbel, M. F., Aref, W. G., and Kamel, I. 2005. Detection and tracking of discrete phenomena in sensor-network databases. In Proceedings of the 17th International Conference on Scientific and Statistical Database Management. 163--172. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Arlitt, M. and Jin, T. 1999. Workload characterization of the 1998 World Cup website. Tech. rep. HPL-99-35R1, Hewlett-Packard Labs.Google ScholarGoogle Scholar
  4. Bae, E., Bailey, J., and Dong, G. 2010. A clustering comparison measure using density profiles and its application to the discovery of alternate clusterings. Data Mining Knowl. Discov. 21, 427--471. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bogdanov, P., Mongiovì, M., and Singh, A. K. 2011. Mining heavy subgraphs in time-evolving networks. In Proceedings of the 11th International Conference on Data Mining. 81--90. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Borgwardt, K. M., Kriegel, H.-P., and Wackersreuther, P. 2006. Pattern mining in frequent dynamic subgraphs. In Proceedings of the 6th International Conference on Data Mining. 818--822. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Celik, M., Shekhar, S., Rogers, J. P., Shine, J. A., and Yoo, J. S. 2006. Mixed-drove spatio-temporal co-occurrence pattern mining: A summary of results. In Proceedings of the 6th International Conference on Data Mining. 119--128. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Chakrabarti, D., Kumar, R., and Tomkins, A. 2006. Evolutionary clustering. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 554--560. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Chan, J., Bailey, J., and Leckie, C. 2008. Discovering correlated spatio-temporal changes in evolving graphs. Knowl. Inform. Syst. 16, 1, 53--96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Chan, J., Bailey, J., and Leckie, C. 2009. Using graph partitioning to discover regions of correlated change spatio-temporal change in evolving graphs. Intell. Data Analy. 13, 5, 755--793. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Chi, Y., Song, X., Zhou, D., Hino, K., and Tseng, B. L. 2007. Evolutionary spectral clustering by incorporating temporal smoothness. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 153--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Clare, S. 1997. Functional MRI: Methods and applications. Ph.D. thesis, University of Nottingham.Google ScholarGoogle Scholar
  13. Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. 2001. Introduction to Algorithms. MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. de Berg, M., Cheong, O., van Krevel D, M., and Overmars, M. 2008. Computational Geometry: Algorithms and Applications. Springer-Verlag. Google ScholarGoogle ScholarCross RefCross Ref
  15. Du, N., Wang, H., and Faloutsos, C. 2010. Analysis of large multi-modal social networks: Patterns and a generator. In Machine Learning and Knowledge Discovery in Databases. Lecture Notes in Computer Science Series, vol. 6321, Springer Berlin. 393--408. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Elnekave, S., Last, M., and Maimon, O. 2007. Incremental clustering of mobile objects. In Proceedings of the 15th Annual ACM International Symposium on Advances in Geographic Information Systems. 585--592. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Erwig, M. 2000. The graph Voronoi diagram with applications. Netw. 36, 3, 156--163.Google ScholarGoogle ScholarCross RefCross Ref
  18. Gibson, D., Kumar, R., and Tomkins, A. 2005. Discovering large dense subgraphs in massive graphs. In Proceedings of the 31st International Conference on Very Large Data Bases. 721--732. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Halkidi, M., Batisakis, Y., and Vazirgiannis, M. 2001. On clustering validation techniques. J. Intell. Inform. Syst. 17, 2--3, 107--145. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Honiden, S., Houle, M. E., and Sommer, C. 2009. Balancing graph Voronoi diagrams. In Proceedings of the Sixth International Symposium on Voronoi Diagrams. IEEE Computer Society, 183--191. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Jain, A. K. and Dubes, R. C. 1998. Algorithms for Clustering Data. Prentice-Hall, Inc. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Kumar, R., Novak, J., Raghavan, P., and Tomkins, A. S. 2003. On the bursty evolution of blogspace. In Proceedings of the 12th International Conference on World Wide Web. 568--576. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Kumar, R., Novak, J., and Tomkins, A. S. 2006. Structure and evolution of online social networks. In Proceedings of the 12th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (poster). Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Lahiri, M. and Berger-Wolf, T. Y. 2010. Periodic subgraph mining in dynamic networks. Knowl. Inform. Syst. 24, 467--497. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Lauw, H. W., Lim, E.-P., Tan, T.-T., and Pang, H.-H. 2005. Mining social networks from spatio-temporal events. In Workshop on Link Analysis, Couterterrorism and Security.Google ScholarGoogle Scholar
  26. Leskovec, J., Kleinberg, J., and Faloutsos, C. 2005. Graphs over time: Densification laws, shrinking diameters and possible explanations. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. 177--187. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Luenberger, D. 2003. Linear and Nonlinear Programming. Kluwer Academic Publishers.Google ScholarGoogle Scholar
  28. Meila, M. 2003. Comparing clusterings by the variation of information. In Proceedings of the Conference on Learning Theory and Kernel Machines. 173--187.Google ScholarGoogle ScholarCross RefCross Ref
  29. Shoubridge, P. J., Kraetzl, M., Wallis, W. D., and Bunke, H. 2002. Detection of abnormal change in a time series of graphs. J. Interconn. Netw. 3, 1-2, 85--101.Google ScholarGoogle ScholarCross RefCross Ref
  30. Steinder, M. and Sethi, A. S. 2004. A survey of fault localization techniques in computer networks. Sci. Comput. Program. 53, 2, 165--194.Google ScholarGoogle ScholarCross RefCross Ref
  31. Sun, J., Papadimitriou, S., Yu, P. S., and Faloutsos, C. 2007. Graphscope: Parameter-free mining of large time-evolving graphs. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 687--696. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Sun, J., Tao, D., and Faloutsos, C. 2006. Beyond streams and graphs: Dynamic tensor analysis. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 374--383. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Thon, I., Landwehr, N., and Raedt, L. D. 2008. A simple model for sequences of relational state descriptions. In Proceedings of the 19th European Conference on Machine Learning. 506--521 Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Yang, H., Parthasarathy, S., and Mehta, S. 2005. A generalized framework for mining spatio-temporal patterns in scientific data. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. 716--721. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Zhou, A., Cao, F., Qian, W., and Jin, C. 2007. Tracking clusters in evolving data streams over sliding windows. Knowl. Inform. Syst. 181--214. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Zhou, D., Li, J., and Zha, H. 2005. A new mallows distance based metric for comparing clusterings. In Proceedings of the 22nd International Conference on Machine Learning. 1028--1035. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. ciForager: Incrementally discovering regions of correlated change in evolving graphs

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Knowledge Discovery from Data
      ACM Transactions on Knowledge Discovery from Data  Volume 6, Issue 3
      October 2012
      126 pages
      ISSN:1556-4681
      EISSN:1556-472X
      DOI:10.1145/2362383
      Issue’s Table of Contents

      Copyright © 2012 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 29 October 2012
      • Accepted: 1 March 2012
      • Revised: 1 August 2011
      • Received: 1 February 2009
      Published in tkdd Volume 6, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader