skip to main content
10.1145/1557019.1557111acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

DOULION: counting triangles in massive graphs with a coin

Published:28 June 2009Publication History

ABSTRACT

Counting the number of triangles in a graph is a beautiful algorithmic problem which has gained importance over the last years due to its significant role in complex network analysis. Metrics frequently computed such as the clustering coefficient and the transitivity ratio involve the execution of a triangle counting algorithm. Furthermore, several interesting graph mining applications rely on computing the number of triangles in the graph of interest.

In this paper, we focus on the problem of counting triangles in a graph. We propose a practical method, out of which all triangle counting algorithms can potentially benefit. Using a straightforward triangle counting algorithm as a black box, we performed 166 experiments on real-world networks and on synthetic datasets as well, where we show that our method works with high accuracy, typically more than 99% and gives significant speedups, resulting in even ≈ 130 times faster performance.

Skip Supplemental Material Section

Supplemental Material

p837-tsourakakis.mp4

mp4

84.6 MB

References

  1. D. Achlioptas and F. McSherry. Fast computation of low rank matrix approximation. In STOC, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. N. Alon and S. Joel. The Probabilistic Method. Wiley Interscience, New York, second edition, 2000.Google ScholarGoogle Scholar
  3. N. Alon, Y. Matias, and M. Szegedy. The space complexity of approximating the frequency moments. In STOC '96: Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, pages 20--29, New York, NY, USA, 1996. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. N. Alon, R. Yuster, and U. Zwick. Finding and counting given length cycles. Algorithmica, 17(3):209--223, 1997.Google ScholarGoogle ScholarCross RefCross Ref
  5. Z. Bar-Yosseff, R. Kumar, and D. Sivakumar. Reductions in streaming algorithms, with an application to counting triangles in graphs. In SODA '02: Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms, pages 623--632, Philadelphia, PA, USA, 2002. Society for Industrial and Applied Mathematics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. L. Becchetti, P. Boldi, C. Castillo, and A. Gionis. Efficient semi-streaming algorithms for local triangle counting in massive graphs. In Proceedings of ACM KDD, Las Vegas, NV, USA, August 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. Bollobas. Random Graphs. Cambridge University Press, 2001.Google ScholarGoogle Scholar
  8. L. S. Buriol, G. Frahling, S. Leonardi, A. Marchetti-Spaccamela, and C. Sohler. Counting triangles in data streams. In PODS '06: Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 253--262, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. F. Chung, L. Lu, and V. Vu. Eigenvalues of random power law graphs. Annals of Combinatorics, 7(1):21--33, June 2003.Google ScholarGoogle ScholarCross RefCross Ref
  10. D. Coppersmith and S. Winograd. Matrix multiplication via arithmetic progressions. In STOC '87: Proceedings of the nineteenth annual ACM conference on Theory of computing, pages 1--6, New York, NY, USA, 1987. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Dean and S. Ghemawat. Mapreduce: Simplified data processing on large clusters. OSDI '04, pages 137--150, December 2004.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Dean and S. Ghemawat. Mapreduce: Simplified data processing on large clusters. OSDI, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J.-P. Eckmann and E. Moses. Curvature of co-links uncovers hidden thematic layers in the world wide web. PNAS, 99(9):5825--5829, April 2002.Google ScholarGoogle ScholarCross RefCross Ref
  14. I. J. Farkas, I. Derenyi, A.-L. Barabasi, and T. Vicsek. Spectra of "real-world" graphs: Beyond the semi-circle law. Physical Review E, 64:1, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  15. G. Golub and C. Van Loan. Matrix Computations. JohnsHopkinsPress, Baltimore, MD, second edition, 1989.Google ScholarGoogle Scholar
  16. A. Itai and M. Rodeh. Finding a minimum circuit in a graph. In STOC '77: Proceedings of the ninth annual ACM symposium on Theory of computing, pages 1--10, New York, NY, USA, 1977. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. H. Jowhari and M. Ghodsi. New streaming algorithms for counting triangles in graphs. In COCOON, pages 710--716, 2005.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. E. Knuth. Seminumerical Algorithms, volume 2 of The Art of Computer Programming. Addison-Wesley, Reading, Massachusetts, second edition, 10 Jan. 1981.Google ScholarGoogle Scholar
  19. R. Lammel. Google's mapreduce programming model - revisited. Science of Computer Programming, 70:1--30, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Latapy. Main-memory triangle computations for very large (sparse (power-law)) graphs. Theor. Comput. Sci., 407(1-3):458--473, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Leskovec and E. Horvitz. Planetary-scale views on an instant-messaging network, Mar 2008.Google ScholarGoogle Scholar
  22. M. Mcpherson, L. S. Lovin, and J. M. Cook. Birds of a feather: Homophily in social networks. Annual Review of Sociology, 27(1):415--444, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  23. M. Mihail and C. Papadimitriou. the eigenvalue power law, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. E. J. Newman. The structure and function of complex networks. SIAM Review, 45:167--256, 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. T. Sarlos. Improved approximation algorithms for large matrices via random projections. In FOCS '06: Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science, pages 143--152, Washington, DC, USA, 2006. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. T. Schank. Algorithmic Aspects of Triangle-Based Network Analysis. Phd in computer science, University Karlsruhe, 2007.Google ScholarGoogle Scholar
  27. C. Tsourakakis. Fast counting of triangles in large real networks, without counting: Algorithms and laws. In ICDM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. C. Tsourakakis, P. Drineas, E. Michelakis, I. Koutis, and C. Faloutsos. Spectral counting of triangles in power-law networks via element-wise sparsification. In SODA '02: Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. S. Vitter. Faster methods for random sampling. Commun. ACM, 27(7):703--718, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Y. Wang, D. Chakrabarti, C. Faloutsos, C. Wang, and C. Wang. Epidemic spreading in real networks: An eigenvalue viewpoint. In In SRDS, pages 25--34, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  31. S. Wasserman and K. Faust. Social network analysis. Cambridge University Press, Cambridge, 1994.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. DOULION: counting triangles in massive graphs with a coin

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
        June 2009
        1426 pages
        ISBN:9781605584959
        DOI:10.1145/1557019

        Copyright © 2009 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 28 June 2009

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate1,133of8,635submissions,13%

        Upcoming Conference

        KDD '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader