skip to main content
10.1145/3178487.3178504acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
research-article
Artifacts Available

Communication-avoiding parallel minimum cuts and connected components

Published:10 February 2018Publication History

ABSTRACT

We present novel scalable parallel algorithms for finding global minimum cuts and connected components, which are important and fundamental problems in graph processing. To take advantage of future massively parallel architectures, our algorithms are communication-avoiding: they reduce the costs of communication across the network and the cache hierarchy. The fundamental technique underlying our work is the randomized sparsification of a graph: removing a fraction of graph edges, deriving a solution for such a sparsified graph, and using the result to obtain a solution for the original input. We design and implement sparsification with O(1) synchronization steps. Our global minimum cut algorithm decreases communication costs and computation compared to the state-of-the-art, while our connected components algorithm incurs few cache misses and synchronization steps. We validate our approach by evaluating MPI implementations of the algorithms on a petascale supercomputer. We also provide an approximate variant of the minimum cut algorithm and show that it approximates the exact solutions well while using a fraction of cores in a fraction of time.

Skip Supplemental Material Section

Supplemental Material

References

  1. James Abello, Adam L. Buchsbaum, and Jeffery Westbrook. 1998. A Functional Approach to External Graph Algorithms. In Proceedings of the 6th Annual European Symposium on Algorithms (ESA '98). Springer-Verlag, London, UK, UK, 332--343. http://dl.acm.org/citation.cfm?id=647908.740141 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Micah Adler, Wolfgang Dittrich, Ben Juurlink, Miroslaw Kutylowski, and Ingo Rieping. 1998. Communication-optimal Parallel Minimum Spanning Tree Algorithms (Extended Abstract). In Proceedings of the Tenth Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA '98). ACM, New York, NY, USA, 27--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Réka Albert and Albert-László Barabási. 2002. Statistical mechanics of complex networks. Rev. Mod. Phys. 74 (Jan 2002), 47--97. Issue 1.Google ScholarGoogle ScholarCross RefCross Ref
  4. Friedhelm Meyer auf der Heide and Gabriel T. Martinez. 1998. Communication-efficient parallel multiway and approximate minimum cut computation. In LATIN'98: Theoretical Informatics. Springer, 316--330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Michael Brinkmeier. 2007. A Simple and Fast Min-Cut Algorithm. Theory Comput. Syst. 41, 2 (2007), 369--380. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Deepayan Chakrabarti, Yiping Zhan, and Christos Faloutsos. 2004. R-MAT: A Recursive Model for Graph Mining. In Proceedings of the Fourth SIAM International Conference on Data Mining, Lake Buena Vista, Florida, USA, April 22--24, 2004, Michael W. Berry, Umeshwar Dayal, Chandrika Kamath, and David B. Skillicorn (Eds.). SIAM, 442--446.Google ScholarGoogle Scholar
  7. Ding-Kai Chen, Hong-Men Su, and Pen-Chung Yew. 1990. The Impact of Synchronization and Granularity on Parallel Systems. SIGARCH Comput. Archit. News 18, 2SI (May 1990), 239--248. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Frank K. H. A. Dehne, Afonso Ferreira, Edson Cáceres, Siang W. Song, and Alessandro Roncato. 2002. Efficient Parallel Graph Algorithms for Coarse-Grained Multicomputers and BSP. Algorithmica 33, 2 (2002), 183--200. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Paul Erdös and Alfréd Rényi. 1959. On random graphs, I. Publicationes Mathematicae (Debrecen) 6 (1959), 290--297.Google ScholarGoogle ScholarCross RefCross Ref
  10. L.R. Ford and Delbert R. Fulkerson. 1962. Flows in networks. Vol. 1962. Princeton Princeton University Press.Google ScholarGoogle Scholar
  11. Matteo Frigo, Charles E. Leiserson, Harald Prokop, and Sridhar Ramachandran. 2012. Cache-Oblivious Algorithms. ACM Trans. Algorithms 8, 1 (2012), 4:1--4:22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Hillel Gazit. 1986. An Optimal Randomized Parallel Algorithm for Finding Connected Components in a Graph. In 27th Annual Symposium on Foundations of Computer Science, Toronto, Canada, 27--29 October 1986. IEEE Computer Society, 492--501. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Barbara Geissmann and Lukas Gianinazzi. 2017. Cache Oblivious Minimum Cut. Springer International Publishing, Cham, 285--296.Google ScholarGoogle Scholar
  14. Steve Goddard, Subodh Kumar, and Jan F. Prins. 1994. Connected components algorithms for mesh-connected parallel computers. In Parallel Algorithms, Proceedings of a DIMACS Workshop, Brunswick, New Jersey, USA, October 17--18, 1994 (DIMACS Series in Discrete Mathematics and Theoretical Computer Science), Sandeep Nautam Bhatt (Ed.), Vol. 30. DIMACS/AMS, 43--58. http://dimacs.rutgers.edu/Volumes/Vol30.htmlGoogle ScholarGoogle Scholar
  15. Douglas Gregor and Andrew Lumsdaine. 2005. The parallel BGL: A generic library for distributed graph computations. Parallel Object-Oriented Scientific Computing (POOSC) 2 (2005), 1--18.Google ScholarGoogle Scholar
  16. Shay Halperin and Uri Zwick. 1996. Optimal randomized EREW PRAM Algorithms for Finding Spanning Forests and for other Basic Graph Connectivity Problems. In Proceedings of the Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, 28--30 January 1996, Atlanta, Georgia. 438--447. http://dl.acm.org/citation.cfm?id-313852.314099 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jianxiu Hao and James B. Orlin. 1992. A faster algorithm for finding the minimum cut in a graph. In Proceedings of the third annual ACM-SIAM symposium on Discrete algorithms. Society for Industrial and Applied Mathematics, 165--174. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Torsten Hoefler and Roberto Belli. 2015. Scientific Benchmarking of Parallel Computing Systems. IEEE/ACM International Conference on High Performance Computing, Networking, Storage and Analysis (SC15). Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Torsten Hoefler, William Gropp, Rajeev Thakur, and Jesper Larsson Träff. 2010. Toward Performance Models of MPI Implementations for Understanding Application Scaling Issues. In Recent Advances in the Message Passing Interface - 17th European MPI Users' Group Meeting, EuroMPI 2010, Stuttgart, Germany, September 12--15, 2010. Proceedings (Lecture Notes in Computer Science), Rainer Keller, Edgar Gabriel, Michael M. Resch, and Jack Dongarra (Eds.), Vol. 6305. Springer, 21--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Mark Hoemmen. 2010. Communication-avoiding Krylov Subspace Methods. Ph.D. Dissertation. Berkeley, CA, USA. Advisor(s) Demmel, James W. AAI3413388. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Keechul Jung, Kwang In Kim, and Anil K. Jain. 2004. Text information extraction in images and video: a survey. Pattern Recognition 37, 5 (2004), 977 -- 997.Google ScholarGoogle ScholarCross RefCross Ref
  22. David R. Karger. 2000. Minimum cuts in near-linear time. J. ACM 47, 1 (2000), 46--76. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. David R. Karger. 2001. A randomized fully polynomial time approximation scheme for the all-terminal network reliability problem. SIAM review 43, 3 (2001), 499--522. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. David R. Karger, Philip N. Klein, and Robert Endre Tarjan. 1995. A Randomized Linear-Time Algorithm to Find Minimum Spanning Trees. J. ACM 42, 2 (1995), 321--328. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. David R. Karger and Clifford Stein. 1996. A new approach to the minimum cut problem. Journal of the ACM (JACM) 43, 4 (1996), 601--640. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. John Kim, Wiliam J Dally, Steve Scott, and Dennis Abts. 2008. Technology-driven, highly-scalable dragonfly topology. In ACM SIGARCH Computer Architecture News, Vol. 36. IEEE Computer Society, 77--88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Jean-Yves Le Boudec. 2010. Performance evaluation of computer and communication systems. EPFL Press.Google ScholarGoogle Scholar
  28. Andrew Lumsdaine, Douglas P. Gregor, Bruce Hendrickson, and Jonathan W. Berry. 2007. Challenges in Parallel Graph Processing. Parallel Processing Letters 17, 1 (2007), 5--20.Google ScholarGoogle ScholarCross RefCross Ref
  29. Hiroshi Nagamochi and Toshihide Ibaraki. 1992. Computing edge-connectivity in multigraphs and capacitated graphs. SIAM Journal on Discrete Mathematics 5, 1 (1992), 54--66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Donald Nguyen, Andrew Lenharth, and Keshav Pingali. 2013. A light-weight infrastructure for graph analytics. In ACM SIGOPS 24th Symposium on Operating Systems Principles, SOSP '13, Farmington, PA, USA, November 3--6, 2013, Michael Kaminsky and Mike Dahlin (Eds.). ACM, 456--471. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Simone F. Oliveira, Karl Fürlinger, and Dieter Kranzlmüller. 2012. Trends in Computation, Communication and Storage and the Consequences for Data-intensive Science. In 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, HPCC-ICESS 2012. 572--579. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Víctor Osma-Ruiz, Juan I. Godino-Llorente, Nicolás Sáenz-Lechón, and Pedro Gómez-Vilda. 2007. An Improved Watershed Algorithm Based on Efficient Computation of Shortest Paths. Pattern Recogn. 40, 3 (March 2007), 1078--1090. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. David A. Patterson. 2004. Latency lags bandwith. Commun. ACM 47, 10 (2004), 71--75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Jelena Pjesivac-Grbovic, Thara Angskun, George Bosilca, Graham E. Fagg, Edgar Gabriel, and Jack Dongarra. 2007. Performance analysis of MPI collective operations. Cluster Computing 10, 2 (2007), 127--143. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Michael Oser Rabin. 1976. Probablistic Algorithms. In Algorithms and Complexity, Joseph F. Traub (Ed.). Academic Press, 21--36.Google ScholarGoogle Scholar
  36. John H. Reif. 1993. Synthesis of Parallel Algorithms (1st ed.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. John H. Reif and Sandeep Sen. 1989. Polling: A New Randomized Sampling Technique for Computational Geometry. In Proceedings of the 21st Annual ACM Symposium on Theory of Computing, May 14--17, 1989, Seattle, Washigton, USA. 394--404. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. John K. Salmon, Mark A. Moraes, Ron O. Dror, and David E. Shaw. 2011. Parallel Random Numbers: As Easy As 1, 2, 3. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC '11). ACM, New York, NY, USA, Article 16, 12 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Satu Elisa Schaeffer. 2007. Survey: Graph Clustering. Comput. Sci. Rev. 1, 1 (Aug. 2007), 27--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Roded Sharan and Ron Shamir. 2000. CLICK: a clustering algorithm with applications to gene expression analysis. In Proc Int Conf Intell Syst Mol Biol, Vol. 8. 16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Jeremy G. Siek, Lie-Quan Lee, and Andrew Lumsdaine. 2001. Boost Graph Library: User Guide and Reference Manual, The. Pearson Education.Google ScholarGoogle Scholar
  42. Francesco Silvestri. 2007. On the Limits of Cache-oblivious Matrix Transposition. In Proceedings of the 2Nd International Conference on Trustworthy Global Computing (TGC'06). Springer-Verlag, Berlin, Heidelberg, 233--243. http://dl.acm.org/citation.cfm?id = 1776656.1776677 Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Marc Snir. 1998. MPI-the Complete Reference: The MPI core. Vol. 1. MIT press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Robert Solovay and Volker Strassen. 1977. A Fast Monte-Carlo Test for Primality. SIAM J. Comput. 6, 1 (1977), 84--85.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Mechthild Stoer and Frank Wagner 1997. A simple min-cut algorithm. J. ACM 44, 4 (1997), 585--591. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Jayaram K. Udupa and Venkatramana G. Ajjanagadde. 1990. Boundary and object labelling in three-dimensional images. Computer Vision, Graphics, and Image Processing 51, 3 (1990), 355--369. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Leslie G. Valiant. 1990. A bridging model for parallel computation. Commun. ACM 33, 8 (1990), 103--111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Duncan J. Watts and Steven H Strogatz. 1998. Collective Dynamics of small-world networks. Nature 393 (1998).Google ScholarGoogle Scholar
  49. Andrew D. Wilson. 2006. Robust Computer Vision-based Detection of Pinching for One and Two-handed Gesture Input. In Proceedings of the 19th Annual ACM Symposium on User Interface Software and Technology (UIST '06). ACM, New York, NY, USA, 255--258. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Yao Zhu and David F. Gleich. 2016. A parallel min-cut algorithm using iteratively reweighted least squares targeting at problems with floating-point edge weights. Parallel Comput. 59 (2016), 43--59. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Communication-avoiding parallel minimum cuts and connected components

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      PPoPP '18: Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
      February 2018
      442 pages
      ISBN:9781450349826
      DOI:10.1145/3178487
      • cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 53, Issue 1
        PPoPP '18
        January 2018
        426 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/3200691
        Issue’s Table of Contents

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 10 February 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate230of1,014submissions,23%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader