skip to main content
column

Graph stream algorithms: a survey

Published:13 May 2014Publication History
Skip Abstract Section

Abstract

Over the last decade, there has been considerable interest in designing algorithms for processing massive graphs in the data stream model. The original motivation was two-fold: a) in many applications, the dynamic graphs that arise are too large to be stored in the main memory of a single machine and b) considering graph problems yields new insights into the complexity of stream computation. However, the techniques developed in this area are now finding applications in other areas including data structures for dynamic graphs, approximation algorithms, and distributed and parallel computation. We survey the state-of-the-art results; identify general techniques; and highlight some simple algorithms that illustrate basic ideas.

References

  1. K. J. Ahn. Analyzing massive graphs in the semi-streaming model. PhD thesis, University of Pennsylvania, Philadelphia, Pennsylvania, Jan. 2013.Google ScholarGoogle Scholar
  2. K. J. Ahn and S. Guha. Graph sparsification in the semi-streaming model. In International Colloquium on Automata, Languages and Programming, pages 328--338, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. K. J. Ahn and S. Guha. Access to data and number of iterations: Dual primal algorithms for maximum matching under resource constraints. CoRR, abs/1307.4359, 2013.Google ScholarGoogle Scholar
  4. K. J. Ahn and S. Guha. Linear programming in the semi-streaming model with application to the maximum matching problem. Inf. Comput., 222:59--79, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. K. J. Ahn, S. Guha, and A. McGregor. Analyzing graph structure via linear measurements. In ACM-SIAM Symposium on Discrete Algorithms, pages 459--467, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. K. J. Ahn, S. Guha, and A. McGregor. Graph sketches: sparsification, spanners, and subgraphs. In ACM Symposium on Principles of Database Systems, pages 5--14, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. K. J. Ahn, S. Guha, and A. McGregor. Spectral sparsification of dynamic graph streams. In International Workshop on Approximation Algorithms for Combinatorial Optimization Problems, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  8. M. Badoiu, A. Sidiropoulos, and V. Vaikuntanathan. Computing s-t min-cuts in a semi-streaming model. Manuscript.Google ScholarGoogle Scholar
  9. B. Bahmani, R. Kumar, and S. Vassilvitskii. Densest subgraph in streaming and mapreduce. PVLDB, 5(5):454--465, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Z. Bar-Yossef, R. Kumar, and D. Sivakumar. Reductions in streaming algorithms, with an application to counting triangles in graphs. In ACM-SIAM Symposium on Discrete Algorithms, pages 623--632, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Baswana. Streaming algorithm for graph spanners - single pass and constant processing time per edge. Inf. Process. Lett., 106(3):110--114, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. D. Batson, D. A. Spielman, and N. Srivastava. Twice-ramanujan sparsifiers. SIAM J. Comput., 41(6):1704--1721, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. L. Becchetti, P. Boldi, C. Castillo, and A. Gionis. Efficient algorithms for large-scale local triangle counting. TKDD, 4(3), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. A. Benczúr and D. R. Karger. Approximating s-t minimum cuts in ¿O(n2) time. In ACM Symposium on Theory of Computing, pages 47--55, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. B. Bollobás. Extremal Graph Theory. Academic Press, New York, 1978.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. V. Braverman and R. Ostrovsky. Smooth histograms for sliding windows. In IEEE Symposium on Foundations of Computer Science, pages 283--293, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. V. Braverman, R. Ostrovsky, and D. Vilenchik. How hard is counting triangles in the streaming model? In International Colloquium on Automata, Languages and Programming, pages 244--254, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. L. S. Buriol, G. Frahling, S. Leonardi, A. Marchetti-Spaccamela, and C. Sohler. Counting triangles in data streams. In ACM Symposium on Principles of Database Systems, pages 253--262, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Chakrabarti, G. Cormode, and A. McGregor. Robust lower bounds for communication and stream computation. In ACM Symposium on Theory of Computing, pages 641--650, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Chakrabarti and S. Kale. Submodular maximization meets streaming: Matchings, matroids, and more. CoRR, arXiv:1309.2038, 2013.Google ScholarGoogle Scholar
  21. G. Cormode and S. Muthukrishnan. Space efficient mining of multigraph streams. In ACM Symposium on Principles of Database Systems, pages 271--282, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. S. Crouch, A. McGregor, and D. Stubbs. Dynamic graphs in the sliding-window model. In European Symposium on Algorithms, pages 337--348, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  23. M. Elkin. Streaming and fully dynamic centralized algorithms for constructing and maintaining sparse spanners. ACM Transactions on Algorithms, 7(2):20, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. Elkin and J. Zhang. Efficient algorithms for constructing (1 + e, ß)-spanners in the distributed and streaming models. Distributed Computing, 18(5):375--385, 2006.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. L. Epstein, A. Levin, J. Mestre, and D. Segev. Improved approximation guarantees for weighted matching in the semi-streaming model. SIAM J. Discrete Math., 25(3):1251--1265, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  26. L. Epstein, A. Levin, D. Segev, and O. Weimann. Improved bounds for online preemptive matching. In STACS, pages 389--399, 2013.Google ScholarGoogle Scholar
  27. J. Feigenbaum, S. Kannan, A. McGregor, S. Suri, and J. Zhang. On graph problems in a semi-streaming model. Theoretical Computer Science, 348(2-3):207--216, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. J. Feigenbaum, S. Kannan, A. McGregor, S. Suri, and J. Zhang. Graph distances in the data-stream model. SIAM Journal on Computing, 38(5):1709--1727, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. W. S. Fung, R. Hariharan, N. J. A. Harvey, and D. Panigrahi. A general framework for graph sparsification. In ACM Symposium on Theory of Computing, pages 71--80, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. A. Goel, M. Kapralov, and S. Khanna. On the communication and streaming complexity of maximum bipartite matching. In ACM-SIAM Symposium on Discrete Algorithms, pages 468--485, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. A. Goel, M. Kapralov, and I. Post. Single pass sparsification in the streaming model with edge deletions. CoRR, abs/1203.4900, 2012.Google ScholarGoogle Scholar
  32. O. Goldreich. Introduction to testing graph properties. In O. Goldreich, editor, Studies in Complexity and Cryptography, volume 6650 of Lecture Notes in Computer Science, pages 470--506. Springer, 2011. Google ScholarGoogle Scholar
  33. V. Guruswami and K. Onak. Superlinear lower bounds for multipass graph processing. In IEEE Conference on Computational Complexity, pages 287--298, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  34. B. V. Halldórsson, M. M. Halldórsson, E. Losievskaja, and M. Szegedy. Streaming algorithms for independent sets. In International Colloquium on Automata, Languages and Programming, pages 641--652, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. M. M. Halldórsson, X. Sun, M. Szegedy, and C. Wang. Streaming and communication complexity of clique approximation. In International Colloquium on Automata, Languages and Programming, pages 449--460, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. M. R. Henzinger, P. Raghavan, and S. Rajagopalan. Computing on data streams. External memory algorithms, pages 107--118, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. M. Jha, C. Seshadhri, and A. Pinar. A space efficient streaming algorithm for triangle counting using the birthday paradox. In KDD, pages 589--597, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. M. Jha, C. Seshadhri, and A. Pinar. When a graph is not so simple: Counting triangles in multigraph streams. CoRR, arXiv:1310.7665, 2013.Google ScholarGoogle Scholar
  39. H. Jowhari and M. Ghodsi. New streaming algorithms for counting triangles in graphs. In COCOON, pages 710--716, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. H. Jowhari, M. Saglam, and G. Tardos. Tight bounds for lp samplers, finding duplicates in streams, and related problems. In ACM Symposium on Principles of Database Systems, pages 49--58, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. D. M. Kane, K. Mehlhorn, T. Sauerwald, and H. Sun. Counting arbitrary subgraphs in data streams. In International Colloquium on Automata, Languages and Programming, pages 598--609, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. M. Kapralov. Better bounds for matchings in the streaming model. In ACM-SIAM Symposium on Discrete Algorithms, pages 1679--1697, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. M. Kapralov, S. Khanna, and M. Sudan. Approximating matching size from random streams. In ACM-SIAM Symposium on Discrete Algorithms, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  44. B. M. Kapron, V. King, and B. Mountjoy. Dynamic graph connectivity in polylogarithmic worst case time. In ACM-SIAM Symposium on Discrete Algorithms, pages 1131--1142, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. D. R. Karger. Random sampling in cut, flow, and network design problems. In ACM Symposium on Theory of Computing, pages 648--657, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. J. A. Kelner and A. Levin. Spectral sparsification in the semi-streaming setting. Theory Comput. Syst., 53(2):243--262, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  47. C. Konrad, F. Magniez, and C. Mathieu. Maximum matching in semi-streaming with few passes. In APPROX-RANDOM, pages 231--242, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  48. C. Konrad and A. Rosén. Approximating semi-matchings in streaming and in two-party communication. In International Colloquium on Automata, Languages and Programming, pages 637--649, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. K. Kutzkov and R. Pagh. On the streaming complexity of computing local clustering coefficients. In WSDM, pages 677--686, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. M. Manjunath, K. Mehlhorn, K. Panagiotou, and H. Sun. Approximate counting of cycles in streams. In European Symposium on Algorithms, pages 677--688, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. A. McGregor. Finding graph matchings in data streams. In APPROX-RANDOM, pages 170--181, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. S. Muthukrishnan. Data Streams: Algorithms and Applications. Now Publishers, 2006.Google ScholarGoogle Scholar
  53. R. Pagh and C. E. Tsourakakis. Colorful triangle counting and a mapreduce implementation. Inf. Process. Lett., 112(7):277--281, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. A. Pavan, K. Tangwongsan, S. Tirthapura, and K.-L. Wu. Counting and sampling triangles from a graph stream. In International Conference on Very Large Data Bases, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. J. M. Phillips, E. Verbin, and Q. Zhang. Lower bounds for number-in-hand multiparty communication complexity, made easy. In ACM-SIAM Symposium on Discrete Algorithms, pages 486--501, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. A. D. Sarma, S. Gollapudi, and R. Panigrahy. Estimating pagerank on graph streams. J. ACM, 58(3):13, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. A. D. Sarma, R. J. Lipton, and D. Nanongkai. Best-order streaming model. Theor. Comput. Sci., 412(23):2544--2555, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. D. A. Spielman and N. Srivastava. Graph sparsification by effective resistances. SIAM J. Comput., 40(6):1913--1926, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. D. A. Spielman and S.-H. Teng. Spectral sparsification of graphs. SIAM J. Comput., 40(4):981--1025, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. R. Tarjan. Data Structures and Network Algorithms. SIAM, Philadelphia, 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. A. B. Varadaraja. Buyback problem - approximate matroid intersection with cancellation costs. In International Colloquium on Automata, Languages and Programming, pages 379--390, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. M. Zelke. Weighted matching in the semi-streaming model. Algorithmica, 62(1-2):1--20, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Graph stream algorithms: a survey
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader