skip to main content
10.1145/2463676.2467799acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Trinity: a distributed graph engine on a memory cloud

Published:22 June 2013Publication History

ABSTRACT

Computations performed by graph algorithms are data driven, and require a high degree of random data access. Despite the great progresses made in disk technology, it still cannot provide the level of efficient random access required by graph computation. On the other hand, memory-based approaches usually do not scale due to the capacity limit of single machines. In this paper, we introduce Trinity, a general purpose graph engine over a distributed memory cloud. Through optimized memory management and network communication, Trinity supports fast graph exploration as well as efficient parallel computing. In particular, Trinity leverages graph access patterns in both online and offline computation to optimize memory and communication for best performance. These enable Trinity to support efficient online query processing and offline analytics on large graphs with just a few commodity machines. Furthermore, Trinity provides a high level specification language called TSL for users to declare data schema and communication protocols, which brings great ease-of-use for general purpose graph management and computing. Our experiments show Trinity's performance in both low latency graph queries as well as high throughput graph analytics on web-scale, billion-node graphs.

References

  1. http://graphlab.org/.Google ScholarGoogle Scholar
  2. http://hadoop.apache.org/.Google ScholarGoogle Scholar
  3. http://incubator.apache.org/giraph/.Google ScholarGoogle Scholar
  4. http://neo4j.org/.Google ScholarGoogle Scholar
  5. http://www.graph500.org/.Google ScholarGoogle Scholar
  6. How to partition a billion-scale graph. Technical report, Microsoft Research, 2012.Google ScholarGoogle Scholar
  7. M. K. Aguilera, A. Merchant, M. Shah, A. Veitch, and C. Karamanolis. Sinfonia: a new paradigm for building scalable distributed systems. SOSP '07, pages 159--174, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. G. Andersen, J. Franklin, M. Kaminsky, A. Phanishayee, L. Tan, and V. Vasudevan. Fawn: a fast array of wimpy nodes. SOSP '09, pages 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Berry, B. Hendrickson, S. Kahan, and P. Konecny. Software and algorithms for graph queries on multithreaded architectures. In IPDPS 2007, pages 1--14, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  10. D. Borthakur. The Hadoop Distributed File System: Architecture and Design, 2007.Google ScholarGoogle Scholar
  11. R. Bramandia, B. Choi, and W. K. Ng. Incremental maintenance of 2-hop labeling of large graphs. TKDE, 22(5):682--698, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. Chakrabarti, Y. Zhan, and C. Faloutsos. R-mat: A recursive model for graph mining. SDM '04, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  13. T. D. Chandra, R. Griesemer, and J. Redstone. Paxos made live: an engineering perspective. PODC '07, pages 398--407, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Cheng, J. X. Yu, B. Ding, P. S. Yu, and H. Wang. Fast graph pattern matching. In ICDE, pages 913--922, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Dean and S. Ghemawat. Mapreduce: Simplified data processing on large clusters. OSDI '04, pages 137--150.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. E. W. Dijkstra. Shmuel Safra's version of termination detection. Jan. 1987.Google ScholarGoogle Scholar
  17. B. Fitzpatrick. Distributed caching with memcached. Linux J., August 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In OSDI, pages 17--30, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. Gregor and A. Lumsdaine. The Parallel BGL: A generic library for distributed graph computations. POOSC '05.Google ScholarGoogle Scholar
  20. Y. Guo, Z. Pan, and J. Heflin. LUBM: A benchmark for OWL knowledge base systems. Journal of Web Semantics, 3(2-3):158--182, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. H. Higaki, K. Shima, T. Tachikawa, and M. Takizawa. Checkpoint and rollback in asynchronous distributed systems. INFOCOM '97, pages 998--, 1997.Google ScholarGoogle Scholar
  22. B. Iordanov. Hypergraphdb: a generalized graph database. WAIM '10, pages 25--36, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. U. Kang, C. E. Tsourakakis, and C. Faloutsos. Pegasus: A peta-scale graph mining system implementation and observations. ICDM '09, pages 229--238, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. G. Karypis and V. Kumar. Parallel multilevel k-way partitioning scheme for irregular graphs. Supercomputing '96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Kyrola, G. Blelloch, and C. Guestrin. Graphchi: Large-scale graph computation on just a pc. In OSDI, pages 31--46, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. Lumsdaine, D. Gregor, B. Hendrickson, and J. W. Berry. Challenges in parallel graph processing. Parallel Processing Letters, 17(1):5--20, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  27. N. A. Lynch. Distributed Algorithms. 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. SIGMOD '10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. D. Ongaro, S. M. Rumble, R. Stutsman, J. Ousterhout, and M. Rosenblum. Fast crash recovery in ramcloud. SOSP '11, pages 29--41, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D. Mazières, S. Mitra, A. Narayanan, G. Parulkar, M. Rosenblum, S. M. Rumble, E. Stratmann, and R. Stutsman. The case for ramclouds: scalable high-performance storage entirely in dram. SIGOPS Oper. Syst. Rev., 43:92--105, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. T. Schütt, F. Schintke, and A. Reinefeld. Scalaris: reliable transactional p2p key/value store. ERLANG '08, pages 41--48, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Z. Sun, H. Wang, H. Wang, B. Shao, and J. Li. Efficient subgraph matching on billion node graphs. Proc. VLDB Endow., 5(9):788--799, May 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. W. Wu, H. Li, H. Wang, and K. Zhu. Probase: A probabilistic taxonomy for text understanding. In SIGMOD, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark: cluster computing with working sets. HotCloud'10, pages 10--10, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. J. Zawodny. Redis: Lightweight key/value store that goes the extra mile. Linux Magazine, 2009.Google ScholarGoogle Scholar
  36. K. Zeng, J. Yang, H. Wang, B. Shao, and Z. Wang. A distributed graph engine for web scale RDF data. In VLDB 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. X. Zhao, A. Sala, C. Wilson, H. Zheng, and B. Y. Zhao. Orion: shortest path estimation for large social graphs. WOSN'10, pages 9--9, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Trinity: a distributed graph engine on a memory cloud

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGMOD '13: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
        June 2013
        1322 pages
        ISBN:9781450320375
        DOI:10.1145/2463676

        Copyright © 2013 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 22 June 2013

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        SIGMOD '13 Paper Acceptance Rate76of372submissions,20%Overall Acceptance Rate785of4,003submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader