Abstract
Given the growing importance of large-scale graph analytics, there is a need to improve the performance of graph analysis frameworks without compromising on productivity. GraphMat is our solution to bridge this gap between a user-friendly graph analytics framework and native, hand-optimized code. GraphMat functions by taking vertex programs and mapping them to high performance sparse matrix operations in the backend. We thus get the productivity benefits of a vertex programming framework without sacrificing performance. GraphMat is a single-node multicore graph framework written in C++ which has enabled us to write a diverse set of graph algorithms with the same effort compared to other vertex programming frameworks. GraphMat performs 1.1-7X faster than high performance frameworks such as GraphLab, CombBLAS and Galois. GraphMat also matches the performance of MapGraph, a GPU-based graph framework, despite running on a CPU platform with significantly lower compute and bandwidth resources. It achieves better multicore scalability (13-15X on 24 cores) than other frameworks and is 1.2X off native, hand-optimized code on a variety of graph algorithms. Since GraphMat performance depends mainly on a few scalable and well-understood sparse matrix operations, GraphMat can naturally benefit from the trend of increasing parallelism in future hardware.
- Apache giraph. http://giraph.apache.org/.Google Scholar
- Apache spark. https://spark.apache.org/.Google Scholar
- Combinatorial Blas v 1.3. http://gauss.cs.ucsb.edu/aydin/CombBLAS/html/.Google Scholar
- Galois v 2.2.0. http://iss.ices.utexas.edu/?p=projects/galois/download.Google Scholar
- Graphlab v 2.2. http://graphlab.org.Google Scholar
- SciDB. http://www.scidb.org.Google Scholar
- Dimacs implementation challenges. http://dimacs.rutgers.edu/Challenges/, 2014.Google Scholar
- J. Bennett and S. Lanning. The Netflix Prize. In KDD Cup and Workshop at ACM SIGKDD, 2007.Google Scholar
- A. Buluç and J. R. Gilbert. On the representation and multiplication of hypersparse matrices. In 22nd IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2008, Miami, Florida USA, April 14-18, 2008, pages 1--11, 2008.Google ScholarCross Ref
- A. Buluç and J. R. Gilbert. Parallel sparse matrix-matrix multiplication and indexing: Implementation and experiments. SIAM J. Scientific Computing, 34(4), 2012.Google ScholarCross Ref
- A. Buluç and J. R. Gilbert. The combinatorial blas: Design, implementation, and applications. Int. J. High Perform. Comput. Appl., 25(4):496--509, Nov. 2011. Google ScholarDigital Library
- A. Ching. Scaling apache giraph to a trillion edges. www.facebook.com/notes/facebook-engineering/scaling-apache-giraph-to-a-trillion-edges/10151617006153920, 2013.Google Scholar
- A. A. Davidson, S. Baxter, M. Garland, and J. D. Owens. Work-efficient parallel gpu methods for single-source shortest paths. In International Parallel and Distributed Processing Symposium, volume 28, 2014. Google ScholarDigital Library
- T. Davis. The University of Florida Sparse Matrix Collection. http://www.cise.ufl.edu/research/sparse/matrices. Google ScholarDigital Library
- Z. Fu, M. Personick, and B. Thompson. Mapgraph: A high level api for fast development of high performance graph analytics on gpus. In Proceedings of Workshop on GRAph Data management Experiences and Systems, pages 1--6. ACM, 2014. Google ScholarDigital Library
- S. Hong, H. Chafi, E. Sedlar, and K. Olukotun. Green-marl: A dsl for easy and efficient graph analysis. In Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVII, pages 349--362, New York, NY, USA, 2012. ACM. Google ScholarDigital Library
- T. Ideker, O. Ozier, B. Schwikowski, and A. F. Siegel. Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics, 18(1):233--240, 2002.Google ScholarCross Ref
- A. Jindal, S. Madden, M. Castellanos, and M. Hsu. Graph Analytics using the Vertica Relational Database. ArXiv e-prints, Dec. 2014.Google Scholar
- U. Kang, C. E. Tsourakakis, and C. Faloutsos. Pegasus: A peta-scale graph mining system implementation and observations. In Proceedings of the 2009 Ninth IEEE International Conference on Data Mining, ICDM '09, pages 229--238, Washington, DC, USA, 2009. IEEE Computer Society. Google ScholarDigital Library
- Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. Computer, 42(8):30--37, 2009. Google ScholarDigital Library
- H. Kwak, C. Lee, H. Park, and S. B. Moon. What is twitter, a social network or a news media? In WWW, pages 591--600, 2010. Google ScholarDigital Library
- Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. M. Hellerstein. GraphLab: A new parallel framework for machine learning. In UAI, July 2010.Google ScholarDigital Library
- T. Mattson, D. Bader, J. Berry, A. Buluc, J. Dongarra, C. Faloutsos, J. Feo, J. Gilbert, J. Gonzalez, B. Hendrickson, J. Kepner, C. Leiserson, A. Lumsdaine, D. Padua, S. Poole, S. Reinhardt, M. Stonebraker, S. Wallach, and A. Yoo. Standards for graph algorithm primitives. In High Performance Extreme Computing Conference (HPEC), 2013 IEEE, pages 1--2, Sept 2013.Google ScholarCross Ref
- R. C. Murphy, K. B. Wheeler, B. W. Barrett, and J. A. Ang. Introducing the graph 500. Cray User's Group (CUG), 2010.Google Scholar
- D. Nguyen, A. Lenharth, and K. Pingali. A lightweight infrastructure for graph analytics. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pages 456--471. ACM, 2013. Google ScholarDigital Library
- K. Pingali, D. Nguyen, M. Kulkarni, et al. The tao of parallelism in algorithms. In PLDI, pages 12--25, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
- F. Ricci, L. Rokach, and B. Shapira. Introduction to recommender systems handbook. Springer, 2011. Google ScholarCross Ref
- N. Satish, N. Sundaram, M. M. A. Patwary, J. Seo, J. Park, M. A. Hassaan, S. Sengupta, Z. Yin, and P. Dubey. Navigating the maze of graph analytics frameworks using massive graph datasets. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, SIGMOD '14, pages 979--990, New York, NY, USA, 2014. Google ScholarDigital Library
- J. Seo, S. Guo, and M. S. Lam. SociaLite: Datalog extensions for efficient social network analysis. ICDE'13, pages 278--289, 2013. Google ScholarDigital Library
- J. Seo, J. Park, J. Shin, and M. S. Lam. Distributed sociaLite: A datalog-based language for large-scale graph analysis. Proceedings of the VLDB Endowment, 6(14), 2013. Google ScholarDigital Library
- A. Tizghadam and A. Leon-Garcia. A graph theoretical approach to traffic engineering and network control problem. In Teletraffic Congress, 2009. ITC 21 2009. 21st International, pages 1--8, Sept 2009.Google Scholar
- C. Wilson, B. Boe, A. Sala, K. P. N. Puttaswamy, and B. Y. Zhao. User interactions in social networks and their implications. In EuroSys, pages 205--218, 2009. Google ScholarDigital Library
- R. S. Xin, J. E. Gonzalez, M. J. Franklin, and I. Stoica. Graphx: A resilient distributed graph system on spark. In First International Workshop on Graph Data Management Experiences and Systems, GRADES '13, pages 2:1--2:6, New York, NY, USA, 2013. ACM. Google ScholarDigital Library
- J. Yang and J. Leskovec. Defining and evaluating network communities based on ground-truth. In Data Mining (ICDM), 2012 IEEE 12th International Conference on, pages 745--754, Dec 2012. Google ScholarDigital Library
- A.-J. N. Yzelman and D. Roose. High-level strategies for parallel shared-memory sparse matrix-vector multiplication. IEEE Transactions on Parallel and Distributed Systems, 25(1):116--125, 2014. Google ScholarDigital Library
Index Terms
- GraphMat: high performance graph analytics made productive
Recommendations
On the Multichromatic Number of s-Stable Kneser Graphs
For positive integers n and s, a subset Sï [n] is s-stable if sï |i-j|ï n-s for distinct i,j∈S . The s-stable r-uniform Kneser hypergraph KGrn,ks-stable is the r-uniform hypergraph that has the collection of all s-stable k-element subsets of [n] as ...
Mars: Accelerating MapReduce with Graphics Processors
We design and implement Mars, a MapReduce runtime system accelerated with graphics processing units (GPUs). MapReduce is a simple and flexible parallel programming paradigm originally proposed by Google, for the ease of large-scale data processing on ...
Adjacent vertex-distinguishing edge and total chromatic numbers of hypercubes
An adjacent vertex-distinguishing edge coloring of a simple graph G is a proper edge coloring of G such that incident edge sets of any two adjacent vertices are assigned different sets of colors. A total coloring of a graph G is a coloring of both the ...
Comments