Tigr: Transforming Irregular Graphs for GPU-Friendly Graph Processing

Authors:
Amir Hossein Nodehi Sabet

University of California, Riverside, Riverside, CA, USA

University of California, Riverside, Riverside, CA, USA
View Profile

,
Junqiao Qiu

University of California, Riverside, Riverside, CA, USA

University of California, Riverside, Riverside, CA, USA
View Profile

,
Zhijia Zhao

University of California, Riverside, Riverside, CA, USA

University of California, Riverside, Riverside, CA, USA
View Profile

ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating SystemsMarch 2018Pages 622–636https://doi.org/10.1145/3173162.3173180

Published:19 March 2018Publication History

ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems

Pages 622–636

ABSTRACT

Graph analytics delivers deep knowledge by processing large volumes of highly connected data. In real-world graphs, the degree distribution tends to follow the power law -- a small portion of nodes own a large number of neighbors. The high irregularity of degree distribution acts as a major barrier to their efficient processing on GPU architectures, which are primarily designed for accelerating computations on regular data with SIMD executions. Existing solutions to the inefficiency of GPU-based graph analytics either modify the graph programming abstraction or rely on changes to the low-level thread execution models. The former requires more programming efforts for designing and maintaining graph analytics; while the latter couples with the underlying architectures, making it difficult to adapt as architectures quickly evolve. Unlike prior efforts, this work proposes to address the above fundamental problem at its origin -- the irregular graph data itself. It raises a critical question in irregular graph processing: Is it possible to transform irregular graphs into more regular ones such that the graphs can be processed more efficiently on GPU-like architectures, yet still producing the same results? Inspired by the question, this work introduces Tigr -- a graph transformation framework that can effectively reduce the irregularity of real-world graphs with correctness guarantees for a wide range of graph analytics. To make the transformations practical, Tigr features a lightweight virtual transformation scheme, which can substantially reduce the costs of graph transformations, while preserving the benefits of reduced irregularity. Evaluation on Tigr-based GPU graph processing shows significant and consistent speedup over the state-of-the-art GPU graph processing frameworks for a spectrum of irregular graphs.

References

Ching Avery. 2011. Giraph: Large-scale graph processing infrastructure on Hadoop. Proceedings of the Hadoop Summit. Santa Clara Vol. 11 (2011).Google Scholar
Scott Beamer, Krste Asanović, and David Patterson. 2012. Direction-optimizing breadth-first search. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. IEEE Computer Society Press, 12. Google ScholarDigital Library
Nathan Bell and Michael Garland. 2009. Implementing sparse matrix-vector multiplication on throughput-oriented processors Proceedings of the conference on high performance computing networking, storage and analysis. ACM, 18. Google ScholarDigital Library
Maciej Besta, Michał Podstawski, Linus Groner, Edgar Solomonik, and Torsten Hoefler. 2017. To Push or To Pull: On Reducing Communication and Synchronization in Graph Computations. In Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing. ACM, 93--104. Google ScholarDigital Library
Paolo Boldi, Marco Rosa, Massimo Santini, and Sebastiano Vigna. 2011. Layered label propagation: A multiresolution coordinate-free ordering for compressing social networks. In Proceedings of the 20th international conference on World wide web. ACM, 587--596. Google ScholarDigital Library
Ulrik Brandes. 2001. A faster algorithm for betweenness centrality. Journal of mathematical sociology Vol. 25, 2 (2001), 163--177.Google ScholarCross Ref
Ed Bullmore and Olaf Sporns. 2009. Complex brain networks: graph theoretical analysis of structural and functional systems. Nature Reviews Neuroscience Vol. 10, 3 (2009), 186--198.Google ScholarCross Ref
Shuai Che, Jeremy W Sheaffer, and Kevin Skadron. 2011. Dymaxion: Optimizing memory access patterns for heterogeneous systems Proceedings of 2011 international conference for high performance computing, networking, storage and analysis. ACM, 13. Google ScholarDigital Library
Rong Chen, Jiaxin Shi, Yanzhe Chen, and Haibo Chen. 2015. PowerLyra: Differentiated Graph Computation and Partitioning on Skewed Graphs Proceedings of the Tenth European Conference on Computer Systems (EuroSys '15). ACM, New York, NY, USA, 1:1--1:15. Google ScholarDigital Library
Andreas Crauser, Kurt Mehlhorn, Ulrich Meyer, and Peter Sanders. 1998. A parallelization of Dijkstra's shortest path algorithm. Mathematical Foundations of Computer Science 1998 (1998), 722--731. Google ScholarDigital Library
Andrew Davidson, Sean Baxter, Michael Garland, and John D Owens. 2014. Work-efficient parallel GPU methods for single-source shortest paths Parallel and Distributed Processing Symposium, 2014 IEEE 28th International. IEEE, 349--359. Google ScholarDigital Library
Pedro Domingos and Matt Richardson. 2001. Mining the network value of customers. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 57--66. Google ScholarDigital Library
E. Elsen and V. Vaidyanathan. 2013. A vertex-centric CUDA/CGoogle Scholar
API for large graph analytics on GPUs using the gather-apply-scatter abstraction. https://github.com/RoyalCaliber/vertexAPI2. (2013).Google Scholar
Adam Fidel, Nancy M Amato, and Lawrence Rauchwerger. 2012. The STAPL parallel graph library. In International Workshop on Languages and Compilers for Parallel Computing. Springer, 46--60.Google Scholar
Abdullah Gharaibeh, Lauro Beltr ao Costa, Elizeu Santos-Neto, and Matei Ripeanu. 2012. A Yoke of Oxen and a Thousand Chickens for Heavy Lifting Graph Processing Proceedings of the 21st international conference on Parallel architectures and compilation techniques. ACM, 345--354. Google ScholarDigital Library
Abdullah Gharaibeh, Tahsin Reza, Elizeu Santos-Neto, Lauro Beltrao Costa, Scott Sallinen, and Matei Ripeanu. 2013. Efficient large-scale graph processing on hybrid CPU and GPU systems. arXiv preprint arXiv:1312.3018 (2013).Google Scholar
Joseph E Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs. OSDI, Vol. Vol. 12. 2. Google ScholarDigital Library
Douglas Gregor and Andrew Lumsdaine. 2005. The parallel BGL: A generic library for distributed graph computations. Parallel Object-Oriented Scientific Computing (POOSC) Vol. 2 (2005), 1--18.Google Scholar
John Greiner. 1994. A comparison of parallel algorithms for connected components Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures. ACM, 16--25. Google ScholarDigital Library
Tianyi David Han and Tarek S Abdelrahman. 2011. Reducing branch divergence in GPU programs. In Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units. ACM, 3. Google ScholarDigital Library
Wei Han, Daniel Mawhirter, Bo Wu, and Matthew Buland. 2017. Graphie: Large-Scale Asynchronous Graph Traversals on Just a GPU Parallel Architectures and Compilation Techniques (PACT), 2017 26th International Conference on. IEEE, 233--245.Google Scholar
Pawan Harish and PJ Narayanan. 2007. Accelerating large graph algorithms on the GPU using CUDA International Conference on High-Performance Computing. Springer, 197--208. Google ScholarDigital Library
Sungpack Hong, Sang Kyun Kim, Tayo Oguntebi, and Kunle Olukotun. 2011 a. Accelerating CUDA graph algorithms at maximum warp ACM SIGPLAN Notices, Vol. Vol. 46. ACM, 267--276. Google ScholarDigital Library
Sungpack Hong, Tayo Oguntebi, and Kunle Olukotun. 2011 b. Efficient parallel graph exploration on multi-core CPU and GPU Parallel Architectures and Compilation Techniques (PACT), 2011 International Conference on. IEEE, 78--88. Google ScholarDigital Library
Jayadharini Jaiganesh and Martin Burtscher. 2018. ECL-CC v1.0. http://cs.txstate.edu/ burtscher/research/ECL-CC/. (2018).Google Scholar
Yuntao Jia, Victor Lu, Jared Hoberock, Michael Garland, and John C Hart. 2011. Edge v. node parallelism for graph centrality metrics. GPU Computing Gems Vol. 2 (2011), 15--30.Google Scholar
Laxmikant V Kale and Abhinav Bhatele. 2016. Parallel science and engineering applications: The Charm+approach. CRC Press. Google ScholarDigital Library
George Karypis and Vipin Kumar. 1998 a. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on scientific Computing Vol. 20, 1 (1998), 359--392. Google ScholarDigital Library
George Karypis and Vipin Kumar. 1998 b. Multilevelk-way partitioning scheme for irregular graphs. Journal of Parallel and Distributed computing Vol. 48, 1 (1998), 96--129. Google ScholarDigital Library
Farzad Khorasani, Rajiv Gupta, and Laxmi N Bhuyan. 2015. Scalable SIMD-efficient graph processing on GPUs Parallel Architecture and Compilation (PACT), 2015 International Conference on. IEEE, 39--50. Google ScholarDigital Library
Farzad Khorasani, Bryan Rowe, Rajiv Gupta, and Laxmi N Bhuyan. 2016. Eliminating Intra-warp Load Imbalance in Irregular Nested Patterns via Collaborative Task Engagement. In Parallel and Distributed Processing Symposium, 2016 IEEE International. IEEE, 524--533.Google ScholarCross Ref
Farzad Khorasani, Keval Vora, Rajiv Gupta, and Laxmi N Bhuyan. 2014. CuSha: vertex-centric graph processing on GPUs Proceedings of the 23rd international symposium on High-performance parallel and distributed computing. ACM, 239--252. Google ScholarDigital Library
Aapo Kyrola, Guy E Blelloch, and Carlos Guestrin. 2012. GraphChi: Large-scale graph computation on just a PC. USENIX. Google ScholarDigital Library
HyoukJoong Lee, Kevin J Brown, Arvind K Sujeeth, Tiark Rompf, and Kunle Olukotun. 2014. Locality-aware mapping of nested parallel patterns on gpus Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 63--74. Google ScholarDigital Library
Jure Leskovec and Andrej Krevl. 2015. SNAP Datasets:Stanford Large Network Dataset Collection. (2015).Google Scholar
Weifeng Liu and Brian Vinter. 2015. CSR5: An efficient storage format for cross-platform sparse matrix-vector multiplication. In Proceedings of the 29th ACM on International Conference on Supercomputing. ACM, 339--350. Google ScholarDigital Library
Yucheng Low, Danny Bickson, Joseph Gonzalez, Carlos Guestrin, Aapo Kyrola, and Joseph M Hellerstein. 2012. Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proceedings of the VLDB Endowment Vol. 5, 8 (2012), 716--727. Google ScholarDigital Library
Yucheng Low, Joseph E Gonzalez, Aapo Kyrola, Danny Bickson, Carlos E Guestrin, and Joseph Hellerstein. 2010. GraphLab: A new framework for parallel machine learning. CoRR Vol. abs/1006.4990 (2010). http://arxiv.org/abs/1006.4990 Google ScholarDigital Library
Andrew Lumsdaine, Douglas Gregor, Bruce Hendrickson, and Jonathan Berry. 2007. Challenges in parallel graph processing. Parallel Processing Letters Vol. 17, 01 (2007), 5--20.Google ScholarCross Ref
Lijuan Luo, Martin Wong, and Wen-mei Hwu. 2010. An effective GPU implementation of breadth-first search Proceedings of the 47th design automation conference. ACM, 52--55. Google ScholarDigital Library
Grzegorz Malewicz, Matthew H Austern, Aart JC Bik, James C Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: a system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data. ACM, 135--146. Google ScholarDigital Library
Adam McLaughlin and David A Bader. 2014. Scalable and high performance betweenness centrality on the GPU Proceedings of the International Conference for High performance computing, networking, storage and analysis. IEEE Press, 572--583. Google ScholarDigital Library
Mario Mendez-Lojo, Martin Burtscher, and Keshav Pingali. 2012. A GPU implementation of inclusion-based points-to analysis. ACM SIGPLAN Notices Vol. 47, 8 (2012), 107--116. Google ScholarDigital Library
Duane Merrill, Michael Garland, and Andrew Grimshaw. 2012. Scalable GPU graph traversal. In ACM SIGPLAN Notices, Vol. Vol. 47. ACM, 117--128. Google ScholarDigital Library
Ulrich Meyer and Peter Sanders. 1998. Δ-stepping: A parallel single source shortest path algorithm European Symposium on Algorithms. Elsevier, 393--404. Google ScholarDigital Library
Alan Mislove, Massimiliano Marcon, Krishna P Gummadi, Peter Druschel, and Bobby Bhattacharjee. 2007. Measurement and analysis of online social networks Proceedings of the 7th ACM SIGCOMM conference on Internet measurement. ACM, 29--42. Google ScholarDigital Library
Rupesh Nasre, Martin Burtscher, and Keshav Pingali. 2013. Atomic-free irregular computations on GPUs. In Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units. ACM, 96--107. Google ScholarDigital Library
Rupesh Nasre, Martin Burtscher, and Keshav Pingali. 2013. Morph algorithms on GPUs. In ACM SIGPLAN Notices, Vol. Vol. 48. ACM, 147--156. Google ScholarDigital Library
Donald Nguyen, Andrew Lenharth, and Keshav Pingali. 2013. A lightweight infrastructure for graph analytics. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. ACM, 456--471. Google ScholarDigital Library
Hector Ortega-Arranz, Yuri Torres, Diego R Llanos, and Arturo Gonzalez-Escribano. 2013. A New GPU-based Approach to the Shortest Path Problem High Performance Computing and Simulation (HPCS), 2013 International Conference on. IEEE, 505--511.Google Scholar
Sreepathi Pai and Keshav Pingali. 2016. A compiler for throughput optimization of graph algorithms on GPUs Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications. ACM, 1--19. Google ScholarDigital Library
Franccois Pellegrini and Jean Roman. 1996. Scotch: A software package for static mapping by dual recursive bipartitioning of process and architecture graphs. In High-Performance Computing and Networking. Springer, 493--498. Google ScholarDigital Library
Keshav Pingali, Donald Nguyen, Milind Kulkarni, Martin Burtscher, M Amber Hassaan, Rashid Kaleem, Tsung-Hsien Lee, Andrew Lenharth, Roman Manevich, Mario Méndez-Lojo, et almbox.. 2011. The Tao of parallelism in algorithms. In ACM Sigplan Notices, Vol. Vol. 46. ACM, 12--25. Google ScholarDigital Library
Dimitrios Prountzos and Keshav Pingali. 2013. Betweenness centrality: algorithms and implementations Acm Sigplan Notices, Vol. Vol. 48. ACM, 35--46. Google ScholarDigital Library
Junqiao Qiu, Zhijia Zhao, and Bin Ren. 2016. MicroSpec: Speculation-centric fine-grained parallelization for FSM computations Parallel Architecture and Compilation Techniques (PACT), 2016 International Conference on. IEEE, 221--233. Google ScholarDigital Library
Ryan A. Rossi and Nesreen K. Ahmed. 2015. The Network Data Repository with Interactive Graph Analytics and Visualization Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence. http://networkrepository.com Google ScholarDigital Library
Gorka Sadowski and Philip Rathle. 2014. Fraud detection: Discovering connections with graph databases. White Paper-Neo Technology-Graphs are Everywhere (2014).Google Scholar
Ahmet Erdem Sariyüce, Kamer Kaya, Erik Saule, and Ümit V Catalyürek. 2013. Betweenness centrality on GPUs and heterogeneous architectures Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units. ACM, 76--85. Google ScholarDigital Library
John Sartori and Rakesh Kumar. 2013. Branch and data herding: Reducing control and memory divergence for error-tolerant GPU applications. IEEE Transactions on Multimedia Vol. 15, 2 (2013), 279--290. Google ScholarDigital Library
Dipanjan Sengupta, Shuaiwen Leon Song, Kapil Agarwal, and Karsten Schwan. 2015. GraphReduce: processing large-scale graphs on accelerator-based systems Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, 28. Google ScholarDigital Library
Julian Shun and Guy E Blelloch. 2013. Ligra: a lightweight graph processing framework for shared memory ACM Sigplan Notices, Vol. Vol. 48. ACM, 135--146. Google ScholarDigital Library
Jeremy G Siek, Lie-Quan Lee, and Andrew Lumsdaine. 2001. The Boost Graph Library: User Guide and Reference Manual, Portable Documents. (2001).Google Scholar
Jyothish Soman, Kothapalli Kishore, and PJ Narayanan. 2010. A fast GPU algorithm for graph connectivity. In Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on. IEEE, 1--8.Google Scholar
Stanley Tzeng, Anjul Patney, and John D Owens. 2010. Task management for irregular-parallel workloads on the GPU Proceedings of the Conference on High Performance Graphics. Eurographics Association, 29--37. Google ScholarDigital Library
Leslie G Valiant. 1990. A bridging model for parallel computation. Commun. ACM Vol. 33, 8 (1990), 103--111. Google ScholarDigital Library
Stephan M Wagner and Nikrouz Neshat. 2010. Assessing the vulnerability of supply chains using graph theory. International Journal of Production Economics Vol. 126, 1 (2010), 121--129.Google ScholarCross Ref
Kai Wang, Aftab Hussain, Zhiqiang Zuo, Guoqing Xu, and Ardalan Amiri Sani. 2017. Graspan: A single-machine disk-based graph system for interprocedural static analyses of large-scale systems code. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 389--404. Google ScholarDigital Library
Kai Wang, Guoqing (Harry) Xu, Zhendong Su, and Yu David Liu. 2015. GraphQ: Graph Query Processing with Abstraction Refinement-Scalable and Programmable Analytics over Very Large Graphs on a Single PC. USENIX Annual Technical Conference. 387--401. Google ScholarDigital Library
Yangzihao Wang, Andrew Davidson, Yuechao Pan, Yuduo Wu, Andy Riffel, and John D Owens. 2016. Gunrock: A high-performance graph processing library on the GPU Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, 11. Google ScholarDigital Library
Brandon West, Adam Fidel, Nancy M Amato, Lawrence Rauchwerger, et almbox.. 2015. A hybrid approach to processing big data graphs on memory-restricted systems Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International. IEEE, 799--808. Google ScholarDigital Library
Bo Wu, Zhijia Zhao, Eddy Zheng Zhang, Yunlian Jiang, and Xipeng Shen. 2013. Complexity analysis and algorithm design for reorganizing data to minimize non-coalesced memory accesses on GPU. In ACM SIGPLAN Notices, Vol. Vol. 48. ACM, 57--68. Google ScholarDigital Library
Qiumin Xu, Hyeran Jeon, and Murali Annavaram. 2014. Graph processing on gpus: Where are the bottlenecks? Workload Characterization (IISWC), 2014 IEEE International Symposium on. IEEE, 140--149.Google ScholarCross Ref
Yi Yang and Huiyang Zhou. 2014. CUDA-NP: realizing nested thread-level parallelism in GPGPU applications ACM SIGPLAN Notices, Vol. Vol. 49. ACM, 93--106. Google ScholarDigital Library
Eddy Z Zhang, Yunlian Jiang, Ziyu Guo, Kai Tian, and Xipeng Shen. 2011. On-the-fly elimination of dynamic irregularities for GPU computing ACM SIGARCH Computer Architecture News, Vol. Vol. 39. ACM, 369--380. Google ScholarDigital Library
Mingxing Zhang, Yongwei Wu, Kang Chen, Xuehai Qian, Xue Li, and Weimin Zheng. 2016. Exploring the Hidden Dimension in Graph Processing. OSDI. 285--300. Google ScholarDigital Library
Zhijia Zhao and Xipeng Shen. 2015. On-the-Fly Principled Speculation for FSM Parallelization Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '15, Istanbul, Turkey, March 14--18, 2015. 619--630. Google ScholarDigital Library
Zhijia Zhao, Bo Wu, and Xipeng Shen. 2014. Challenging the "Embarrassingly Sequential": Parallelizing Finite State Machine-Based Computations through Principled Speculation ASPLOS '14: Proceedings of 19th International Conference on Architecture Support for Programming Languages and Operating Systems. ACM Press. Google ScholarDigital Library
Jianlong Zhong and Bingsheng He. 2014. Medusa: Simplified graph processing on GPUs. IEEE Transactions on Parallel and Distributed Systems Vol. 25, 6 (2014), 1543--1552. Google ScholarDigital Library

Index Terms

Tigr: Transforming Irregular Graphs for GPU-Friendly Graph Processing
1. Computing methodologies
  1. Parallel computing methodologies

Recommendations

Tigr: Transforming Irregular Graphs for GPU-Friendly Graph Processing
ASPLOS '18

Graph analytics delivers deep knowledge by processing large volumes of highly connected data. In real-world graphs, the degree distribution tends to follow the power law -- a small portion of nodes own a large number of neighbors. The high irregularity ...
Read More
Nested MIMD-SIMD Parallelization for Heterogeneous Microprocessors

Heterogeneous microprocessors integrate a CPU and GPU on the same chip, providing fast CPU-GPU communication and enabling cores to compute on data “in place.” This permits exploiting a finer granularity of parallelism on the integrated GPUs, and enables ...
Read More
Accelerated bulk memory operations on heterogeneous multi-core systems

A traditional fixed-function graphics accelerator has evolved into a programmable general-purpose graphics processing unit over the past few years, the general-purpose computing on GPU (GPGPU). Recently, revolutionary measures have been taken along this ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems
March 2018
827 pages
ISBN:9781450349116
DOI:10.1145/3173162
General Chairs:
Xipeng Shen
North Carolina State University, USA
,
James Tuck
North Carolina State University, USA
,
Program Chairs:
Ricardo Bianchini
Microsoft Research, USA
,
Vivek Sarkar
Georgia Institute of Technology, USA
ACM SIGPLAN Notices Volume 53, Issue 2
ASPLOS '18
February 2018
809 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/3296957
Editor:
Matthew Fluet
Rodchester Institude of Technology
Issue’s Table of Contents
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 March 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
GPU
SIMD
graph transformation
irregularity
power-law graph
vertex-centric graph processing
Qualifiers
- research-article
Conference

Acceptance Rates
ASPLOS '18 Paper Acceptance Rate56of319submissions,18%Overall Acceptance Rate535of2,713submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 86
  Total Citations
  View Citations
- 1,823
  Total Downloads
- Downloads (Last 12 months)329
- Downloads (Last 6 weeks)35
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Tigr: Transforming Irregular Graphs for GPU-Friendly Graph Processing

ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Tigr: Transforming Irregular Graphs for GPU-Friendly Graph Processing

Nested MIMD-SIMD Parallelization for Heterogeneous Microprocessors

Accelerated bulk memory operations on heterogeneous multi-core systems