Abstract
To hide the complexity of the underlying system, graph processing frameworks ask programmers to specify graph computations in user-defined functions (UDFs) of graph-oriented programming model. Due to the nature of distributed execution, current frameworks cannot precisely enforce the semantics of UDFs, leading to unnecessary computation and communication. It exemplifies a gap between programming model and runtime execution. This article proposes novel graph processing frameworks for distributed system and Processing-in-memory (PIM) architecture that precisely enforces loop-carried dependency; i.e., when a condition is satisfied by a neighbor, all following neighbors can be skipped. Our approach instruments the UDFs to express the loop-carried dependency, then the distributed execution framework enforces the precise semantics by performing dependency propagation dynamically. Enforcing loop-carried dependency requires the sequential processing of the neighbors of each vertex distributed in different nodes. We propose to circulant scheduling in the framework to allow different nodes to process disjoint sets of edges/vertices in parallel while satisfying the sequential requirement. The technique achieves an excellent trade-off between precise semantics and parallelism—the benefits of eliminating unnecessary computation and communication offset the reduced parallelism. We implement a new distributed graph processing framework SympleGraph, and two variants of runtime systems—GraphS and GraphSR—for PIM-based graph processing architecture, which significantly outperform the state-of-the-art.
- Junwhan Ahn, Sungpack Hong, Sungjoo Yoo, Onur Mutlu, and Kiyoung Choi. 2015. A scalable processing-in-memory accelerator for parallel graph processing. In Proceedings of the ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA’15). IEEE, 105–117.Google ScholarDigital Library
- Tero Aittokallio and Benno Schwikowski. 2006. Graph-based methods for analysing networks in cell biology. Brief. Bioinform. 7, 3 (2006), 243–255.Google ScholarCross Ref
- Andrei Alexandrescu and Katrin Kirchhoff. 2007. Data-driven graph construction for semi-supervised graph-based learning in NLP. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL’07). 204–211.Google Scholar
- ARM. 2009. ARM Cortex-A5 Processor. Retrieved from http://www.arm.com/products/processors/cortex-a/cortex-a5.php.Google Scholar
- Abanti Basak, Shuangchen Li, Xing Hu, Sang Min Oh, Xinfeng Xie, Li Zhao, Xiaowei Jiang, and Yuan Xie. 2019. Analysis and optimization of the memory hierarchy for graph processing workloads. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA’19). IEEE, 373–386.Google ScholarCross Ref
- Peter W Battaglia, Jessica B Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner et al. 2018. Relational inductive biases, deep learning, and graph networks. Retrieved from https://arXiv:1806.01261.Google Scholar
- Scott Beamer, Krste Asanović, and David Patterson. 2012. Direction-optimizing breadth-first search. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC’12). IEEE Computer Society Press, Los Alamitos, CA, Article 12, 10 pages. Retrieved from http://dl.acm.org/citation.cfm?id=2388996.2389013.Google ScholarDigital Library
- Scott Beamer, Krste Asanović, and David Patterson. 2015. The GAP Benchmark Suite. Retrieved from https://arXiv:cs.DC/1508.03619.Google Scholar
- Scott Beamer, Aydin Buluc, Krste Asanovic, and David Patterson. 2013. Distributed memory breadth-first search revisited: Enabling bottom-up search. In Proceeding sof the IEEE International Symposium on Parallel & Distributed Processing, Workshops and PhD Forum. IEEE, 1618–1627.Google Scholar
- Paolo Boldi, Marco Rosa, Massimo Santini, and Sebastiano Vigna. 2011. Layered label propagation: A multiresolution coordinate-free ordering for compressing social networks. In Proceedings of the 20th International Conference on World Wide Web. ACM, 587–596.Google ScholarDigital Library
- Paolo Boldi and Sebastiano Vigna. 2004. The webgraph framework I: Compression techniques. In Proceedings of the 13th International Conference on World Wide Web. ACM, 595–602.Google ScholarDigital Library
- Aydin Buluc, Scott Beamer, Kamesh Madduri, Krste Asanovic, and David Patterson. 2017. Distributed-memory breadth-first search on massive graphs. Retrieved from https://arXiv:1705.04590.Google Scholar
- Deepayan Chakrabarti, Yiping Zhan, and Christos Faloutsos. 2004. R-MAT: A recursive model for graph mining. In Proceedings of the SIAM International Conference on Data Mining. SIAM, 442–446.Google ScholarCross Ref
- Rong Chen, Jiaxin Shi, Yanzhe Chen, and Haibo Chen. 2015. PowerLyra: Differentiated graph computation and partitioning on skewed graphs. In Proceedings of the 10th European Conference on Computer Systems (EuroSys’15). ACM, New York, NY, Article 1, 15 pages. DOI:https://doi.org/10.1145/2741948.2741970Google ScholarDigital Library
- Thayne Coffman, Seth Greenblatt, and Sherry Marcus. 2004. Graph-based technologies for intelligence analysis. Commun. ACM 47, 3 (Mar. 2004), 45–47. DOI:https://doi.org/10.1145/971617.971643Google ScholarDigital Library
- Hybrid Memory Cube Consortium. 2015. Hybrid Memory Cube Specification Version 2.1. Technical Report.Google Scholar
- Guohao Dai, Tianhao Huang, Yuze Chi, Jishen Zhao, Guangyu Sun, Yongpan Liu, Yu Wang, Yuan Xie, and Huazhong Yang. 2018. Graphh: A processing-in-memory architecture for large-scale graph processing. IEEE Trans. Comput.-Aided Design Integr. Circ. Syst. 34, 4 (2018), 640–653.Google Scholar
- Roshan Dathathri, Gurbinder Gill, Loc Hoang, Hoang-Vu Dang, Alex Brooks, Nikoli Dryden, Marc Snir, and Keshav Pingali. 2018. Gluon: A communication-optimizing substrate for distributed heterogeneous graph analytics. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’18). ACM, New York, NY, 752–768. DOI:https://doi.org/10.1145/3192366.3192404Google ScholarDigital Library
- Anton J. Enright and Christos A. Ouzounis. 2001. BioLayout—An automatic graph layout algorithm for similarity visualization. Bioinformatics 17, 9 (2001), 853–854.Google ScholarCross Ref
- Wenfei Fan, Jingbo Xu, Yinghui Wu, Wenyuan Yu, Jiaxin Jiang, Zeyu Zheng, Bohan Zhang, Yang Cao, and Chao Tian. 2017. Parallelizing sequential graph computations. In Proceedings of the ACM International Conference on Management of Data (SIGMOD’17). ACM, New York, NY, 495–510. DOI:https://doi.org/10.1145/3035918.3035942Google ScholarDigital Library
- Francois Fouss, Alain Pirotte, Jean-Michel Renders, and Marco Saerens. 2007. Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Trans. Knowl. Data Eng. 19, 3 (2007), 355–369.Google ScholarDigital Library
- Mingyu Gao, Grant Ayers, and Christos Kozyrakis. 2015. Practical near-data processing for in-memory analytics frameworks. In Proceedings of the International Conference on Parallel Architecture and Compilation (PACT’15). IEEE, 113–124.Google ScholarDigital Library
- Gurbinder Gill, Roshan Dathathri, Loc Hoang, Andrew Lenharth, and Keshav Pingali. 2018. Abelian: A compiler for graph analytics on distributed, heterogeneous platforms. In Proceedings of the European Conference on Parallel Processing. Springer, 249–264.Google ScholarCross Ref
- Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. PowerGraph: Distributed graph-parallel computation on natural graphs. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI’12). USENIX Association, Berkeley, CA, 17–30. Retrieved from http://dl.acm.org/citation.cfm?id=2387880.2387883.Google ScholarDigital Library
- Joseph E. Gonzalez, Reynold S. Xin, Ankur Dave, Daniel Crankshaw, Michael J. Franklin, and Ion Stoica. 2014. GraphX: Graph processing in a distributed dataflow framework. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI’14). USENIX Association, Berkeley, CA, 599–613. Retrieved from http://dl.acm.org/citation.cfm?id=2685048.2685096.Google ScholarDigital Library
- Amit Goyal, Hal Daumé III, and Raul Guerra. 2012. Fast large-scale approximate graph construction for nlp. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, 1069–1080.Google Scholar
- Graph500. 2010. Graph 500 Benchmarks. Retrieved from http://www.graph500.org.Google Scholar
- Aditya Grover and Jure Leskovec. 2016. Node2Vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’16). ACM, New York, NY, 855–864. DOI:https://doi.org/10.1145/2939672.2939754Google ScholarDigital Library
- Ziyu Guan, Jiajun Bu, Qiaozhu Mei, Chun Chen, and Can Wang. 2009. Personalized tag recommendation using graph-based ranking on multi-type interrelated objects. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 540–547.Google ScholarDigital Library
- Tae Jun Ham, Lisa Wu, Narayanan Sundaram, Nadathur Satish, and Margaret Martonosi. 2016. Graphicionado: A high-performance and energy-efficient accelerator for graph analytics. In Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’16). IEEE, 1–13.Google ScholarCross Ref
- Sungpack Hong, Hassan Chafi, Edic Sedlar, and Kunle Olukotun. 2012. Green-Marl: A DSL for easy and efficient graph analysis. SIGPLAN Not. 47, 4 (Mar. 2012), 349–362. DOI:https://doi.org/10.1145/2248487.2151013Google ScholarDigital Library
- Sungpack Hong, Siegfried Depner, Thomas Manhardt, Jan Van Der Lugt, Merijn Verstraaten, and Hassan Chafi. 2015. PGX.D: A fast distributed graph processing engine. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC’15). ACM, New York, NY, Article 58, 12 pages. DOI:https://doi.org/10.1145/2807591.2807620Google ScholarDigital Library
- Sungpack Hong, Nicole C. Rodia, and Kunle Olukotun. 2013. On fast parallel detection of strongly connected components (SCC) in small-world graphs. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC’13). ACM, New York, NY, Article 92, 11 pages. DOI:https://doi.org/10.1145/2503210.2503246Google ScholarDigital Library
- Imranul Hoque and Indranil Gupta. 2013. LFGraph: Simple and fast distributed graph analytics. In Proceedings of the First ACM SIGOPS Conference on Timely Results in Operating Systems (TRIOS’13). ACM, New York, NY, Article 9, 17 pages. DOI:https://doi.org/10.1145/2524211.2524218Google ScholarDigital Library
- M. C. Jeffrey, S. Subramanian, C. Yan, J. Emer, and D. Sanchez. 2016. Unlocking ordered parallelism with the swarm architecture. IEEE Micro 36, 3 (2016), 105–117. DOI:https://doi.org/10.1109/MM.2016.12Google ScholarCross Ref
- Andrew B. Kahng, Bin Li, Li-Shiuan Peh, and Kambiz Samadi. 2012. ORION 2.0: A power-area simulator for interconnection networks. IEEE Trans. Very Large Scale Integr. Syst. 20, 1 (Jan. 2012), 191–196. DOI:https://doi.org/10.1109/TVLSI.2010.2091686Google ScholarDigital Library
- Gwangsun Kim, John Kim, Jung Ho Ahn, and Jaeha Kim. 2013. Memory-centric system interconnect design with hybrid memory cubes. In Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques. IEEE Press, 145–156.Google ScholarDigital Library
- Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media? In Proceedings of the 19th International Conference on World Wide Web (WWW’10). ACM, New York, NY, 591–600. DOI:https://doi.org/10.1145/1772690.1772751Google ScholarDigital Library
- Nicolas Le Novere, Michael Hucka, Huaiyu Mi, Stuart Moodie, Falk Schreiber, Anatoly Sorokin, Emek Demir, Katja Wegner, Mirit I. Aladjem, Sarala M. Wimalaratne, et al. 2009. The systems biology graphical notation. Nature Biotechnology 27, 8 (2009), 735–741.Google ScholarCross Ref
- Dong Uk Lee, Kyung Whan Kim, Kwan Weon Kim, Hongjung Kim, Ju Young Kim, Young Jun Park, Jae Hwan Kim, Dae Suk Kim, Heat Bit Park, Jin Wook Shin, et al. 2014. 25.2 A 1.2 V 8Gb 8-channel 128GB/s high-bandwidth memory (HBM) stacked DRAM with effective microbump I/O test methods using 29nm process and TSV. In Proceedings of the IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC’14). IEEE, 432–433.Google Scholar
- Jure Leskovec and Andrej Krevl. 2014. friendster. Retrieved from https://snap.stanford.edu/data/com-Friendster.html.Google Scholar
- Jure Leskovec, Kevin J. Lang, Anirban Dasgupta, and Michael W. Mahoney. 2009. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Math. 6, 1 (2009), 29–123.Google ScholarCross Ref
- Sheng Li, Jung Ho Ahn, Richard D. Strong, Jay B. Brockman, Dean M. Tullsen, and Norman P. Jouppi. 2009. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’09). 469–480.Google Scholar
- Yucheng Low, Danny Bickson, Joseph Gonzalez, Carlos Guestrin, Aapo Kyrola, and Joseph M. Hellerstein. 2012. Distributed GraphLab: A framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5, 8 (Apr. 2012), 716–727. DOI:https://doi.org/10.14778/2212351.2212354Google ScholarDigital Library
- Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: A system for large-scale graph processing. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’10). ACM, New York, NY, 135–146. DOI:https://doi.org/10.1145/1807167.1807184Google ScholarDigital Library
- Mugilan Mariappan and Keval Vora. 2019. GraphBolt: Dependency-driven synchronous processing of streaming graphs. In Proceedings of the 14th EuroSys Conference 2019 (EuroSys’19). ACM, New York, NY, Article 25, 16 pages. DOI:https://doi.org/10.1145/3302424.3303974Google ScholarDigital Library
- David W. Matula and Leland L. Beck. 1983. Smallest-last ordering and clustering and graph coloring algorithms. J. ACM 30, 3 (1983), 417–427.Google ScholarDigital Library
- Julian McAuley and Jure Leskovec. 2012. Learning to discover social circles in ego networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems (NIPS’12). Curran Associates, 539–547. Retrieved from http://dl.acm.org/citation.cfm?id=2999134.2999195.Google Scholar
- Frank McSherry. 2017. COST in the land of databases. Retrieved from https://github.com/frankmcsherry/blog/blob/master/posts/2017-09-23.md.Google Scholar
- Frank McSherry, Michael Isard, and Derek G Murray. 2015. Scalability! But at what {COST}? In Proceedings of the 15th Workshop on Hot Topics in Operating Systems (HotOS’15).Google Scholar
- Batul J. Mirza, Benjamin J. Keller, and Naren Ramakrishnan. 2003. Studying recommendation algorithms by graph analysis. J. Intell. Info. Syst. 20, 2 (2003), 131–160.Google ScholarDigital Library
- Anurag Mukkara, Nathan Beckmann, Maleen Abeydeera, Xiaosong Ma, and Daniel Sanchez. 2018. Exploiting locality in graph analytics through hardware-accelerated traversal scheduling. In Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’18). IEEE, 1–14.Google ScholarDigital Library
- Lifeng Nai, Ramyad Hadidi, Jaewoong Sim, Hyojong Kim, Pranith Kumar, and Hyesoon Kim. 2017. GraphPIM: Enabling instruction-level PIM offloading in graph computing frameworks. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA’17). IEEE, 457–468.Google ScholarCross Ref
- Donald Nguyen, Andrew Lenharth, and Keshav Pingali. 2013. A lightweight infrastructure for graph analytics. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP’13). ACM, New York, NY, 456–471. DOI:https://doi.org/10.1145/2517349.2522739Google ScholarDigital Library
- The University of Texas at Austin. 2019. Texas Advanced Computing Center (TACC). Retrieved from https://www.tacc.utexas.edu/.Google Scholar
- Muhammet Mustafa Ozdal, Serif Yesil, Taemin Kim, Andrey Ayupov, John Greth, Steven Burns, and Ozcan Ozturk. 2016. Energy efficient architecture for graph analytics accelerators. In Proceedings of the ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA’16). IEEE, 166–177.Google ScholarDigital Library
- Sreepathi Pai and Keshav Pingali. 2016. A compiler for throughput optimization of graph algorithms on GPUs. In Proceedings of the ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA’16). ACM, New York, NY, 1–19. DOI:https://doi.org/10.1145/2983990.2984015Google ScholarDigital Library
- Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. DeepWalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’14). ACM, New York, NY, 701–710. DOI:https://doi.org/10.1145/2623330.2623732Google ScholarDigital Library
- The Lemur Project. 2013. The ClueWeb12 Dataset. Retrieved from http://lemurproject.org/clueweb12/.Google Scholar
- Meikang Qiu, Lei Zhang, Zhong Ming, Zhi Chen, Xiao Qin, and Laurence T. Yang. 2013. Security-aware optimization for ubiquitous computing systems with SEAT graph approach. J. Comput. Syst. Sci. 79, 5 (2013), 518–529.Google ScholarDigital Library
- Amitabha Roy, Ivo Mihailovic, and Willy Zwaenepoel. 2013. X-stream: Edge-centric graph processing using streaming partitions. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP’13). Association for Computing Machinery, New York, NY, 472–488. DOI:https://doi.org/10.1145/2517349.2522740Google ScholarDigital Library
- Semih Salihoglu and Jennifer Widom. 2013. GPS: A graph processing system. In Proceedings of the 25th International Conference on Scientific and Statistical Database Management (SSDBM’13). ACM, New York, NY, Article 22, 12 pages. DOI:https://doi.org/10.1145/2484838.2484843Google ScholarDigital Library
- Daniel Sanchez and Christos Kozyrakis. 2013. ZSim: Fast and accurate microarchitectural simulation of thousand-core systems. In Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA’13). ACM, New York, NY, 475–486. DOI:https://doi.org/10.1145/2485922.2485963Google ScholarDigital Library
- Satu Elisa Schaeffer. 2007. Graph clustering. Comput. Sci. Rev. 1, 1 (2007), 27–64.Google ScholarDigital Library
- Jiwon Seo, Jongsoo Park, Jaeho Shin, and Monica S. Lam. 2013. Distributed socialite: A datalog-based language for large-scale graph analysis. Proc. VLDB Endow. 6, 14 (Sep. 2013), 1906–1917. DOI:https://doi.org/10.14778/2556549.2556572Google ScholarDigital Library
- Manjunath Shevgoor, Jung-Sik Kim, Niladrish Chatterjee, Rajeev Balasubramonian, Al Davis, and Aniruddha N. Udipi. 2013. Quantifying the relationship between the power delivery network and architectural policies in a 3D-stacked memory device. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture. ACM, 198–209.Google Scholar
- Julian Shun. 2019. K-Core. Retrieved from http://jshun.github.io/ligra/docs/tutorial_kcore.html.Google Scholar
- Julian Shun and Guy E. Blelloch. 2013. Ligra: A lightweight graph processing framework for shared memory. In Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’13). ACM, New York, NY, 135–146. DOI:https://doi.org/10.1145/2442516.2442530Google Scholar
- Julian Shun, Farbod Roosta-Khorasani, Kimon Fountoulakis, and Michael W. Mahoney. 2016. Parallel local graph clustering. Proc. VLDB Endow. 9, 12 (Aug. 2016), 1041–1052. DOI:https://doi.org/10.14778/2994509.2994522Google ScholarDigital Library
- AM Stankovic and MS Calovic. 1989. Graph oriented algorithm for the steady-state security enhancement in distribution networks. IEEE Trans. Power Delivery 4, 1 (1989), 539–544.Google ScholarCross Ref
- Lei Tang and Huan Liu. 2010. Graph mining applications to social network analysis. In Managing and Mining Graph Data. Springer, 487–513.Google Scholar
- Po-An Tsai, Nathan Beckmann, and Daniel Sanchez. 2017. Jenga: Sotware-defined cache hierarchies. In Proceedings of the 44th Annual International Symposium on Computer Architecture. ACM, 652–665.Google ScholarDigital Library
- Keval Vora. 2019. LUMOS: Dependency-driven disk-based graph processing. In Proceedings of the USENIX Conference on Usenix Annual Technical Conference (USENIX ATC’19). USENIX Association, USA, 429–442.Google Scholar
- Keval Vora, Rajiv Gupta, and Guoqing Xu. 2017. KickStarter: Fast and accurate computations on streaming graphs via trimmed approximations. In Proceedings of the 22nd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’17). Association for Computing Machinery, New York, NY, 237–251. DOI:https://doi.org/10.1145/3037697.3037748Google ScholarDigital Library
- Keval Vora, Sai Charan Koduru, and Rajiv Gupta. 2014. ASPIRE: Exploiting asynchronous parallelism in iterative algorithms using a relaxed consistency based DSM. In Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages & Applications (OOPSLA’14). ACM, New York, NY, 861–878. DOI:https://doi.org/10.1145/2660193.2660227Google ScholarDigital Library
- Tianyi Wang, Yang Chen, Zengbin Zhang, Tianyin Xu, Long Jin, Pan Hui, Beixing Deng, and Xing Li. 2011. Understanding graph sampling algorithms for social network analysis. In Proceedings of the 31st International Conference on Distributed Computing Systems Workshops. IEEE, 123–128.Google ScholarDigital Library
- English Wikipedia. 2013. enwiki-2013. Retrieved from http://law.di.unimi.it/webdata/enwiki-2013/.Google Scholar
- Ming Wu, Fan Yang, Jilong Xue, Wencong Xiao, Youshan Miao, Lan Wei, Haoxiang Lin, Yafei Dai, and Lidong Zhou. 2015. GraM: Scaling graph computation to the trillions. In Proceedings of the 6th ACM Symposium on Cloud Computing (SoCC’15). ACM, New York, NY, 408–421. DOI:https://doi.org/10.1145/2806777.2806849Google ScholarDigital Library
- Wencong Xiao, Jilong Xue, Youshan Miao, Zhen Li, Cheng Chen, Ming Wu, Wei Li, and Lidong Zhou. 2017. Tux2: Distributed graph computation for machine learning. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI’17). USENIX Association, Berkeley, CA, 669–682.Google Scholar
- Yuan Yu, Pradeep Kumar Gunda, and Michael Isard. 2009. Distributed aggregation for data-parallel computing: Interfaces and implementations. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (SOSP’09). Association for Computing Machinery, New York, NY, 247–260. DOI:https://doi.org/10.1145/1629575.1629600Google ScholarDigital Library
- Torsten Zesch and Iryna Gurevych. 2007. Analysis of the Wikipedia category graph for NLP applications. In Proceedings of the TextGraphs-2 Workshop (NAACL-HLT’07). 1–8.Google Scholar
- Mingxing Zhang, Yongwei Wu, Kang Chen, Xuehai Qian, Xue Li, and Weimin Zheng. 2016. Exploring the hidden dimension in graph processing. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI’16). USENIX Association, Berkeley, CA, 285–300. Retrieved from http://dl.acm.org/citation.cfm?id=3026877.3026900.Google ScholarDigital Library
- Mingxing Zhang, Youwei Zhuo, Chao Wang, Mingyu Gao, Yongwei Wu, Kang Chen, Christos Kozyrakis, and Xuehai Qian. 2018. GraphP: Reducing communication for PIM-based graph processing with efficient data partition. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA’18). IEEE, 544–557.Google ScholarCross Ref
- Yunming Zhang, Mengjiao Yang, Riyadh Baghdadi, Shoaib Kamil, Julian Shun, and Saman Amarasinghe. 2018. GraphIt: A high-performance graph DSL. Proc. ACM Program. Lang. 2, OOPSLA, Article 121 (Oct. 2018), 30 pages. DOI:https://doi.org/10.1145/3276491Google ScholarDigital Library
- Xiaowei Zhu, Wenguang Chen, Weimin Zheng, and Xiaosong Ma. 2016. Gemini: A computation-centric distributed graph processing system. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI’16). USENIX Association, Berkeley, CA, 301–316. http://dl.acm.org/citation.cfm?id=3026877.3026901Google ScholarDigital Library
- Youwei Zhuo, Chao Wang, Mingxing Zhang, Rui Wang, Dimin Niu, Yanzhi Wang, and Xuehai Qian. 2019. GraphQ: Scalable PIM-based graph processing. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’52). ACM, New York, NY, 712–725. DOI:https://doi.org/10.1145/3352460.3358256Google ScholarDigital Library
Index Terms
- Distributed Graph Processing System and Processing-in-memory Architecture with Precise Loop-carried Dependency Guarantee
Recommendations
SympleGraph: distributed graph processing with precise loop-carried dependency guarantee
PLDI 2020: Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and ImplementationGraph analytics is an important way to understand relationships in real-world applications. At the age of big data, graphs have grown to billions of edges. This motivates distributed graph processing. Graph processing frameworks ask programmers to ...
L(2,1)-labeling of dually chordal graphs and strongly orderable graphs
An L(2,1)-labeling of a graph G=(V,E) is a function f:V(G)->{0,1,2,...} such that |f(u)-f(v)|>=2 whenever uv@__ __E(G) and |f(u)-f(v)|>=1 whenever u and v are at distance two apart. The span of an L(2,1)-labeling f of G, denoted as SP"2(f,G), is the ...
Finding a chain graph in a bipartite permutation graph
We present a polynomial-time algorithm for solving Subgraph Isomorphism where the base graphs are bipartite permutation graphs and the pattern graphs are chain graphs. Subgraph Isomorphism is studied on graph classes.A polynomial-time algorithm is given ...
Comments