skip to main content
10.1145/3587135.3592196acmconferencesArticle/Chapter ViewAbstractPublication PagescfConference Proceedingsconference-collections
research-article
Free Access

High-Level Synthesis of Irregular Applications: A Case Study on Influence Maximization

Published:04 August 2023Publication History

ABSTRACT

FPGAs are promising platforms for accelerating irregular applications due to their ability to implement highly specialized hardware designs for each kernel. However, the design and implementation of FPGA-accelerated kernels can take several months using hardware design languages. High Level Synthesis (HLS) tools provide fast, high quality results for regular applications, but lack the support to effectively accelerate more irregular, complex workloads. This work analyzes the challenges and benefits of using a commercial state-of-the-art HLS tool and its available optimizations to accelerate graph sampling. We evaluate the resulting designs and their effectiveness when deployed in a state-of-the-art heterogeneous framework that implements the Influence Maximization with Martingales (IMM) algorithm, a complex graph analytics algorithm. We discuss future opportunities for improvement in hardware, HLS tools, and hardware/software co-design methodology to better support complex irregular applications such as IMM.

References

  1. 2021. Vitis Unified Software Platform Documentation: Application Acceleration Development. https://www.xilinx.com/support/documentation/sw_manuals/xilinx2020_2/ug1393-vitis-application-acceleration.pdfGoogle ScholarGoogle Scholar
  2. Robert Alverson, David Callahan, Daniel Cummings, Brian Koblenz, Allan Porter-field, and Burton Smith. 1990. The Tera computer system. In Proceedings of the 4th International Conference on Supercomputing. 1--6.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Osama G. Attia, Tyler Johnson, Kevin Townsend, Philip Jones, and Joseph Zambreno. 2014. CyGraph: A Reconfigurable Architecture for Parallel Breadth-First Search. In IPDPS '14. 228--235. https://doi.org/10.1109/IPDPSW.2014.30Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Reet Barik, Marco Minutoli, Mahantesh Halappanavar, Nathan R Tallent, and Ananth Kalyanaraman. 2020. Vertex reordering for real-world graphs and applications: An empirical evaluation. In 2020 IEEE International Symposium on Workload Characterization (IISWC). IEEE, 240--251.Google ScholarGoogle ScholarCross RefCross Ref
  5. Heiko Bauke. 2021. Tina's Random Number Generator Library. https://www.numbercrunch.de/trng/trng.pdfGoogle ScholarGoogle Scholar
  6. Maciej Besta, Dimitri Stanojevic, Johannes De Fine Licht, Tal Ben-Nun, and Torsten Hoefler. 2019. Graph Processing on FPGAs: Taxonomy, Survey, Challenges. arXiv:1903.06697 [cs.DC]Google ScholarGoogle Scholar
  7. Brahim Betkaoui, Yu Wang, David B. Thomas, and Wayne Luk. 2012. A Reconfigurable Computing Approach for Efficient and Scalable Parallel Graph Exploration. In ASAP '12. 8--15. https://doi.org/10.1109/ASAP.2012.30Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Christian Borgs, Michael Brautbar, Jennifer Chayes, and Brendan Lucier. 2014. Maximizing Social Influence in Nearly Optimal Time. In Proc. of SODA '14. SIAM, 946--957. https://doi.org/Portland, OregonGoogle ScholarGoogle ScholarCross RefCross Ref
  9. Tony M. Brewer. 2010. Instruction Set Innovations for the Convey HC-1 Computer. IEEE Micro 30, 2 (2010), 70--79. https://doi.org/10.1109/MM.2010.36Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Enrico Calore and Sebastiano Fabio Schifano. 2021. Performance assessment of FPGAs as HPC accelerators using the FPGA Empirical Roofline. In Proc. of FPL '21. 83--90. https://doi.org/10.1109/FPL53798.2021.00022Google ScholarGoogle ScholarCross RefCross Ref
  11. Pedro M. Domingos and Matthew Richardson. 2001. Mining the network value of customers. In Proc. of KDD '01. ACM, 57--66.Google ScholarGoogle Scholar
  12. Timothy Dysart, Peter Kogge, Martin Deneroff, Eric Bovell, Preston Briggs, Jay Brockman, Kenneth Jacobsen, Yujen Juan, Shannon Kuntz, Richard Lethin, Janice McMahon, Chandra Pawar, Martin Perrigo, Sarah Rucker, John Ruttenberg, Max Ruttenberg, and Steve Stein. 2016. Highly Scalable Near Memory Processing with Migrating Threads on the Emu System Architecture. In 2016 6th Workshop on Irregular Applications: Architecture and Algorithms (IA3). 2--9. https://doi.org/10.1109/IA3.2016.007Google ScholarGoogle ScholarCross RefCross Ref
  13. Iman Firmansyah, Du Changdao, Norihisa Fujita, Yoshiki Yamaguchi, and Taisuke Boku. 2019. FPGA-Based Implementation of Memory-Intensive Application Using OpenCL (HEART 2019). ACM, New York, NY, USA, Article 16, 4 pages.Google ScholarGoogle Scholar
  14. Gökhan Göktürk and Kamer Kaya. 2020. Boosting Parallel Influence-Maximization Kernels for Undirected Networks with Fusing and Vectorization. CoRR abs/2008.03095 (2020). arXiv:2008.03095 https://arxiv.org/abs/2008.03095Google ScholarGoogle Scholar
  15. Mark Harris. 2013. CUDA Pro Tip: Write Flexible Kernels with Grid-Stride Loops. https://developer.nvidia.com/blog/cuda-pro-tip-write-flexible-kernels-grid-stride-loops/.Google ScholarGoogle Scholar
  16. Mohamed W. Hassan, Ahmed E. Helal, Peter M. Athanas, Wu-Chun Feng, and Yasser Y. Hanafy. 2018. Exploring FPGA-specific Optimizations for Irregular OpenCL Applications. In ReConFig '18. 1--8. https://doi.org/10.1109/RECONFIG.2018.8641699Google ScholarGoogle ScholarCross RefCross Ref
  17. Sitao Huang, Mohamed El-Hadedy, Cong Hao, Qin Li, Vikram S. Mailthody, Ketan Date, Jinjun Xiong, Deming Chen, Rakesh Nagi, and Wen-mei Hwu. 2018. Triangle Counting and Truss Decomposition using FPGA. In HPEC '18. 1--7.Google ScholarGoogle ScholarCross RefCross Ref
  18. Vinod Kathail. 2020. Xilinx Vitis Unified Software Platform. In Proc. of FPGA '20, Stephen Neuendorffer and Lesley Shannon (Eds.). ACM, 173--174. https://doi.org/10.1145/3373087.3375887Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. David Kempe, Jon Kleinberg, and Éva Tardos. 2003. Maximizing the Spread of Influence through a Social Network. In Proc. of KDD '03. ACM, New York, NY, USA, 137--146. https://doi.org/10.1145/956750.956769Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Kartik Lakhotia, Rajgopal Kannan, Sourav Pati, and Viktor Prasanna. 2020. GPOP: A Scalable Cache- and Memory-Efficient Framework for Graph Processing over Parts. TOPC '20 7, 1, Article 7 (March 2020), 24 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne Van-Briesen, and Natalie Glance. 2007. Cost-effective outbreak detection in networks. In KDD. ACM, 420--429.Google ScholarGoogle Scholar
  22. Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data.Google ScholarGoogle Scholar
  23. Cheng Liu, Xinyu Chen, Bingsheng He, Xiaofei Liao, Ying Wang, and Lei Zhang. 2019. OBFS: OpenCL Based BFS Optimizations on Software Programmable FPGAs. In ICFPT '19. 315--318. https://doi.org/10.1109/ICFPT47387.2019.00056Google ScholarGoogle ScholarCross RefCross Ref
  24. Marco Minutoli, Vito Giovanni Castellana, Nicola Saporetti, Stefano Devecchi, Marco Lattuada, Pietro Fezzardi, Antonino Tumeo, and Fabrizio Ferrandi. 2022. Svelto: High-Level Synthesis of Multi-Threaded Accelerators for Graph Analytics. IEEE Trans. Comput. 71, 3 (2022), 520--533. https://doi.org/10.1109/TC.2021.3057860Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Marco Minutoli, Maurizio Drocco, Mahantesh Halappanavar, Antonino Tumeo, and Ananth Kalyanaraman. 2020. CuRipples: Influence Maximization on Multi-GPU Systems. In Proc. of ICS '20. ACM. https://doi.org/10.1145/3392717.3392750Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Marco Minutoli, Mahantesh Halappanavar, Ananth Kalyanaraman, Arun Sathanur, Ryan Mcclure, and Jason McDermott. 2019. Fast and Scalable Implementations of Influence Maximization Algorithms. In CLUSTER '19. 1--12. https://doi.org/10.1109/CLUSTER.2019.8890991Google ScholarGoogle ScholarCross RefCross Ref
  27. Marco Minutoli, Prathyush Sambaturu, Mahantesh Halappanavar, Antonino Tumeo, Ananth Kalyananaraman, and Anil Vullikanti. 2020. PREEMPT: Scalable Epidemic Interventions Using Submodular Optimization on Multi-GPU Systems. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis. 1--15. https://doi.org/10.1109/SC41405.2020.00059Google ScholarGoogle ScholarCross RefCross Ref
  28. Tayo Oguntebi and Kunle Olukotun. 2016. GraphOps: A Dataflow Library for Graph Analytics Acceleration. In Proc. of FPGA 16. 111--117.Google ScholarGoogle Scholar
  29. Amir Hossein Nodehi Sabet, Junqiao Qiu, and Zhijia Zhao. 2018. Tigr: Transforming Irregular Graphs for GPU-Friendly Graph Processing. In Proc. of ASPLOS '18. ACM. https://doi.org/10.1145/3173162.3173180Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. K. Sridharan, T. K. Priya, and P. Rajesh Kumar. 2009. Hardware architecture for finding shortest paths. In TENCON '09. 1--5. https://doi.org/10.1109/TENCON.2009.5396155Google ScholarGoogle ScholarCross RefCross Ref
  31. Chunyou Su, Hao Liang, Wei Zhang, Kun Zhao, Baole Ai, Wenting Shen, and Zeke Wang. 2021. Graph Sampling with Fast Random Walker on HBM-enabled FPGA Accelerators. In 2021 31st International Conference on Field-Programmable Logic and Applications (FPL). 211--218. https://doi.org/10.1109/FPL53798.2021.00042Google ScholarGoogle ScholarCross RefCross Ref
  32. Mingxing Tan, Gai Liu, Ritchie Zhao, Steve Dai, and Zhiru Zhang. 2015. ElasticFlow: A complexity-effective approach for pipelining irregular loop nests. In ICCAD '15. 78--85.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Youze Tang, Yanchen Shi, and Xiaokui Xiao. 2015. Influence Maximization in Near-Linear Time: A Martingale Approach. In Proc. 2015 ACM SIGMOD International Conference on Management of Data. ACM, 1539--1554.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Matti Tommiska and Jorma Skyttä. 2001. Dijkstra's Shortest Path Routing Algorithm in Reconfigurable Hardware. In Proc. of FPL '01. Springer-Verlag, Berlin, Heidelberg, 653--657.Google ScholarGoogle Scholar
  35. Antonino Tumeo and John Feo. 2015. Irregular applications: From architectures to algorithms [guest editors' introduction]. Computer 48, 8 (2015), 14--16.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Shijie Zhou, Charalampos Chelmis, and Viktor K. Prasanna. 2015. Optimizing memory performance for FPGA implementation of pagerank. In ReConFig '15. 1--6. https://doi.org/10.1109/ReConFig.2015.7393332Google ScholarGoogle ScholarCross RefCross Ref
  37. Shijie Zhou, Charalampos Chelmis, and Viktor K. Prasanna. 2016. High-Throughput and Energy-Efficient Graph Processing on FPGA. In FCCM '16. 103--110. https://doi.org/10.1109/FCCM.2016.35Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. High-Level Synthesis of Irregular Applications: A Case Study on Influence Maximization

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        CF '23: Proceedings of the 20th ACM International Conference on Computing Frontiers
        May 2023
        419 pages
        ISBN:9798400701405
        DOI:10.1145/3587135

        Copyright © 2023 ACM

        Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 4 August 2023

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        CF '23 Paper Acceptance Rate24of66submissions,36%Overall Acceptance Rate240of680submissions,35%
      • Article Metrics

        • Downloads (Last 12 months)83
        • Downloads (Last 6 weeks)6

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader