research-article

Free Access

High-Level Synthesis of Irregular Applications: A Case Study on Influence Maximization

Authors:
Reece Neff

North Carolina State University, Raleigh, NC, USA

North Carolina State University, Raleigh, NC, USA

0000-0002-7137-7599
View Profile

,
Marco Minutoli

Pacific Northwest National Laboratory, Richland, WA, USA

Pacific Northwest National Laboratory, Richland, WA, USA

0000-0002-4220-1420
View Profile

,
Antonino Tumeo

Pacific Northwest National Laboratory, Richland, WA, USA

Pacific Northwest National Laboratory, Richland, WA, USA

0000-0001-9452-120X
View Profile

,
Michela Becchi

North Carolina State University, Raleigh, NC, USA

North Carolina State University, Raleigh, NC, USA

0000-0001-8353-2915
View Profile

CF '23: Proceedings of the 20th ACM International Conference on Computing FrontiersMay 2023Pages 12–22https://doi.org/10.1145/3587135.3592196

Published:04 August 2023Publication History

CF '23: Proceedings of the 20th ACM International Conference on Computing Frontiers

Pages 12–22

ABSTRACT

FPGAs are promising platforms for accelerating irregular applications due to their ability to implement highly specialized hardware designs for each kernel. However, the design and implementation of FPGA-accelerated kernels can take several months using hardware design languages. High Level Synthesis (HLS) tools provide fast, high quality results for regular applications, but lack the support to effectively accelerate more irregular, complex workloads. This work analyzes the challenges and benefits of using a commercial state-of-the-art HLS tool and its available optimizations to accelerate graph sampling. We evaluate the resulting designs and their effectiveness when deployed in a state-of-the-art heterogeneous framework that implements the Influence Maximization with Martingales (IMM) algorithm, a complex graph analytics algorithm. We discuss future opportunities for improvement in hardware, HLS tools, and hardware/software co-design methodology to better support complex irregular applications such as IMM.

References

2021. Vitis Unified Software Platform Documentation: Application Acceleration Development. https://www.xilinx.com/support/documentation/sw_manuals/xilinx2020_2/ug1393-vitis-application-acceleration.pdfGoogle Scholar
Robert Alverson, David Callahan, Daniel Cummings, Brian Koblenz, Allan Porter-field, and Burton Smith. 1990. The Tera computer system. In Proceedings of the 4th International Conference on Supercomputing. 1--6.Google ScholarDigital Library
Osama G. Attia, Tyler Johnson, Kevin Townsend, Philip Jones, and Joseph Zambreno. 2014. CyGraph: A Reconfigurable Architecture for Parallel Breadth-First Search. In IPDPS '14. 228--235. https://doi.org/10.1109/IPDPSW.2014.30Google ScholarDigital Library
Reet Barik, Marco Minutoli, Mahantesh Halappanavar, Nathan R Tallent, and Ananth Kalyanaraman. 2020. Vertex reordering for real-world graphs and applications: An empirical evaluation. In 2020 IEEE International Symposium on Workload Characterization (IISWC). IEEE, 240--251.Google ScholarCross Ref
Heiko Bauke. 2021. Tina's Random Number Generator Library. https://www.numbercrunch.de/trng/trng.pdfGoogle Scholar
Maciej Besta, Dimitri Stanojevic, Johannes De Fine Licht, Tal Ben-Nun, and Torsten Hoefler. 2019. Graph Processing on FPGAs: Taxonomy, Survey, Challenges. arXiv:1903.06697 [cs.DC]Google Scholar
Brahim Betkaoui, Yu Wang, David B. Thomas, and Wayne Luk. 2012. A Reconfigurable Computing Approach for Efficient and Scalable Parallel Graph Exploration. In ASAP '12. 8--15. https://doi.org/10.1109/ASAP.2012.30Google ScholarDigital Library
Christian Borgs, Michael Brautbar, Jennifer Chayes, and Brendan Lucier. 2014. Maximizing Social Influence in Nearly Optimal Time. In Proc. of SODA '14. SIAM, 946--957. https://doi.org/Portland, OregonGoogle ScholarCross Ref
Tony M. Brewer. 2010. Instruction Set Innovations for the Convey HC-1 Computer. IEEE Micro 30, 2 (2010), 70--79. https://doi.org/10.1109/MM.2010.36Google ScholarDigital Library
Enrico Calore and Sebastiano Fabio Schifano. 2021. Performance assessment of FPGAs as HPC accelerators using the FPGA Empirical Roofline. In Proc. of FPL '21. 83--90. https://doi.org/10.1109/FPL53798.2021.00022Google ScholarCross Ref
Pedro M. Domingos and Matthew Richardson. 2001. Mining the network value of customers. In Proc. of KDD '01. ACM, 57--66.Google Scholar
Timothy Dysart, Peter Kogge, Martin Deneroff, Eric Bovell, Preston Briggs, Jay Brockman, Kenneth Jacobsen, Yujen Juan, Shannon Kuntz, Richard Lethin, Janice McMahon, Chandra Pawar, Martin Perrigo, Sarah Rucker, John Ruttenberg, Max Ruttenberg, and Steve Stein. 2016. Highly Scalable Near Memory Processing with Migrating Threads on the Emu System Architecture. In 2016 6th Workshop on Irregular Applications: Architecture and Algorithms (IA3). 2--9. https://doi.org/10.1109/IA3.2016.007Google ScholarCross Ref
Iman Firmansyah, Du Changdao, Norihisa Fujita, Yoshiki Yamaguchi, and Taisuke Boku. 2019. FPGA-Based Implementation of Memory-Intensive Application Using OpenCL (HEART 2019). ACM, New York, NY, USA, Article 16, 4 pages.Google Scholar
Gökhan Göktürk and Kamer Kaya. 2020. Boosting Parallel Influence-Maximization Kernels for Undirected Networks with Fusing and Vectorization. CoRR abs/2008.03095 (2020). arXiv:2008.03095 https://arxiv.org/abs/2008.03095Google Scholar
Mark Harris. 2013. CUDA Pro Tip: Write Flexible Kernels with Grid-Stride Loops. https://developer.nvidia.com/blog/cuda-pro-tip-write-flexible-kernels-grid-stride-loops/.Google Scholar
Mohamed W. Hassan, Ahmed E. Helal, Peter M. Athanas, Wu-Chun Feng, and Yasser Y. Hanafy. 2018. Exploring FPGA-specific Optimizations for Irregular OpenCL Applications. In ReConFig '18. 1--8. https://doi.org/10.1109/RECONFIG.2018.8641699Google ScholarCross Ref
Sitao Huang, Mohamed El-Hadedy, Cong Hao, Qin Li, Vikram S. Mailthody, Ketan Date, Jinjun Xiong, Deming Chen, Rakesh Nagi, and Wen-mei Hwu. 2018. Triangle Counting and Truss Decomposition using FPGA. In HPEC '18. 1--7.Google ScholarCross Ref
Vinod Kathail. 2020. Xilinx Vitis Unified Software Platform. In Proc. of FPGA '20, Stephen Neuendorffer and Lesley Shannon (Eds.). ACM, 173--174. https://doi.org/10.1145/3373087.3375887Google ScholarDigital Library
David Kempe, Jon Kleinberg, and Éva Tardos. 2003. Maximizing the Spread of Influence through a Social Network. In Proc. of KDD '03. ACM, New York, NY, USA, 137--146. https://doi.org/10.1145/956750.956769Google ScholarDigital Library
Kartik Lakhotia, Rajgopal Kannan, Sourav Pati, and Viktor Prasanna. 2020. GPOP: A Scalable Cache- and Memory-Efficient Framework for Graph Processing over Parts. TOPC '20 7, 1, Article 7 (March 2020), 24 pages.Google ScholarDigital Library
Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne Van-Briesen, and Natalie Glance. 2007. Cost-effective outbreak detection in networks. In KDD. ACM, 420--429.Google Scholar
Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data.Google Scholar
Cheng Liu, Xinyu Chen, Bingsheng He, Xiaofei Liao, Ying Wang, and Lei Zhang. 2019. OBFS: OpenCL Based BFS Optimizations on Software Programmable FPGAs. In ICFPT '19. 315--318. https://doi.org/10.1109/ICFPT47387.2019.00056Google ScholarCross Ref
Marco Minutoli, Vito Giovanni Castellana, Nicola Saporetti, Stefano Devecchi, Marco Lattuada, Pietro Fezzardi, Antonino Tumeo, and Fabrizio Ferrandi. 2022. Svelto: High-Level Synthesis of Multi-Threaded Accelerators for Graph Analytics. IEEE Trans. Comput. 71, 3 (2022), 520--533. https://doi.org/10.1109/TC.2021.3057860Google ScholarDigital Library
Marco Minutoli, Maurizio Drocco, Mahantesh Halappanavar, Antonino Tumeo, and Ananth Kalyanaraman. 2020. CuRipples: Influence Maximization on Multi-GPU Systems. In Proc. of ICS '20. ACM. https://doi.org/10.1145/3392717.3392750Google ScholarDigital Library
Marco Minutoli, Mahantesh Halappanavar, Ananth Kalyanaraman, Arun Sathanur, Ryan Mcclure, and Jason McDermott. 2019. Fast and Scalable Implementations of Influence Maximization Algorithms. In CLUSTER '19. 1--12. https://doi.org/10.1109/CLUSTER.2019.8890991Google ScholarCross Ref
Marco Minutoli, Prathyush Sambaturu, Mahantesh Halappanavar, Antonino Tumeo, Ananth Kalyananaraman, and Anil Vullikanti. 2020. PREEMPT: Scalable Epidemic Interventions Using Submodular Optimization on Multi-GPU Systems. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis. 1--15. https://doi.org/10.1109/SC41405.2020.00059Google ScholarCross Ref
Tayo Oguntebi and Kunle Olukotun. 2016. GraphOps: A Dataflow Library for Graph Analytics Acceleration. In Proc. of FPGA 16. 111--117.Google Scholar
Amir Hossein Nodehi Sabet, Junqiao Qiu, and Zhijia Zhao. 2018. Tigr: Transforming Irregular Graphs for GPU-Friendly Graph Processing. In Proc. of ASPLOS '18. ACM. https://doi.org/10.1145/3173162.3173180Google ScholarDigital Library
K. Sridharan, T. K. Priya, and P. Rajesh Kumar. 2009. Hardware architecture for finding shortest paths. In TENCON '09. 1--5. https://doi.org/10.1109/TENCON.2009.5396155Google ScholarCross Ref
Chunyou Su, Hao Liang, Wei Zhang, Kun Zhao, Baole Ai, Wenting Shen, and Zeke Wang. 2021. Graph Sampling with Fast Random Walker on HBM-enabled FPGA Accelerators. In 2021 31st International Conference on Field-Programmable Logic and Applications (FPL). 211--218. https://doi.org/10.1109/FPL53798.2021.00042Google ScholarCross Ref
Mingxing Tan, Gai Liu, Ritchie Zhao, Steve Dai, and Zhiru Zhang. 2015. ElasticFlow: A complexity-effective approach for pipelining irregular loop nests. In ICCAD '15. 78--85.Google ScholarDigital Library
Youze Tang, Yanchen Shi, and Xiaokui Xiao. 2015. Influence Maximization in Near-Linear Time: A Martingale Approach. In Proc. 2015 ACM SIGMOD International Conference on Management of Data. ACM, 1539--1554.Google ScholarDigital Library
Matti Tommiska and Jorma Skyttä. 2001. Dijkstra's Shortest Path Routing Algorithm in Reconfigurable Hardware. In Proc. of FPL '01. Springer-Verlag, Berlin, Heidelberg, 653--657.Google Scholar
Antonino Tumeo and John Feo. 2015. Irregular applications: From architectures to algorithms [guest editors' introduction]. Computer 48, 8 (2015), 14--16.Google ScholarDigital Library
Shijie Zhou, Charalampos Chelmis, and Viktor K. Prasanna. 2015. Optimizing memory performance for FPGA implementation of pagerank. In ReConFig '15. 1--6. https://doi.org/10.1109/ReConFig.2015.7393332Google ScholarCross Ref
Shijie Zhou, Charalampos Chelmis, and Viktor K. Prasanna. 2016. High-Throughput and Energy-Efficient Graph Processing on FPGA. In FCCM '16. 103--110. https://doi.org/10.1109/FCCM.2016.35Google ScholarCross Ref

Index Terms

High-Level Synthesis of Irregular Applications: A Case Study on Influence Maximization
1. Hardware
  1. Electronic design automation
    1. High-level and register-transfer level synthesis
2. Mathematics of computing
  1. Discrete mathematics
    1. Graph theory
      1. Graph algorithms

Recommendations

From software to accelerators with LegUp high-level synthesis
CASES '13: Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems

Embedded system designers can achieve energy and performance benefits by using dedicated hardware accelerators. However, implementing custom hardware accelerators for an application can be difficult and time intensive. LegUp is an open-source high-level ...
Read More
Dynamic Hazard Resolution for Pipelining Irregular Loops in High-Level Synthesis
FPGA '17: Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

Current pipelining approach in high-level synthesis (HLS) achieves high performance for applications with regular and statically analyzable memory access patterns. However, it cannot effectively handle infrequent data-dependent structural and data ...
Read More
High-level synthesis with LegUp: a crash course for users and researchers
FPGA '13: Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays

High-level synthesis (HLS) has been gaining traction recently as a design methodology for FPGAs, with the promise of raising the productivity of FPGA hardware designers, and ultimately, opening the door to the use of FPGAs as computing devices ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CF '23: Proceedings of the 20th ACM International Conference on Computing Frontiers
May 2023
419 pages
ISBN:9798400701405
DOI:10.1145/3587135
General Chairs:
Andrea Bartolini
Università di Bologna, IT
,
Kristian Rietveld
Leiden University, NL
,
Program Chairs:
Catherine Schuman
University of Tennessee, US
,
Jose Moreira
IBM, US
Copyright © 2023 ACM
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 August 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Graph Algorithms
High-Level Synthesis
Influence Maximization
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
CF '23 Paper Acceptance Rate24of66submissions,36%Overall Acceptance Rate240of680submissions,35%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 83
  Total Downloads
- Downloads (Last 12 months)83
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

High-Level Synthesis of Irregular Applications: A Case Study on Influence Maximization

CF '23: Proceedings of the 20th ACM International Conference on Computing Frontiers

ABSTRACT

References

Cited By

Index Terms

Recommendations

From software to accelerators with LegUp high-level synthesis

Dynamic Hazard Resolution for Pipelining Irregular Loops in High-Level Synthesis

High-level synthesis with LegUp: a crash course for users and researchers

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

High-Level Synthesis of Irregular Applications: A Case Study on Influence Maximization

CF '23: Proceedings of the 20th ACM International Conference on Computing Frontiers

ABSTRACT

References

Cited By

Index Terms

Recommendations

From software to accelerators with LegUp high-level synthesis

Dynamic Hazard Resolution for Pipelining Irregular Loops in High-Level Synthesis

High-level synthesis with LegUp: a crash course for users and researchers

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media