skip to main content
10.1145/3458817.3476216acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections

HPAC: evaluating approximate computing techniques on HPC OpenMP applications

Published:13 November 2021Publication History

ABSTRACT

As we approach the limits of Moore's law, researchers are exploring new paradigms for future high-performance computing (HPC) systems. Approximate computing has gained traction by promising to deliver substantial computing power. However, due to the stringent accuracy requirements of HPC scientific applications, the broad adoption of approximate computing methods in HPC requires an in-depth understanding of the application's amenability to approximations.

We develop HPAC, a framework with compiler and runtime support for code annotation and transformation, and accuracy vs. performance trade-off analysis of OpenMP HPC applications. We use HPAC to perform an in-depth analysis of the effectiveness of approximate computing techniques when applied to HPC applications. The results reveal possible performance gains of approximation and its interplay with parallel execution. For instance, in the LULESH proxy application approximation provides substantial performance gains due to the reduction of memory accesses. However, in the leukocyte benchmark approximation induces load imbalance in the parallel execution and thus limiting the performance gains.

Skip Supplemental Material Section

Supplemental Material

HPAC Evaluating Approximate Computing Techniques on HPC OpenMP Applications.mp4

mp4

135.5 MB

References

  1. Sameh Abdulah, Qinglei Cao, Yu Pei, George Bosilca, Jack Dongarra, Marc G Genton, David E Keyes, Hatem Ltaief, and Ying Sun. 2021. Accelerating Geostatistical Modeling and Prediction With Mixed-Precision Computations: A High-Productivity Approach with PaRSEC. Technical Report.Google ScholarGoogle Scholar
  2. Jason Ansel, Cy Chan, Yee Lok Wong, Marek Olszewski, Qin Zhao, Alan Edelman, and Saman Amarasinghe. 2009. Petabricks: A language and compiler for algorithmic choice. ACM Sigplan Notices 44, 6 (2009), 38--49.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Jason Ansel, Yee Lok Wong, Cy Chan, Marek Olszewski, Alan Edelman, and Saman Amarasinghe. 2011. Language and compiler support for auto-tuning variable-accuracy algorithms. In International Symposium on Code Generation and Optimization (CGO 2011). IEEE, 85--96.Google ScholarGoogle ScholarCross RefCross Ref
  4. Woongki Baek and Trishul M Chilimbi. 2010. Green: A framework for supporting energy-conscious programming using controlled approximation. In Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation. 198--209.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li. 2008. The PARSEC benchmark suite: Characterization and architectural implications. In Proceedings of the 17th international conference on Parallel architectures and compilation techniques. 72--81.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Qinglei Cao, Yu Pei, Kadir Akbudak, George Bosilca, Hatem Ltaief, David E Keyes, and Jack Dongarra. 2020. Leveraging PaRSEC Runtime Support to Tackle Challenging 3D Data-Sparse Matrix Problems. (2020).Google ScholarGoogle Scholar
  7. Qinglei Cao, Yu Pei, Kadir Akbudak, Aleksandr Mikhalev, George Bosilca, Hatem Ltaief, David Keyes, and Jack Dongarra. 2020. Extreme-scale task-based cholesky factorization toward climate and weather prediction applications. In Proceedings of the Platform for Advanced Scientific Computing Conference. 1--11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Michael Carbin, Sasa Misailovic, and Martin C Rinard. 2013. Verifying quantitative reliability for programs that execute on unreliable hardware. ACM SIGPLAN Notices 48, 10 (2013), 33--52.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, S. Lee, and K. Skadron. 2009. Rodinia: A benchmark suite for heterogeneous computing. In 2009 IEEE International Symposium on Workload Characterization (IISWC). 44--54. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ron S Dembo, Stanley C Eisenstat, and Trond Steihaug. 1982. Inexact newton methods. SIAM Journal on Numerical analysis 19, 2 (1982), 400--408.Google ScholarGoogle Scholar
  11. Yufei Ding, Jason Ansel, Kalyan Veeramachaneni, Xipeng Shen, Una-May O'Reilly, and Saman Amarasinghe. 2015. Autotuning algorithmic choice for input sensitivity. ACM SIGPLAN Notices 50, 6 (2015), 379--390.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Jack Dongarra, G Bosilca, A Bouteiller, A Danalis, M Faverge, and T Herault. 2013. PaRSEC: A programming paradigm exploiting heterogeneity for enhancing scalability. IEEE Computing in Science and Engineering 15 (2013), 36.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Rudolf Eigenmann. 2017. HiPA: history-based piecewise approximation for functions. In Proceedings of the International Conference on Supercomputing. 1--10.Google ScholarGoogle Scholar
  14. Hadi Esmaeilzadeh, Adrian Sampson, Luis Ceze, and Doug Burger. 2012. Architecture support for disciplined approximate programming. In Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems. 301--312.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Hadi Esmaeilzadeh, Adrian Sampson, Luis Ceze, and Doug Burger. 2012. Neural acceleration for general-purpose approximate programs. In 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE, 449--460.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Saman Froehlich, Daniel Große, and Rolf Drechsler. 2018. Towards reversed approximate hardware design. In 2018 21st Euromicro Conference on Digital System Design (DSD). IEEE, 665--671.Google ScholarGoogle ScholarCross RefCross Ref
  17. Daniele Funaro. 2008. Polynomial approximation of differential equations. Vol. 8. Springer Science & Business Media.Google ScholarGoogle Scholar
  18. Beayna Grigorian, Nazanin Farahpour, and Glenn Reinman. 2015. BRAINIAC: Bringing reliable accuracy into neurally-implemented approximate computing. In 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA). IEEE, 615--626.Google ScholarGoogle ScholarCross RefCross Ref
  19. Michael A. Heroux. 2017. High Performance Computing Conjugate Gradients: The original Mantevo miniapp. https://github.com/Mantevo/HPCCGGoogle ScholarGoogle Scholar
  20. Henry Hoffmann, Sasa Misailovic, Stelios Sidiroglou, Anant Agarwal, and Martin Rinard. 2009. Using code perforation to improve performance, reduce energy consumption, and respond to failures. (2009).Google ScholarGoogle Scholar
  21. Ian Karlin, Jeff Keasler, and Rob Neely. 2013. LULESH 2.0 Updates and Changes. Technical Report LLNL-TR-641973. 1--9 pages.Google ScholarGoogle Scholar
  22. Ignacio Laguna, Paul C Wood, Ranvijay Singh, and Saurabh Bagchi. 2019. Gpumixer: Performance-driven floating-point tuning for gpu scientific applications. In International Conference on High Performance Computing. Springer, 227--246.Google ScholarGoogle ScholarCross RefCross Ref
  23. Michael O Lam, Tristan Vanderbruggen, Harshitha Menon, and Markus Schordan. 2019. Tool integration for source-level mixed precision. In 2019 IEEE/ACM 3rd International Workshop on Software Correctness for HPC Applications (Correctness). IEEE, 27--35.Google ScholarGoogle ScholarCross RefCross Ref
  24. Wes McKinney et al. 2011. pandas: a foundational Python library for data analysis and statistics. Python for High Performance and Scientific Computing 14, 9 (2011), 1--9.Google ScholarGoogle Scholar
  25. Jiayuan Meng, Srimat Chakradhar, and Anand Raghunathan. 2009. Best-effort parallel execution framework for recognition and mining applications. In 2009 IEEE International Symposium on Parallel & Distributed Processing. IEEE, 1--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Jiayuan Mengte, Anand Raghunathan, Srimat Chakradhar, and Surendra Byna. 2010. Exploiting the forgiving nature of applications for scalable parallel execution. In 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS). IEEE, 1--12.Google ScholarGoogle ScholarCross RefCross Ref
  27. Harshitha Menon, Michael O Lam, Daniel Osei-Kuffuor, Markus Schordan, Scott Lloyd, Kathryn Mohror, and Jeffrey Hittinger. 2018. ADAPT: algorithmic differentiation applied to floating-point precision tuning. In SC18: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 614--626.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Sasa Misailovic, Michael Carbin, Sara Achour, Zichao Qi, and Martin C Rinard. 2014. Chisel: Reliability-and accuracy-aware optimization of approximate computational kernels. ACM Sigplan Notices 49, 10 (2014), 309--328.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Sasa Misailovic, Daniel M Roy, and Martin C Rinard. 2011. Probabilistically accurate program transformations. In International Static Analysis Symposium. Springer, 316--333.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Sasa Misailovic, Stelios Sidiroglou, Henry Hoffmann, and Martin Rinard. 2010. Quality of service profiling. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering-Volume 1. 25--34.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Asit K Mishra, Rajkishore Barik, and Somnath Paul. 2014. iACT: A software-hardware framework for understanding the scope of approximate computing. In Workshop on Approximate Computing Across the System Stack (WACAS). 52.Google ScholarGoogle Scholar
  32. Konstantinos Parasyris, Vassilis Vassiliadis, Christos D Antonopoulos, Spyros Lalis, and Nikolaos Bellas. 2017. Significance-aware program execution on unreliable hardware. ACM Transactions on Architecture and Code Optimization (TACO) 14, 2 (2017), 1--25.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. A. Rahimi, A. Marongiu, P. Burgio, R. K. Gupta, and L. Benini. 2013. Variationtolerant OpenMP tasking on tightly-coupled processor clusters. In 2013 Design, Automation Test in Europe Conference Exhibition (DATE). 541--546. Google ScholarGoogle ScholarCross RefCross Ref
  34. Abbas Rahimi, Andrea Marongiu, Rajesh K Gupta, and Luca Benini. 2013. A variability-aware openmp environment for efficient execution of accuracy-configurable computation on shared-fpu processor clusters. In 2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ ISSS). IEEE, 1--10.Google ScholarGoogle ScholarCross RefCross Ref
  35. Semeen Rehman, Walaa El-Harouni, Muhammad Shafique, Akash Kumar, Jorg Henkel, and Jörg Henkel. 2016. Architectural-space exploration of approximate multipliers. In 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). IEEE, 1--8.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Martin Rinard. 2006. Probabilistic accuracy bounds for fault-tolerant computations that discard tasks. In Proceedings of the 20th annual international conference on Supercomputing. 324--334.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Cindy Rubio-González, Cuong Nguyen, Hong Diep Nguyen, James Demmel, William Kahan, Koushik Sen, David H Bailey, Costin Iancu, and David Hough. 2013. Precimonious: Tuning assistant for floating-point precision. In SC'13: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. IEEE, 1--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Mehrzad Samadi, Davoud Anoushe Jamshidi, Janghaeng Lee, and Scott Mahlke. 2014. Paraprox: Pattern-based approximation for data parallel applications. In Proceedings of the 19th international conference on Architectural support for programming languages and operating systems. 35--50.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Mehrzad Samadi, Janghaeng Lee, D Anoushe Jamshidi, Amir Hormati, and Scott Mahlke. 2013. Sage: Self-tuning approximation for graphics engines. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture. 13--24.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Adrian Sampson, André Baixo, Benjamin Ransford, Thierry Moreau, Joshua Yip, Luis Ceze, and Mark Oskin. 2015. Accept: A programmer-guided compiler framework for practical approximate computing. University of Washington Technical Report UW-CSE-15-01 1, 2 (2015).Google ScholarGoogle Scholar
  41. Adrian Sampson, Werner Dietl, Emily Fortuna, Danushen Gnanapragasam, Luis Ceze, and Dan Grossman. 2011. EnerJ: Approximate data types for safe and general low-power computation. ACM SIGPLAN Notices 46, 6 (2011), 164--174.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Hashim Sharif, Prakalp Srivastava, Muhammad Huzaifa, Maria Kotsifakou, Keyur Joshi, Yasmin Sarita, Nathan Zhao, Vikram S Adve, Sasa Misailovic, and Sarita V Adve. 2019. ApproxHPVM: a portable compiler IR for accuracy-aware optimizations. Proc. ACM Program. Lang. 3, OOPSLA (2019), 186--1.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Hashim Sharif, Yifan Zhao, Maria Kotsifakou, Akash Kothari, Ben Schreiber, Elizabeth Wang, Yasmin Sarita, Nathan Zhao, Keyur Joshi, Vikram S Adve, et al. 2021. ApproxTuner: a compiler and runtime system for adaptive approximations. In Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 262--277.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Stelios Sidiroglou-Douskos, Sasa Misailovic, Henry Hoffmann, and Martin Rinard. 2011. Managing performance vs. accuracy trade-offs with loop perforation. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. 124--134.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Georgios Tziantzioulis, Nikos Hardavellas, and Simone Campanoni. 2018. Temporal approximate function memoization. IEEE Micro 38, 4 (2018), 60--70.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Vassilis Vassiliadis, Charalampos Chalios, Konstantinos Parasyris, Christos D Antonopoulos, Spyros Lalis, Nikolaos Bellas, Hans Vandierendonck, and Dimitrios S Nikolopoulos. 2015. A significance-driven programming framework for energy-constrained approximate computing. In Proceedings of the 12th ACM International Conference on Computing Frontiers. 1--8.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Vassilis Vassiliadis, Charalampos Chalios, Konstantinos Parasyris, Christos D Antonopoulos, Spyros Lalis, Nikolaos Bellas, Hans Vandierendonck, and Dimitrios S Nikolopoulos. 2016. Exploiting significance of computations for energy-constrained approximate computing. International Journal of Parallel Programming 44, 5 (2016), 1078--1098.Google ScholarGoogle ScholarCross RefCross Ref
  48. Vassilis Vassiliadis, Konstantinos Parasyris, Charalambos Chalios, Christos D Antonopoulos, Spyros Lalis, Nikolaos Bellas, Hans Vandierendonck, and Dimitrios S Nikolopoulos. 2015. A programming model and runtime system for significance-aware energy-efficient computing. ACM SIGPLAN Notices 50, 8 (2015), 275--276.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Zeyuan Allen Zhu, Sasa Misailovic, Jonathan A Kelner, and Martin Rinard. 2012. Randomized accuracy-aware program transformations for efficient approximate computations. ACM SIGPLAN Notices 47, 1 (2012), 441--454.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. HPAC: evaluating approximate computing techniques on HPC OpenMP applications

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SC '21: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
      November 2021
      1493 pages
      ISBN:9781450384421
      DOI:10.1145/3458817

      Copyright © 2021 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 13 November 2021

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,516of6,373submissions,24%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader