skip to main content
10.1145/3581784.3607065acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Experiences readying applications for Exascale

Authors Info & Claims
Published:11 November 2023Publication History

ABSTRACT

The advent of Exascale computing invites an assessment of existing best practices for developing application readiness on the world's largest supercomputers. This work details observations from the last four years in preparing scientific applications to run on the Oak Ridge Leadership Computing Facility's (OLCF) Frontier system. This paper addresses a range of topics in software including programmability, tuning, and portability considerations that are key to moving applications from existing systems to future installations. A set of representative workloads provides case studies for general system and software testing. We evaluate the use of early access systems for development across several generations of hardware. Finally, we discuss how best practices were identified and disseminated to the community through a wide range of activities including user-guides and trainings. We conclude with recommendations for ensuring application readiness on future leadership computing systems.

References

  1. Ahmad Abdelfattah, Stanimire Tomov, and Jack Dongarra. 2019. Progressive optimization of batched LU factorization on GPUs. In 2019 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, 1--6.Google ScholarGoogle ScholarCross RefCross Ref
  2. Inc. Advanced Micro Devices. 2016. Hip: c++ heterogeneous-compute interface for portability. https://www.olcf.ornl.gov/calendar/intro-to-amd-gpu-programming-with-hip/.Google ScholarGoogle Scholar
  3. Anton Afanasyev, Mauro Bianco, Lukas Mosimann, Carlos Osuna, Felix Thaler, Hannes Vogt, Oliver Fuhrer, Joost VandeVondele, and Thomas C. Schulthess. 2021. Gridtools: a framework for portable weather and climate applications. SoftwareX, 15, 100707. Google ScholarGoogle ScholarCross RefCross Ref
  4. H.M. Aktulga, J.C. Fogarty, S.A. Pandit, and A.Y. Grama. 2012. Parallel reactive molecular dynamics: numerical methods and algorithmic techniques. Parallel Computing, 38, 4, 245--259. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. S. Almgren, J. B. Bell, M. J. Lijewski, Z. Lukić, and E. Van Andel. 2013. Nyx: A Massively Parallel AMR Code for Computational Cosmology. The Astrophysical Journal, 765, (Mar. 2013), 39, 39. arXiv: 1301.4498. Google ScholarGoogle ScholarCross RefCross Ref
  6. [n. d.] Amr-wind. https://github.com/exawind/amr-wind. Accessed: 2023-04-02. ().Google ScholarGoogle Scholar
  7. Cody J Balos, David J Gardner, Carol S Woodward, and Daniel R Reynolds. 2021. Enabling GPU accelerated computing in the SUNDIALS time integration library. Parallel Computing, 108, 102836.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Giuseppe M. J. Barca, Calum Snowdon, Jorge L. Galvez Vallejo, Fazeleh Kazemian, Alistair P. Rendell, and Mark S. Gordon. 2022. Scaling correlated fragment molecular orbital calculations on Summit. In SC22: International Conference for High Performance Computing, Networking, Storage and Analysis, 1--14. Google ScholarGoogle ScholarCross RefCross Ref
  9. Giuseppe M. J. Barca et al. 2020. Recent developments in the general atomic and molecular electronic structure system. en. The Journal of Chemical Physics, 152, 15, (Apr. 2020), 154102. Google ScholarGoogle ScholarCross RefCross Ref
  10. A. Beckingsale et al. 2019. Raja: portable performance for large-scale scientific applications. In P3HPC: IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC. IEEE.Google ScholarGoogle Scholar
  11. Peter M Caldwell et al. 2019. The DOE E3SM coupled model version 1: description and results at high resolution. Journal of Advances in Modeling Earth Systems, 11, 12, 4095--4146.Google ScholarGoogle ScholarCross RefCross Ref
  12. Kimberly Chenoweth, Adri C. T. van Duin, and William A. Goddard. 2008. ReaxFF reactive force field for molecular dynamics simulations of hydrocarbon oxidation. The Journal of Physical Chemistry A, 112, 5, 1040--1053. PMID: 18197648. doi: 10.1021/jp709896w. Google ScholarGoogle ScholarCross RefCross Ref
  13. DWARF Debugging Information Format Committee. 2023. Dwarf debugging information format. https://dwarfstd.org/.Google ScholarGoogle Scholar
  14. L. Dagum and R. Menon. 1998. OpenMP: an industry standard API for shared-memory programming. IEEE Computational Science and Engineering, 5, 1, 46--55. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. H. Carter Edwards, Christian R. Trott, and Daniel Sunderland. 2014. Kokkos: enabling manycore performance portability through polymorphic memory access patterns. Journal of Parallel and Distributed Computing, 74, 12, 3202--3216. Domain-Specific Languages and High-Level Frameworks for High-Performance Computing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Markus Eisenbach, Jeff Larkin, Justin Lutjens, Steven Rennich, and James H. Rogers. 2017. GPU acceleration of the locally selfconsistent multiple scattering code for first principles calculation of the ground state and statistical physics of materials. Computer Physics Communications, 211, 2--7.Google ScholarGoogle ScholarCross RefCross Ref
  17. Tjerk P. Straatsma, Katerina B. Antypas, and Timothy J. Williams, (Eds.) 2017. Exascale Scientific Applications: Scalability and Performance Portability. CRC Press, New York. Chap. Real-Space Multiple-Scattering Theory and Its Applications at Exascale, 449--460.Google ScholarGoogle Scholar
  18. Oak Ridge Leadership Computing Facility. 2019. Intro to AMD GPU programming with HIP. https://www.olcf.ornl.gov/calendar/intro-to-amd-gpu-programming-with-hip/.Google ScholarGoogle Scholar
  19. Luca Fedeli et al. 2022. Pushing the frontier in the design of laser-based electron accelerators with groundbreaking mesh-refined particle-in-cell simulations on exascale-class supercomputers. In SC22: International Conference for High Performance Computing, Networking, Storage and Analysis, 1--12. Google ScholarGoogle ScholarCross RefCross Ref
  20. Nicholas Frontiere, J. D. Emberson, Michael Buehlmann, Joseph Adamo, Salman Habib, Katrin Heitmann, and Claude-André Faucher-Giguère. 2023. Simulating Hydrodynamics in Cosmology with CRK-HACC. The Astrophysical Journal Supplement Series, 264, 2, (Jan. 2023), 34. Google ScholarGoogle ScholarCross RefCross Ref
  21. Nicholas Frontiere et al. 2022. Farpoint: A High-resolution Cosmology Simulation at the Gigaparsec Scale. The Astrophysical Journal Supplement Series, 259, 1, (Feb. 2022), 15. Google ScholarGoogle ScholarCross RefCross Ref
  22. Mark S. Gordon, Giuseppe Barca, Sarom S. Leang, David Poole, Alistair P. Rendell, Jorge L. Galvez Vallejo, and Bryce Westheimer. 2020. Novel computer architectures and quantum chemistry. The Journal of Physical Chemistry A, 124, 23, 4557--4582. PMID: 32379450. doi: 10.1021/acs.jpca.0c02249. Google ScholarGoogle ScholarCross RefCross Ref
  23. Salman Habib et al. 2016. HACC: Extreme Scaling and Performance Across Diverse Architectures. Commun. ACM (Research Highlight), 60, 1, (Dec. 2016), 97--104. Originally published in: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, page 6. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Katrin Heitmann et al. 2021. The Last Journey. I. An Extreme-scale Simulation on the Mira Supercomputer. The Astrophysical Journal Supplement Series, 252, 2, 19.Google ScholarGoogle Scholar
  25. Marc T Henry de Frahan et al. 2022. PeleC: An adaptive mesh refinement solver for compressible reacting flows. The International Journal of High Performance Computing Applications, 0, 0, 10943420221121151. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Wayne Joubert, Deborah Weighill, David Kainer, Sharlee Climer, Amy Justice, Kjiersten Fagnan, and Daniel Jacobson. 2018. Attacking the opioid epidemic: determining the epistatic and pleiotropic genetic architectures for chronic pain and opioid addiction. In SC18: International Conference for High Performance Computing, Networking, Storage and Analysis, 717--730. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Ramakrishnan Kannan, Piyush Sao, Hao Lu, Drahomira Herrmannova, Vijay Thakkar, Robert Patton, Richard Vuduc, and Thomas Potok. 2020. Scalable knowledge graph analytics at 136 petaflop/s. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 1--13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Ramakrishnan Kannan et al. 2022. Exaflops biomedical knowledge graph analytics. In SC22: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 1--11.Google ScholarGoogle Scholar
  29. John Lagergren et al. 2022. Climatic clustering and longitudinal analysis with impacts on food, bioenergy, and pandemics. Phytobiomes Journal, 0, ja, null. Google ScholarGoogle ScholarCross RefCross Ref
  30. Chris Lattner and Vikram Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proceedings of the 2004 International Symposium on Code Generation and Optimization (CGO'04). Palo Alto, California, (Mar. 2004).Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Myoungkyu Lee, Nicholas Malaya, and Robert D Moser. 2013. Petascale direct numerical simulation of turbulent channel flow on up to 786k cores. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, 1--11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Myoungkyu Lee, Rhys Ulerich, Nicholas Malaya, and Robert D. Moser. 2014. Experiences from leadership computing in simulations of turbulent fluid flows. Computing in Science & Engineering, 16, 5, 24--31. Google ScholarGoogle ScholarCross RefCross Ref
  33. Lianchi Liu, Yi Liu, Sergey V. Zybin, Huai Sun, and William A. III Goddard. 2011. ReaxFF-lg: correction of the ReaxFF reactive force field for london dispersion, with applications to the equations of state for energetic materials. The Journal of Physical Chemistry A, 115, 40, 11016--11022. PMID: 21888351. doi: 10.1021/jp201599t. Google ScholarGoogle ScholarCross RefCross Ref
  34. M. Graham Lopez, Jeffrey Young, Jeremy S. Meredith, Philip C. Roth, Mitchel Horton, and Jeffrey S. Vetter. 2015. Examining recent many-core architectures and programming models using SHOC. In Proceedings of the 6th International Workshop on Performance Modeling, Benchmarking, and Simulation of High Performance Computing Systems (PMBS '15) Article 3. Association for Computing Machinery, Austin, Texas, 12 pages. isbn: 9781450340090. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. L. Luo et al. 2020. Pre-exascale accelerated application development: the ORNL Summit experience. IBM Journal of Research and Development, 64, 3/4, 11:1--11:21. Google ScholarGoogle ScholarCross RefCross Ref
  36. George S. Markomanolis et al. 2022. Evaluating GPU programming models for the LUMI supercomputer. In Supercomputing Frontiers: 7th Asian Conference, SCFA 2022, Singapore, March 1--3, 2022, Proceedings. Springer-Verlag, Singapore, Singapore, 79--101. isbn: 978-3-031-10418-3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Roger Ghanem, David Higdon, and Houman Owhadi, (Eds.) 2017. The parallel C++ statistical library for Bayesian inference: QUESO. Handbook of Uncertainty Quantification. Springer International Publishing, Cham, 1829--1865. isbn: 978-3-319-12385-1. Google ScholarGoogle ScholarCross RefCross Ref
  38. Stan Moore. 2022. Kokkos and SNAP work in support of EXAALT and LAMMPS. In CoPA All Hands Meeting. Sandia National Laboratories.Google ScholarGoogle Scholar
  39. H. Nam, G. Rockefeller, M. Glass, S. Dawson, J. Levesque, and V. Lee. 2017. The Trinity center of excellence co-design best practices. Computing in Science & Engineering, 19, 05, (Sept. 2017), 19--26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. J. Robert Neely and Bronis R. de Supinski. 2017. Application modernization at LLNL and the Sierra center of excellence. Computing in Science & Engineering, 19, 5, 9--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Charlotte A. Nelson, Ana Uriarte Acuna, Amber M. Paul, Ryan T. Scott, Atul J. Butte, Egle Cekanaviciute, Sergio E. Baranzini, and Sylvain V. Costes. 2021. Knowledge network embedding of transcriptomic data from spaceflown mice uncovers signs and symptoms associated with terrestrial diseases. Life, 11, 1. Google ScholarGoogle ScholarCross RefCross Ref
  42. Matthew Norman and Jeffrey Larkin. 2020. A holistic algorithmic approach to improving accuracy, robustness, and computational efficiency for atmospheric dynamics. SIAM Journal on Scientific Computing, 42, 5, B1302--B1327.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Matthew Norman, Isaac Lyngaas, Abhishek Bagusetty, and Mark Berrill. 2022. Portable C++ code that can look and feel like Fortran code with Yet Another Kernel Launcher (YAKL). International Journal of Parallel Programming, 1--22.Google ScholarGoogle Scholar
  44. Matthew R Norman et al. 2022. Unprecedented cloud resolution in a GPU-enabled full-physics atmospheric climate simulation on OLCF's Summit supercomputer. The International Journal of High Performance Computing Applications, 36, 1, 93--105.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Octavi Obiols-Sales, Abhinav Vishnu, Nicholas Malaya, and Aparna Chandramowliswharan. 2020. CFDNet: a deep learning-based accelerator for fluid simulations. In Proceedings of the 34th ACM International Conference on Super-computing (ICS '20) Article 3. Association for Computing Machinery, Barcelona, Spain, 12 pages. isbn: 9781450379830. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Octavi Obiols-Sales, Abhinav Vishnu, Nicholas P. Malaya, and Aparna Chandramowlishwaran. 2021. Surfnet: super-resolution of turbulent flows with transfer learning using small datasets. In 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT), 331--344. Google ScholarGoogle ScholarCross RefCross Ref
  47. Kiran Ravikumar, David Appelhans, and P.K Yeung. 2019. GPU acceleration of extreme scale pseudo-spectral simulations of turbulence using asynchronism. In SC '19: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 1--22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Evan Schneider. 2021. Astrophysics at exascale: preparing Cholla for Frontier. https://www.olcf.ornl.gov/wp-content/uploads/2021/01/2021UM-Day-3-Schneider-Preparing-Cholla-for-Frontier.pdf.Google ScholarGoogle Scholar
  49. John E. Stone, David Gohara, and Guochun Shi. 2010. OpenCL: a parallel programming standard for heterogeneous computing systems. Computing in Science & Engineering, 12, 3, 66--73. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. A. P. Thompson et al. 2022. LAMMPS - a flexible simulation tool for particle-based materials modeling at the atomic, meso, and continuum scales. Comp. Phys. Comm., 271, 108171. Google ScholarGoogle ScholarCross RefCross Ref
  51. Christian R. Trott et al. 2022. Kokkos 3: programming model extensions for the exascale era. IEEE Transactions on Parallel and Distributed Systems, 33, 4, 805--817. Google ScholarGoogle ScholarCross RefCross Ref
  52. Sudharshan S Vazhkudai et al. 2018. The design, deployment, and evaluation of the CORAL pre-exascale systems. In SC18: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 661--672.Google ScholarGoogle Scholar
  53. Yang Wang, G. Malcolm Stocks, W. A. Shelton, D. M. C. Nicholson, W. M. Temmerman, and Z. Szotek. 1995. Order-n multiple scattering approach to electronic structure calculations. Phys. Rev. Lett., 75, 2867.Google ScholarGoogle ScholarCross RefCross Ref
  54. Weiqun Zhang et al. 2019. Amrex: a framework for block-structured adaptive mesh refinement. Journal of Open Source Software, 4, 37, (May 12, 2019), 1370. Google ScholarGoogle ScholarCross RefCross Ref

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    SC '23: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
    November 2023
    1428 pages
    ISBN:9798400701092
    DOI:10.1145/3581784

    Copyright © 2023 ACM

    Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 11 November 2023

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    Overall Acceptance Rate1,516of6,373submissions,24%
  • Article Metrics

    • Downloads (Last 12 months)339
    • Downloads (Last 6 weeks)23

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader