skip to main content
10.1145/3458817.3476169acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections

SV-sim: scalable PGAS-based state vector simulation of quantum circuits

Published:13 November 2021Publication History

ABSTRACT

High-performance quantum circuit simulation in a classic HPC is still imperative in the NISQ era. Observing that the major obstacle of scalable state-vector quantum simulation arises from the massively fine-grained irregular data-exchange with remote nodes, in this paper we present SV-Sim to apply the PGAS-based communication models (i.e., direct peer access for intra-node CPUs/GPUs and SHMEM for inter-node CPU/GPU clusters) for efficient generalpurpose quantum circuit simulation. Through an orchestrated design based on device functional pointer, SV-Sim is able to abstract various quantum gates across multiple heterogeneous backends, including IBM/Intel/AMD CPUs, NVIDIA/AMD GPUs, and Intel Xeon Phi, in a unified framework, but still asserting outstanding performance and tractable interface to higher-level quantum programming environments, such as IBM Qiskit, Microsoft Q# and Google Cirq. Circumventing the obstacle from the lack of polymorphism in GPUs and leveraging the device-initiated one-sided communication, SV-Sim can process circuit that are dynamically generated in Python using a single GPU/CPU kernel without the need of expensive JIT or runtime parsing, significantly simplifying the programming complexity and improving performance for QC simulation. This is especially appealing for the variational quantum algorithms given the circuits are synthesized online per iteration. Evaluations on the latest NVIDIA DGX-A100, V100-DGX-2, ALCF Theta, OLCF Spock, and OLCF Summit HPCs show that SV-Sim can deliver scalable performance on various state-of-the-art HPC platforms, offering a useful tool for quantum algorithm validation and verification. SV-Sim has been released at http://github.com/pnnl/sv-sim. A version specially tweaked for Q#/QDK is also provided.

Skip Supplemental Material Section

Supplemental Material

SV-Sim_ Scalable PGAS-Based State Vector Simulation of Quantum Circuits.mp4.mp4

mp4

287.5 MB

References

  1. [n.d.]. List of QC simulators. https://www.quantiki.org/wiki/list-qc-simulators.Google ScholarGoogle Scholar
  2. Ali J Abhari, Arvin Faruque, Mohammad J Dousti, Lukas Svec, Oana Catu, Amlan Chakrabati, Chen-Fu Chiang, Seth Vanderwilt, John Black, and Fred Chong. 2012. Scaffold: Quantum programming language. Technical Report. Princeton Univ NJ Dept of Computer Science.Google ScholarGoogle Scholar
  3. Gadi Aleksandrowicz, Thomas Alexander, Panagiotis Barkoutsos, Luciano Bello, Yael Ben-Haim, D Bucher, FJ Cabrera-Hernández, J Carballo-Franquis, A Chen, CF Chen, et al. 2019. Qiskit: An open-source framework for quantum computing. Accessed on: Mar 16 (2019).Google ScholarGoogle Scholar
  4. AMD. [n.d.]. ROCm OpenSHMEM. URL: https://github.com/ROCm-Developer-Tools/ROC_SHMEM.Google ScholarGoogle Scholar
  5. AMD. 2020. AMD Infinity Fabric.Google ScholarGoogle Scholar
  6. Frank Arute, Kunal Arya, Ryan Babbush, Dave Bacon, Joseph C Bardin, Rami Barends, Rupak Biswas, Sergio Boixo, Fernando GSL Brandao, David A Buell, et al. 2019. Quantum supremacy using a programmable superconducting processor. Nature 574, 7779 (2019), 505--510.Google ScholarGoogle Scholar
  7. Adriano Barenco, Charles H Bennett, Richard Cleve, David P DiVincenzo, Norman Margolus, Peter Shor, Tycho Sleator, John A Smolin, and Harald Weinfurter. 1995. Elementary gates for quantum computation. Physical review A 52, 5 (1995), 3457.Google ScholarGoogle Scholar
  8. Rami Barends, Julian Kelly, Anthony Megrant, Andrzej Veitia, Daniel Sank, Evan Jeffrey, Ted C White, Josh Mutus, Austin G Fowler, Brooks Campbell, et al. 2014. Superconducting quantum circuits at the surface code threshold for fault tolerance. Nature 508, 7497 (2014), 500--503.Google ScholarGoogle Scholar
  9. Kerstin Beer, Dmytro Bondarenko, Terry Farrelly, Tobias J Osborne, Robert Salzmann, Daniel Scheiermann, and Ramona Wolf. 2020. Training deep quantum neural networks. Nature communications 11, 1 (2020), 1--6.Google ScholarGoogle Scholar
  10. Sergio Boixo, Sergei V Isakov, Vadim N Smelyanskiy, Ryan Babbush, Nan Ding, Zhang Jiang, Michael J Bremner, John M Martinis, and Hartmut Neven. 2018. Characterizing quantum supremacy in near-term devices. Nature Physics 14, 6 (2018), 595--600.Google ScholarGoogle ScholarCross RefCross Ref
  11. Michael Broughton, Guillaume Verdon, Trevor McCourt, Antonio J Martinez, Jae Hyeon Yoo, Sergei V Isakov, Philip Massey, Murphy Yuezhen Niu, Ramin Halavati, Evan Peters, et al. 2020. Tensorflow quantum: A software framework for quantum machine learning. arXiv preprint arXiv:2003.02989 (2020).Google ScholarGoogle Scholar
  12. Jonathan Carter, David Dean, Greg Hebner, Jungsang Kim, Andrew Landahl, Peter Maunz, Raphael Pooser, Irfan Siddiqi, and Jeffrey Vetter. 2017. ASCR Report on a Quantum Computing Testbed for Science. Technical Report. USDOE Office of Science (SC), Washington, DC (United States). Advanced ....Google ScholarGoogle Scholar
  13. Barbara Chapman, Tony Curtis, Swaroop Pophale, Stephen Poole, Jeff Kuehn, Chuck Koelbel, and Lauren Smith. 2010. Introducing OpenSHMEM: SHMEM for the PGAS community. In Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model. 1--3.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Andrew W Cross, Lev S Bishop, John A Smolin, and Jay M Gambetta. 2017. Open quantum assembly language. arXiv preprint arXiv:1707.03429 (2017). Repo: https://github.com/Qiskit/openqasm.Google ScholarGoogle Scholar
  15. Hans De Raedt, Fengping Jin, Dennis Willsch, Madita Willsch, Naoki Yoshioka, Nobuyasu Ito, Shengjun Yuan, and Kristel Michielsen. 2019. Massively parallel quantum computer simulator, eleven years later. Computer Physics Communications 237 (2019), 47--61.Google ScholarGoogle ScholarCross RefCross Ref
  16. Koen De Raedt, Kristel Michielsen, Hans De Raedt, Binh Trieu, Guido Arnold, Marcus Richter, Th Lippert, Hiroshi Watanabe, and Nobuyasu Ito. 2007. Massively parallel quantum computer simulator. Computer Physics Communications 176, 2 (2007), 121--136.Google ScholarGoogle ScholarCross RefCross Ref
  17. Jun Doi, Hitomi Takahashi, Rudy Raymond, Takashi Imamichi, and Hiroshi Horii. 2019. Quantum computing simulator on a heterogeneous hpc system. In Proceedings of the 16th ACM International Conference on Computing Frontiers. 85--93.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Edward Farhi, Jeffrey Goldstone, and Sam Gutmann. 2014. A quantum approximate optimization algorithm. arXiv preprint arXiv:1411.4028 (2014).Google ScholarGoogle Scholar
  19. Edward Farhi and Hartmut Neven. 2018. Classification with quantum neural networks on near term processors. arXiv preprint arXiv:1802.06002 (2018).Google ScholarGoogle Scholar
  20. Gian Giacomo Guerreschi and Anne Y Matsuura. 2019. QAOA for Max-Cut requires hundreds of qubits for quantum speed-up. Scientific reports 9, 1 (2019), 1--7.Google ScholarGoogle Scholar
  21. Khaled Hamidouche and Michael LeBeane. 2020. Gpu initiated openshmem: correct and efficient intra-kernel networking for dgpus. In Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 336--347.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Thomas Häner and Damian S Steiger. 2017. 5 petabyte simulation of a 45-qubit quantum circuit. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1--10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Thomas Häner, Damian S Steiger, Mikhail Smelyanskiy, and Matthias Troyer. 2016. High performance emulation of quantum circuits. In SC'16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 866--874.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Bettina Heim. 2021. Universal Quantum Intermediate Representation. Bulletin of the American Physical Society (2021).Google ScholarGoogle Scholar
  25. IBM. [n.d.]. IBM Quantum Experience. URL: https://quantum-computing.ibm.com/.Google ScholarGoogle Scholar
  26. Sylvain Jeaugey. 2017. NCCL 2.0. In GPU Technology Conference (GTC).Google ScholarGoogle Scholar
  27. Tyson Jones, Anna Brown, Ian Bush, and Simon C Benjamin. 2019. QuEST and high performance simulation of quantum computers. Scientific reports 9, 1 (2019), 1--11.Google ScholarGoogle Scholar
  28. Abhinav Kandala, Antonio Mezzacapo, Kristan Temme, Maika Takita, Markus Brink, Jerry M Chow, and Jay M Gambetta. 2017. Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets. Nature 549, 7671 (2017), 242--246.Google ScholarGoogle Scholar
  29. Michael LeBeane, Khaled Hamidouche, Brad Benton, Mauricio Breternitz, Steven K Reinhardt, and Lizy K John. 2017. GPU triggered networking for intra-kernel communications. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Ang Li, Tong Geng, Tianqi Wang, Martin Herbordt, Shuaiwen Leon Song, and Kevin Barker. 2019. BSTC: A novel binarized-soft-tensor-core design for accelerating bit-based approximated neural nets. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1--30.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Ang Li and Sriram Krishnamoorthy. 2020. QASMBench: A low-level QASM benchmark suite for NISQ evaluation and simulation. arXiv preprint arXiv:2005.13018 (2020).Google ScholarGoogle Scholar
  32. Ang Li, Weifeng Liu, Mads RB Kristensen, Brian Vinter, Hao Wang, Kaixi Hou, Andres Marquez, and Shuaiwen Leon Song. 2017. Exploring and analyzing the real impact of modern on-package memory on HPC scientific kernels. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Ang Li, Weifeng Liu, Linnan Wang, Kevin Barker, and Shuaiwen Leon Song. 2018. Warp-consolidation: A novel execution model for gpus. In Proceedings of the 2018 International Conference on Supercomputing. 53--64.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Ang Li, Shuaiwen Leon Song, Eric Brugel, Akash Kumar, Daniel Chavarria-Miranda, and Henk Corporaal. 2016. X: A comprehensive analytic model for parallel machines. In 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 242--252.Google ScholarGoogle ScholarCross RefCross Ref
  35. Ang Li, Shuaiwen Leon Song, Jieyang Chen, Jiajia Li, Xu Liu, Nathan R Tallent, and Kevin J Barker. 2019. Evaluating modern gpu interconnect: Pcie, nvlink, nv-sli, nvswitch and gpudirect. IEEE Transactions on Parallel and Distributed Systems 31, 1 (2019), 94--110.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Ang Li, Shuaiwen Leon Song, Jieyang Chen, Xu Liu, Nathan Tallent, and Kevin Barker. 2018. Tartan: evaluating modern GPU interconnect via a multi-GPU benchmark suite. In 2018 IEEE International Symposium on Workload Characterization (IISWC). IEEE, 191--202.Google ScholarGoogle ScholarCross RefCross Ref
  37. Ang Li, Shuaiwen Leon Song, Akash Kumar, Eddy Z Zhang, Daniel Chavarría-Miranda, and Henk Corporaal. 2016. Critical points based register-concurrency autotuning for GPUs. In 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 1273--1278.Google ScholarGoogle Scholar
  38. Ang Li, Shuaiwen Leon Song, Weifeng Liu, Xu Liu, Akash Kumar, and Henk Corporaal. 2017. Locality-aware CTA clustering for modern GPUs. ACM SIGARCH Computer Architecture News 45, 1 (2017), 297--311.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Ang Li, Shuaiwen Leon Song, Mark Wijtvliet, Akash Kumar, and Henk Corporaal. 2016. SFU-driven transparent approximation acceleration on GPUs. In Proceedings of the 2016 International Conference on Supercomputing. 1--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Ang Li and Simon Su. 2020. Accelerating Binarized Neural Networks via Bit-Tensor-Cores in Turing GPUs. IEEE Transactions on Parallel and Distributed Systems 32, 7 (2020), 1878--1891.Google ScholarGoogle Scholar
  41. Ang Li, Omer Subasi, Xiu Yang, and Sriram Krishnamoorthy. 2020. Density matrix quantum circuit simulation via the BSP machine on modern GPU clusters. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1--15.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Ang Li, Gert-Jan van den Braak, Henk Corporaal, and Akash Kumar. 2015. Finegrained synchronizations and dataflow programming on GPUs. In Proceedings of the 29th ACM on International Conference on Supercomputing. 109--118.Google ScholarGoogle Scholar
  43. Ang Li, Gert-Jan van den Braak, Akash Kumar, and Henk Corporaal. 2015. Adaptive and transparent cache bypassing for GPUs. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1--12.Google ScholarGoogle Scholar
  44. Zhen Li and Jiabin Yuan. 2017. Quantum computer simulation on gpu cluster incorporating data locality. In International Conference on Cloud Computing and Security. Springer, 85--97.Google ScholarGoogle ScholarCross RefCross Ref
  45. Daniel A Lidar and Todd A Brun. 2013. Quantum error correction. Cambridge university press.Google ScholarGoogle Scholar
  46. A Linn. [n.d.]. The future is quantum: Microsoft releases free preview of quantum development kit.(Dec. 11, 2017).Google ScholarGoogle Scholar
  47. John Nickolls and William J Dally. 2010. The GPU computing era. IEEE micro 30, 2 (2010), 56--69.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. NVIDIA. [n.d.]. NVIDIA NVSHMEM Developer Guide. URL: https://docs.nvidia.com/hpc-sdk/nvshmem/archives/nvshmem-101/developer-guide/index.html.Google ScholarGoogle Scholar
  49. Yuchen Pang, Tianyi Hao, Annika Dugad, Yiqing Zhou, and Edgar Solomonik. 2020. Efficient 2D tensor network simulation of quantum systems. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Sreeram Potluri, Davide Rossetti, Donald Becker, Duncan Poole, Manjunath Gorentla Venkata, Oscar Hernandez, Pavel Shamis, M Graham Lopez, Mathew Baker, and Wendy Poole. 2014. Exploring OpenSHMEM model to program GPU-based extreme-scale systems. In Workshop on OpenSHMEM and Related Technologies. Springer, 18--35.Google ScholarGoogle Scholar
  51. John Preskill. 2018. Quantum Computing in the NISQ era and beyond. Quantum 2 (2018), 79.Google ScholarGoogle ScholarCross RefCross Ref
  52. Jonathan Romero, Ryan Babbush, Jarrod R McClean, Cornelius Hempel, Peter J Love, and Alán Aspuru-Guzik. 2018. Strategies for quantum computing molecular energies using the unitary coupled cluster ansatz. Quantum Science and Technology 4, 1 (2018), 014008.Google ScholarGoogle ScholarCross RefCross Ref
  53. Davide Rossetti and S Team. 2015. GPUDIRECT: Integrating the GPU with a Network Interface. In GPU Technology Conference.Google ScholarGoogle Scholar
  54. Avinash Sodani, Roger Gramunt, Jesus Corbal, Ho-Seop Kim, Krishna Vinod, Sundaram Chinthamani, Steven Hutsell, Rajat Agarwal, and Yen-Chen Liu. 2016. Knights landing: Second-generation intel xeon phi product. Ieee micro 36, 2 (2016), 34--46.Google ScholarGoogle Scholar
  55. SPI. [n.d.]. OpenMPI: Open Source High Performance Computing. URL: https://www.open-mpi.org/.Google ScholarGoogle Scholar
  56. Krysta Svore, Alan Geller, Matthias Troyer, John Azariah, Christopher Granade, Bettina Heim, Vadym Kliuchnikov, Mariia Mykhailova, Andres Paz, and Martin Roetteler. 2018. Q# enabling scalable quantum computing and development with a high-level dsl. In Proceedings of the Real World Domain Specific Languages Workshop 2018. 1--10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. James D Whitfield, Jacob Biamonte, and Alán Aspuru-Guzik. 2011. Simulation of electronic structure Hamiltonians using quantum computers. Molecular Physics 109, 5 (2011), 735--750.Google ScholarGoogle ScholarCross RefCross Ref
  58. Samuel Williams, Andrew Waterman, and David Patterson. 2009. Roofline: an insightful visual performance model for multicore architectures. Commun. ACM 52, 4 (2009), 65--76.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Xin-Chuan Wu, Sheng Di, Emma Maitreyee Dasgupta, Franck Cappello, Hal Finkel, Yuri Alexeev, and Frederic T Chong. 2019. Full-state quantum circuit simulation by using data compression. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1--24.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Pei Zhang, Jiabin Yuan, and Xiangwen Lu. 2015. Quantum computer simulation on multi-GPU incorporating data locality. In International Conference on Algorithms and Architectures for Parallel Processing. Springer, 241--256.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. SV-sim: scalable PGAS-based state vector simulation of quantum circuits
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SC '21: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
          November 2021
          1493 pages
          ISBN:9781450384421
          DOI:10.1145/3458817

          Copyright © 2021 ACM

          Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 13 November 2021

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate1,516of6,373submissions,24%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader