skip to main content
10.1145/3579371.3589080acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
research-article

Spy in the GPU-box: Covert and Side Channel Attacks on Multi-GPU Systems

Published:17 June 2023Publication History

ABSTRACT

The deep learning revolution has been enabled in large part by GPUs, and more recently accelerators, which make it possible to carry out computationally demanding training and inference in acceptable times. As the size of machine learning networks and workloads continues to increase, multi-GPU machines have emerged as an important platform offered on High Performance Computing and cloud data centers. Since these machines are shared among multiple users, it becomes increasingly important to protect applications against potential attacks. In this paper, we explore the vulnerability of Nvidia's DGX multi-GPU machines to covert and side channel attacks. These machines consist of a number of discrete GPUs that are interconnected through a combination of custom interconnect (NVLink) and PCIe connections. We reverse engineer the interconnected cache hierarchy and show that it is possible for an attacker on one GPU to cause contention on the L2 cache of another GPU. We use this observation to first develop a covert channel attack across two GPUs, achieving the best bandwidth of around 4 MB/s. We also develop a prime and probe attack on a remote GPU allowing an attacker to recover the cache access pattern of another workload. This access pattern can be used in any number of side channel attacks: we demonstrate a proof of concept attack that fingerprints the application running on the remote GPU, with high accuracy. We also develop a proof of concept attack to extract hyperparameters of a machine learning workload. Our work establishes for the first time the vulnerability of these machines to microarchitectural attacks and can guide future research to improve their security.

References

  1. Jaeguk Ahn, Cheolgyu Jin, Jiho Kim, Minsoo Rhu, Yunsi Fei, David Kaeli, and John Kim. 2021. Trident: A Hybrid Correlation-Collision GPU Cache Timing Attack for AES Key Recovery. In 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA). 332--344. Google ScholarGoogle ScholarCross RefCross Ref
  2. Jaeguk Ahn, Jiho Kim, Hans Kasan, Leila Delshadtehrani, Wonjun Song, Ajay Joshi, and John Kim. 2021. Network-on-Chip Microarchitecture-Based Covert Channel in GPUs. In MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture (Virtual Event, Greece) (MICRO '21). Association for Computing Machinery, New York, NY, USA, 565--577. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Jaeguk Ahn, Jiho Kim, Hans Kasan, Leila Delshadtehrani, Wonjun Song, Ajay Joshi, and John Kim. 2021. Network-on-Chip Microarchitecture-Based Covert Channel in GPUs. In MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture (Virtual Event, Greece) (MICRO '21). Association for Computing Machinery, New York, NY, USA, 565--577. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. AMD. 2017. AMD CrossFire guide for Direct3D® 11 applications.Google ScholarGoogle Scholar
  5. AMD. 2022. Introducing AMD CDNA™ 2 Architecture. https://www.amd.com/system/files/documents/amd-cdna2-white-paper.pdf.Google ScholarGoogle Scholar
  6. Sangjin Choi, Taeksoo Kim, Jinwoo Jeong, Rachata Ausavarungnirun, Myeongjae Jeon, Youngjin Kwon, and Jeongseob Ahn. 2022. Memory Harvesting in Multi-GPU Systems with Hierarchical Unified Virtual Memory. In 2022 USENIX Annual Technical Conference (USENIX ATC 22). USENIX Association, Carlsbad, CA, 625--638. https://www.usenix.org/conference/atc22/presentation/choi-sangjinGoogle ScholarGoogle Scholar
  7. Li Deng. 2012. The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine 29, 6 (2012), 141--142.Google ScholarGoogle ScholarCross RefCross Ref
  8. Leonid Domnitser, Aamer Jaleel, Jason Loew, Nael Abu-Ghazaleh, and Dmitry Ponomarev. 2012. Non-monopolizable caches: Low-complexity mitigation of cache side channel attacks. ACM Transactions on Architecture and Code Optimization (TACO) 8, 4 (2012), 1--21.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Sankha Baran Dutta, Hoda Naghibijouybari, Nael Abu-Ghazaleh, Andres Marquez, and Kevin Barker. 2021. Leaky Buddies: Cross-Component Covert Channels on Integrated CPU-GPU Systems. In 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA). 972--984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Yiwen Gao, Hailong Zhang, Wei Cheng, Yongbin Zhou, and Yuchen Cao. 2018. Electro-Magnetic Analysis of GPU-Based AES Implementation. In Proceedings of the 55th Annual Design Automation Conference (San Francisco, California) (DAC '18). Association for Computing Machinery, New York, NY, USA, Article 121, 6 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Daniel Gruss, Clémentine Maurice, Klaus Wagner, and Stefan Mangard. 2016. Flush+ Flush: a fast and stealthy cache attack. In Detection of Intrusions and Malware, and Vulnerability Assessment: 13th International Conference, DIMVA 2016, San Sebastián, Spain, July 7--8, 2016, Proceedings 13. Springer, 279--299.Google ScholarGoogle Scholar
  12. Xing Hu, Ling Liang, Shuangchen Li, Lei Deng, Pengfei Zuo, Yu Ji, Xinfeng Xie, Yufei Ding, Chang Liu, Timothy Sherwood, and Yuan Xie. 2020. DeepSniffer: A DNN Model Extraction Framework Based on Learning Architectural Hints. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems (Lausanne, Switzerland) (ASPLOS '20). Association for Computing Machinery, New York, NY, USA, 385--399. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Gorka Irazoqui, Thomas Eisenbarth, and Berk Sunar. 2016. Cross processor cache attacks. In Proceedings of the 11th ACM on Asia conference on computer and communications security. 353--364.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Saksham Jain, Iljoo Baek, Shige Wang, and Ragunathan Rajkumar. 2019. Fractional GPUs: Software-based compute and memory bandwidth reservation for GPUs. In 2019 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 29--41.Google ScholarGoogle ScholarCross RefCross Ref
  15. Zhe Jia, Marco Maggioni, Benjamin Staiger, and Daniele P Scarpazza. 2018. Dissecting the NVIDIA volta GPU architecture via microbenchmarking. arXiv preprint arXiv:1804.06826 (2018).Google ScholarGoogle Scholar
  16. Zhen Hang Jiang, Yunsi Fei, and David Kaeli. 2016. A complete key recovery timing attack on a GPU. In IEEE International Symposium on High Performance Computer Architecture (HPCA'16). IEEE, Barcelona Spain, 394--405. Google ScholarGoogle ScholarCross RefCross Ref
  17. Zhen Hang Jiang, Yunsi Fei, and David Kaeli. 2017. A Novel Side-Channel Timing Attack on GPUs. In Proceedings of the on Great Lakes Symposium on VLSI (VLSI'17). 167--172. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Seunghwa Kang, Alex Fender, Joe Eaton, and Brad Rees. 2020. Computing PageRank Scores of Web Crawl Data Using DGX A100 Clusters. In 2020 IEEE High Performance Extreme Computing Conference (HPEC). 1--4. Google ScholarGoogle ScholarCross RefCross Ref
  19. Jingfei Kong, Onur Aciicmez, Jean-Pierre Seifert, and Huiyang Zhou. 2009. Hardware-Software Integrated Approaches to Defend Against Software Cache-based Side Channel Attacks. In Proceedings of the International Symposium on High Performance Comp. Architecture (HPCA).Google ScholarGoogle ScholarCross RefCross Ref
  20. Oak Ridge National Laboratory. 2022. Systems. https://docs.olcf.ornl.gov/systems/crusher_quick_start_guide.html.Google ScholarGoogle Scholar
  21. Pacific Northwest National Laboratory. 2016. Nvidia DGX-1 housed in PNNL's campus. https://www.pnnl.gov/science/highlights/highlight.asp?id=4431.Google ScholarGoogle Scholar
  22. Fangfei Liu, Qian Ge, Yuval Yarom, Frank Mckeen, Carlos Rozas, Gernot Heiser, and Ruby Lee. 2016. Catalyst: Defeating last-level cache side channel attacks in cloud computing. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA).Google ScholarGoogle ScholarCross RefCross Ref
  23. Fangfei Liu, Yuval Yarom, Qian Ge, Gernot Heiser, and Ruby B Lee. 2015. Last-level cache side-channel attacks are practical. In 2015 IEEE symposium on security and privacy. IEEE, 605--622.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. Liu, Y. Wei, J. Chi, F. H. Shezan, and Y. Tian. 2019. Side Channel Attacks in Computation Offloading Systems with GPU Virtualization. In 2019 IEEE Security and Privacy Workshops (SPW). 156--161.Google ScholarGoogle Scholar
  25. Chao Luo, Yunsi Fei, Pei Luo, Saoni Mukherjee, and David Kaeli. 2015. Side-Channel Power Analysis of a GPU AES Implementation. In 33rd IEEE International Conference on Computer Design (ICCD'15). Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Tobias Mann. 2020. Amazon Amps Up Cloud With Nvidia A100s. https://www.sdxcentral.com/articles/news/amazon-amps-up-cloud-with-nvidia-a100s/2020/11/.Google ScholarGoogle Scholar
  27. Clémentine Maurice, Christoph Neumann, Olivier Heen, and Aurélien Francillon. 2015. C5: cross-cores cache covert channel. In International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, 46--64.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Xinxin Mei and Xiaowen Chu. 2016. Dissecting GPU memory hierarchy through microbenchmarking. IEEE Transactions on Parallel and Distributed Systems 28, 1 (2016), 72--86.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Hoda Naghibijouybari, Khaled N. Khasawneh, and Nael Abu-Ghazaleh. 2017. Constructing and Characterizing Covert Channels on GPGPUs. In 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 354--366.Google ScholarGoogle Scholar
  30. Hoda Naghibijouybari, Ajaya Neupane, Zhiyun Qian, and Nael Abu-Ghazaleh. 2018. Rendered Insecure: GPU Side Channel Attacks Are Practical. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (Toronto, Canada) (CCS '18). Association for Computing Machinery, New York, NY, USA, 2139--2153. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Deepak Narayanan, Mohammad Shoeybi, Jared Casper, Patrick LeGresley, Mostofa Patwary, Vijay Korthikanti, Dmitri Vainbrand, Prethvi Kashinkunti, Julie Bernauer, Bryan Catanzaro, Amar Phanishayee, and Matei Zaharia. 2021. Efficient Large-Scale Language Model Training on GPU Clusters. CoRR abs/2104.04473 (2021). https://arxiv.org/abs/2104.04473Google ScholarGoogle Scholar
  32. Ajay Nayak, Pratheek B., Vinod Ganapathy, and Arkaprava Basu. 2021. (Mis)Managed: A Novel TLB-Based Covert Channel on GPUs. In Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security (Virtual Event, Hong Kong) (ASIA CCS '21). Association for Computing Machinery, New York, NY, USA, 872--885. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Nvidia. [n. d.]. https://www.nvidia.com/en-us/data-center/resources/mlperf-benchmarks/.Google ScholarGoogle Scholar
  34. Nvidia. 2017. NVIDIA DGX-1 System Architecture White Paper.Google ScholarGoogle Scholar
  35. Nvidia. 2020. GPU-Accelerated Google Clouds, Google Cloud Anthos on NVIDIA DGX A100. https://www.nvidia.com/en-us/data-center/gpu-cloud-computing/google-cloud-platform/#:~:text=NVIDIA%20DGX%20A100%20is%20the,NVIDIA%20GPUs%20within%20Google%20Cloud.Google ScholarGoogle Scholar
  36. Nvidia. 2021. Nvidia cuda samples. https://docs.nvidia.com/cuda/cuda-samples/index.html.Google ScholarGoogle Scholar
  37. Nvidia. 2021. Nvidia Multi-Instance GPU. https://www.nvidia.com/en-us/technologies/multi-instance-gpu/.Google ScholarGoogle Scholar
  38. Nvidia. 2021. Parallel Thread Execution ISA. https://docs.nvidia.com/cuda/pdf/ptx_isa_7.6.pdf.Google ScholarGoogle Scholar
  39. Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. (2017).Google ScholarGoogle Scholar
  40. Peter Pessl, Daniel Gruss, Clémentine Maurice, Michael Schwarz, and Stefan Mangard. 2016. DRAMA: Exploiting DRAM Addressing for Cross-CPU Attacks. In 25th USENIX Security Symposium (USENIX Security 16). USENIX Association, Austin, TX, 565--581. https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/pesslGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  41. Binh Pham, Viswanathan Vaidyanathan, Aamer Jaleel, and Abhishek Bhattacharjee. 2012. Colt: Coalesced large-reach tlbs. In 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE, 258--269.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Vijay Janapa Reddi, Christine Cheng, David Kanter, Peter Mattson, Guenther Schmuelling, Carole-Jean Wu, Brian Anderson, Maximilien Breughe, Mark Charlebois, William Chou, Ramesh Chukka, Cody Coleman, Sam Davis, Pan Deng, Greg Diamos, Jared Duke, Dave Fick, J. Scott Gardner, Itay Hubara, Sachin Idgunji, Thomas B. Jablin, Jeff Jiao, Tom St. John, Pankaj Kanwar, David Lee, Jeffery Liao, Anton Lokhmotov, Francisco Massa, Peng Meng, Paulius Micikevicius, Colin Osborne, Gennady Pekhimenko, Arun Tejusve Raghunath Rajan, Dilip Sequeira, Ashish Sirasao, Fei Sun, Hanlin Tang, Michael Thomson, Frank Wei, Ephrem Wu, Lingjie Xu, Koichi Yamada, Bing Yu, George Yuan, Aaron Zhong, Peizhao Zhang, and Yuchen Zhou. 2019. MLPerf Inference Benchmark. arXiv:1911.02549 [cs.LG]Google ScholarGoogle Scholar
  43. Gururaj Saileshwar, Christopher W Fletcher, and Moinuddin Qureshi. 2021. Streamline: a fast, flushless cache covert-channel attack by enabling asynchronous collusion. In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems. 1077--1090.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Cirrascale Cloud Services. 2022. Bringing NVIDIA DGX A100 to the Cloud. https://cirrascale.com/platforms-nvidiadgx-a100.php.Google ScholarGoogle Scholar
  45. Anatoly Shusterman, Lachlan Kang, Yarden Haskal, Yosef Meltser, Prateek Mittal, Yossi Oren, and Yuval Yarom. 2019. Robust website fingerprinting through the cache occupancy channel. In 28th {USENIX} Security Symposium ({USENIX} Security 19). 639--656.Google ScholarGoogle Scholar
  46. Xin Wang and Wei Zhang. 2020. An Efficient Profiling-Based Side-Channel Attack on Graphics Processing Units. In National Cyber Summit (NCS) Research Track, Kim-Kwang Raymond Choo, Thomas H. Morris, and Gilbert L. Peterson (Eds.). Springer International Publishing, Cham, 126--139.Google ScholarGoogle Scholar
  47. Zhenghong Wang and Ruby B. Lee. 2007. New Cache Designs for Thwarting Software Cache-based Side Channel Attacks. In Proceedings of the International Symposium on Computer Architecture (ISCA).Google ScholarGoogle Scholar
  48. Junyi Wei, Yicheng Zhangy, Zhe Zhou, Zhou Liy, and Mohammad Abdullah Al Faruque. 2020. Leaky DNN: Stealing Deep-learning Model Secret with GPU Context-switching Side-channel. In Proceedings of IEEE/IFIP International Conference on Dependable Systems and Networks (DSN) (Valencia, Spain).Google ScholarGoogle ScholarCross RefCross Ref
  49. Zhenyu Wu, Zhang Xu, and Haining Wang. 2012. Whispers in the Hyper-space: High-speed Covert Channel Attacks in the Cloud. In 21st USENIX Security Symposium (USENIX Security 12). USENIX Association, Bellevue, WA, 159--173. https://www.usenix.org/conference/usenixsecurity12/technical-sessions/presentation/wuGoogle ScholarGoogle Scholar
  50. Qiumin Xu, Hoda Naghibijouybari, Shibo Wang, Nael Abu-Ghazaleh, and Murali Annavaram. 2019. GPUGuard: Mitigating Contention Based Side and Covert Channel Attacks on GPUs. In Proceedings of the ACM International Conference on Supercomputing (Phoenix, Arizona) (ICS '19). ACM, New York, NY, USA, 497--509. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Yuval Yarom and Katrina Falkner. 2014. FLUSH+ RELOAD: A high resolution, low noise, L3 cache side-channel attack. In 23rd {USENIX} Security Symposium ({USENIX} Security 14). 719--732.Google ScholarGoogle Scholar
  52. Heng Zhang, Lingda Li, Donglin Zhuang, Rui Liu, Shuang Song, Dingwen Tao, Yanjun Wu, and Shuaiwen Leon Song. 2021. An Efficient Uncertain Graph Processing Framework for Heterogeneous Architectures (PPoPP '21). Association for Computing Machinery, New York, NY, USA, 477--479. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. P. Zou, A. Li, K. Barker, and R. Ge. 2019. Fingerprinting Anomalous Computation with RNN for GPU-accelerated HPC Machines. In 2019 IEEE International Symposium on Workload Characterization (IISWC). 253--256.Google ScholarGoogle Scholar

Index Terms

  1. Spy in the GPU-box: Covert and Side Channel Attacks on Multi-GPU Systems
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ISCA '23: Proceedings of the 50th Annual International Symposium on Computer Architecture
        June 2023
        1225 pages
        ISBN:9798400700958
        DOI:10.1145/3579371

        Copyright © 2023 ACM

        Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 17 June 2023

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate543of3,203submissions,17%

        Upcoming Conference

        ISCA '24
      • Article Metrics

        • Downloads (Last 12 months)697
        • Downloads (Last 6 weeks)57

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader