Abstract
To increase computation throughput, general purpose Graphics Processing Units (GPUs) have been leveraged to accelerate computationally intensive workloads. GPUs have been used as cryptographic engines, improving encryption/decryption throughput and leveraging the GPU’s Single Instruction Multiple Thread (SIMT) model. RSA is a widely used public-key cipher and has been ported onto GPUs for signing and decrypting large files. Although performance has been significantly improved, the security of RSA on GPUs is vulnerable to side-channel timing attacks and is an exposure overlooked in previous studies.
GPUs tend to be naturally resilient to side-channel attacks, given that they execute a large number of concurrent threads, performing many RSA operations on different data in parallel. Given the degree of parallel execution on a GPU, there will be a significant amount of noise introduced into the timing channel given the thousands of concurrent threads executing concurrently.
In this work, we build a timing model to capture the parallel characteristics of an RSA public-key cipher implemented on a GPU. We consider optimizations that include using Montgomery multiplication and sliding-window exponentiation to implement cryptographic operations. Our timing model considers the challenges of parallel execution, complications that do not occur in single-threaded computing platforms. Based on our timing model, we launch successful timing attacks on RSA running on a GPU, extracting the private key of RSA. We also present an effective error detection and correction mechanism. Our results demonstrate that GPU acceleration of RSA is vulnerable to side-channel timing attacks. We propose several countermeasures to defend against this class of attacks.
- Onur Aciiçmez, Werner Schindler, and Çetin K. Koç. 2005 Nov. Improving Brumley and Boneh timing attack on unprotected SSL implementations. In Proceedings of the 12th ACM Conference on Computer and Communications Security. ACM, New York, NY, 139--146. Google ScholarDigital Library
- Cyril Arnaud and Pierre-Alain Fouque. 2013. Timing attack against protected RSA-CRT implementation used in PolarSSL. In Proceedings of the Cryptographers’ Track at the RSA Conference. 18--33. Google ScholarDigital Library
- Billy Bob Brumley and Nicola Tuveri. 2011. Remote timing attacks are still practical. In Proceedings of the European Symposium on Research in Computer Security (ESORICS’11).Google ScholarCross Ref
- David Brumley and Dan Boneh. 2003. Remote timing attacks are practical. In Proceedings of the 12th Conference on USENIX Security Symposium (SSYM’03), Vol. 12. USENIX Association, 1--1. Retrieved from: http://dl.acm.org/citation.cfm?id=1251353.1251354. Google ScholarDigital Library
- CaiSen Chen, Tao Wang, and Junjian Tian. 2013. Improving timing attack on RSA-CRT via error detection and correction strategy. Inform. Sci. 232 (2013), 464--474. Google ScholarDigital Library
- Don Coppersmith. 1997. Small solutions to polynomial equations, and low exponent RSA vulnerabilities. J. Cryptol. 10, 4 (1997), 233--260. Google ScholarDigital Library
- Jean-Francois Dhem, Francois Koeune, Philippe-Alexandre Leroux, Patrick Mestré, Jean-Jacques Quisquater, and Jean-Louis Willems. 1998. A practical implementation of the timing attack. In Proceedings of the International Conference on Smart Card Research and Advanced Applications. Springer, 167--182. Google ScholarDigital Library
- Yunsi Fei, A. Adam Ding, Jian Lao, and Liwei Zhang. 2015. A statistics-based success rate model for DPA and CPA. J. Cryptog. Eng. 5, 4 (2015), 227--243.Google ScholarCross Ref
- Gaël Hachez and Jean-Jacques Quisquater. 2000. Montgomery exponentiation with no final subtractions: Improved results. In Proceedings of the International Workshop on Cryptographic Hardware and Embedded Systems. Springer, 293--301. Google ScholarDigital Library
- Mehmet Sinan Inci, Berk Gülmezoglu, Gorka Irazoqui Apecechea, Thomas Eisenbarth, and Berk Sunar. 2015. Seriously, get off my cloud! Cross-VM RSA key recovery in a public cloud. IACR Cryptology ePrint Archive 2015 (2015), 898. Retrieved from https://eprint.iacr.org/2015/898.Google Scholar
- Keon Jang, Sangjin Han, Seungyeop Han, Sue Moon, and KyoungSoo Park. 2011. SSLShader: Cheap SSL acceleration with commodity processors. In Proceedings of the Symposium on Networked Systems Design and Implementation. USENIX Association, 1--14. Google ScholarDigital Library
- Keon Jang, Sangjin Han, Seungyeop Han, and KyoungSoo Park. 2015. libgpucrypto. Retrieved from: https://github.com/lwakefield/libgpucrypto.Google Scholar
- Zhen Hang Jiang, Yunsi Fei, and David Kaeli. 2016. A complete key recovery timing attack on a GPU. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture. 394--405.Google ScholarCross Ref
- Zhen Hang Jiang, Yunsi Fei, and David Kaeli. 2017. A novel side-channel timing attack on GPUs. In Proceedings of the Great Lakes Symposium on VLSI. ACM, 167--172. Google ScholarDigital Library
- Gurunath Kadam, Danfeng Zhang, and Adwait Jog. 2018. RCoal: Mitigating GPU timing attack via subwarp-based randomized coalescing techniques. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA’18). IEEE, 156--167.Google ScholarCross Ref
- Donald Ervin Knuth. 1998. The Art of Computer Programming, Seminumerical Algorithms, Vol. 2, Addition Wesley. Reading, MA.Google ScholarDigital Library
- Cetin K. Koç. 1995. Analysis of sliding window techniques for exponentiation. Comput. Math. Appl. 30, 10 (1995), 17--24.Google ScholarCross Ref
- Çetin Kaya Koç. 1994. High-speed RSA Implementation. Technical Report. RSA Laboratories.Google Scholar
- Paul C. Kocher. 1996. Timing attacks on implementations of Diffie-Hellman, RSA, DSS, and other systems. In Proceedings of the International Cryptology Conference (CRYPTO’96). Google ScholarDigital Library
- Chao Luo, Yunsi Fei, and David Kaeli. 2018. GPU acceleration of RSA is vulnerable to side-channel timing attacks. In Proceedings of the International Conference on Computer-Aided Design. ACM, 113. Google ScholarDigital Library
- Peter L. Montgomery. 1985. Modular multiplication without trial division. Math. Comput. 44, 170 (1985), 519--521.Google ScholarCross Ref
- Andrew Moss, Daniel Page, and Nigel P. Smart. 2007. Toward acceleration of RSA using 3D graphics hardware. In Proceedings of the Conference on Cryptography and Coding. Google ScholarDigital Library
- Hoda Naghibijouybari, Khaled N. Khasawneh, and Nael Abu-Ghazaleh. 2017. Constructing and characterizing covert channels on GPGPUs. In Proceedings of the 50th IEEE/ACM International Symposium on Microarchitecture. ACM, 354--366. Google ScholarDigital Library
- Hoda Naghibijouybari, Ajaya Neupane, Zhiyun Qian, and Nael Abu-Ghazaleh. 2018. Rendered insecure: GPU side channel attacks are practical. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security. ACM, 2139--2153. Google ScholarDigital Library
- Dag Arne Osvik, Adi Shamir, and Eran Tromer. 2006. Cache attacks and countermeasures: The case of AES. In Proceedings of the Cryptographers’ Track at the RSA Conference. Springer, 1--20. Google ScholarDigital Library
- Heejin Park, Kunsoo Park, and Yookun Cho. 1999. Analysis of the variable length nonzero window method for exponentiation. Computers 8 Mathematics with Applications 37, 7 (Apr. 1999), 21--29.Google Scholar
- Ronald L. Rivest, Adi Shamir, and Leonard Adleman. 1978. A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM 21, 2 (Feb. 1978), 7. Google ScholarDigital Library
- Werner Schindler. 2000. A timing attack against RSA with the Chinese remainder theorem. In Proceedings of the International Conference on Cryptographic Hardware 8 Embedded Systems. Google ScholarDigital Library
- Robert Szerwinski and Tim Güneysu. 2008. Exploiting the power of GPUs for asymmetric cryptography. In Proceedings of the International Conference on Cryptographic Hardware 8 Embedded Systems. Google ScholarDigital Library
- R. Tóth, Z. Faigl, M. Szalay, and S. Imre. 2008. An advanced timing attack scheme on RSA. In Proceedings of the International Telecommunications Network Strategy and Planning Symposium, Vol. Supplement. 1--9.Google Scholar
- Colin D. Walter. 1999. Montgomery exponentiation needs no final subtractions. Electron. Lett. 35, 21 (Oct. 1999), 1831--1832.Google ScholarCross Ref
- Yang Yang, Zhi Guan, Huiping Sun, and Zhong Chen. 2015. Accelerating RSA with fine-grained parallelism using GPU. In Proceedings of the Information Security Practice and Experience.Google ScholarCross Ref
- Yuval Yarom and Katrina Falkner. 2014. FLUSH+ RELOAD: A high resolution, low noise, L3 cache side-channel attack. In Proceedings of the USENIX Security Conference, Vol. 2014. Google ScholarDigital Library
- Yuval Yarom, Daniel Genkin, and Nadia Heninger. 2016. CacheBleed: A timing attack on OpenSSL constant time RSA. In Proceedings of the International Conference on Cryptographic Hardware 8 Embedded Systems.Google ScholarCross Ref
Index Terms
- Side-channel Timing Attack of RSA on a GPU
Recommendations
A Novel Side-Channel Timing Attack on GPUs
GLSVLSI '17: Proceedings of the on Great Lakes Symposium on VLSI 2017To avoid information leakage during program execution, modern software implementations of cryptographic algorithms target constant timing complexity, i.e., the number of instructions executed does not vary with different inputs. However, many times the ...
Side-Channel Attacks on Cryptographic Software
When it comes to cryptographic software, side channels are an often-overlooked threat. A side channel is any observable side effect of computation that an attacker could measure and possibly influence. In the software world, side-channel attacks have ...
Performance Analysis of Efficient RSA Text Encryption Using NVIDIA CUDA-C and OpenCL
ICONIAAC '14: Proceedings of the 2014 International Conference on Interdisciplinary Advances in Applied ComputingComputer security relies on cryptography as a means to protect the data that we people have become increasingly reliant on. The main research in computer security domain is how to enhance the speed of RSA algorithm. The computing capability of Graphic ...
Comments