ABSTRACT
This paper utilizes Reinforcement Learning (RL) as a means to automate the Hardware Trojan (HT) insertion process to eliminate the inherent human biases that limit the development of robust HT detection methods. An RL agent explores the design space and finds circuit locations that are best for keeping inserted HTs hidden. To achieve this, a digital circuit is converted to an environment in which an RL agent inserts HTs such that the cumulative reward is maximized. Our toolset can insert combinational HTs into the ISCAS-85 benchmark suite with variations in HT size and triggering conditions. Experimental results show that the toolset achieves high input coverage rates (100% in two benchmark circuits) that confirms its effectiveness. Also, the inserted HTs have shown a minimal footprint and rare activation probability.
Supplemental Material
- L. Bassett. 2015. Introduction to JavaScript Object Notation: A To-the-Point Guide to JSON. O'Reilly Media. 2015510525 https://books.google.com/books?id=Qv9PCgAAQBAJGoogle Scholar
- Michael L Bushnell. 2000. Essentials of electronic testing for digital. Memory & Mixed-Signal VLSI Circuits (2000).Google Scholar
- Jonathan Cruz, Yuanwen Huang, Prabhat Mishra, and Swarup Bhunia. 2018. An automated configurable Trojan insertion framework for dynamic trust benchmarks. In 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 1598--1603.Google Scholar
- Lawrence H Goldstein and Evelyn L Thigpen. 1980. SCOAP: Sandia controllability/observability analysis program. In Proceedings of the 17th Design Automation Conference. 190--196.Google ScholarDigital Library
- Aric A. Hagberg, Daniel A. Schult, and Pieter J. Swart. 2008. Exploring Network Structure, Dynamics, and Function using NetworkX. In Proceedings of the 7th Python in Science Conference, Gaël Varoquaux, Travis Vaught, and Jarrod Millman (Eds.). Pasadena, CA USA, 11 -- 15.Google Scholar
- Kento Hasegawa, Masao Yanagisawa, and Nozomu Togawa. 2017. Trojan-feature extraction at gate-level netlists and its application to hardware-Trojan detection using random forest classifier. In 2017 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, 1--4.Google ScholarCross Ref
- Zhixin Pan and Prabhat Mishra. 2021. Automated test generation for hardware trojan detection using reinforcement learning. In Proceedings of the 26th Asia and South Pacific Design Automation Conference. 408--413.Google ScholarDigital Library
- Antonin Raffin, Ashley Hill, Maximilian Ernestus, Adam Gleave, Anssi Kanervisto, and Noah Dormann. 2019. Stable baselines3. GitHub repository (2019).Google Scholar
- Mohammad Sabri, Ahmad Shabani, and Bijan Alizadeh. 2021. SAT-Based Integrated Hardware Trojan Detection and Localization Approach Through Path-Delay Analysis. IEEE Transactions on Circuits and Systems II: Express Briefs (2021).Google ScholarCross Ref
- Hassan Salmani. 2016. COTD: Reference-free hardware trojan detection and recovery based on controllability and observability in gate-level netlist. IEEE Transactions on Information Forensics and Security, Vol. 12, 2 (2016), 338--350.Google ScholarDigital Library
- Hassan Salmani, Mohammad Tehranipoor, and Ramesh Karri. 2013. On design vulnerability analysis and trust benchmarks development. In 2013 IEEE 31st international conference on computer design (ICCD). IEEE, 471--474.Google ScholarCross Ref
- Amin Sarihi, Ahmad Patooghy, Ahmed Khalid, Mahdi Hasanzadeh, Mostafa Said, and Abdel-Hameed A Badawy. 2021. A Survey on the Security of Wired, Wireless, and 3D Network-on-Chips. IEEE Access (2021).Google ScholarCross Ref
- John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).Google Scholar
- Seyed Mohammad Sebt, Ahmad Patooghy, Hakem Beitollahi, and Michel Kinsy. 2018. Circuit enclaves susceptible to hardware Trojans insertion at gate-level designs. IET Computers & Digital Techniques, Vol. 12, 6 (2018), 251--257.Google ScholarCross Ref
- Bicky Shakya, Tony He, Hassan Salmani, Domenic Forte, Swarup Bhunia, and Mark Tehranipoor. 2017. Benchmarking of hardware trojans and maliciously affected circuits. Journal of Hardware and Systems Security, Vol. 1, 1 (2017), 85--102.Google ScholarCross Ref
- Clifford Wolf, Johann Glaser, and Johannes Kepler. 2013. Yosys-a free Verilog synthesis suite. In Proceedings of the 21st Austrian Workshop on Microelectronics (Austrochip) .Google Scholar
- Mingfu Xue, Chongyan Gu, Weiqiang Liu, Shichao Yu, and Máire O'Neill. 2020. Ten years of hardware Trojans: a survey from the attacker's perspective. IET Computers & Digital Techniques, Vol. 14, 6 (2020), 231--246.Google ScholarCross Ref
- Shichao Yu, Weiqiang Liu, and Maire O'Neill. 2019. An improved automatic hardware Trojan generation platform. In 2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI). IEEE, 302--307.Google ScholarCross Ref
Index Terms
- Hardware Trojan Insertion Using Reinforcement Learning
Recommendations
Reward Shaping in Episodic Reinforcement Learning
AAMAS '17: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent SystemsRecent advancements in reinforcement learning confirm that reinforcement learning techniques can solve large scale problems leading to high quality autonomous decision making. It is a matter of time until we will see large scale applications of ...
Using Transfer Learning to Speed-Up Reinforcement Learning: A Cased-Based Approach
LARS '10: Proceedings of the 2010 Latin American Robotics Symposium and Intelligent Robotics MeetingReinforcement Learning (RL) is a well-known technique for the solution of problems where agents need to act with success in an unknown environment, learning through trial and error. However, this technique is not efficient enough to be used in ...
Learning with whom to communicate using relational reinforcement learning
AAMAS '09: Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2Relational reinforcement learning (RRL) has emerged in the machine learning community as a new promising subfield of reinforcement learning (RL) (e.g. [1]). It upgrades RL techniques by using relational representations for states, actions and learned ...
Comments