Acoustic-decoy: Detection of adversarial examples through audio modification on speech recognition system

doi:10.1016/j.neucom.2020.07.101

Neurocomputing

Volume 417, 5 December 2020, Pages 357-370

https://doi.org/10.1016/j.neucom.2020.07.101 Get rights and content

Under a Creative Commons license

open access

Abstract

Deep neural networks (DNNs) display good performance in the domains of recognition and prediction, such as on tasks of image recognition, speech recognition, video recognition, and pattern analysis. However, adversarial examples, created by inserting a small amount of noise into the original samples, can be a serious threat because they can cause misclassification by the DNN. Adversarial examples have been studied primarily in the context of images, but their effect in the audio context is now drawing considerable interest as well. For example, by adding a small distortion to an original audio sample, imperceptible to humans, an audio adversarial example can be created that humans hear as error-free but that causes misunderstanding by a machine. Therefore, it is necessary to create a method of defense for resisting audio adversarial examples. In this paper, we propose an acoustic-decoy method for detecting audio adversarial examples. Its key feature is that it adds well-formalized distortions using audio modification that are sufficient to change the classification result of an adversarial example but do not affect the classification result of an original sample. Experimental results show that the proposed scheme can detect adversarial examples by reducing the similarity rate for an adversarial example to 6.21%, 1.27%, and 0.66% using low-pass filtering (with 12 dB roll-off), 8-bit reduction, and audio silence removal techniques, respectively. It can detect an audio adversarial example with a success rate of 97% by performing a comparison with the initial audio sample.

Keywords

Machine learning

Audio modification

Audio adversarial example

Defense technology

Deep neural network (DNN)

Cited by (0)

Hyun Kwon received the B.S degree in mathematics from Korea Military Academdy, South Korea, in 2010. He also received the M.S. degree in School of Computing from Korea Advanced Institute of Science and Technology (KAIST) in 2015, and the Ph.D. degree at School of Computing, KAIST in 2020. He is currently an assistant professor in Korea Military Academy. His research interests include information security, machine learning, computer security, and intrusion tolerant system.

Hyunsoo Yoon received the B.E. degree in electronicsengineering from Seoul National University, South Korea, in 1979, the M.S. degree in computer science from Korea Advanced Institute of Science and Technology (KAIST) in 1981, and the Ph.D. degree in computer and information science from the Ohio State University, Columbus, Ohio, in 1988. From 1988 to 1989, he was a member of technical staff at AT&T Bell Labs. Since 1989 he has been a faculty member of School of Computing at KAIST. His main research interest includes wireless sensor networks, 4G networks, and network security.

Ki-Woong Park received the B.S. degree in computer science from Yonsei University, South Korea, in 2005, the M.S. degree in electrical engineering from the Korea Advanced Institute of Science and Technology (KAIST) in 2007, and the Ph.D. degree in electrical engineering from KAIST in 2012. He received a 2009–2010 Microsoft Graduate Research Fellowship. He worked for National Security Research Institute as a senior researcher. He has been a professor in the department of computer and information security at Sejong University. His research interests include security issues for cloud and mobile computing systems as well as the actual system implementation and subsequent evaluation in a real computing system.

^☆: A preliminary version of this paper was presented at ACM CCS 2019.

Neurocomputing

Acoustic-decoy: Detection of adversarial examples through audio modification on speech recognition system☆

Abstract

Keywords