Abstract
Adversarial attacks have threatened the application of deep neural networks in security-sensitive scenarios. Most existing black-box attacks fool the target model by interacting with it many times and producing global perturbations. However, all pixels are not equally crucial to the target model; thus, indiscriminately treating all pixels will increase query overhead inevitably. In addition, existing black-box attacks take clean samples as start points, which also limits query efficiency. In this article, we propose a novel black-box attack framework, constructed on a strategy of dual transferability (DT), to perturb the discriminative areas of clean examples within limited queries. The first kind of transferability is the transferability of model interpretations. Based on this property, we identify the discriminative areas of clean samples for generating local perturbations. The second is the transferability of adversarial examples, which helps us to produce local pre-perturbations for further improving query efficiency. We achieve the two kinds of transferability through an independent auxiliary model and do not incur extra query overhead. After identifying discriminative areas and generating pre-perturbations, we use the pre-perturbed samples as better start points and further perturb them locally in a black-box manner to search the corresponding adversarial examples. The DT strategy is general; thus, the proposed framework can be applied to different types of black-box attacks. We conduct extensive experiments to show that, under various system settings, our framework can significantly improve the query efficiency of existing black-box attacks and attack success rates.
- [1] . 2019. Genattack: Practical black-box attacks with gradient-free optimization. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO’19). ACM, New York, NY, 1111–1119. Google ScholarDigital Library
- [2] . 2018. Practical black-box attacks on deep neural networks using efficient query mechanisms. In Proceedings of the European Conference on Computer Vision (ECCV’18). Springer International Publishing, 158–174.Google ScholarDigital Library
- [3] . 2018. Decision-based adversarial attacks: Reliable attacks against black-box machine learning models. In International Conference on Learning Representations (ICLR’18). OpenReview.net, Vancouver, 1–12.Google Scholar
- [4] . 2017. Towards evaluating the robustness of neural networks. In IEEE Symposium on Security and Privacy (SP’17). IEEE Computer Society, Seattle, 39–57. Google ScholarCross Ref
- [5] . 2018. Audio adversarial examples: Targeted attacks on speech-to-text. In IEEE Security and Privacy Workshops (SPW’17). IEEE Computer Society, San Francisco, 1–7. Google ScholarCross Ref
- [6] . 2018. Grad-CAM++: Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV’18). IEEE Computer Society, Lake Tahoe, 839–847. Google ScholarCross Ref
- [7] . 2017. ZOO: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security (AISec’17). ACM, Dallas, 15–26. Google ScholarDigital Library
- [8] . 2017. Real time image saliency for black box classifiers. In Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Vol. 30. Curran Associates, Inc., Long Beach.Google Scholar
- [9] . 2020. Robust superpixel-guided attentional adversarial attack. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). IEEE Computer Society, Los Alamitos, 12892–12901. Google ScholarCross Ref
- [10] . 2018. Boosting adversarial attacks with momentum. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’18). IEEE Computer Society, 9185–9193. Google ScholarCross Ref
- [11] . 2018. Towards query efficient black-box attacks: An input-free perspective. In Proceedings of the 11th ACM Workshop on Artificial Intelligence and Security (AISec’18). ACM, 13–24. Google ScholarDigital Library
- [12] . 2019. Perceptual evaluation of adversarial attacks for CNN-based image classification. In International Conference on Quality of Multimedia Experience (QoMEX’19). IEEE, 1–6. Google ScholarCross Ref
- [13] . 2020. Axiom-based Grad-CAM: Towards accurate visualization and explanation of CNNs. In British Machine Vision Conference (BMVC). BMVA Press, UK, 1–13.Google Scholar
- [14] . 2015. Explaining and harnessing adversarial examples. In International Conference on Learning Representations (ICLR’15). OpenReview.net, San Diego, 1–11.Google Scholar
- [15] . 2019. Simple black-box adversarial attacks. In Proceedings of the 36th International Conference on Machine Learning (ICML’19). PMLR, Long Beach, 2484–2493.Google Scholar
- [16] . 2016. Deep residual learning for image recognition. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’16). IEEE Computer Society, 770–778.Google ScholarCross Ref
- [17] . 2015. Distilling the knowledge in a neural network. arXiv:1503.02531 (2015).Google Scholar
- [18] . 2018. Black-box adversarial attacks with limited queries and information. In Proceedings of the 35th International Conference on Machine Learning (ICML’18), Vol. 80. PMLR, Stockholmsmässan, 2137–2146.Google Scholar
- [19] . 2021. LayerCAM: Exploring hierarchical class activation maps for localization. IEEE Transactions on Image Processing 30 (2021), 5875–5888.Google ScholarDigital Library
- [20] . 2017. Adversarial examples in the physical world. In International Conference on Learning Representations (ICLR’17), 1–17.Google Scholar
- [21] . 2017. Adversarial machine learning at scale. International Conference on Learning Representations Workshop, (ICLR Workshop). OpenReview.net, Toulon, 1–14.Google Scholar
- [22] . 2010. Most apparent distortion: Full-reference image quality assessment and the role of strategy. Journal of Electronic Imaging 19, 1 (2010), 011006.Google ScholarCross Ref
- [23] . 2021. Adversarial examples versus cloud-based detectors: A black-box empirical study. IEEE Transactions on Dependable and Secure Computing 18, 4 (2021), 1933–1949.Google Scholar
- [24] . 2017. Delving into transferable adversarial examples and black-box attacks. In International Conference on Learning Representations (ICLR’17). OpenReview.net, Toulon, 1–14.Google Scholar
- [25] . 2018. Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations (ICLR’18). Open-Review.net, Vancouver, 1–23.Google Scholar
- [26] . 2017. Universal adversarial perturbations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). IEEE Computer Society, 1765–1773.Google ScholarCross Ref
- [27] . 2016. Deepfool: A simple and accurate method to fool deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). IEEE Computer Society, 2574–2582.Google ScholarCross Ref
- [28] . 2020. Eigen-cam: Class activation map using principal components. In International Joint Conference on Neural Networks (IJCNN’20). IEEE, 1–7.Google ScholarCross Ref
- [29] . 2017. Simple black-box adversarial attacks on deep neural networks. In IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’17), Vol. 2. IEEE Computer, 1310–1318.Google ScholarCross Ref
- [30] . 2019. Smooth Grad-CAM++: An enhanced inference level visualization technique for deep convolutional neural network models. arXiv preprint arXiv:1908.01224 (2019).Google Scholar
- [31] . 2017. Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security (AisaCCS’17). ACM, 506–519.Google ScholarDigital Library
- [32] . 2016. The limitations of deep learning in adversarial settings. In IEEE European symposium on security and privacy (EuroS&P’16). IEEE, 372–387.Google Scholar
- [33] . 2020. Ablation-CAM: Visual explanations for deep convolutional network via gradient-free localization. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV’20). IEEE, 983–991.Google Scholar
- [34] . 2016. “Why should I trust you?”: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD’16). ACM, 1135–1144.Google ScholarDigital Library
- [35] . 2017. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision. IEEE Computer Society, 618–626.Google ScholarCross Ref
- [36] . 2015. Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations (ICLR’15). San Diego, 1–14.Google Scholar
- [37] . 2019. One pixel attack for fooling deep neural networks. IEEE Transactions on Evolutionary Computation 23, 5 (2019), 828–841.Google ScholarCross Ref
- [38] . 2020. Hybrid batch attacks: Finding black-box adversarial examples with limited queries. In 29th USENIX Security Symposium (USENIX’20). USENIX Association, 1327–1344.Google Scholar
- [39] . 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). IEEE Computer Society, 2818–2826.Google ScholarCross Ref
- [40] . 2014. Intriguing properties of neural networks. In International Conference on Learning Representations (ICLR’14). Banff, AB, 1–10.Google Scholar
- [41] . 2019. Autozoom: Autoencoder-based zeroth order optimization method for attacking black-box neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’19), Vol. 33. AAAI Press, 742–749.Google ScholarDigital Library
- [42] . 2019. The security of machine learning in an adversarial setting: A survey. Journal of Parallel and Distributed Computing 130 (2019), 12–23.Google ScholarDigital Library
- [43] . 2014. Natural evolution strategies. Journal of Machine Learning Research 15, 1 (2014), 949–980.Google ScholarDigital Library
- [44] . 2017. Adversarial examples for semantic segmentation and object detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). IEEE Computer Society, 1369–1378.Google ScholarCross Ref
- [45] . 2019. Improving transferability of adversarial examples with input diversity. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). IEEE, 2730–2739.Google ScholarCross Ref
- [46] . 2020. Adversarial examples: Opportunities and challenges. IEEE Transactions on Neural Networks and Learning Systems 31, 7 (2020), 2578–2593.Google Scholar
- [47] . 2020. Adversarial attacks on deep-learning models in natural language processing: A survey. ACM Transactions on Intelligent Systems and Technology 11, 3 (2020), 1–41.Google ScholarDigital Library
- [48] . 2016. Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). IEEE Computer Society, 2921–2929.Google ScholarCross Ref
Index Terms
- Towards Query-Efficient Black-Box Attacks: A Universal Dual Transferability-Based Framework
Recommendations
Towards Query Efficient Black-box Attacks: An Input-free Perspective
AISec '18: Proceedings of the 11th ACM Workshop on Artificial Intelligence and SecurityRecent studies have highlighted that deep neural networks (DNNs) are vulnerable to adversarial attacks, even in a black-box scenario. However, most of the existing black-box attack algorithms need to make a huge amount of queries to perform attacks, ...
Ensemble adversarial black-box attacks against deep learning systems
Highlights- Deep learning models, e.g., state-of-the-art convolutional neural networks (CNNs), have been widely applied into security-sensitivity tasks, such as facial ...
AbstractDeep learning (DL) models, e.g., state-of-the-art convolutional neural networks (CNNs), have been widely applied into security sensitivity tasks, such as face payment, security monitoring, automated driving, etc. Then their ...
PhantomSound: Black-Box, Query-Efficient Audio Adversarial Attack via Split-Second Phoneme Injection
RAID '23: Proceedings of the 26th International Symposium on Research in Attacks, Intrusions and DefensesIn this paper, we propose PhantomSound, a query-efficient black-box attack toward voice assistants. Existing black-box adversarial attacks on voice assistants either apply substitution models or leverage the intermediate model output to estimate the ...
Comments