Adversarial frontier stitching for remote neural network watermarking

Le Merrer, Erwan; Pérez, Patrick; Trédan, Gilles

doi:10.1007/s00521-019-04434-z

Adversarial frontier stitching for remote neural network watermarking

Original Article
Published: 17 August 2019

Volume 32, pages 9233–9244, (2020)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Erwan Le Merrer¹,
Patrick Pérez² &
Gilles Trédan³

2243 Accesses
135 Citations
12 Altmetric
Explore all metrics

Abstract

The state-of-the-art performance of deep learning models comes at a high cost for companies and institutions, due to the tedious data collection and the heavy processing requirements. Recently, Nagai et al. (Int J Multimed Inf Retr 7(1):3–16, 2018), Uchida et al. (Embedding watermarks into deep neural networks, ICMR, 2017) proposed to watermark convolutional neural networks for image classification, by embedding information into their weights. While this is a clear progress toward model protection, this technique solely allows for extracting the watermark from a network that one accesses locally and entirely. Instead, we aim at allowing the extraction of the watermark from a neural network (or any other machine learning model) that is operated remotely, and available through a service API. To this end, we propose to mark the model’s action itself, tweaking slightly its decision frontiers so that a set of specific queries convey the desired information. In the present paper, we formally introduce the problem and propose a novel zero-bit watermarking algorithm that makes use of adversarial model examples. While limiting the loss of performance of the protected model, this algorithm allows subsequent extraction of the watermark using only few queries. We experimented the approach on three neural networks designed for image classification, in the context of MNIST digit recognition task.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep neural networks watermark via universal deep hiding and metric learning

Article 21 February 2024

Detect and Remove Watermark in Deep Neural Networks via Generative Adversarial Networks

Protecting IP of Deep Neural Networks with Watermarking: A New Label Helps

Notes

“\({\hat{k}}_w+\varepsilon\)” stands for a small modification of the parameters of \({\hat{k}}_w\) that preserves the value of the model, i.e., that does not deteriorate significantly its performance.
Code will be open-sourced on GitHub, upon article acceptance.
This about \(3.5\%\) accuracy drop is also the one tolerated by a recent work on trojaning neural networks [18].
https://github.com/DwangoMediaVillage/keras_compressor.

References

Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I, Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur M, Levenberg J, Mané D, Monga R, Moore S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker P, Vanhoucke V, Vasudevan V, Viégas F, Vinyals O, Warden P, Wattenberg M, Wicke M, Yu Y, Zheng X (2015) TensorFlow: large-scale machine learning on heterogeneous systems. https://www.tensorflow.org/. Software available from tensorflow.org
Adi Y, Baum C, Cisse M, Pinkas B, Keshet J (2018) Turning your weakness into a strength: watermarking deep neural networks by backdooring. In: 27th \(\{\)USENIX\(\}\) security symposium (\(\{\)USENIX\(\}\) security 18) pp 1615–1631
Braudaway GW, Magerlein KA, Mintzer CF (1996) Color correct digital watermarking of images. United States Patent 5530759
Carlini N, Wagner DA (2018) Audio adversarial examples: targeted attacks on speech-to-text. CoRR arXiv:1801.01944
Chang CY, Su SJ (2005) A neural-network-based robust watermarking scheme. SMC, Santa Monica
Book Google Scholar
Chollet F et al. (2015) Keras. https://keras.io
Davchev T, Korres T, Fotiadis S, Antonopoulos N, Ramamoorthy S (2019) An empirical evaluation of adversarial robustness under transfer learning. In: ICML workshop on understanding and improving generalization in deep learning
Duddu V, Samanta D, Rao DV, Balas VE (2018) Stealing neural networks via timing side channels. CoRR arXiv:1812.11720
Goodfellow IJ, Shlens J, Szegedy C (2015) Explaining and harnessing adversarial examples. In: ICLR
Grosse K, Manoharan P, Papernot N, Backes M, McDaniel PD (2017) On the (statistical) detection of adversarial examples. CoRR arXiv:1702.06280
Guo J, Potkonjak M (2018) Watermarking deep neural networks for embedded systems. In: 2018 IEEE/ACM international conference on computer-aided design (ICCAD), pp 1–8. https://doi.org/10.1145/3240765.3240862
Hartung F, Kutter M (1999) Multimedia watermarking techniques. Proc IEEE 87(7):1079–1107. https://doi.org/10.1109/5.771066
Article Google Scholar
Le QV, Jaitly N, Hinton GE (2015) A simple way to initialize recurrent networks of rectified linear units. CoRR arXiv:1504.00941
Le Merrer E, Perez P, Trédan G (2017) Adversarial frontier stitching for remote neural network watermarking. CoRR arXiv:1711.01894
Le Merrer E, Trédan G (2019) Tampernn: efficient tampering detection of deployed neural nets. CoRR arXiv:1903.00317
LeCun Y, Cortes C, Burges CJ (1998) The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist
Li S, Neupane A, Paul S, Song C, Krishnamurthy SV, Roy-Chowdhury AK, Swami A (2018) Adversarial perturbations against real-time video classification systems. CoRR arXiv:1807.00458
Liu Y, Ma S, Aafer Y, Lee WC, Zhai J, Wang W, Zhang X (2017) Trojaning attack on neural networks. NDSS, New York
Google Scholar
Moosavi-Dezfooli S, Fawzi A, Fawzi O, Frossard P (2017) Universal adversarial perturbations. In: CVPR
Nagai Y, Uchida Y, Sakazawa S, Satoh S (2018) Digital watermarking for deep neural networks. Int J Multimed Inf Retr 7(1):3–16
Article Google Scholar
Oh SJ, Augustin M, Fritz M, Schiele B (2018) Towards reverse-engineering black-box neural networks. In: International conference on learning representations. https://openreview.net/forum?id=BydjJte0-
Papernot N, Carlini N, Goodfellow I, Feinman R, Faghri F, Matyasko A, Hambardzumyan K, Juang YL, Kurakin A, Sheatsley R, Garg A, Lin YC (2017) cleverhans v2.0.0: an adversarial machine learning library. arXiv preprint arXiv:1610.00768
Papernot N, McDaniel P, Goodfellow I, Jha S, Celik ZB, Swami A (2017) Practical black-box attacks against machine learning. In: ASIA CCS
Papernot N, McDaniel P, Jha S, Fredrikson M, Berkay Celik Z, Swami A (2015) The limitations of deep learning in adversarial settings. arXiv preprint arXiv:1511.07528
Papernot N, McDaniel PD, Jha S, Fredrikson M, Celik ZB, Swami A (2015) The limitations of deep learning in adversarial settings. arXiv preprint arXiv:1511.07528
Rouhani BD, Chen H, Koushanfar F (2018) Deepsigns: A generic watermarking framework for IP protection of deep learning models. CoRR arXiv:1804.00750
Rozsa A, Günther M, Boult TE (2016) Are accuracy and robustness correlated? In: ICMLA
Sethi TS, Kantardzic M (2018) Data driven exploratory attacks on black box classifiers in adversarial domains. Neurocomputing 289:129–143. https://doi.org/10.1016/j.neucom.2018.02.007
Article Google Scholar
Shafahi A, Huang WR, Studer C, Feizi S, Goldstein T (2018) Are adversarial examples inevitable? CoRR arXiv:1809.02104
Shin HC, Roth HR, Gao M, Lu L, Xu Z, Nogues I, Yao J, Mollura D, Summers RM (2016) Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imaging 35(5):1285–1298. https://doi.org/10.1109/TMI.2016.2528162
Article Google Scholar
Tramèr F, Zhang F, Juels A, Reiter MK, Ristenpart T (2016) Stealing machine learning models via prediction apis. In: USENIX security symposium
Tramèr F, Kurakin A, Papernot N, Boneh D, McDaniel P (2017) Ensemble adversarial training: attacks and defenses. arXiv preprint arXiv:1705.07204
Uchida Y, Nagai Y, Sakazawa S, Satoh S (2017) Embedding watermarks into deep neural networks. ICMR
van den Berg E (2016) Some insights into the geometry and training of neural networks. arXiv preprint arXiv:1605.00329
Van Schyndel RG, Tirkel AZ, Osborne CF (1994) A digital watermark. In: Proceedings of 1st international conference on image processing, vol 2. IEEE, pp 86–90
Wang B, Gong NZ (2018) Stealing hyperparameters in machine learning. CoRR arXiv:1802.05351
Yuan X, He P, Zhu Q, Li X (2019) Adversarial examples: attacks and defenses for deep learning. IEEE Transactions on neural networks and learning systems, pp 1–20. https://doi.org/10.1109/TNNLS.2018.2886017
Zhang J, Gu Z, Jang J, Wu H, Stoecklin MP, Huang H, Molloy I (2018) Protecting intellectual property of deep neural networks with watermarking. In: Proceedings of the 2018 on Asia conference on computer and communications security. ACM, pp 159–172
Zhao X, Liu Q, Zheng H, Zhao BY (2015) Towards graph watermarks. In: COSN

Download references

Acknowledgements

The authors would like to thank the reviewers for their constructive comments.

Author information

Authors and Affiliations

Univ Rennes, Inria, CNRS, IRISA, Campus de Beaulieu, 35576, Cesson Sévigné, France
Erwan Le Merrer
Valeo.ai, Creteil, Île-de-France, France
Patrick Pérez
LAAS/CNRS, 7 Avenue du Colonel Roche, 31031, Toulouse, France
Gilles Trédan

Authors

Erwan Le Merrer
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Pérez
View author publications
You can also search for this author in PubMed Google Scholar
Gilles Trédan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Erwan Le Merrer.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Le Merrer, E., Pérez, P. & Trédan, G. Adversarial frontier stitching for remote neural network watermarking. Neural Comput & Applic 32, 9233–9244 (2020). https://doi.org/10.1007/s00521-019-04434-z

Download citation

Received: 09 May 2018
Accepted: 07 August 2019
Published: 17 August 2019
Issue Date: July 2020
DOI: https://doi.org/10.1007/s00521-019-04434-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adversarial frontier stitching for remote neural network watermarking

Abstract

Access this article

Similar content being viewed by others

Deep neural networks watermark via universal deep hiding and metric learning

Detect and Remove Watermark in Deep Neural Networks via Generative Adversarial Networks

Protecting IP of Deep Neural Networks with Watermarking: A New Label Helps

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Adversarial frontier stitching for remote neural network watermarking

Abstract

Access this article

Similar content being viewed by others

Deep neural networks watermark via universal deep hiding and metric learning

Detect and Remove Watermark in Deep Neural Networks via Generative Adversarial Networks

Protecting IP of Deep Neural Networks with Watermarking: A New Label Helps

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation