ABSTRACT
We design a novel, communication-efficient, failure-robust protocol for secure aggregation of high-dimensional data. Our protocol allows a server to compute the sum of large, user-held data vectors from mobile devices in a secure manner (i.e. without learning each user's individual contribution), and can be used, for example, in a federated learning setting, to aggregate user-provided model updates for a deep neural network. We prove the security of our protocol in the honest-but-curious and active adversary settings, and show that security is maintained even if an arbitrarily chosen subset of users drop out at any time. We evaluate the efficiency of our protocol and show, by complexity analysis and a concrete implementation, that its runtime and communication overhead remain low even on large data sets and client pools. For 16-bit input values, our protocol offers $1.73 x communication expansion for 210 users and 220-dimensional vectors, and 1.98 x expansion for 214 users and 224-dimensional vectors over sending data in the clear.
Supplemental Material
- Martín Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 308--318. Google ScholarDigital Library
- Michel Abdalla, Mihir Bellare, and Phillip Rogaway. 2001. The oracle Diffie-Hellman assumptions and an analysis of DHIES. In Cryptographers' Track at the RSA Conference. Springer, 143--158. Google ScholarCross Ref
- Gergely Ács and Claude Castelluccia. 2011. I have a DREAM! (DiffeRentially privatE smArt Metering). In International Workshop on Information Hiding. Springer, 118--132. Google ScholarCross Ref
- Stephen Advokat. 1987. Publication Of Bork's Video Rentals Raises Privacy Issue. Chicago Tribune (1987).Google Scholar
- Toshinori Araki, Jun Furukawa, Yehuda Lindell, Ariel Nof, and Kazuma Ohara. 2016. High-Throughput Semi-Honest Secure Three-Party Computation with an Honest Majority. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS '16). ACM, New York, NY, USA, 805--817. https://doi.org/10.1145/2976749.2978331 Google ScholarDigital Library
- Michael Barbaro, Tom Zeller, and Saul Hansell. 2006. A face is exposed for AOL searcher no. 4417749. New York Times 9, 2008 (2006).Google Scholar
- Mihir Bellare and Chanathip Namprempre. 2000. Authenticated encryption: Relations among notions and analysis of the generic composition paradigm. In International Conference on the Theory and Application of Cryptology and Information Security. Springer, 531--545. Google ScholarCross Ref
- Michael Ben-Or, Shafi Goldwasser, and Avi Wigderson. 1988. Completeness theorems for non-cryptographic fault-tolerant distributed computation. In Proceedings of the twentieth annual ACM symposium on Theory of computing. ACM, 1--10. Google ScholarDigital Library
- Manuel Blum and Silvio Micali. 1984. How to generate cryptographically strong sequences of pseudorandom bits. SIAM journal on Computing 13, 4 (1984), 850--864. Google ScholarDigital Library
- Peter Bogetoft, Dan Lund Christensen, Ivan Damgård, Martin Geisler, Thomas Jakobsen, Mikkel Krøigaard, Janus Dam Nielsen, Jesper Buus Nielsen, Kurt Nielsen, Jakob Pagter, et al. 2009. Secure multiparty computation goes live. In International Conference on Financial Cryptography and Data Security. Springer, 325--343. Google ScholarDigital Library
- Elette Boyle, Kai-Min Chung, and Rafael Pass. 2015. Large-Scale Secure Computation: Multi-party Computation for (Parallel) RAM Programs. Springer Berlin Heidelberg, Berlin, Heidelberg, 742--762. https://doi.org/10.1007/978-3-662-48000-7_36Google Scholar
- Martin Burkhart, Mario Strasser, Dilip Many, and Xenofontas Dimitropoulos. 2010. SEPIA: Privacy-preserving aggregation of multi-domain network events and statistics. Network 1 (2010), 101101.Google Scholar
- T-H Hubert Chan, Elaine Shi, and Dawn Song. 2012. Privacy-preserving stream aggregation with fault tolerance. In International Conference on Financial Cryptography and Data Security. Springer, 200--214.Google ScholarCross Ref
- David Chaum. 1988. The dining cryptographers problem: unconditional sender and recipient untraceability. Journal of Cryptology 1, 1 (1988), 65--75. Google ScholarDigital Library
- Jianmin Chen, Rajat Monga, Samy Bengio, and Rafal Jozefowicz. 2016. Revisiting Distributed Synchronous SGD. In ICLR Workshop Track. https://arxiv.org/abs/1604.00981Google Scholar
- Henry Corrigan-Gibbs and Dan Boneh. 2017. Prio: Private, Robust, and Scalable Computation of Aggregate Statistics. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17). USENIX Association, Boston, MA, 259--282. https://www.usenix.org/conference/nsdi17/technical-sessions/presentation/corrigan-gibbsGoogle Scholar
- Henry Corrigan-Gibbs, David Isaac Wolinsky, and Bryan Ford. 2013. Proactively Accountable Anonymous Messaging in Verdict.. In USENIX Security. 147--162.Google Scholar
- Ivan Damgård, Valerio Pastro, Nigel Smart, and Sarah Zakarias. 2012. Multi-party computation from somewhat homomorphic encryption. In Advances in Cryptology--CRYPTO 2012. Springer, 643--662. Google ScholarDigital Library
- Whitfield Diffie and Martin Hellman. 1976. New directions in cryptography. IEEE transactions on Information Theory 22, 6 (1976), 644--654. Google ScholarDigital Library
- John C Duchi, Michael I Jordan, and Martin J Wainwright. 2013. Local privacy and statistical minimax rates. In Foundations of Computer Science (FOCS), 2013 IEEE 54th Annual Symposium on. IEEE, 429--438.Google ScholarDigital Library
- Cynthia Dwork. 2006. Differential Privacy, In 33rd International Colloquium on Automata, Languages and Programming, part II (ICALP 2006). 4052, 1--12. https://www.microsoft.com/en-us/research/publication/differential-privacy/Google ScholarDigital Library
- Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. 2006. Our Data, Ourselves: Privacy Via Distributed Noise Generation.. In Eurocrypt, Vol. 4004. Springer, 486--503.Google Scholar
- Cynthia Dwork and Aaron Roth. 2014. The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science 9, 3--4 (2014), 211--407.Google Scholar
- Tariq Elahi, George Danezis, and Ian Goldberg. 2014. Privex: Private collection of traffic statistics for anonymous communication networks. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security. ACM, 1068--1079. Google ScholarDigital Library
- Úlfar Erlingsson, Vasyl Pihur, and Aleksandra Korolova. 2014. Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC conference on computer and communications security. ACM, 1054--1067. Google ScholarDigital Library
- Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. 2015. Model inversion attacks that exploit confidence information and basic countermeasures. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. ACM, 1322--1333. Google ScholarDigital Library
- Oded Goldreich, Silvio Micali, and Avi Wigderson. 1987. How to play any mental game. In Proceedings of the nineteenth annual ACM symposium on Theory of computing. ACM, 218--229. Google ScholarDigital Library
- Philippe Golle and Ari Juels. 2004. Dining cryptographer revisited. In International Conference on the Theory and Applications of Cryptographic Techniques. Springer, 456--473. Google ScholarCross Ref
- Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning.(2016). Book in preparation for MIT Press.Google Scholar
- Joshua Goodman, Gina Venolia, Keith Steury, and Chauncey Parker. 2002. Language modeling for soft keyboards. In Proceedings of the 7th international conference on Intelligent user interfaces. ACM, 194--195. Google ScholarDigital Library
- Slawomir Goryczka and Li Xiong. 2015. A comprehensive comparison of multiparty secure additions with differential privacy. IEEE Transactions on Dependable and Secure Computing (2015).Google Scholar
- Shai Halevi, Yehuda Lindell, and Benny Pinkas. 2011. Secure computation on the web: Computing without simultaneous interaction. In Annual Cryptology Conference. Springer, 132--150. Google ScholarCross Ref
- Rob Jansen and Aaron Johnson. 2016. Safely Measuring Tor. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 1553--1567. Google ScholarDigital Library
- Marek Jawurek, Florian Kerschbaum, and Claudio Orlandi. 2013. Zero-knowledge using garbled circuits: how to prove non-algebraic statements efficiently. In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security. ACM, 955--966. Google ScholarDigital Library
- Jakub KonečnỴ, H Brendan McMahan, Felix X Yu, Peter Richtárik, Ananda Theertha Suresh, and Dave Bacon. 2016. Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492 (2016).Google Scholar
- Young Hyun Kwon. 2015. Riffle: An efficient communication system with strong anonymity. Ph.D. Dissertation. Massachusetts Institute of Technology.Google Scholar
- Vasileios Lampos, Andrew C Miller, Steve Crossan, and Christian Stefansen. 2015. Advances in nowcasting influenza-like illness rates using search query logs. Scientific reports 5 (2015), 12760. Google ScholarCross Ref
- Iraklis Leontiadis, Kaoutar Elkhiyaoui, and Refik Molva. 2014. Private and Dynamic Time-Series Data Aggregation with Trust Relaxation. Springer International Publishing, Cham, 305--320. https://doi.org/10.1007/978-3-319-12280-9_20 Google ScholarDigital Library
- Iraklis Leontiadis, Kaoutar Elkhiyaoui, Melek Önen, and Refik Molva. 2015. PUDA -- Privacy and Unforgeability for Data Aggregation. Springer International Publishing, Cham, 3--18. https://doi.org/10.1007/978-3-319-26823-1_1 Google ScholarCross Ref
- Yehuda Lindell, Eli Oxman, and Benny Pinkas. 2011. The IPS Compiler: Optimizations, Variants and Concrete Efficiency. Advances in Cryptology--CRYPTO 2011 (2011), 259--276.Google Scholar
- Yehuda Lindell, Benny Pinkas, Nigel P Smart, and Avishay Yanai. 2015. Efficient constant round multi-party computation combining BMR and SPDZ. In Annual Cryptology Conference. Springer, 319--338.Google ScholarCross Ref
- Kathryn Elizabeth McCabe. 2012. Just You and Me and Netflix Makes Three: Implications for Allowing Frictionless Sharing of Personally Identifiable Information under the Video Privacy Protection Act. J. Intell. Prop. L. 20 (2012), 413.Google Scholar
- H Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, et al. 2016. Communication-Efficient Learning of Deep Networks from Decentralized Data. arXiv preprint arXiv:1602.05629 (2016).Google Scholar
- Ilya Mironov, Omkant Pandey, Omer Reingold, and Salil Vadhan. 2009. Computational differential privacy. In Advances in Cryptology-CRYPTO 2009. Springer, 126--142. Google ScholarDigital Library
- Arvind Narayanan and Vitaly Shmatikov. 2008. Robust de-anonymization of large sparse datasets. In 2008 IEEE Symposium on Security and Privacy (sp 2008). IEEE, 111--125.Google ScholarDigital Library
- John Paparrizos, Ryen W White, and Eric Horvitz. 2016. Screening for pancreatic adenocarcinoma using signals from web search logs: Feasibility study and results. Journal of Oncology Practice 12, 8 (2016), 737--744. Google ScholarCross Ref
- Vibhor Rastogi and Suman Nath. 2010. Differentially private aggregation of distributed time-series with transformation and encryption. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data. ACM, 735--746.Google ScholarDigital Library
- Adi Shamir. 1979. How to share a secret. Commun. ACM 22, 11 (1979), 612--613. Google ScholarDigital Library
- Elaine Shi, HTH Chan, Eleanor Rieffel, Richard Chow, and Dawn Song. 2011. Privacy-preserving aggregation of time-series data. In Annual Network & Distributed System Security Symposium (NDSS). Internet Society.Google Scholar
- Reza Shokri and Vitaly Shmatikov. 2015. Privacy-preserving deep learning. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. ACM, 1310--1321.Google Scholar
- Reza Shokri, Marco Stronati, and Vitaly Shmatikov. 2016. Membership Inference Attacks against Machine Learning Models. arXiv preprint arXiv:1610.05820 (2016).Google Scholar
- Latanya Sweeney and Ji Su Yoo. 2015. De-anonymizing South Korean Resident Registration Numbers Shared in Prescription Data. Technology Science (2015).Google Scholar
- Martin J Wainwright, Michael I Jordan, and John C Duchi. 2012. Privacy aware learning. In Advances in Neural Information Processing Systems. 1430--1438.Google Scholar
- Andrew C Yao. 1982. Theory and application of trapdoor functions. In Foundations of Computer Science, 1982. SFCS'08. 23rd Annual Symposium on. IEEE, 80--91.Google ScholarCross Ref
Index Terms
- Practical Secure Aggregation for Privacy-Preserving Machine Learning
Recommendations
Verifiable Secure Aggregation Protocol Under Federated Learning
Artificial Intelligence Security and PrivacyAbstractFederated learning is a new machine learning paradigm used for collaborative training models among multiple devices. In federated learning, multiple clients participate in model training locally and use decentralized learning methods to ensure the ...
Post-quantum Privacy-Preserving Aggregation in Federated Learning Based on Lattice
Cyberspace Safety and SecurityAbstractWith the fast growth of quantum computers in recent years, the use of post-quantum cryptography has steadily become vital. In the privacy-preserving aggregation of federated learning, different cryptographic primitives, such as encryption, key ...
Quid-Pro-Quo-tocols: Strengthening Semi-honest Protocols with Dual Execution
SP '12: Proceedings of the 2012 IEEE Symposium on Security and PrivacyKnown protocols for secure two-party computation that are designed to provide full security against malicious behavior are significantly less efficient than protocols intended only to thwart semi-honest adversaries. We present a concrete design and ...
Comments