skip to main content
10.1145/2488608.2488652acmconferencesArticle/Chapter ViewAbstractPublication PagesstocConference Proceedingsconference-collections
research-article

The geometry of differential privacy: the sparse and approximate cases

Published:01 June 2013Publication History

ABSTRACT

We study trade-offs between accuracy and privacy in the context of linear queries over histograms. This is a rich class of queries that includes contingency tables and range queries and has been the focus of a long line of work. For a given set of d linear queries over a database x ∈ RN, we seek to find the differentially private mechanism that has the minimum mean squared error. For pure differential privacy, [5, 32] give an O(log2 d) approximation to the optimal mechanism. Our first contribution is to give an efficient O(log2 d) approximation guarantee for the case of (ε,δ)-differential privacy. Our mechanism adds carefully chosen correlated Gaussian noise to the answers. We prove its approximation guarantee relative to the hereditary discrepancy lower bound of [44], using tools from convex geometry. We next consider the sparse case when the number of queries exceeds the number of individuals in the database, i.e. when d > n Δ |x|1. The lower bounds used in the previous approximation algorithm no longer apply --- in fact better mechanisms are known in this setting [7, 27, 28, 31, 49]. Our second main contribution is to give an efficient (ε,δ)-differentially private mechanism that, for any given query set A and an upper bound n on |x|1, has mean squared error within polylog(d,N) of the optimal for A and n. This approximation is achieved by coupling the Gaussian noise addition approach with linear regression over the l1 ball. Additionally, we show a similar polylogarithmic approximation guarantee for the optimal ε-differentially private mechanism in this sparse setting. Our work also shows that for arbitrary counting queries, i.e. A with entries in {0,1}, there is an ε-differentially private mechanism with expected error ~O(√n) per query, improving on the ~O(n2/3) bound of [7] and matching the lower bound implied by [15] up to logarithmic factors.

The connection between the hereditary discrepancy and the privacy mechanism enables us to derive the first polylogarithmic approximation to the hereditary discrepancy of a matrix A.

References

  1. N. Bansal. Constructive algorithms for discrepancy minimization. In Foundations of Computer Science (FOCS), 2010 51st Annual IEEE Symposium on, pages 3--10. IEEE, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. B. Barak, K. Chaudhuri, C. Dwork, S. Kale, F. McSherry, and K. Talwar. Privacy, accuracy, and consistency too: a holistic solution to contingency table release. In L. Libkin, editor, Proceedings of ACM PODS, pages 273--282. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. I. Bárány and Z. Füredi. Approximation of the sphere by polytopes having few vertices. Proceedings of the American Mathematical Society, 102(3):651--659, 1988.Google ScholarGoogle ScholarCross RefCross Ref
  4. J. Beck and V. T. Sós. Handbook of combinatorics (vol. 2). chapter Discrepancy theory, pages 1405--1446. MIT Press, Cambridge, MA, USA, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Bhaskara, D. Dadush, R. Krishnaswamy, and K. Talwar. Unconditional differentially private mechanisms for linear queries. In Proceedings of the 44th symposium on Theory of Computing, STOC '12, pages 1269--1284, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Blum, C. Dwork, F. McSherry, and K. Nissim. Practical privacy: the sulq framework. In Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 128--138. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Blum, K. Ligett, and A. Roth. A learning theory approach to non-interactive database privacy. In STOC '08: Proceedings of the 40th annual ACM symposium on Theory of computing, pages 609--618, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Bourgain and L. Tzafriri. Invertibility of large submatrices with applications to the geometry of banach spaces and harmonic analysis. Israel journal of mathematics, 57(2):137--224, 1987.Google ScholarGoogle Scholar
  9. H. Brenner and K. Nissim. Impossibility of differentially private universally optimal mechanisms. In Proceedings of the 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, FOCS '10, pages 71--80, Washington, DC, USA, 2010. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T.-H. H. Chan, E. Shi, and D. Song. Private and continual release of statistics. In ICALP, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. K. Chandrasekaran and S. Vempala. A discrepancy based approach to integer programming. CoRR, abs/1111.4649, 2011.Google ScholarGoogle Scholar
  12. B. Chazelle. The Discrepancy Method: Randomness and Complexity. Cambridge University Press, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. De. Lower bounds in differential privacy. Theory of Cryptography, pages 321--338, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. B. Ding, M. Winslett, J. Han, and Z. Li. Differentially private data cubes: optimizing noise sources and consistency. In SIGMOD Conference, pages 217--228, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. I. Dinur and K. Nissim. Revealing information while preserving privacy. In Proc.\ $22$nd PODS, pages 202--210. ACM, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. Dwork, K. Kenthapadi, F. McSherry, I. Mironov, and M. Naor. Our data, ourselves: Privacy via distributed noise generation. In Proc.\ $25$th EUROCRYPT, pages 486--503. Springer, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. Dwork, K. Kenthapadi, F. McSherry, I. Mironov, and M. Naor. Our data, ourselves: Privacy via distributed noise generation, 2006.Google ScholarGoogle Scholar
  18. C. Dwork, F. Mcsherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In TCC, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. C. Dwork, F. McSherry, and K. Talwar. The price of privacy and the limits of LP decoding. In Proc.\ $39$th STOC, pages 85--94. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. C. Dwork, M. Naor, O. Reingold, G. N. Rothblum, and S. Vadhan. On the complexity of differentially private data release: efficient algorithms and hardness results. In Proceedings of the 41st annual ACM symposium on Theory of computing, STOC '09, pages 381--390, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. C. Dwork, G. N. Rothblum, and S. Vadhan. Boosting and differential privacy. In Proceedings of the 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, FOCS '10, pages 51--60, Washington, DC, USA, 2010. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. C. Dwork and S. Yekhanin. New efficient attacks on statistical disclosure control mechanisms. In Proc. 28th CRYPTO, pages 469--480. Springer, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. N. Fawaz, S. Muthukrishnan, and A. Nikolov. Nearly optimal private convolutions. unpublished manuscript.Google ScholarGoogle Scholar
  24. M. Frank and P. Wolfe. An algorithm for quadratic programming. Naval research logistics quarterly, 3(1--2):95--110, 1956.Google ScholarGoogle Scholar
  25. A. Ghosh, T. Roughgarden, and M. Sundararajan. Universally utility-maximizing privacy mechanisms. In STOC, pages 351--360, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. E. Gluskin. Extremal properties of orthogonal parallelepipeds and their applications to the geometry of banach spaces. Mathematics of the USSR-Sbornik, 64(1):85, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  27. A. Gupta, M. Hardt, A. Roth, and J. Ullman. Privately releasing conjunctions and the statistical query barrier. In STOC, pages 803--812, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. Gupta, A. Roth, and J. Ullman. Iterative constructions and private data release. In TCC, pages 339--356, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. M. Gupte and M. Sundararajan. Universally optimal privacy mechanisms for minimax agents. In Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, PODS '10, pages 135--146, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. Hardt, K. Ligett, and F. McSherry. A simple and practical algorithm for differentially private data release. In NIPS, 2012. To appear.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. M. Hardt and G. N. Rothblum. A multiplicative weights mechanism for privacy-preserving data analysis. In Proceedings of the 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, FOCS '10, pages 61--70, Washington, DC, USA, 2010. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. M. Hardt and K. Talwar. On the geometry of differential privacy. In Proceedings of the 42nd ACM symposium on Theory of computing, STOC '10, pages 705--714, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. M. Hay, V. Rastogi, G. Miklau, and D. Suciu. Boosting the accuracy of differentially private histograms through consistency. PVLDB, 3(1):1021--1032, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. F. John. Extremum problems with inequalities as subsidiary conditions. In Studies and Essays presented to R. Courant on his 60th Birthday, pages 187--204, 1948.Google ScholarGoogle Scholar
  35. S. Kasiviswanathan, M. Rudelson, and A. Smith. The power of linear reconstruction attacks. In SODA, 2013. To appear.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. S. P. Kasiviswanathan, M. Rudelson, A. Smith, and J. Ullman. The price of privately releasing contingency tables and the spectra of random matrices with correlated rows. In Proceedings of the 42nd ACM symposium on Theory of computing, STOC '10, pages 775--784, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. K. G. Larsen. On range searching in the group model and combinatorial discrepancy. In FOCS, pages 542--549, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. C. Li, M. Hay, V. Rastogi, G. Miklau, and A. McGregor. Optimizing linear counting queries under differential privacy. In Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, PODS '10, pages 123--134, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. C. Li and G. Miklau. An adaptive mechanism for accurate query answering under differential privacy. PVLDB, 5(6):514--525, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. C. Li and G. Miklau. Measuring the achievable error of query sets under differential privacy. CoRR, abs/1202.3399, 2012.\newpageGoogle ScholarGoogle Scholar
  41. L. Lovász, J. Spencer, and K. Vesztergombi. Discrepancy of set-systems and matrices. European Journal of Combinatorics, 7(2):151--160, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. J. Matousek. Geometric Discrepancy (An Illustrated Guide). Springer, 1999.Google ScholarGoogle Scholar
  43. J. Matousek. The determinant bound for discrepancy is almost tight. http://arxiv.org/abs/1101.0767, 2011.Google ScholarGoogle Scholar
  44. S. Muthukrishnan and A. Nikolov. Optimal private halfspace counting via discrepancy. In Proceedings of the 44th symposium on Theory of Computing, STOC '12, pages 1285--1292, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. A. Nikolov, K. Talwar, and L. Zhang. The geometry of differential privacy: the sparse and approximate cases. CoRR, abs/1212.0297, 2012.Google ScholarGoogle Scholar
  46. K. Nissim, S. Raskhodnikova, and A. Smith. Smooth sensitivity and sampling in private data analysis. In STOC '07: Proceedings of the thirty-ninth annual ACM symposium on Theory of computing, pages 75--84, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. G. Raskutti, M. Wainwright, and B. Yu. Minimax rates of estimation for high-dimensional linear regression over $\ell_q$-balls. Information Theory, IEEE Transactions on, 57(10):6976--6994, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. V. Rastogi, S. Hong, and D. Suciu. The boundary between privacy and utility in data publishing. In VLDB, pages 531--542, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. A. Roth and T. Roughgarden. Interactive privacy via the median mechanism. In Proceedings of the 42nd ACM symposium on Theory of computing, STOC '10, pages 765--774, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. R. Vershynin. John's decompositions: Selecting a large part. Israel Journal of Mathematics, 122(1):253--277, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  51. X. Xiao, G. Wang, and J. Gehrke. Differential privacy via wavelet transforms. In ICDE, pages 225--236, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  52. Y. Xiao, L. Xiong, and C. Yuan. Differentially private data release through multidimensional partitioning. In Secure Data Management, pages 150--168, 2010. Google ScholarGoogle ScholarCross RefCross Ref
  53. G. Yuan, Z. Zhang, M. Winslett, X. Xiao, Y. Yang, and Z. Hao. Low-rank mechanism: Optimizing batch queries under differential privacy. PVLDB, 5(11):1352--1363, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. The geometry of differential privacy: the sparse and approximate cases

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      STOC '13: Proceedings of the forty-fifth annual ACM symposium on Theory of Computing
      June 2013
      998 pages
      ISBN:9781450320290
      DOI:10.1145/2488608

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 June 2013

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      STOC '13 Paper Acceptance Rate100of360submissions,28%Overall Acceptance Rate1,469of4,586submissions,32%

      Upcoming Conference

      STOC '24
      56th Annual ACM Symposium on Theory of Computing (STOC 2024)
      June 24 - 28, 2024
      Vancouver , BC , Canada

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader