skip to main content
research-article

Distribution Testing Lower Bounds via Reductions from Communication Complexity

Published:11 February 2019Publication History
Skip Abstract Section

Abstract

We present a new methodology for proving distribution testing lower bounds, establishing a connection between distribution testing and the simultaneous message passing (SMP) communication model. Extending the framework of Blais, Brody, and Matulef [15], we show a simple way to reduce (private-coin) SMP problems to distribution testing problems. This method allows us to prove new distribution testing lower bounds, as well as to provide simple proofs of known lower bounds.

Our main result is concerned with testing identity to a specific distribution, p, given as a parameter. In a recent and influential work, Valiant and Valiant [55] showed that the sample complexity of the aforementioned problem is closely related to the ℓ2/3-quasinorm of p. We obtain alternative bounds on the complexity of this problem in terms of an arguably more intuitive measure and using simpler proofs. More specifically, we prove that the sample complexity is essentially determined by a fundamental operator in the theory of interpolation of Banach spaces, known as Peetre’s K-functional. We show that this quantity is closely related to the size of the effective support of p (loosely speaking, the number of supported elements that constitute the vast majority of the mass of p). This result, in turn, stems from an unexpected connection to functional analysis and refined concentration of measure inequalities, which arise naturally in our reduction.

References

  1. Jayadev Acharya, Clément L. Canonne, and Gautam Kamath. 2015. A chasm between identity and equivalence testing with conditional queries. In Proceedings of the APPROX-RANDOM (LIPIcs), Vol. 40. 449--466.Google ScholarGoogle Scholar
  2. Jayadev Acharya, Hirakendu Das, Ashkan Jafarpour, Alon Orlitsky, and Shengjun Pan. 2011. Competitive closeness testing. In Proceedings of the COLT. 47--68.Google ScholarGoogle Scholar
  3. Jayadev Acharya and Constantinos Daskalakis. 2015. Testing poisson binomial distributions. In Proceedings of the SODA. 1829--1840. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Jayadev Acharya, Constantinos Daskalakis, and Gautam Kamath. 2015. Optimal testing for properties of distributions. In Proceedings of the NIPS. 3577--3598. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Maryam Aliakbarpour, Eric Blais, and Ronitt Rubinfeld. 2016. Learning and testing junta distributions. In Proceedings of the COLT (JMLR Workshop and Conference Proceedings), Vol. 49. JMLR.org, 19--46.Google ScholarGoogle Scholar
  6. Sergey V. Astashkin. 2010. Rademacher functions in symmetric spaces. J. Math. Sci. 169, 6 (Sep. 2010), 725--886.Google ScholarGoogle ScholarCross RefCross Ref
  7. Tuğkan Batu, Sanjoy Dasgupta, Ravi Kumar, and Ronitt Rubinfeld. 2005. The complexity of approximating the entropy. SIAM J. Comput. 35, 1 (2005), 132--150. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Tuğkan Batu, Eldar Fischer, Lance Fortnow, Ravi Kumar, Ronitt Rubinfeld, and Patrick White. 2001. Testing random variables for independence and identity. In Proceedings of the FOCS. 442--451. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Tuğkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White. 2000. Testing that distributions are close. In Proceedings of the FOCS. 189--197.Google ScholarGoogle ScholarCross RefCross Ref
  10. Tuğkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White. 2010. Testing closeness of discrete distributions. ArXiV abs/1009.5397 (2010). This is a long version of Reference {9}.Google ScholarGoogle Scholar
  11. Tuğkan Batu, Ravi Kumar, and Ronitt Rubinfeld. 2004. Sublinear algorithms for testing monotone and unimodal distributions. In Proceedings of the STOC. ACM, New York, NY, 381--390. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Colin Bennett and Robert C. Sharpley. 1988. Interpolation of Operators. Elsevier Science. Retrieved from https://books.google.com/books?id=HpqF9zjZWMMC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Bhaswar B. Bhattacharya and Gregory Valiant. 2015. Testing closeness with unequal sized samples. In Proceedings of the NIPS. 2611--2619. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Arnab Bhattacharyya, Eldar Fischer, Ronitt Rubinfeld, and Paul Valiant. 2011. Testing monotonicity of distributions over general partial orders. In Proceedings of the ITCS. 239--252.Google ScholarGoogle Scholar
  15. Eric Blais, Joshua Brody, and Kevin Matulef. 2012. Property testing lower bounds via communication complexity. Comput. Complex. 21, 2 (2012), 311--358. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Joshua Brody, Kevin Matulef, and Chenggang Wu. 2011. Lower bounds for testing computability by small width OBDDs. In Proceedings of the TAMC (Lecture Notes in Computer Science), Vol. 6648. Springer, 320--331. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Clément L. Canonne. 2015. Big data on the rise? testing monotonicity of distributions. In Proceedings of the ICALP. Springer, 294--305.Google ScholarGoogle Scholar
  18. Clément L. Canonne. 2015. A survey on distribution testing: Your data is big. but is it blue? Electr. Colloq. Computat. Complex. 22 (Apr. 2015), 63.Google ScholarGoogle Scholar
  19. Clément L. Canonne. 2016. Are few bins enough: Testing histogram distributions. In Proceedings of the PODS. Association for Computing Machinery (ACM). Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Clément L. Canonne, Ilias Diakonikolas, Themis Gouleakis, and Ronitt Rubinfeld. 2016. Testing shape restrictions of discrete distributions. In Proceedings of the STACS.Google ScholarGoogle Scholar
  21. Clément L. Canonne, Dana Ron, and Rocco A. Servedio. 2015. Testing probability distributions using conditional samples. SIAM J. Comput. 44, 3 (2015), 540--616. Also available on arXiv at abs/1211.2664.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Sourav Chakraborty, Eldar Fischer, Yonatan Goldhirsh, and Arie Matsliah. 2013. On the power of conditional samples in distribution testing. In Proceedings of the ITCS. ACM, New York, NY, 561--580. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Siu-On Chan, Ilias Diakonikolas, Gregory Valiant, and Paul Valiant. 2014. Optimal algorithms for testing closeness of discrete distributions. In Proceedings of the SODA. SIAM, 1193--1203. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Constantinos Daskalakis, Ilias Diakonikolas, Rocco A. Servedio, Gregory Valiant, and Paul Valiant. 2013. Testing -modal distributions: Optimal algorithms via reductions. In Proceedings of the SODA. SIAM, 1833--1852. http://dl.acm.org/citation.cfm?id=2627817.2627948 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Ilias Diakonikolas, Themis Gouleakis, John Peebles, and Eric Price. 2018. Sample-optimal identity testing with high probability. In Proceedings of the ICALP. 41:1--41:14.Google ScholarGoogle Scholar
  26. Ilias Diakonikolas and Daniel M. Kane. 2016. A new approach for testing properties of discrete distributions. In Proceedings of the FOCS. IEEE Computer Society.Google ScholarGoogle Scholar
  27. Ilias Diakonikolas, Daniel M. Kane, and Vladimir Nikishkin. 2015. Optimal algorithms and lower bounds for testing closeness of structured distributions. In Proceedings of the FOCS. IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Ilias Diakonikolas, Daniel M. Kane, and Vladimir Nikishkin. 2015. Testing identity of structured distributions. In Proceedings of the SODA. SIAM, 1841--1854. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Moein Falahatgar, Ashkan Jafarpour, Alon Orlitsky, Venkatadheeraj Pichapathi, and Ananda Theertha Suresh. 2015. Faster algorithms for testing under conditional sampling (JMLR Workshop and Conference Proceedings), Vol. 40. JMLR.org, 607--636.Google ScholarGoogle Scholar
  30. Eldar Fischer, Oded Lachish, and Yadu Vasudev. 2017. Improving and extending the testing of distributions for shape-restricted properties. In Proceedings of the STACS (LIPIcs), Vol. 66. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 31:1--31:14.Google ScholarGoogle Scholar
  31. Oded Goldreich. 2016. The uniform distribution is complete with respect to testing identity to a fixed distribution. Electr. Colloq. Comput. Complex. 23 (2016), 15.Google ScholarGoogle Scholar
  32. Oded Goldreich, Shafi Goldwasser, and Dana Ron. 1998. Property testing and its connection to learning and approximation. J. ACM 45, 4 (July 1998), 653--750. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Oded Goldreich and Dana Ron. 2000. On testing expansion in bounded-degree graphs. Electr. Colloq. Comput. Complex. 7 (2000), 20.Google ScholarGoogle Scholar
  34. Paweł Hitczenko and Stanisław Kwapień. 1994. On the Rademacher series. In Probability in Banach Spaces, 9 (Sandjberg, 1993). Progr. Probab., Vol. 35. Birkhäuser, Boston, MA, 31--36.Google ScholarGoogle Scholar
  35. Tord Holmstedt. 1970. Interpolation of quasi-normed spaces. Math. Scand. 26, 0 (1970), 177--199. http://www.mscand.dk/article/view/10976Google ScholarGoogle ScholarCross RefCross Ref
  36. Piotr Indyk, Reut Levi, and Ronitt Rubinfeld. 2012. Approximating and testing k-histogram distributions in sub-linear time. In Proceedings of the PODS. 15--22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Jiantao Jiao, Kartik Venkat, and Tsachy Weissman. 2014. Order-optimal estimation of functionals of discrete distributions. ArXiV abs/1406.6956. Retrieved from http://arxiv.org/abs/1406.6956Google ScholarGoogle Scholar
  38. Norman Lloyd Johnson, Samuel Kotz, and Narayanaswamy Balakrishnan. 1997. Discrete Multivariate Distributions. Vol. 165. Wiley, New York.Google ScholarGoogle Scholar
  39. Reut Levi, Dana Ron, and Ronitt Rubinfeld. 2013. Testing properties of collections of distributions. Theory Comput. 9 (2013), 295--347.Google ScholarGoogle ScholarCross RefCross Ref
  40. Stephen J. Montgomery-Smith. 1990. The distribution of Rademacher sums. Proc. Amer. Math. Soc. 109, 2 (1990), 517--522.Google ScholarGoogle ScholarCross RefCross Ref
  41. Ilan Newman. 2010. Property testing of massively parametrized problems—A survey. In Property Testing (Lecture Notes in Computer Science), Vol. 6390. Springer, 142--157.Google ScholarGoogle Scholar
  42. Ilan Newman and Mario Szegedy. 1996. Public vs. private coin flips in one round communication games. In Proceedings of the STOC. ACM, 561--570. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Liam Paninski. 2004. Estimating entropy on m bins given fewer than m samples. IEEE Trans. Info. Theory 50, 9 (2004), 2200--2203. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Liam Paninski. 2008. A coincidence-based test for uniformity given very sparsely sampled discrete data. IEEE Trans. Info. Theory 54, 10 (2008), 4750--4755. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Jaak Peetre. 1968. A Theory of Interpolation of Normed Spaces. Instituto de Matemática Pura e Aplicada, Conselho Nacional de Pesquisas, Rio de Janeiro.Google ScholarGoogle Scholar
  46. David Pollard. 2003. Asymptopia. Retrieved from http://www.stat.yale.edu/pollard/Books/Asymptopia.Google ScholarGoogle Scholar
  47. Sofya Raskhodnikova, Dana Ron, Amir Shpilka, and Adam Smith. 2009. Strong lower bounds for approximating distributions support size and the distinct elements problem. SIAM J. Comput. 39, 3 (2009), 813--842. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Ronitt Rubinfeld. 2012. Taming big probability distributions. XRDS: Crossroads ACM Mag. Students 19, 1 (Sep. 2012), 24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Ronitt Rubinfeld and Rocco A. Servedio. 2009. Testing monotone high-dimensional distributions. Random Struct. Algor. 34, 1 (Jan. 2009), 24--44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Ronitt Rubinfeld and Madhu Sudan. 1996. Robust characterization of polynomials with applications to program testing. SIAM J. Comput. 25, 2 (1996), 252--271. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Gregory Valiant and Paul Valiant. 2010. A CLT and tight lower bounds for estimating entropy. Electr. Colloq. Comput. Complex. 17 (2010), 179.Google ScholarGoogle Scholar
  52. Gregory Valiant and Paul Valiant. 2010. Estimating the unseen: A sublinear-sample canonical estimator of distributions. Electr. Colloq. Comput. Complex. 17 (2010), 180.Google ScholarGoogle Scholar
  53. Gregory Valiant and Paul Valiant. 2011. Estimating the unseen: An n/log(n)-sample estimator for entropy and support size, shown optimal via new CLTs. In Proceedings of the STOC. ACM, 685--694. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Gregory Valiant and Paul Valiant. 2011. The power of linear estimators. In Proceedings of the FOCS. 403--412. See also References {51} and {52}. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Gregory Valiant and Paul Valiant. 2017. An automatic inequality prover and instance optimal identity testing. SIAM J. Comput. 46, 1 (2017), 429--455.Google ScholarGoogle ScholarCross RefCross Ref
  56. Paul Valiant. 2011. Testing symmetric properties of distributions. SIAM J. Comput. 40, 6 (2011), 1927--1968. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Yihong Wu and Pengkun Yang. 2016. Minimax rates of entropy estimation on large alphabets via best polynomial approximation. IEEE Trans. Info. Theory 62, 6 (2016), 3702--3720.Google ScholarGoogle ScholarCross RefCross Ref
  58. Bin Yu. 1997. Assouad, fano, and le cam. In Festschrift for Lucien Le Cam. Springer, 423--435.Google ScholarGoogle Scholar

Index Terms

  1. Distribution Testing Lower Bounds via Reductions from Communication Complexity

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Computation Theory
      ACM Transactions on Computation Theory  Volume 11, Issue 2
      June 2019
      169 pages
      ISSN:1942-3454
      EISSN:1942-3462
      DOI:10.1145/3312746
      Issue’s Table of Contents

      Copyright © 2019 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 11 February 2019
      • Revised: 1 October 2018
      • Accepted: 1 October 2018
      • Received: 1 July 2017
      Published in toct Volume 11, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format