research-article

Distribution Testing Lower Bounds via Reductions from Communication Complexity

Authors:
Eric Blais

University of Waterloo, Waterloo, Canada

University of Waterloo, Waterloo, Canada
View Profile

,
Clément L. Canonne

Stanford University, Stanford, CA

Stanford University, Stanford, CA
View Profile

,
Tom Gur

University of Warwick, United Kingdom

University of Warwick, United Kingdom
View Profile

Authors Info & Claims

ACM Transactions on Computation Theory Volume 11 Issue 2Article No.: 6pp 1–37https://doi.org/10.1145/3305270

Published:11 February 2019Publication History

ACM Transactions on Computation Theory

Abstract

We present a new methodology for proving distribution testing lower bounds, establishing a connection between distribution testing and the simultaneous message passing (SMP) communication model. Extending the framework of Blais, Brody, and Matulef [15], we show a simple way to reduce (private-coin) SMP problems to distribution testing problems. This method allows us to prove new distribution testing lower bounds, as well as to provide simple proofs of known lower bounds.

Our main result is concerned with testing identity to a specific distribution, p, given as a parameter. In a recent and influential work, Valiant and Valiant [55] showed that the sample complexity of the aforementioned problem is closely related to the ℓ_2/3-quasinorm of p. We obtain alternative bounds on the complexity of this problem in terms of an arguably more intuitive measure and using simpler proofs. More specifically, we prove that the sample complexity is essentially determined by a fundamental operator in the theory of interpolation of Banach spaces, known as Peetre’s K-functional. We show that this quantity is closely related to the size of the effective support of p (loosely speaking, the number of supported elements that constitute the vast majority of the mass of p). This result, in turn, stems from an unexpected connection to functional analysis and refined concentration of measure inequalities, which arise naturally in our reduction.

References

Jayadev Acharya, Clément L. Canonne, and Gautam Kamath. 2015. A chasm between identity and equivalence testing with conditional queries. In Proceedings of the APPROX-RANDOM (LIPIcs), Vol. 40. 449--466.Google Scholar
Jayadev Acharya, Hirakendu Das, Ashkan Jafarpour, Alon Orlitsky, and Shengjun Pan. 2011. Competitive closeness testing. In Proceedings of the COLT. 47--68.Google Scholar
Jayadev Acharya and Constantinos Daskalakis. 2015. Testing poisson binomial distributions. In Proceedings of the SODA. 1829--1840. Google ScholarDigital Library
Jayadev Acharya, Constantinos Daskalakis, and Gautam Kamath. 2015. Optimal testing for properties of distributions. In Proceedings of the NIPS. 3577--3598. Google ScholarDigital Library
Maryam Aliakbarpour, Eric Blais, and Ronitt Rubinfeld. 2016. Learning and testing junta distributions. In Proceedings of the COLT (JMLR Workshop and Conference Proceedings), Vol. 49. JMLR.org, 19--46.Google Scholar
Sergey V. Astashkin. 2010. Rademacher functions in symmetric spaces. J. Math. Sci. 169, 6 (Sep. 2010), 725--886.Google ScholarCross Ref
Tuğkan Batu, Sanjoy Dasgupta, Ravi Kumar, and Ronitt Rubinfeld. 2005. The complexity of approximating the entropy. SIAM J. Comput. 35, 1 (2005), 132--150. Google ScholarDigital Library
Tuğkan Batu, Eldar Fischer, Lance Fortnow, Ravi Kumar, Ronitt Rubinfeld, and Patrick White. 2001. Testing random variables for independence and identity. In Proceedings of the FOCS. 442--451. Google ScholarDigital Library
Tuğkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White. 2000. Testing that distributions are close. In Proceedings of the FOCS. 189--197.Google ScholarCross Ref
Tuğkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White. 2010. Testing closeness of discrete distributions. ArXiV abs/1009.5397 (2010). This is a long version of Reference {9}.Google Scholar
Tuğkan Batu, Ravi Kumar, and Ronitt Rubinfeld. 2004. Sublinear algorithms for testing monotone and unimodal distributions. In Proceedings of the STOC. ACM, New York, NY, 381--390. Google ScholarDigital Library
Colin Bennett and Robert C. Sharpley. 1988. Interpolation of Operators. Elsevier Science. Retrieved from https://books.google.com/books?id=HpqF9zjZWMMC. Google ScholarDigital Library
Bhaswar B. Bhattacharya and Gregory Valiant. 2015. Testing closeness with unequal sized samples. In Proceedings of the NIPS. 2611--2619. Google ScholarDigital Library
Arnab Bhattacharyya, Eldar Fischer, Ronitt Rubinfeld, and Paul Valiant. 2011. Testing monotonicity of distributions over general partial orders. In Proceedings of the ITCS. 239--252.Google Scholar
Eric Blais, Joshua Brody, and Kevin Matulef. 2012. Property testing lower bounds via communication complexity. Comput. Complex. 21, 2 (2012), 311--358. Google ScholarDigital Library
Joshua Brody, Kevin Matulef, and Chenggang Wu. 2011. Lower bounds for testing computability by small width OBDDs. In Proceedings of the TAMC (Lecture Notes in Computer Science), Vol. 6648. Springer, 320--331. Google ScholarDigital Library
Clément L. Canonne. 2015. Big data on the rise? testing monotonicity of distributions. In Proceedings of the ICALP. Springer, 294--305.Google Scholar
Clément L. Canonne. 2015. A survey on distribution testing: Your data is big. but is it blue? Electr. Colloq. Computat. Complex. 22 (Apr. 2015), 63.Google Scholar
Clément L. Canonne. 2016. Are few bins enough: Testing histogram distributions. In Proceedings of the PODS. Association for Computing Machinery (ACM). Google ScholarDigital Library
Clément L. Canonne, Ilias Diakonikolas, Themis Gouleakis, and Ronitt Rubinfeld. 2016. Testing shape restrictions of discrete distributions. In Proceedings of the STACS.Google Scholar
Clément L. Canonne, Dana Ron, and Rocco A. Servedio. 2015. Testing probability distributions using conditional samples. SIAM J. Comput. 44, 3 (2015), 540--616. Also available on arXiv at abs/1211.2664.Google ScholarDigital Library
Sourav Chakraborty, Eldar Fischer, Yonatan Goldhirsh, and Arie Matsliah. 2013. On the power of conditional samples in distribution testing. In Proceedings of the ITCS. ACM, New York, NY, 561--580. Google ScholarDigital Library
Siu-On Chan, Ilias Diakonikolas, Gregory Valiant, and Paul Valiant. 2014. Optimal algorithms for testing closeness of discrete distributions. In Proceedings of the SODA. SIAM, 1193--1203. Google ScholarDigital Library
Constantinos Daskalakis, Ilias Diakonikolas, Rocco A. Servedio, Gregory Valiant, and Paul Valiant. 2013. Testing -modal distributions: Optimal algorithms via reductions. In Proceedings of the SODA. SIAM, 1833--1852. http://dl.acm.org/citation.cfm?id=2627817.2627948 Google ScholarDigital Library
Ilias Diakonikolas, Themis Gouleakis, John Peebles, and Eric Price. 2018. Sample-optimal identity testing with high probability. In Proceedings of the ICALP. 41:1--41:14.Google Scholar
Ilias Diakonikolas and Daniel M. Kane. 2016. A new approach for testing properties of discrete distributions. In Proceedings of the FOCS. IEEE Computer Society.Google Scholar
Ilias Diakonikolas, Daniel M. Kane, and Vladimir Nikishkin. 2015. Optimal algorithms and lower bounds for testing closeness of structured distributions. In Proceedings of the FOCS. IEEE. Google ScholarDigital Library
Ilias Diakonikolas, Daniel M. Kane, and Vladimir Nikishkin. 2015. Testing identity of structured distributions. In Proceedings of the SODA. SIAM, 1841--1854. Google ScholarDigital Library
Moein Falahatgar, Ashkan Jafarpour, Alon Orlitsky, Venkatadheeraj Pichapathi, and Ananda Theertha Suresh. 2015. Faster algorithms for testing under conditional sampling (JMLR Workshop and Conference Proceedings), Vol. 40. JMLR.org, 607--636.Google Scholar
Eldar Fischer, Oded Lachish, and Yadu Vasudev. 2017. Improving and extending the testing of distributions for shape-restricted properties. In Proceedings of the STACS (LIPIcs), Vol. 66. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 31:1--31:14.Google Scholar
Oded Goldreich. 2016. The uniform distribution is complete with respect to testing identity to a fixed distribution. Electr. Colloq. Comput. Complex. 23 (2016), 15.Google Scholar
Oded Goldreich, Shafi Goldwasser, and Dana Ron. 1998. Property testing and its connection to learning and approximation. J. ACM 45, 4 (July 1998), 653--750. Google ScholarDigital Library
Oded Goldreich and Dana Ron. 2000. On testing expansion in bounded-degree graphs. Electr. Colloq. Comput. Complex. 7 (2000), 20.Google Scholar
Paweł Hitczenko and Stanisław Kwapień. 1994. On the Rademacher series. In Probability in Banach Spaces, 9 (Sandjberg, 1993). Progr. Probab., Vol. 35. Birkhäuser, Boston, MA, 31--36.Google Scholar
Tord Holmstedt. 1970. Interpolation of quasi-normed spaces. Math. Scand. 26, 0 (1970), 177--199. http://www.mscand.dk/article/view/10976Google ScholarCross Ref
Piotr Indyk, Reut Levi, and Ronitt Rubinfeld. 2012. Approximating and testing k-histogram distributions in sub-linear time. In Proceedings of the PODS. 15--22. Google ScholarDigital Library
Jiantao Jiao, Kartik Venkat, and Tsachy Weissman. 2014. Order-optimal estimation of functionals of discrete distributions. ArXiV abs/1406.6956. Retrieved from http://arxiv.org/abs/1406.6956Google Scholar
Norman Lloyd Johnson, Samuel Kotz, and Narayanaswamy Balakrishnan. 1997. Discrete Multivariate Distributions. Vol. 165. Wiley, New York.Google Scholar
Reut Levi, Dana Ron, and Ronitt Rubinfeld. 2013. Testing properties of collections of distributions. Theory Comput. 9 (2013), 295--347.Google ScholarCross Ref
Stephen J. Montgomery-Smith. 1990. The distribution of Rademacher sums. Proc. Amer. Math. Soc. 109, 2 (1990), 517--522.Google ScholarCross Ref
Ilan Newman. 2010. Property testing of massively parametrized problems—A survey. In Property Testing (Lecture Notes in Computer Science), Vol. 6390. Springer, 142--157.Google Scholar
Ilan Newman and Mario Szegedy. 1996. Public vs. private coin flips in one round communication games. In Proceedings of the STOC. ACM, 561--570. Google ScholarDigital Library
Liam Paninski. 2004. Estimating entropy on m bins given fewer than m samples. IEEE Trans. Info. Theory 50, 9 (2004), 2200--2203. Google ScholarDigital Library
Liam Paninski. 2008. A coincidence-based test for uniformity given very sparsely sampled discrete data. IEEE Trans. Info. Theory 54, 10 (2008), 4750--4755. Google ScholarDigital Library
Jaak Peetre. 1968. A Theory of Interpolation of Normed Spaces. Instituto de Matemática Pura e Aplicada, Conselho Nacional de Pesquisas, Rio de Janeiro.Google Scholar
David Pollard. 2003. Asymptopia. Retrieved from http://www.stat.yale.edu/pollard/Books/Asymptopia.Google Scholar
Sofya Raskhodnikova, Dana Ron, Amir Shpilka, and Adam Smith. 2009. Strong lower bounds for approximating distributions support size and the distinct elements problem. SIAM J. Comput. 39, 3 (2009), 813--842. Google ScholarDigital Library
Ronitt Rubinfeld. 2012. Taming big probability distributions. XRDS: Crossroads ACM Mag. Students 19, 1 (Sep. 2012), 24. Google ScholarDigital Library
Ronitt Rubinfeld and Rocco A. Servedio. 2009. Testing monotone high-dimensional distributions. Random Struct. Algor. 34, 1 (Jan. 2009), 24--44. Google ScholarDigital Library
Ronitt Rubinfeld and Madhu Sudan. 1996. Robust characterization of polynomials with applications to program testing. SIAM J. Comput. 25, 2 (1996), 252--271. Google ScholarDigital Library
Gregory Valiant and Paul Valiant. 2010. A CLT and tight lower bounds for estimating entropy. Electr. Colloq. Comput. Complex. 17 (2010), 179.Google Scholar
Gregory Valiant and Paul Valiant. 2010. Estimating the unseen: A sublinear-sample canonical estimator of distributions. Electr. Colloq. Comput. Complex. 17 (2010), 180.Google Scholar
Gregory Valiant and Paul Valiant. 2011. Estimating the unseen: An n/log(n)-sample estimator for entropy and support size, shown optimal via new CLTs. In Proceedings of the STOC. ACM, 685--694. Google ScholarDigital Library
Gregory Valiant and Paul Valiant. 2011. The power of linear estimators. In Proceedings of the FOCS. 403--412. See also References {51} and {52}. Google ScholarDigital Library
Gregory Valiant and Paul Valiant. 2017. An automatic inequality prover and instance optimal identity testing. SIAM J. Comput. 46, 1 (2017), 429--455.Google ScholarCross Ref
Paul Valiant. 2011. Testing symmetric properties of distributions. SIAM J. Comput. 40, 6 (2011), 1927--1968. Google ScholarDigital Library
Yihong Wu and Pengkun Yang. 2016. Minimax rates of entropy estimation on large alphabets via best polynomial approximation. IEEE Trans. Info. Theory 62, 6 (2016), 3702--3720.Google ScholarCross Ref
Bin Yu. 1997. Assouad, fano, and le cam. In Festschrift for Lucien Le Cam. Springer, 423--435.Google Scholar

Index Terms

Distribution Testing Lower Bounds via Reductions from Communication Complexity
1. Theory of computation
  1. Computational complexity and cryptography

Recommendations

Distribution testing lower bounds via reductions from communication complexity
CCC '17: Proceedings of the 32nd Computational Complexity Conference

We present a new methodology for proving distribution testing lower bounds, establishing a connection between distribution testing and the simultaneous message passing (SMP) communication model. Extending the framework of Blais, Brody, and Matulef [15], ...
Read More
Property Testing Lower Bounds via Communication Complexity
Selected papers from the 26th Annual IEEE Conference on Computational Complexity (CCC 2011)

We develop a new technique for proving lower bounds in property testing, by showing a strong connection between testing and communication complexity. We give a simple scheme for reducing communication problems to testing problems, thus allowing us to ...
Read More
Kernelization Lower Bounds Through Colors and IDs

In parameterized complexity, each problem instance comes with a parameter k, and a parameterized problem is said to admit a polynomial kernel if there are polynomial time preprocessing rules that reduce the input instance to an instance with size ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Computation Theory Volume 11, Issue 2
June 2019
169 pages
ISSN:1942-3454
EISSN:1942-3462
DOI:10.1145/3312746
Editor:
Venkatesan Guruswami
Carnegie Mellon University, USA
Issue’s Table of Contents
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 11 February 2019
- Revised: 1 October 2018
- Accepted: 1 October 2018
- Received: 1 July 2017
Published in toct Volume 11, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Communication complexity
K-functional
distribution testing
lower bounds
property testing
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 10
  Total Citations
  View Citations
- 214
  Total Downloads
- Downloads (Last 12 months)20
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Distribution Testing Lower Bounds via Reductions from Communication Complexity

ACM Transactions on Computation Theory

Abstract

References

Cited By

Index Terms

Recommendations

Distribution testing lower bounds via reductions from communication complexity

Property Testing Lower Bounds via Communication Complexity

Kernelization Lower Bounds Through Colors and IDs