Abstract
We present a new methodology for proving distribution testing lower bounds, establishing a connection between distribution testing and the simultaneous message passing (SMP) communication model. Extending the framework of Blais, Brody, and Matulef [15], we show a simple way to reduce (private-coin) SMP problems to distribution testing problems. This method allows us to prove new distribution testing lower bounds, as well as to provide simple proofs of known lower bounds.
Our main result is concerned with testing identity to a specific distribution, p, given as a parameter. In a recent and influential work, Valiant and Valiant [55] showed that the sample complexity of the aforementioned problem is closely related to the ℓ2/3-quasinorm of p. We obtain alternative bounds on the complexity of this problem in terms of an arguably more intuitive measure and using simpler proofs. More specifically, we prove that the sample complexity is essentially determined by a fundamental operator in the theory of interpolation of Banach spaces, known as Peetre’s K-functional. We show that this quantity is closely related to the size of the effective support of p (loosely speaking, the number of supported elements that constitute the vast majority of the mass of p). This result, in turn, stems from an unexpected connection to functional analysis and refined concentration of measure inequalities, which arise naturally in our reduction.
- Jayadev Acharya, Clément L. Canonne, and Gautam Kamath. 2015. A chasm between identity and equivalence testing with conditional queries. In Proceedings of the APPROX-RANDOM (LIPIcs), Vol. 40. 449--466.Google Scholar
- Jayadev Acharya, Hirakendu Das, Ashkan Jafarpour, Alon Orlitsky, and Shengjun Pan. 2011. Competitive closeness testing. In Proceedings of the COLT. 47--68.Google Scholar
- Jayadev Acharya and Constantinos Daskalakis. 2015. Testing poisson binomial distributions. In Proceedings of the SODA. 1829--1840. Google ScholarDigital Library
- Jayadev Acharya, Constantinos Daskalakis, and Gautam Kamath. 2015. Optimal testing for properties of distributions. In Proceedings of the NIPS. 3577--3598. Google ScholarDigital Library
- Maryam Aliakbarpour, Eric Blais, and Ronitt Rubinfeld. 2016. Learning and testing junta distributions. In Proceedings of the COLT (JMLR Workshop and Conference Proceedings), Vol. 49. JMLR.org, 19--46.Google Scholar
- Sergey V. Astashkin. 2010. Rademacher functions in symmetric spaces. J. Math. Sci. 169, 6 (Sep. 2010), 725--886.Google ScholarCross Ref
- Tuğkan Batu, Sanjoy Dasgupta, Ravi Kumar, and Ronitt Rubinfeld. 2005. The complexity of approximating the entropy. SIAM J. Comput. 35, 1 (2005), 132--150. Google ScholarDigital Library
- Tuğkan Batu, Eldar Fischer, Lance Fortnow, Ravi Kumar, Ronitt Rubinfeld, and Patrick White. 2001. Testing random variables for independence and identity. In Proceedings of the FOCS. 442--451. Google ScholarDigital Library
- Tuğkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White. 2000. Testing that distributions are close. In Proceedings of the FOCS. 189--197.Google ScholarCross Ref
- Tuğkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White. 2010. Testing closeness of discrete distributions. ArXiV abs/1009.5397 (2010). This is a long version of Reference {9}.Google Scholar
- Tuğkan Batu, Ravi Kumar, and Ronitt Rubinfeld. 2004. Sublinear algorithms for testing monotone and unimodal distributions. In Proceedings of the STOC. ACM, New York, NY, 381--390. Google ScholarDigital Library
- Colin Bennett and Robert C. Sharpley. 1988. Interpolation of Operators. Elsevier Science. Retrieved from https://books.google.com/books?id=HpqF9zjZWMMC. Google ScholarDigital Library
- Bhaswar B. Bhattacharya and Gregory Valiant. 2015. Testing closeness with unequal sized samples. In Proceedings of the NIPS. 2611--2619. Google ScholarDigital Library
- Arnab Bhattacharyya, Eldar Fischer, Ronitt Rubinfeld, and Paul Valiant. 2011. Testing monotonicity of distributions over general partial orders. In Proceedings of the ITCS. 239--252.Google Scholar
- Eric Blais, Joshua Brody, and Kevin Matulef. 2012. Property testing lower bounds via communication complexity. Comput. Complex. 21, 2 (2012), 311--358. Google ScholarDigital Library
- Joshua Brody, Kevin Matulef, and Chenggang Wu. 2011. Lower bounds for testing computability by small width OBDDs. In Proceedings of the TAMC (Lecture Notes in Computer Science), Vol. 6648. Springer, 320--331. Google ScholarDigital Library
- Clément L. Canonne. 2015. Big data on the rise? testing monotonicity of distributions. In Proceedings of the ICALP. Springer, 294--305.Google Scholar
- Clément L. Canonne. 2015. A survey on distribution testing: Your data is big. but is it blue? Electr. Colloq. Computat. Complex. 22 (Apr. 2015), 63.Google Scholar
- Clément L. Canonne. 2016. Are few bins enough: Testing histogram distributions. In Proceedings of the PODS. Association for Computing Machinery (ACM). Google ScholarDigital Library
- Clément L. Canonne, Ilias Diakonikolas, Themis Gouleakis, and Ronitt Rubinfeld. 2016. Testing shape restrictions of discrete distributions. In Proceedings of the STACS.Google Scholar
- Clément L. Canonne, Dana Ron, and Rocco A. Servedio. 2015. Testing probability distributions using conditional samples. SIAM J. Comput. 44, 3 (2015), 540--616. Also available on arXiv at abs/1211.2664.Google ScholarDigital Library
- Sourav Chakraborty, Eldar Fischer, Yonatan Goldhirsh, and Arie Matsliah. 2013. On the power of conditional samples in distribution testing. In Proceedings of the ITCS. ACM, New York, NY, 561--580. Google ScholarDigital Library
- Siu-On Chan, Ilias Diakonikolas, Gregory Valiant, and Paul Valiant. 2014. Optimal algorithms for testing closeness of discrete distributions. In Proceedings of the SODA. SIAM, 1193--1203. Google ScholarDigital Library
- Constantinos Daskalakis, Ilias Diakonikolas, Rocco A. Servedio, Gregory Valiant, and Paul Valiant. 2013. Testing -modal distributions: Optimal algorithms via reductions. In Proceedings of the SODA. SIAM, 1833--1852. http://dl.acm.org/citation.cfm?id=2627817.2627948 Google ScholarDigital Library
- Ilias Diakonikolas, Themis Gouleakis, John Peebles, and Eric Price. 2018. Sample-optimal identity testing with high probability. In Proceedings of the ICALP. 41:1--41:14.Google Scholar
- Ilias Diakonikolas and Daniel M. Kane. 2016. A new approach for testing properties of discrete distributions. In Proceedings of the FOCS. IEEE Computer Society.Google Scholar
- Ilias Diakonikolas, Daniel M. Kane, and Vladimir Nikishkin. 2015. Optimal algorithms and lower bounds for testing closeness of structured distributions. In Proceedings of the FOCS. IEEE. Google ScholarDigital Library
- Ilias Diakonikolas, Daniel M. Kane, and Vladimir Nikishkin. 2015. Testing identity of structured distributions. In Proceedings of the SODA. SIAM, 1841--1854. Google ScholarDigital Library
- Moein Falahatgar, Ashkan Jafarpour, Alon Orlitsky, Venkatadheeraj Pichapathi, and Ananda Theertha Suresh. 2015. Faster algorithms for testing under conditional sampling (JMLR Workshop and Conference Proceedings), Vol. 40. JMLR.org, 607--636.Google Scholar
- Eldar Fischer, Oded Lachish, and Yadu Vasudev. 2017. Improving and extending the testing of distributions for shape-restricted properties. In Proceedings of the STACS (LIPIcs), Vol. 66. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 31:1--31:14.Google Scholar
- Oded Goldreich. 2016. The uniform distribution is complete with respect to testing identity to a fixed distribution. Electr. Colloq. Comput. Complex. 23 (2016), 15.Google Scholar
- Oded Goldreich, Shafi Goldwasser, and Dana Ron. 1998. Property testing and its connection to learning and approximation. J. ACM 45, 4 (July 1998), 653--750. Google ScholarDigital Library
- Oded Goldreich and Dana Ron. 2000. On testing expansion in bounded-degree graphs. Electr. Colloq. Comput. Complex. 7 (2000), 20.Google Scholar
- Paweł Hitczenko and Stanisław Kwapień. 1994. On the Rademacher series. In Probability in Banach Spaces, 9 (Sandjberg, 1993). Progr. Probab., Vol. 35. Birkhäuser, Boston, MA, 31--36.Google Scholar
- Tord Holmstedt. 1970. Interpolation of quasi-normed spaces. Math. Scand. 26, 0 (1970), 177--199. http://www.mscand.dk/article/view/10976Google ScholarCross Ref
- Piotr Indyk, Reut Levi, and Ronitt Rubinfeld. 2012. Approximating and testing k-histogram distributions in sub-linear time. In Proceedings of the PODS. 15--22. Google ScholarDigital Library
- Jiantao Jiao, Kartik Venkat, and Tsachy Weissman. 2014. Order-optimal estimation of functionals of discrete distributions. ArXiV abs/1406.6956. Retrieved from http://arxiv.org/abs/1406.6956Google Scholar
- Norman Lloyd Johnson, Samuel Kotz, and Narayanaswamy Balakrishnan. 1997. Discrete Multivariate Distributions. Vol. 165. Wiley, New York.Google Scholar
- Reut Levi, Dana Ron, and Ronitt Rubinfeld. 2013. Testing properties of collections of distributions. Theory Comput. 9 (2013), 295--347.Google ScholarCross Ref
- Stephen J. Montgomery-Smith. 1990. The distribution of Rademacher sums. Proc. Amer. Math. Soc. 109, 2 (1990), 517--522.Google ScholarCross Ref
- Ilan Newman. 2010. Property testing of massively parametrized problems—A survey. In Property Testing (Lecture Notes in Computer Science), Vol. 6390. Springer, 142--157.Google Scholar
- Ilan Newman and Mario Szegedy. 1996. Public vs. private coin flips in one round communication games. In Proceedings of the STOC. ACM, 561--570. Google ScholarDigital Library
- Liam Paninski. 2004. Estimating entropy on m bins given fewer than m samples. IEEE Trans. Info. Theory 50, 9 (2004), 2200--2203. Google ScholarDigital Library
- Liam Paninski. 2008. A coincidence-based test for uniformity given very sparsely sampled discrete data. IEEE Trans. Info. Theory 54, 10 (2008), 4750--4755. Google ScholarDigital Library
- Jaak Peetre. 1968. A Theory of Interpolation of Normed Spaces. Instituto de Matemática Pura e Aplicada, Conselho Nacional de Pesquisas, Rio de Janeiro.Google Scholar
- David Pollard. 2003. Asymptopia. Retrieved from http://www.stat.yale.edu/pollard/Books/Asymptopia.Google Scholar
- Sofya Raskhodnikova, Dana Ron, Amir Shpilka, and Adam Smith. 2009. Strong lower bounds for approximating distributions support size and the distinct elements problem. SIAM J. Comput. 39, 3 (2009), 813--842. Google ScholarDigital Library
- Ronitt Rubinfeld. 2012. Taming big probability distributions. XRDS: Crossroads ACM Mag. Students 19, 1 (Sep. 2012), 24. Google ScholarDigital Library
- Ronitt Rubinfeld and Rocco A. Servedio. 2009. Testing monotone high-dimensional distributions. Random Struct. Algor. 34, 1 (Jan. 2009), 24--44. Google ScholarDigital Library
- Ronitt Rubinfeld and Madhu Sudan. 1996. Robust characterization of polynomials with applications to program testing. SIAM J. Comput. 25, 2 (1996), 252--271. Google ScholarDigital Library
- Gregory Valiant and Paul Valiant. 2010. A CLT and tight lower bounds for estimating entropy. Electr. Colloq. Comput. Complex. 17 (2010), 179.Google Scholar
- Gregory Valiant and Paul Valiant. 2010. Estimating the unseen: A sublinear-sample canonical estimator of distributions. Electr. Colloq. Comput. Complex. 17 (2010), 180.Google Scholar
- Gregory Valiant and Paul Valiant. 2011. Estimating the unseen: An n/log(n)-sample estimator for entropy and support size, shown optimal via new CLTs. In Proceedings of the STOC. ACM, 685--694. Google ScholarDigital Library
- Gregory Valiant and Paul Valiant. 2011. The power of linear estimators. In Proceedings of the FOCS. 403--412. See also References {51} and {52}. Google ScholarDigital Library
- Gregory Valiant and Paul Valiant. 2017. An automatic inequality prover and instance optimal identity testing. SIAM J. Comput. 46, 1 (2017), 429--455.Google ScholarCross Ref
- Paul Valiant. 2011. Testing symmetric properties of distributions. SIAM J. Comput. 40, 6 (2011), 1927--1968. Google ScholarDigital Library
- Yihong Wu and Pengkun Yang. 2016. Minimax rates of entropy estimation on large alphabets via best polynomial approximation. IEEE Trans. Info. Theory 62, 6 (2016), 3702--3720.Google ScholarCross Ref
- Bin Yu. 1997. Assouad, fano, and le cam. In Festschrift for Lucien Le Cam. Springer, 423--435.Google Scholar
Index Terms
- Distribution Testing Lower Bounds via Reductions from Communication Complexity
Recommendations
Distribution testing lower bounds via reductions from communication complexity
CCC '17: Proceedings of the 32nd Computational Complexity ConferenceWe present a new methodology for proving distribution testing lower bounds, establishing a connection between distribution testing and the simultaneous message passing (SMP) communication model. Extending the framework of Blais, Brody, and Matulef [15], ...
Property Testing Lower Bounds via Communication Complexity
Selected papers from the 26th Annual IEEE Conference on Computational Complexity (CCC 2011)We develop a new technique for proving lower bounds in property testing, by showing a strong connection between testing and communication complexity. We give a simple scheme for reducing communication problems to testing problems, thus allowing us to ...
Kernelization Lower Bounds Through Colors and IDs
In parameterized complexity, each problem instance comes with a parameter k, and a parameterized problem is said to admit a polynomial kernel if there are polynomial time preprocessing rules that reduce the input instance to an instance with size ...
Comments