skip to main content
10.1145/509907.509963acmconferencesArticle/Chapter ViewAbstractPublication PagesstocConference Proceedingsconference-collections
Article

Space lower bounds for distance approximation in the data stream model

Published:19 May 2002Publication History

ABSTRACT

(MATH) We consider the problem of approximating the distance of two d-dimensional vectors x and y in the data stream model. In this model, the 2d coordinates are presented as a "stream" of data in some arbitrary order, where each data item includes the index and value of some coordinate and a bit that identifies the vector (x or y) to which it belongs. The goal is to minimize the amount of memory needed to approximate the distance. For the case of Lp-distance with p ε [1,2], there are good approximation algorithms that run in polylogarithmic space in d (here we assume that each coordinate is an integer with O(log d) bits). Here we prove that they do not exist for p<2. In particular, we prove an optimal approximation-space tradeoff of approximating L distance of two vectors. We show that any randomized algorithm that approximates L distance of two length d vectors within factor of dδ requires ω(d1—4δ) space. As a consequence we show that for p<2/(1—4δ), any randomized algorithm that approximate Lp distance of two length d vectors within a factor dδ requires ω(d 1— 2< \over p—4δ) space.The lower bound follows from a lower bound on the two-party one-round communication complexity of this problem. This lower bound is proved using a combination of information theory and Fourier analysis.

References

  1. Noga Alon, Yossi Matias, and Mario Szegedy. The Space Complexity of Approximating the Frequency Moments. STOC 1996 20--29.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. L. Babai, Peter Frankl, and Janos Simon. Complexity classes in communication complexity theory (preliminary version). FOCS 1986 337--347.]]Google ScholarGoogle Scholar
  3. Ziv Bar-Yossef, T. S. Jayram, Ravi Kumar, and D. Sivakumar. Information Theory Methods in Communication Complexity. to appear in Proceedings of IEEE Conference on Computational Complexity 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Ziv Bar-Yossef, T. S. Jayram, Ravi Kumar, and D. Sivakumar. Personal communication.]]Google ScholarGoogle Scholar
  5. T. M. Cover and J. A. Thomas. Elements of Information Theory. John Wiley & Sons, Inc., 1991.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Joan Feigenbaum, Sampath Kannan, Martin Strauss, and Mahesh Viswanathan. An Approximate L1-Difference Algorithm for Massive Data Streams. FOCS 1999 501--511.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Jessica H. Fong and Martin Strauss. An Approximate Lp-Difference Algorithm for Massive Data Streams. STACS 2000 193--204.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Piotr Indyk. Stable Distributions, Pseudorandom Generators, Embeddings and Data Stream Computation. FOCS 2000 189--197.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Eyal Kushilevitz and Noam Nisan. Communication complexity. Cambridge University Press, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Balasubramanian Kalyanasundaram and Georg Schnitger. The Probabilistic Communication Complexity of Set Intersection (preliminary Version) Proc. of 2nd Structure in Complexity Theory 1987 41--49.]]Google ScholarGoogle Scholar
  11. W. B. Johnson and J. Lindenstrauss. Extensions of Lipshitz mapping into Hilbert space. Contemporary (MATH)ematics, 26(1984) 189--206.]]Google ScholarGoogle Scholar
  12. A. A. Razborov. On the distributed complexity of disjointness Theoretical Computer Science 106(2) (1992) 385--390.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. C. Yao. Lower bounds by probabilistic arguments. FOCS 1983 420--428.]]Google ScholarGoogle Scholar

Index Terms

  1. Space lower bounds for distance approximation in the data stream model

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            STOC '02: Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
            May 2002
            840 pages
            ISBN:1581134959
            DOI:10.1145/509907

            Copyright © 2002 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 19 May 2002

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • Article

            Acceptance Rates

            STOC '02 Paper Acceptance Rate91of287submissions,32%Overall Acceptance Rate1,469of4,586submissions,32%

            Upcoming Conference

            STOC '24
            56th Annual ACM Symposium on Theory of Computing (STOC 2024)
            June 24 - 28, 2024
            Vancouver , BC , Canada

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader