Abstract
We design a new distribution over m × n matrices S so that, for any fixed n × d matrix A of rank r, with probability at least 9/10, ∥SAx∥2 = (1 ± ε)∥Ax∥2 simultaneously for all x ∈ Rd. Here, m is bounded by a polynomial in rε− 1, and the parameter ε ∈ (0, 1]. Such a matrix S is called a subspace embedding. Furthermore, SA can be computed in O(nnz(A)) time, where nnz(A) is the number of nonzero entries of A. This improves over all previous subspace embeddings, for which computing SA required at least Ω(ndlog d) time. We call these S sparse embedding matrices.
Using our sparse embedding matrices, we obtain the fastest known algorithms for overconstrained least-squares regression, low-rank approximation, approximating all leverage scores, and ℓp regression.
More specifically, let b be an n × 1 vector, ε > 0 a small enough value, and integers k, p ⩾ 1. Our results include the following.
—Regression: The regression problem is to find d × 1 vector x′ for which ∥Ax′ − b∥p ⩽ (1 + ε)min x∥Ax − b∥p. For the Euclidean case p = 2, we obtain an algorithm running in O(nnz(A)) + Õ(d3ε −2) time, and another in O(nnz(A)log(1/ε)) + Õ(d3 log (1/ε)) time. (Here, Õ(f) = f ċ log O(1)(f).) For p ∈ [1, ∞), more generally, we obtain an algorithm running in O(nnz(A) log n) + O(r\ε −1)C time, for a fixed C.
—Low-rank approximation: We give an algorithm to obtain a rank-k matrix Âk such that ∥A − Âk∥F ≤ (1 + ε )∥ A − Ak∥F, where Ak is the best rank-k approximation to A. (That is, Ak is the output of principal components analysis, produced by a truncated singular value decomposition, useful for latent semantic indexing and many other statistical problems.) Our algorithm runs in O(nnz(A)) + Õ(nk2ε−4 + k3ε−5) time.
—Leverage scores: We give an algorithm to estimate the leverage scores of A, up to a constant factor, in O(nnz(A)log n) + Õ(r3)time.
- Dimitris Achlioptas, Amos Fiat, Anna R. Karlin, and Frank McSherry. 2001. Web search via hub synthesis. In FOCS. 500--509. Google ScholarCross Ref
- Dimitris Achlioptas and Frank McSherry. 2005. On spectral learning of mixtures of distributions. In COLT. 458--469. Google ScholarDigital Library
- Dimitris Achlioptas and Frank McSherry. 2007. Fast computation of low-rank matrix approximations. Journal of the ACM 54, 2. Google ScholarDigital Library
- Nir Ailon and Bernard Chazelle. 2006. Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform. In STOC. 557--563. Google ScholarDigital Library
- Sanjeev Arora, Elad Hazan, and Satyen Kale. 2006. A fast random sampling algorithm for sparsifying matrices. In APPROX-RANDOM. 272--279. Google ScholarDigital Library
- Haim Avron, Huy L. Nguyen, and David P. Woodruff. 2013a. Subspace embeddings for the polynomial kernel. In Manuscript.Google Scholar
- Haim Avron, Vikas Sindhwani, and David P. Woodruff. 2013b. Sketching structured matrices for faster nonlinear regression. In NIPS.Google Scholar
- Yossi Azar, Amos Fiat, Anna R. Karlin, Frank McSherry, and Jared Saia. 2001. Spectral analysis of data. In STOC. 619--626. Google ScholarDigital Library
- C. Boutsidis and A. Gittens. 2012. Improved matrix algorithms via the subsampled randomized Hadamard transform. ArXiv E-prints.Google Scholar
- Moses Charikar, Kevin Chen, and Martin Farach-Colton. 2004. Finding frequent items in data streams. Theoretical Computer Science 312, 1 3--15.Google ScholarDigital Library
- Ho Yee Cheung, Tsz Chiu Kwok, and Lap Chi Lau. 2012. Fast matrix rank algorithms and applications. In STOC. 549--562. Google ScholarDigital Library
- K. Clarkson, P. Drineas, Malik Magdon-Ismail, M. Mahoney, Xiangrui Meng, and David P. Woodruff. 2013. The fast Cauchy transform and faster robust linear regression. In SODA.Google Scholar
- Kenneth L. Clarkson and David P. Woodruff. 2009. Numerical linear algebra in the streaming model. In STOC. 205--214. Google ScholarDigital Library
- Anirban Dasgupta, Petros Drineas, Boulos Harb, Ravi Kumar, and Michael W. Mahoney. 2009. Sampling algorithms and coresets for ℓp regression. SIAM Journal on Computing 38, 5, 2060--2078. Google ScholarDigital Library
- Anirban Dasgupta, Ravi Kumar, and Tamás Sarlós. 2010. A sparse Johnson-Lindenstrauss transform. In STOC. 341--350.Google Scholar
- Amit Deshpande, Luis Rademacher, Santosh Vempala, and Grant Wang. 2006. Matrix approximation and projective clustering via volume sampling. In SODA. 1117--1126. Google ScholarCross Ref
- Amit Deshpande and Santosh Vempala. 2006. Adaptive sampling and fast low-rank matrix approximation. In APPROX-RANDOM. 292--303. Google ScholarDigital Library
- Petros Drineas, Alan M. Frieze, Ravi Kannan, Santosh Vempala, and V. Vinay. 2004. Clustering large graphs via the singular value decomposition. Machine Learning 56, 1--3, 9--33. Google ScholarDigital Library
- Petros Drineas, Ravi Kannan, and Michael W. Mahoney. 2006a. Fast Monte Carlo algorithms for matrices I: Approximating matrix multiplication. SIAM Journal on Computing 36, 1, 132--157. Google ScholarDigital Library
- Petros Drineas, Ravi Kannan, and Michael W. Mahoney. 2006b. Fast Monte Carlo algorithms for matrices II: Computing a low-rank approximation to a matrix. SIAM Journal on Computing 36, 1, 158--183. Google ScholarDigital Library
- Petros Drineas, Ravi Kannan, and Michael W. Mahoney. 2006c. Fast Monte Carlo algorithms for matrices III: Computing a compressed approximate matrix decomposition. SIAM Journal on Computing 36, 1, 184--206. Google ScholarDigital Library
- Petros Drineas, Iordanis Kerenidis, and Prabhakar Raghavan. 2002. Competitive recommendation systems. In STOC. 82--90. Google ScholarDigital Library
- Petros Drineas, Malik Magdon-Ismail, Michael W. Mahoney, and David P. Woodruff. 2011. Fast approximation of matrix coherence and statistical leverage. CoRR abs/1109.3843.Google Scholar
- Petros Drineas, Michael Mahoney, Malik Magdon-Ismail, and David P. Woodruff. 2012. Fast approximation of matrix coherence and statistical leverage. In ICML.Google Scholar
- Petros Drineas and Michael W. Mahoney. 2005. Approximating a gram matrix for improved kernel-based learning. In COLT. 323--337. Google ScholarDigital Library
- Petros Drineas, Michael W. Mahoney, and S. Muthukrishnan. 2006a. Sampling algorithms for ℓ2 regression and applications. In SODA. 1127--1136.Google Scholar
- Petros Drineas, Michael W. Mahoney, and S. Muthukrishnan. 2006b. Subspace sampling and relative-error matrix approximation: Column-based methods. In Approx-Random. 316--326.Google Scholar
- Petros Drineas, Michael W. Mahoney, and S. Muthukrishnan. 2006c. Subspace sampling and relative-error matrix approximation: Column-row-based methods. In ESA. 304--314.Google Scholar
- Petros Drineas, Michael W. Mahoney, S. Muthukrishnan, and Tamás Sarlós. 2011. Faster least squares approximation. Numerische Mathematik 117, 2, 217--249.Google ScholarDigital Library
- Alan M. Frieze, Ravi Kannan, and Santosh Vempala. 2004. Fast Monte-Carlo algorithms for finding low-rank approximations. Journal of the ACM 51, 6, 1025--1041. Google ScholarDigital Library
- Gene H. Golub and Charles F. van Loan. 1996. Matrix Computations (3rd ed.). Johns Hopkins University Press, Baltimore, MD. I--XXVII, 1--694 pages.Google Scholar
- Uffe Haagerup. 1981. The best constants in the Khintchine inequality. Studia Mathematica 70, 3, 231--283. http://eudml.org/doc/218383.Google ScholarCross Ref
- N. Halko, P.-G. Martinsson, and J. A. Tropp. 2009. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. ArXiv E-prints.Google Scholar
- D. L. Hanson and F. T. Wright. 1971. A bound on tail probabilities for quadratic forms in independent random variables. Annals of Mathematical Statistics 42, 3, 1079--1083. Google ScholarCross Ref
- William B. Johnson and Joram Lindenstrauss. 1984. Extensions of Lipschitz mappings into a Hilbert space. Contemporary Mathematics 189--206. Google ScholarCross Ref
- Daniel M. Kane and Jelani Nelson. 2010. A sparser Johnson-Lindenstrauss transform. CoRR abs/1012.1577.Google Scholar
- Daniel M. Kane and Jelani Nelson. 2012. Sparser Johnson-Lindenstrauss transforms. In SODA. 1195--1206. Google ScholarCross Ref
- Daniel M. Kane, Jelani Nelson, Ely Porat, and David P. Woodruff. 2011. Fast moment estimation in data streams in optimal space. In STOC. 745--754. Google ScholarDigital Library
- Ravindran Kannan, Hadi Salmasian, and Santosh Vempala. 2008. The spectral method for general mixture models. SIAM Journal on Computing 38, 3, 1141--1156. Google ScholarDigital Library
- Jon M. Kleinberg. 1999. Authoritative sources in a hyperlinked environment. Journal of the ACM 46, 5, 604--632. Google ScholarDigital Library
- D. G. Luenberger and Y. Ye. 2008. Linear and Nonlinear Programming. Vol. 116. Springer, Berlin.Google Scholar
- Avner Magen and Anastasios Zouzias. 2011. Low rank matrix-valued Chernoff bounds and approximate matrix multiplication. In SODA. 1422--1436. Google ScholarCross Ref
- Frank McSherry. 2001. Spectral partitioning of random graphs. In FOCS. 529--537. Google ScholarCross Ref
- X. Meng and M. W. Mahoney. 2012. Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression. ArXiv E-prints.Google Scholar
- X. Meng, M. A. Saunders, and M. W. Mahoney. 2011. LSRN: A parallel iterative solver for strongly over- or under-determined systems. ArXiv E-prints.Google Scholar
- Gary L. Miller and Richard Peng. 2012. Iterative approaches to row sampling. CoRR abs/1211.2713 (2012).Google Scholar
- Jelani Nelson and Huy L. Nguyen. 2012. OSNAP: Faster numerical linear algebra algorithms via sparser subspace embeddings. CoRR abs/1211.1002 (2012).Google Scholar
- Jelani Nelson and Huy L. Nguyen. 2013a. Lower bounds for oblivious subspace embeddings. CoRR abs/1308.3280, abs/1308.3280 (2013).Google Scholar
- Jelani Nelson and Huy L. Nguyen. 2013b. Sparsity lower bounds for dimensionality reducing maps. In STOC. 101--110. Google ScholarDigital Library
- Jelani Nelson and David P. Woodruff. 2010. Fast Manhattan sketches in data streams. In PODS. 99--110. Google ScholarDigital Library
- Nam H. Nguyen, Thong T. Do, and Trac D. Tran. 2009. A fast and efficient algorithm for low-rank approximation of a matrix. In STOC. 215--224. Google ScholarDigital Library
- Rasmus Pagh. 2013. Compressed matrix multiplication. ACM Transactions on Computation Theory 5, 3, 9.Google ScholarDigital Library
- Christos H. Papadimitriou, Prabhakar Raghavan, Hisao Tamaki, and Santosh Vempala. 2000. Latent semantic indexing: A probabilistic analysis. Journal of Computer System Sciences 61, 2, 217--235. Google ScholarDigital Library
- Saurabh Paul, Christos Boutsidis, Malik Magdon-Ismail, and Petros Drineas. 2012. Random projections for support vector machines. CoRR abs/1211.6085.Google Scholar
- Benjamin Recht. 2009. A simpler approach to matrix completion. CoRR abs/0910.0651.Google Scholar
- M. Rudelson. 1999. Random vectors in the isotropic position. Journal of Functional Analysis 164, 1, 60--72. Google ScholarCross Ref
- Mark Rudelson and Roman Vershynin. 2007. Sampling from large matrices: An approach through geometric functional analysis. Journal of the ACM 54, 4. Google ScholarDigital Library
- Tamás Sarlós. 2006. Improved approximation algorithms for large matrices via random projections. In FOCS. 143--152. Google ScholarDigital Library
- Mikkel Thorup and Yin Zhang. 2004. Tabulation based 4-universal hashing with applications to second moment estimation. In SODA. 615--624.Google Scholar
- Lloyd N. Trefethen and David Bau. 1997. Numerical Linear Algebra. SIAM, Philadelphia, PA. I--XII, 1--361 pages. Google ScholarCross Ref
- David P. Woodruff. 2014. Sketching as a tool for numerical linear algebra. Foundations and Trends in Theoretical Computer Science 10, 1--2, 1--157.Google ScholarDigital Library
- David P. Woodruff and Qin Zhang. 2013. Subspace embeddings and LP regression using exponential random variables. In COLT.Google Scholar
- Jiyan Yang, Xiangrui Meng, and Michael W. Mahoney. 2013. Quantile regression for large-scale applications. CoRR abs/1305.0087.Google Scholar
- Anastasios Zouzias. 2011. A matrix hyperbolic cosine algorithm and applications. CoRR abs/1103.2793.Google Scholar
- Anastasios Zouzias and Nikolaos M. Freris. 2012. Randomized extended Kaczmarz for solving least-squares. CoRR abs/1205.5770.Google Scholar
Index Terms
- Low-Rank Approximation and Regression in Input Sparsity Time
Recommendations
Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression
STOC '13: Proceedings of the forty-fifth annual ACM symposium on Theory of ComputingLow-distortion embeddings are critical building blocks for developing random sampling and random projection algorithms for common linear algebra problems. We show that, given a matrix A ∈ Rn x d with n >> d and a p ∈ [1, 2), with a constant probability, ...
Low rank approximation and regression in input sparsity time
STOC '13: Proceedings of the forty-fifth annual ACM symposium on Theory of ComputingWe design a new distribution over poly(r ε-1) x n matrices S so that for any fixed n x d matrix A of rank r, with probability at least 9/10, SAx2 = (1 pm ε)Ax2 simultaneously for all x ∈ Rd. Such a matrix S is called a subspace embedding. Furthermore, ...
Low-rank PSD approximation in input-sparsity time
SODA '17: Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete AlgorithmsWe give algorithms for approximation by low-rank positive semidefinite (PSD) matrices. For symmetric input matrix A ∈ ℝn × n, target rank k, and error parameter ε > 0, one algorithm finds with constant probability a PSD matrix Ỹ of rank k such that ||A −...
Comments