Toward a unified theory of sparse dimensionality reduction in Euclidean space

Bourgain, Jean; Dirksen, Sjoerd; Nelson, Jelani

doi:10.1007/s00039-015-0332-9

Toward a unified theory of sparse dimensionality reduction in Euclidean space

Published: 05 August 2015

Volume 25, pages 1009–1088, (2015)
Cite this article

Geometric and Functional Analysis Aims and scope Submit manuscript

Jean Bourgain¹,
Sjoerd Dirksen² &
Jelani Nelson³

809 Accesses
27 Citations
3 Altmetric
Explore all metrics

Abstract

Let ${\Phi\in\mathbb{R}^{m\times n}}$ be a sparse Johnson–Lindenstrauss transform (Kane and Nelson in J ACM 61(1):4, 2014) with s non-zeroes per column. For a subset T of the unit sphere, ${\varepsilon\in(0,1/2)}$ given, we study settings for m, s required to ensure

$$\mathop{\mathbb{E}}_\Phi \sup_{x\in T}\left|\|\Phi x\|_2^2 - 1\right| < \varepsilon,$$

i.e. so that ${\Phi}$ preserves the norm of every ${x\in T}$ simultaneously and multiplicatively up to ${1+\varepsilon}$. We introduce a new complexity parameter, which depends on the geometry of T, and show that it suffices to choose s and m such that this parameter is small. Our result is a sparse analog of Gordon’s theorem, which was concerned with a dense ${\Phi}$ having i.i.d. Gaussian entries. We qualitatively unify several results related to the Johnson–Lindenstrauss lemma, subspace embeddings, and Fourier-based restricted isometries. Our work also implies new results in using the sparse Johnson–Lindenstrauss transform in numerical linear algebra, classical and model-based compressed sensing, manifold learning, and constrained least squares problems such as the Lasso.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Sum-of-Squares Relaxations for Information Theory and Variational Inference

Article 05 April 2024

Francis Bach

A New Insight on Augmented Lagrangian Method with Applications in Machine Learning

Article 13 April 2024

Jianchao Bai, Linyuan Jia & Zheng Peng

SSCNet: learning-based subspace clustering

Article Open access 08 April 2024

Xingyu Xie, Jianlong Wu, … Zhouchen Lin

References

H. Avron, C. Boutsidis, S. Toledo, and A. Zouzias. Efficient dimensionality reduction for canonical correlation analysis. In Proceedings of the 30th International Conference on Machine Learning (ICML), (2013).
Ailon N., Chazelle B.: The Fast Johnson–Lindenstrauss Transform and approximate nearest neighbors. SIAM Journal of Computing, 1(39), 302–322 (2009)
Article MathSciNet Google Scholar
Achlioptas D.: Database-friendly random projections: Johnson-Lindenstrauss with binary coins. Journal of Computer System Science, 4(66), 671–687 (2003)
Article MathSciNet Google Scholar
U. Ayaz, S. Dirksen, and H. Rauhut. Uniform recovery of fusion frame structured sparse signals. CoRR (2014). arXiv:1407.7680.
S. Arora, E. Hazan, and S. Kale. A fast random sampling algorithm for sparsifying matrices. In APPROX-RANDOM (2006), pp. 272–279.
Ailon N., Liberty E.: Fast dimension reduction using Rademacher series on dual BCH codes. Discrete Computational Geometry, 4(42), 615–630 (2009)
Article MathSciNet Google Scholar
N. Ailon and E. Liberty. An almost optimal unrestricted fast Johnson–Lindenstrauss transform. ACM Transactions on Algorithms, (3)9 (2013), 21.
N. Alon. Problems and results in extremal combinatorics—i. Discrete Mathematics, (1–3)273 (2003), 31–53.
Avron H., Maymounkov P., Toledo S.: Blendenpik: Supercharging LAPACK’s least-squares solver. SIAM Journal of Scientific Computing, 3(32), 1217–1236 (2010)
Article MathSciNet Google Scholar
A. Andoni and H.L. Nguy${\tilde{\hat{\mbox{e}}}}$n. Eigenvalues of a matrix in the streaming model. In Proceedings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) (2013), pp. 1729–1737.
M.-F. Balcan and A. Blum. A pac-style model for learning from labeled and unlabeled data. In Learning Theory, 18th Annual Conference on Learning Theory, COLT 2005, Bertinoro, Italy, June 27–30, 2005, Proceedings(2005), pp. 111–126.
J. Blocki, A. Blum, A. Datta, and O. Sheffet. The johnson–Lindenstrauss transform itself preserves differential privacy. In 53rd Annual IEEE Symposium on Foundations of Computer Science, FOCS 2012, New Brunswick, NJ, USA, October 20–23, 2012 (2012), pp. 410–419.
Balcan M.-F., Blum A., Vempala S.: Kernels as features: On kernels, margins, and low-dimensional mappings. Machine Learning, 1(65), 79–94 (2006)
Article Google Scholar
Baraniuk R.G., Cevher V., Duarte M.F., Hegde C.: Model-based compressive sensing. IEEE Transactions of Information Theory, 56, 1982–2001 (2010)
Article MathSciNet Google Scholar
M. Badoiu, J. Chuzhoy, P. Indyk, and A. Sidiropoulos. Low-distortion embeddings of general metrics into the line. In Proceedings of the 37th Annual ACM Symposium on Theory of Computing, Baltimore, MD, USA, May 22–24, 2005 (2005), pp. 225–233.
Blumensath T., Davies M.E.: Iterative hard thresholding for compressed sensing. Journal of Fourier Analysis Application, 14, 629–654 (2008)
Article MathSciNet MATH Google Scholar
Bourgain J., Dilworth S., Ford K., Konyagin S., Kutzarova D.: Explicit constructions of RIP matrices and related problems. Duke Journal of Mathmetics, 1(159), 145–185 (2011)
Article MathSciNet Google Scholar
Berger B.: The fourth moment method. SIAM Journal of Computing, 4(26), 1188–1207 (1997)
Article Google Scholar
Bourgain J., Lindenstrauss J., Milman V.D.: Approximation of zonoids by zonotopes. Acta Mathematics, 162, 73–141 (1989)
Article MathSciNet MATH Google Scholar
V. Braverman, R. Ostrovsky, and Y. Rabani. Rademacher chaos, random Eulerian graphs and the sparse Johnson-Lindenstrauss transform. CoRR (2010). arXiv:1011.2590.
Bourgain J., Pajor A., Szarek S.J., Tomczak-Jaegermann N.: On the duality problem for entropy numbers of operators. Geometric Aspects of Functional Analysis, 1376, 50–163 (1989)
Article MathSciNet Google Scholar
Buhler J., Tompa M.: Finding motifs using random projections. Journal of Computational Biology, 2(9), 225–242 (2002)
Article Google Scholar
P. Bühlmann and S. van de Geer. Statistics for high-dimensional data. Springer, Heidelberg (2011).
Baraniuk R.G., Wakin M.B.: Random projections of smooth manifolds. Foundations of Computational Mathematics, 1(9), 51–77 (2009)
Article MathSciNet Google Scholar
C. Boutsidis, A. Zouzias, M.W. Mahoney, and P. Drineas. Stochastic dimensionality reduction for k-means clustering. CoRR (2011). arXiv:1110.2897.
E. Candès. The restricted isometry property and its implications for compressed sensing. Comptes Rendus Mathematique, (9-10)346 (2008), 589–592.
B. Carl. Inequalities of Bernstein-Jackson-type and the degree of compactness operators in Banach spaces. Annales Institut Fourier (Grenoble), (3)35 (1985), 79–118.
Charikar M., Chen K., Farach-Colton M.: Finding frequent items in data streams. Theoretical Computer Science, 1(312), 3–15 (2004)
Article MathSciNet Google Scholar
K.L. Clarkson, P. Drineas, M. Magdon-Ismail, M.W. Mahoney, X. Meng, and D.P. Woodruff. The Fast Cauchy Transform and faster robust linear regression. In Proceedings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) (2013), pp. 466–477.
M. Cohen, S. Elder, C. Musco, C. Musco, and M. Persu. Dimensionality reduction for k-means clustering and low rank approximation. CoRR (2014). arXiv:1410.6801.
Cheraghchi M., Guruswami V., Velingker A.: Restricted isometry of Fourier matrices and list decodability of random linear codes. SIAM Journal of Computing, 5(42), 1888–1914 (2013)
Article MathSciNet Google Scholar
K.L. Clarkson. Tighter bounds for random projections of manifolds. In Proceedings of the 24th ACM Symposium on Computational Geometry, College Park, MD, USA, June 9–11, 2008, pp. 39–48.
Contreras P., Murtagh F.: Fast, linear time hierarchical clustering using the Baire metric. Journal of Classification, 2(29), 118–143 (2012)
Article MathSciNet Google Scholar
M.B. Cohen. Personal communication, (2014).
J.W. Cooley and J.M. Tukey. An algorithm for the machine calculation of complex Fourier series. Mathmatics of Computation, 19 (1965), 297–301.
Candès E., Tao T.: Decoding by linear programming. IEEE Transactions of Information Theory, 12(51), 4203–4215 (2005)
Article Google Scholar
E.J. Candès and T. Tao. Near-optimal signal recovery from random projections: universal encoding strategies? IEEE Transactions of Information Theory, 52 (2006), 5406–5425.
K.L. Clarkson and D.P. Woodruff. Numerical linear algebra in the streaming model. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, May 31–June 2, 2009 (2009), pp. 205–214.
K.L. Clarkson and D.P. Woodruff. Low rank approximation and regression in input sparsity time. In Proceedings of the 45th ACM Symposium on Theory of Computing (STOC) (2013), pp. 81–90.
Donoho D.L., Grimes C.: Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data.. Proceedings of Nationall Academy of Sciences, 10(100), 5591–5596 (2013)
Google Scholar
Dirksen S.: Tail bounds via generic chaining. Electronic Journal of Probability, 53(20), 1–29 (2015)
MathSciNet Google Scholar
S. Dirksen. Dimensionality reduction with subgaussian matrices: a unified theory. CoRR (2014), arXiv:1402.3973.
A. Dasgupta, R. Kumar, and T. Sarlós. A sparse Johnson-Lindenstrauss transform. In Proceedings of the 42nd ACM Symposium on Theory of Computing (STOC) (2010), pp. 341–350.
Drineas P., Magdon-Ismail M., Mahoney M., Woodruff D.: Fast approximation of matrix coherence and statistical leverage. Journal of Machine Learning Research, 13, 3475–3506 (2012)
MathSciNet MATH Google Scholar
Donoho D.: Compressed sensing. IEEE Transactions of Information Theory, (4)52, 1289–1306 (2006)
Article MathSciNet Google Scholar
Dudley R.M.: The sizes of compact subsets of Hilbert space and continuity of Gaussian processes. Journal of Functional Analysis, 1, 290–330 (1967)
Article MathSciNet MATH Google Scholar
A. Eftekhari and M.B. Wakin. New analysis of manifold embeddings and signal recovery from compressive measurements. CoRR (2013).arXiv:1306.4748 .
Fernique X.: Regularité des trajectoires des fonctions aléatoires gaussiennes. Lecture Notes in Math, 480, 1–96 (1975)
Article MathSciNet Google Scholar
Figiel T.: On the moduli of convexity and smoothness. Studia Mathematics, 2(56), 121–155 (1976)
MathSciNet Google Scholar
S. Foucart and H. Rauhut. A Mathematical Introduction to Compressive Sensing. Birkhaüser, Boston (2013).
O. Guédon, S. Mendelson, A. Pajor, and N. Tomczak-Jaegermann. Subspaces and orthogonal decompositions generated by bounded orthogonal systems. Positivity, (2)11, 269–283 (2007)
Article Google Scholar
O. Guédon, S. Mendelson, A. Pajor, and N. Tomczak-Jaegermann. Majorizing measures and proportional subsets of bounded orthonormal systems. Revista Matematica Iberoamericana, (3)24 (2008), 1075–1095.
Article Google Scholar
Y. Gordon. On Milman’s inequality and random subspaces which escape through a mesh in ${\mathbb{R}^n}$. Geometric Aspects of Functional Analysis, (1988), 84–106.
S. Har-Peled, P. Indyk, and R. Motwani. Approximate nearest neighbor: Towards removing the curse of dimensionality. Theory of Computing, (1)8 (2012), 321–350.
Article MathSciNet Google Scholar
J.-B. Hiriart-Urruty and C. Lemaréchal. Fundamentals of convex analysis. Springer-Verlag, Berlin, (2001).
C. Hegde, M. Wakin, and R. Baraniuk. Random projections for manifold learning. In Advances in neural information processing systems (2007), pp. 641–648.
P. Indyk. Algorithmic applications of low-distortion geometric embeddings. In Proceedings of the 42nd Annual Symposium on Foundations of Computer Science (FOCS)(2001), pp. 10–33.
P. Indyk and I. Razenshteyn. On model-based RIP-1 matrices. In Proceedings of the 40th International Colloquium on Automata, Languages and Programming (ICALP) (2013), pp. 564–575.
W.B. Johnson and J. Lindenstrauss. Extensions of Lipschitz mappings into a Hilbert space. Contemporary Mathematics, 26 (1984), 189–206.
Article MathSciNet MATH Google Scholar
J.-P. Kahane. Some Random Series of Functions. Heath Math. Monographs. Cambridge University Press, Cambridge (1968).
B. Klartag and S. Mendelson. Empirical processes and random projections. Journal of Functional of Analysis, (1)225 (2005), 229–245.
Article MathSciNet Google Scholar
F. Krahmer, S. Mendelson, and H. Rauhut. Suprema of chaos processes and the restricted isometry property. Communications of Pure Applied Mathmetics, (11)67 (2014), 1877–1904.
Article MathSciNet Google Scholar
D.M. Kane and J. Nelson. A derandomized sparse Johnson–Lindenstrauss transform. CoRR (2010).arXiv:1006.3585.
D.M. Kane and J. Nelson. Sparser Johnson–Lindenstrauss transforms. Journal of the ACM, (1) 61 (2014), 4.
F. Krahmer and R. Ward. New and improved Johnson–Lindenstrauss embeddings via the Restricted Isometry Property. SIAM Journal of Mathematical Analysis, (3)43 (2011), 1269–1281.
G. Kovács, S. Zucker, and T. Mazeh. A box-fitting algorithm in the search for periodic transits. Astronomy and Astrophysics, 391 (2002), 369–377.
Article Google Scholar
Y. Lu, P. Dhillon, D. Foster, and L. Ungar. Faster ridge regression via the subsampled randomized Hadamard transform. In Proceedings of the 26th Annual Conference on Advances in Neural Information Processing Systems (NIPS), (2013).
K.G. Larsen and Jelani Nelson. The Johnson–Lindenstrauss lemma is optimal for linear dimensionality reduction, Manuscript (2014).
F. Lust-Piquard. Inégalités de Khintchine dans C _p ${(1 < p < \infty)}$. Comptes Rendus Mathematique Acadameic Science Paris, (7)303 (1986), 289–292.
F. Lust-Piquard and G. Pisier. Noncommutative Khintchine and Paley inequalities. Ark. Mat., (2)29 (1991), 241–260.
Y.T. Lee and A. Sidford. Matching the universal barrier without paying the costs : Solving linear programs with õ(sqrt(rank)) linear system solves. CoRR (2013). arXiv:1312.6677.
J. Matousek. On variants of the Johnson–Lindenstrauss lemma. Random Structure Algorithms, (2)33 (2008):142–156.
X. Meng and M.W. Mahoney. Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression. In Proceedings of the 45th ACM Symposium on Theory of Computing (STOC) (2013), pp. 91–100.
S. Mendelson, A. Pajor, and N. Tomczak-Jaegermann. Reconstruction and subgaussian operators in asymptotic geometric analysis. Geometric and Functional Analysis, (4)17 (2007), 1248–1282.
Article MathSciNet Google Scholar
R. Meise and D. Vogt. Introduction to Functional Analysis. Oxford Graduate Texts in Mathematics (Book 2). Oxford University Press, Oxford (1997).
J. Nelson and H.L. Nguy${\tilde{\hat{\mbox{e} }}}$n. OSNAP: Faster numerical linear algebra algorithms via sparser subspace embeddings. In Proceedings of the 54th Annual IEEE Symposium on Foundations of Computer Science (FOCS) (2013), pp. 117–126.
J. Nelson and H.L. Nguy${\tilde{\hat{\mbox{e} }}}$n. Sparsity lower bounds for dimensionality-reducing maps. In Proceedings of the 45th ACM Symposium on Theory of Computing (STOC) (2013), pp. 101–110.
J. Nelson, E. Price, and M. Wootters. New constructions of RIP matrices with fast multiplication and fewer rows. In Proceedings of the 25th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)(2014).
D. Needell and J.A. Tropp. CoSaMP: Iterative signal recovery from incomplete and inaccurate samples. Applied Computational Harmonic Analysis, 26 (2009), 301–332.
Article MathSciNet MATH Google Scholar
S. Paul, C. Boutsidis, M. Magdon-Ismail, and P. Drineas. Random projections for support vector machines. In Proceedings of the 16th International Conference on Artificial Intelligence and Statistics (AISTATS) (2013), pp. 498–506.
A. Pietsch. Theorie der Operatorenideale (Zusammenfassung). Friedrich-Schiller-Universität Jena (1972).
G. Pisier. Probabilistic methods in the geometry of Banach spaces. Probability and Analysis, Lecture Notes in Mathematics, 1206 (1986), 167– 241.
Article MathSciNet Google Scholar
G. Pisier. Remarques sur un résultat non public de B. Maurey. Séminaire d’analyse fonctionnelle, Exp. V., (1980/81).
A. Pajor and N. Tomczak-Jaegermann. Subspaces of small codimension of finite dimensional Banach spaces. Proceedings of American of Mathematics Society, 97, (1986), 637–642.
M. Pilanci and M. Wainwright. Randomized sketches of convex programs with sharp guarantees. arXiv (2014).arXiv:1404.7203.
R. Rockafellar. Convex Analysis. Princeton University Press, Princeton (1970).
S.T. Roweis and L.K. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290 (2000), 2323–2326.
Article Google Scholar
M. Rudelson and R. Vershynin. Sampling from large matrices: An approach through geometric functional analysis. Journal of the ACM, (4)54 (2007).
M. Rudelson and R. Vershynin. On sparse reconstruction from Fourier and Gaussian measurements. Communications on Pure and Applied Mathematics, (8)61 (2008), 1025–1045.
Article MathSciNet Google Scholar
T. Sarlós. Improved approximation algorithms for large matrices via random projections. In Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS) (2006), pp. 143–152.
D.A. Spielman and N. Srivastava. Graph sparsification by effective resistances. SIAM Journal of Computations, (6) 40 (2011), 1913–1926.
Article MathSciNet Google Scholar
M. Talagrand. The generic chaining: upper and lower bounds of stochastic processes. Springer Verlag, Berlin (2005).
J.B. Tenenbaum, V. de Silva, and J.C. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, (5500)290 (2000), 2319–2323.
R. Tibshirani. Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society, Series B, (1)58 (1996), 267–288.
MathSciNet Google Scholar
J.A. Tropp. The random paving property for uniformly bounded matrices. Studia Math., 185:67–82, 2008.
J.A. Tropp. Improved analysis of the subsampled randomized hadamard transform. Advances in Adaptive Data Analysis, (1–2)3 (2011), 115– 126.
A. Vanderburg and J.A. Johnson. A technique for extracting highly precise photometry for the two-wheeled Kepler mission. CoRR (2011). arXiv:1408.3853.
K.Q. Weinberger, A. Dasgupta, J. Langford, A.J. Smola, and J. Attenberg. Feature hashing for large scale multitask learning. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML) (2009), pp. 1113–1120.
D.P. Woodruff. Personal communication (2014).
D.P. Woodruff and Q. Zhang. Subspace embeddings and ${\ell_p}$ regression using exponential random variables. In Proceedings of the 26th Conference on Learning Theory (COLT) (2013).

Download references

Author information

Authors and Affiliations

Institute for Advanced Study, Princeton, NJ, 08540, USA
Jean Bourgain
RWTH Aachen University, 52062, Aachen, Germany
Sjoerd Dirksen
Harvard University, Cambridge, MA, 02138, USA
Jelani Nelson

Authors

Jean Bourgain
View author publications
You can also search for this author in PubMed Google Scholar
Sjoerd Dirksen
View author publications
You can also search for this author in PubMed Google Scholar
Jelani Nelson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jean Bourgain.

Additional information

J. B. partially supported by NSF Grant DMS-1301619. S. D. partially supported by SFB Grant 1060 of the Deutsche Forschungsgemeinschaft (DFG).

J.N. supported by NSF Grant IIS-1447471 and CAREER Award CCF-1350670, ONR Grant N00014-14-1-0632, and a Google Faculty Research Award. Part of this work done while supported by NSF Grants CCF-0832797 and DMS-1128155.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bourgain, J., Dirksen, S. & Nelson, J. Toward a unified theory of sparse dimensionality reduction in Euclidean space. Geom. Funct. Anal. 25, 1009–1088 (2015). https://doi.org/10.1007/s00039-015-0332-9

Download citation

Received: 05 February 2015
Accepted: 19 March 2015
Published: 05 August 2015
Issue Date: July 2015
DOI: https://doi.org/10.1007/s00039-015-0332-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Toward a unified theory of sparse dimensionality reduction in Euclidean space

Abstract

Access this article

Similar content being viewed by others

Sum-of-Squares Relaxations for Information Theory and Variational Inference

A New Insight on Augmented Lagrangian Method with Applications in Machine Learning

SSCNet: learning-based subspace clustering

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Toward a unified theory of sparse dimensionality reduction in Euclidean space

Abstract

Access this article

Similar content being viewed by others

Sum-of-Squares Relaxations for Information Theory and Variational Inference

A New Insight on Augmented Lagrangian Method with Applications in Machine Learning

SSCNet: learning-based subspace clustering

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation