Skip to main content
Log in

Toward a unified theory of sparse dimensionality reduction in Euclidean space

  • Published:
Geometric and Functional Analysis Aims and scope Submit manuscript

Abstract

Let \({\Phi\in\mathbb{R}^{m\times n}}\) be a sparse Johnson–Lindenstrauss transform (Kane and Nelson in J ACM 61(1):4, 2014) with s non-zeroes per column. For a subset T of the unit sphere, \({\varepsilon\in(0,1/2)}\) given, we study settings for m, s required to ensure

$$\mathop{\mathbb{E}}_\Phi \sup_{x\in T}\left|\|\Phi x\|_2^2 - 1\right| < \varepsilon,$$

i.e. so that \({\Phi}\) preserves the norm of every \({x\in T}\) simultaneously and multiplicatively up to \({1+\varepsilon}\). We introduce a new complexity parameter, which depends on the geometry of T, and show that it suffices to choose s and m such that this parameter is small. Our result is a sparse analog of Gordon’s theorem, which was concerned with a dense \({\Phi}\) having i.i.d. Gaussian entries. We qualitatively unify several results related to the Johnson–Lindenstrauss lemma, subspace embeddings, and Fourier-based restricted isometries. Our work also implies new results in using the sparse Johnson–Lindenstrauss transform in numerical linear algebra, classical and model-based compressed sensing, manifold learning, and constrained least squares problems such as the Lasso.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. H. Avron, C. Boutsidis, S. Toledo, and A. Zouzias. Efficient dimensionality reduction for canonical correlation analysis. In Proceedings of the 30th International Conference on Machine Learning (ICML), (2013).

  2. Ailon N., Chazelle B.: The Fast Johnson–Lindenstrauss Transform and approximate nearest neighbors. SIAM Journal of Computing, 1(39), 302–322 (2009)

    Article  MathSciNet  Google Scholar 

  3. Achlioptas D.: Database-friendly random projections: Johnson-Lindenstrauss with binary coins. Journal of Computer System Science, 4(66), 671–687 (2003)

    Article  MathSciNet  Google Scholar 

  4. U. Ayaz, S. Dirksen, and H. Rauhut. Uniform recovery of fusion frame structured sparse signals. CoRR (2014). arXiv:1407.7680.

  5. S. Arora, E. Hazan, and S. Kale. A fast random sampling algorithm for sparsifying matrices. In APPROX-RANDOM (2006), pp. 272–279.

  6. Ailon N., Liberty E.: Fast dimension reduction using Rademacher series on dual BCH codes. Discrete Computational Geometry, 4(42), 615–630 (2009)

    Article  MathSciNet  Google Scholar 

  7. N. Ailon and E. Liberty. An almost optimal unrestricted fast Johnson–Lindenstrauss transform. ACM Transactions on Algorithms, (3)9 (2013), 21.

  8. N. Alon. Problems and results in extremal combinatorics—i. Discrete Mathematics, (1–3)273 (2003), 31–53.

  9. Avron H., Maymounkov P., Toledo S.: Blendenpik: Supercharging LAPACK’s least-squares solver. SIAM Journal of Scientific Computing, 3(32), 1217–1236 (2010)

    Article  MathSciNet  Google Scholar 

  10. A. Andoni and H.L. Nguy\({\tilde{\hat{\mbox{e}}}}\)n. Eigenvalues of a matrix in the streaming model. In Proceedings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) (2013), pp. 1729–1737.

  11. M.-F. Balcan and A. Blum. A pac-style model for learning from labeled and unlabeled data. In Learning Theory, 18th Annual Conference on Learning Theory, COLT 2005, Bertinoro, Italy, June 27–30, 2005, Proceedings(2005), pp. 111–126.

  12. J. Blocki, A. Blum, A. Datta, and O. Sheffet. The johnson–Lindenstrauss transform itself preserves differential privacy. In 53rd Annual IEEE Symposium on Foundations of Computer Science, FOCS 2012, New Brunswick, NJ, USA, October 20–23, 2012 (2012), pp. 410–419.

  13. Balcan M.-F., Blum A., Vempala S.: Kernels as features: On kernels, margins, and low-dimensional mappings. Machine Learning, 1(65), 79–94 (2006)

    Article  Google Scholar 

  14. Baraniuk R.G., Cevher V., Duarte M.F., Hegde C.: Model-based compressive sensing. IEEE Transactions of Information Theory, 56, 1982–2001 (2010)

    Article  MathSciNet  Google Scholar 

  15. M. Badoiu, J. Chuzhoy, P. Indyk, and A. Sidiropoulos. Low-distortion embeddings of general metrics into the line. In Proceedings of the 37th Annual ACM Symposium on Theory of Computing, Baltimore, MD, USA, May 22–24, 2005 (2005), pp. 225–233.

  16. Blumensath T., Davies M.E.: Iterative hard thresholding for compressed sensing. Journal of Fourier Analysis Application, 14, 629–654 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  17. Bourgain J., Dilworth S., Ford K., Konyagin S., Kutzarova D.: Explicit constructions of RIP matrices and related problems. Duke Journal of Mathmetics, 1(159), 145–185 (2011)

    Article  MathSciNet  Google Scholar 

  18. Berger B.: The fourth moment method. SIAM Journal of Computing, 4(26), 1188–1207 (1997)

    Article  Google Scholar 

  19. Bourgain J., Lindenstrauss J., Milman V.D.: Approximation of zonoids by zonotopes. Acta Mathematics, 162, 73–141 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  20. V. Braverman, R. Ostrovsky, and Y. Rabani. Rademacher chaos, random Eulerian graphs and the sparse Johnson-Lindenstrauss transform. CoRR (2010). arXiv:1011.2590.

  21. Bourgain J., Pajor A., Szarek S.J., Tomczak-Jaegermann N.: On the duality problem for entropy numbers of operators. Geometric Aspects of Functional Analysis, 1376, 50–163 (1989)

    Article  MathSciNet  Google Scholar 

  22. Buhler J., Tompa M.: Finding motifs using random projections. Journal of Computational Biology, 2(9), 225–242 (2002)

    Article  Google Scholar 

  23. P. Bühlmann and S. van de Geer. Statistics for high-dimensional data. Springer, Heidelberg (2011).

  24. Baraniuk R.G., Wakin M.B.: Random projections of smooth manifolds. Foundations of Computational Mathematics, 1(9), 51–77 (2009)

    Article  MathSciNet  Google Scholar 

  25. C. Boutsidis, A. Zouzias, M.W. Mahoney, and P. Drineas. Stochastic dimensionality reduction for k-means clustering. CoRR (2011). arXiv:1110.2897.

  26. E. Candès. The restricted isometry property and its implications for compressed sensing. Comptes Rendus Mathematique, (9-10)346 (2008), 589–592.

  27. B. Carl. Inequalities of Bernstein-Jackson-type and the degree of compactness operators in Banach spaces. Annales Institut Fourier (Grenoble), (3)35 (1985), 79–118.

  28. Charikar M., Chen K., Farach-Colton M.: Finding frequent items in data streams. Theoretical Computer Science, 1(312), 3–15 (2004)

    Article  MathSciNet  Google Scholar 

  29. K.L. Clarkson, P. Drineas, M. Magdon-Ismail, M.W. Mahoney, X. Meng, and D.P. Woodruff. The Fast Cauchy Transform and faster robust linear regression. In Proceedings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) (2013), pp. 466–477.

  30. M. Cohen, S. Elder, C. Musco, C. Musco, and M. Persu. Dimensionality reduction for k-means clustering and low rank approximation. CoRR (2014). arXiv:1410.6801.

  31. Cheraghchi M., Guruswami V., Velingker A.: Restricted isometry of Fourier matrices and list decodability of random linear codes. SIAM Journal of Computing, 5(42), 1888–1914 (2013)

    Article  MathSciNet  Google Scholar 

  32. K.L. Clarkson. Tighter bounds for random projections of manifolds. In Proceedings of the 24th ACM Symposium on Computational Geometry, College Park, MD, USA, June 9–11, 2008, pp. 39–48.

  33. Contreras P., Murtagh F.: Fast, linear time hierarchical clustering using the Baire metric. Journal of Classification, 2(29), 118–143 (2012)

    Article  MathSciNet  Google Scholar 

  34. M.B. Cohen. Personal communication, (2014).

  35. J.W. Cooley and J.M. Tukey. An algorithm for the machine calculation of complex Fourier series. Mathmatics of Computation, 19 (1965), 297–301.

  36. Candès E., Tao T.: Decoding by linear programming. IEEE Transactions of Information Theory, 12(51), 4203–4215 (2005)

    Article  Google Scholar 

  37. E.J. Candès and T. Tao. Near-optimal signal recovery from random projections: universal encoding strategies? IEEE Transactions of Information Theory, 52 (2006), 5406–5425.

  38. K.L. Clarkson and D.P. Woodruff. Numerical linear algebra in the streaming model. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, May 31–June 2, 2009 (2009), pp. 205–214.

  39. K.L. Clarkson and D.P. Woodruff. Low rank approximation and regression in input sparsity time. In Proceedings of the 45th ACM Symposium on Theory of Computing (STOC) (2013), pp. 81–90.

  40. Donoho D.L., Grimes C.: Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data.. Proceedings of Nationall Academy of Sciences, 10(100), 5591–5596 (2013)

    Google Scholar 

  41. Dirksen S.: Tail bounds via generic chaining. Electronic Journal of Probability, 53(20), 1–29 (2015)

    MathSciNet  Google Scholar 

  42. S. Dirksen. Dimensionality reduction with subgaussian matrices: a unified theory. CoRR (2014), arXiv:1402.3973.

  43. A. Dasgupta, R. Kumar, and T. Sarlós. A sparse Johnson-Lindenstrauss transform. In Proceedings of the 42nd ACM Symposium on Theory of Computing (STOC) (2010), pp. 341–350.

  44. Drineas P., Magdon-Ismail M., Mahoney M., Woodruff D.: Fast approximation of matrix coherence and statistical leverage. Journal of Machine Learning Research, 13, 3475–3506 (2012)

    MathSciNet  MATH  Google Scholar 

  45. Donoho D.: Compressed sensing. IEEE Transactions of Information Theory, (4)52, 1289–1306 (2006)

    Article  MathSciNet  Google Scholar 

  46. Dudley R.M.: The sizes of compact subsets of Hilbert space and continuity of Gaussian processes. Journal of Functional Analysis, 1, 290–330 (1967)

    Article  MathSciNet  MATH  Google Scholar 

  47. A. Eftekhari and M.B. Wakin. New analysis of manifold embeddings and signal recovery from compressive measurements. CoRR (2013).arXiv:1306.4748 .

  48. Fernique X.: Regularité des trajectoires des fonctions aléatoires gaussiennes. Lecture Notes in Math, 480, 1–96 (1975)

    Article  MathSciNet  Google Scholar 

  49. Figiel T.: On the moduli of convexity and smoothness. Studia Mathematics, 2(56), 121–155 (1976)

    MathSciNet  Google Scholar 

  50. S. Foucart and H. Rauhut. A Mathematical Introduction to Compressive Sensing. Birkhaüser, Boston (2013).

  51. O. Guédon, S. Mendelson, A. Pajor, and N. Tomczak-Jaegermann. Subspaces and orthogonal decompositions generated by bounded orthogonal systems. Positivity, (2)11, 269–283 (2007)

    Article  Google Scholar 

  52. O. Guédon, S. Mendelson, A. Pajor, and N. Tomczak-Jaegermann. Majorizing measures and proportional subsets of bounded orthonormal systems. Revista Matematica Iberoamericana, (3)24 (2008), 1075–1095.

    Article  Google Scholar 

  53. Y. Gordon. On Milman’s inequality and random subspaces which escape through a mesh in \({\mathbb{R}^n}\). Geometric Aspects of Functional Analysis, (1988), 84–106.

  54. S. Har-Peled, P. Indyk, and R. Motwani. Approximate nearest neighbor: Towards removing the curse of dimensionality. Theory of Computing, (1)8 (2012), 321–350.

    Article  MathSciNet  Google Scholar 

  55. J.-B. Hiriart-Urruty and C. Lemaréchal. Fundamentals of convex analysis. Springer-Verlag, Berlin, (2001).

  56. C. Hegde, M. Wakin, and R. Baraniuk. Random projections for manifold learning. In Advances in neural information processing systems (2007), pp. 641–648.

  57. P. Indyk. Algorithmic applications of low-distortion geometric embeddings. In Proceedings of the 42nd Annual Symposium on Foundations of Computer Science (FOCS)(2001), pp. 10–33.

  58. P. Indyk and I. Razenshteyn. On model-based RIP-1 matrices. In Proceedings of the 40th International Colloquium on Automata, Languages and Programming (ICALP) (2013), pp. 564–575.

  59. W.B. Johnson and J. Lindenstrauss. Extensions of Lipschitz mappings into a Hilbert space. Contemporary Mathematics, 26 (1984), 189–206.

    Article  MathSciNet  MATH  Google Scholar 

  60. J.-P. Kahane. Some Random Series of Functions. Heath Math. Monographs. Cambridge University Press, Cambridge (1968).

  61. B. Klartag and S. Mendelson. Empirical processes and random projections. Journal of Functional of Analysis, (1)225 (2005), 229–245.

    Article  MathSciNet  Google Scholar 

  62. F. Krahmer, S. Mendelson, and H. Rauhut. Suprema of chaos processes and the restricted isometry property. Communications of Pure Applied Mathmetics, (11)67 (2014), 1877–1904.

    Article  MathSciNet  Google Scholar 

  63. D.M. Kane and J. Nelson. A derandomized sparse Johnson–Lindenstrauss transform. CoRR (2010).arXiv:1006.3585.

  64. D.M. Kane and J. Nelson. Sparser Johnson–Lindenstrauss transforms. Journal of the ACM, (1) 61 (2014), 4.

  65. F. Krahmer and R. Ward. New and improved Johnson–Lindenstrauss embeddings via the Restricted Isometry Property. SIAM Journal of Mathematical Analysis, (3)43 (2011), 1269–1281.

  66. G. Kovács, S. Zucker, and T. Mazeh. A box-fitting algorithm in the search for periodic transits. Astronomy and Astrophysics, 391 (2002), 369–377.

    Article  Google Scholar 

  67. Y. Lu, P. Dhillon, D. Foster, and L. Ungar. Faster ridge regression via the subsampled randomized Hadamard transform. In Proceedings of the 26th Annual Conference on Advances in Neural Information Processing Systems (NIPS), (2013).

  68. K.G. Larsen and Jelani Nelson. The Johnson–Lindenstrauss lemma is optimal for linear dimensionality reduction, Manuscript (2014).

  69. F. Lust-Piquard. Inégalités de Khintchine dans C p \({(1 < p < \infty)}\). Comptes Rendus Mathematique Acadameic Science Paris, (7)303 (1986), 289–292.

  70. F. Lust-Piquard and G. Pisier. Noncommutative Khintchine and Paley inequalities. Ark. Mat., (2)29 (1991), 241–260.

  71. Y.T. Lee and A. Sidford. Matching the universal barrier without paying the costs : Solving linear programs with õ(sqrt(rank)) linear system solves. CoRR (2013). arXiv:1312.6677.

  72. J. Matousek. On variants of the Johnson–Lindenstrauss lemma. Random Structure Algorithms, (2)33 (2008):142–156.

  73. X. Meng and M.W. Mahoney. Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression. In Proceedings of the 45th ACM Symposium on Theory of Computing (STOC) (2013), pp. 91–100.

  74. S. Mendelson, A. Pajor, and N. Tomczak-Jaegermann. Reconstruction and subgaussian operators in asymptotic geometric analysis. Geometric and Functional Analysis, (4)17 (2007), 1248–1282.

    Article  MathSciNet  Google Scholar 

  75. R. Meise and D. Vogt. Introduction to Functional Analysis. Oxford Graduate Texts in Mathematics (Book 2). Oxford University Press, Oxford (1997).

  76. J. Nelson and H.L. Nguy\({\tilde{\hat{\mbox{e} }}}\)n. OSNAP: Faster numerical linear algebra algorithms via sparser subspace embeddings. In Proceedings of the 54th Annual IEEE Symposium on Foundations of Computer Science (FOCS) (2013), pp. 117–126.

  77. J. Nelson and H.L. Nguy\({\tilde{\hat{\mbox{e} }}}\)n. Sparsity lower bounds for dimensionality-reducing maps. In Proceedings of the 45th ACM Symposium on Theory of Computing (STOC) (2013), pp. 101–110.

  78. J. Nelson, E. Price, and M. Wootters. New constructions of RIP matrices with fast multiplication and fewer rows. In Proceedings of the 25th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)(2014).

  79. D. Needell and J.A. Tropp. CoSaMP: Iterative signal recovery from incomplete and inaccurate samples. Applied Computational Harmonic Analysis, 26 (2009), 301–332.

    Article  MathSciNet  MATH  Google Scholar 

  80. S. Paul, C. Boutsidis, M. Magdon-Ismail, and P. Drineas. Random projections for support vector machines. In Proceedings of the 16th International Conference on Artificial Intelligence and Statistics (AISTATS) (2013), pp. 498–506.

  81. A. Pietsch. Theorie der Operatorenideale (Zusammenfassung). Friedrich-Schiller-Universität Jena (1972).

  82. G. Pisier. Probabilistic methods in the geometry of Banach spaces. Probability and Analysis, Lecture Notes in Mathematics, 1206 (1986), 167– 241.

    Article  MathSciNet  Google Scholar 

  83. G. Pisier. Remarques sur un résultat non public de B. Maurey. Séminaire d’analyse fonctionnelle, Exp. V., (1980/81).

  84. A. Pajor and N. Tomczak-Jaegermann. Subspaces of small codimension of finite dimensional Banach spaces. Proceedings of American of Mathematics Society, 97, (1986), 637–642.

  85. M. Pilanci and M. Wainwright. Randomized sketches of convex programs with sharp guarantees. arXiv (2014).arXiv:1404.7203.

  86. R. Rockafellar. Convex Analysis. Princeton University Press, Princeton (1970).

  87. S.T. Roweis and L.K. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290 (2000), 2323–2326.

    Article  Google Scholar 

  88. M. Rudelson and R. Vershynin. Sampling from large matrices: An approach through geometric functional analysis. Journal of the ACM, (4)54 (2007).

  89. M. Rudelson and R. Vershynin. On sparse reconstruction from Fourier and Gaussian measurements. Communications on Pure and Applied Mathematics, (8)61 (2008), 1025–1045.

    Article  MathSciNet  Google Scholar 

  90. T. Sarlós. Improved approximation algorithms for large matrices via random projections. In Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS) (2006), pp. 143–152.

  91. D.A. Spielman and N. Srivastava. Graph sparsification by effective resistances. SIAM Journal of Computations, (6) 40 (2011), 1913–1926.

    Article  MathSciNet  Google Scholar 

  92. M. Talagrand. The generic chaining: upper and lower bounds of stochastic processes. Springer Verlag, Berlin (2005).

  93. J.B. Tenenbaum, V. de Silva, and J.C. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, (5500)290 (2000), 2319–2323.

  94. R. Tibshirani. Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society, Series B, (1)58 (1996), 267–288.

    MathSciNet  Google Scholar 

  95. J.A. Tropp. The random paving property for uniformly bounded matrices. Studia Math., 185:67–82, 2008.

  96. J.A. Tropp. Improved analysis of the subsampled randomized hadamard transform. Advances in Adaptive Data Analysis, (1–2)3 (2011), 115– 126.

  97. A. Vanderburg and J.A. Johnson. A technique for extracting highly precise photometry for the two-wheeled Kepler mission. CoRR (2011). arXiv:1408.3853.

  98. K.Q. Weinberger, A. Dasgupta, J. Langford, A.J. Smola, and J. Attenberg. Feature hashing for large scale multitask learning. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML) (2009), pp. 1113–1120.

  99. D.P. Woodruff. Personal communication (2014).

  100. D.P. Woodruff and Q. Zhang. Subspace embeddings and \({\ell_p}\) regression using exponential random variables. In Proceedings of the 26th Conference on Learning Theory (COLT) (2013).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jean Bourgain.

Additional information

J. B. partially supported by NSF Grant DMS-1301619. S. D. partially supported by SFB Grant 1060 of the Deutsche Forschungsgemeinschaft (DFG).

J.N. supported by NSF Grant IIS-1447471 and CAREER Award CCF-1350670, ONR Grant N00014-14-1-0632, and a Google Faculty Research Award. Part of this work done while supported by NSF Grants CCF-0832797 and DMS-1128155.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bourgain, J., Dirksen, S. & Nelson, J. Toward a unified theory of sparse dimensionality reduction in Euclidean space. Geom. Funct. Anal. 25, 1009–1088 (2015). https://doi.org/10.1007/s00039-015-0332-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00039-015-0332-9

Keywords

Navigation