Abstract
Let \({\Phi\in\mathbb{R}^{m\times n}}\) be a sparse Johnson–Lindenstrauss transform (Kane and Nelson in J ACM 61(1):4, 2014) with s non-zeroes per column. For a subset T of the unit sphere, \({\varepsilon\in(0,1/2)}\) given, we study settings for m, s required to ensure
i.e. so that \({\Phi}\) preserves the norm of every \({x\in T}\) simultaneously and multiplicatively up to \({1+\varepsilon}\). We introduce a new complexity parameter, which depends on the geometry of T, and show that it suffices to choose s and m such that this parameter is small. Our result is a sparse analog of Gordon’s theorem, which was concerned with a dense \({\Phi}\) having i.i.d. Gaussian entries. We qualitatively unify several results related to the Johnson–Lindenstrauss lemma, subspace embeddings, and Fourier-based restricted isometries. Our work also implies new results in using the sparse Johnson–Lindenstrauss transform in numerical linear algebra, classical and model-based compressed sensing, manifold learning, and constrained least squares problems such as the Lasso.
Similar content being viewed by others
References
H. Avron, C. Boutsidis, S. Toledo, and A. Zouzias. Efficient dimensionality reduction for canonical correlation analysis. In Proceedings of the 30th International Conference on Machine Learning (ICML), (2013).
Ailon N., Chazelle B.: The Fast Johnson–Lindenstrauss Transform and approximate nearest neighbors. SIAM Journal of Computing, 1(39), 302–322 (2009)
Achlioptas D.: Database-friendly random projections: Johnson-Lindenstrauss with binary coins. Journal of Computer System Science, 4(66), 671–687 (2003)
U. Ayaz, S. Dirksen, and H. Rauhut. Uniform recovery of fusion frame structured sparse signals. CoRR (2014). arXiv:1407.7680.
S. Arora, E. Hazan, and S. Kale. A fast random sampling algorithm for sparsifying matrices. In APPROX-RANDOM (2006), pp. 272–279.
Ailon N., Liberty E.: Fast dimension reduction using Rademacher series on dual BCH codes. Discrete Computational Geometry, 4(42), 615–630 (2009)
N. Ailon and E. Liberty. An almost optimal unrestricted fast Johnson–Lindenstrauss transform. ACM Transactions on Algorithms, (3)9 (2013), 21.
N. Alon. Problems and results in extremal combinatorics—i. Discrete Mathematics, (1–3)273 (2003), 31–53.
Avron H., Maymounkov P., Toledo S.: Blendenpik: Supercharging LAPACK’s least-squares solver. SIAM Journal of Scientific Computing, 3(32), 1217–1236 (2010)
A. Andoni and H.L. Nguy\({\tilde{\hat{\mbox{e}}}}\)n. Eigenvalues of a matrix in the streaming model. In Proceedings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) (2013), pp. 1729–1737.
M.-F. Balcan and A. Blum. A pac-style model for learning from labeled and unlabeled data. In Learning Theory, 18th Annual Conference on Learning Theory, COLT 2005, Bertinoro, Italy, June 27–30, 2005, Proceedings(2005), pp. 111–126.
J. Blocki, A. Blum, A. Datta, and O. Sheffet. The johnson–Lindenstrauss transform itself preserves differential privacy. In 53rd Annual IEEE Symposium on Foundations of Computer Science, FOCS 2012, New Brunswick, NJ, USA, October 20–23, 2012 (2012), pp. 410–419.
Balcan M.-F., Blum A., Vempala S.: Kernels as features: On kernels, margins, and low-dimensional mappings. Machine Learning, 1(65), 79–94 (2006)
Baraniuk R.G., Cevher V., Duarte M.F., Hegde C.: Model-based compressive sensing. IEEE Transactions of Information Theory, 56, 1982–2001 (2010)
M. Badoiu, J. Chuzhoy, P. Indyk, and A. Sidiropoulos. Low-distortion embeddings of general metrics into the line. In Proceedings of the 37th Annual ACM Symposium on Theory of Computing, Baltimore, MD, USA, May 22–24, 2005 (2005), pp. 225–233.
Blumensath T., Davies M.E.: Iterative hard thresholding for compressed sensing. Journal of Fourier Analysis Application, 14, 629–654 (2008)
Bourgain J., Dilworth S., Ford K., Konyagin S., Kutzarova D.: Explicit constructions of RIP matrices and related problems. Duke Journal of Mathmetics, 1(159), 145–185 (2011)
Berger B.: The fourth moment method. SIAM Journal of Computing, 4(26), 1188–1207 (1997)
Bourgain J., Lindenstrauss J., Milman V.D.: Approximation of zonoids by zonotopes. Acta Mathematics, 162, 73–141 (1989)
V. Braverman, R. Ostrovsky, and Y. Rabani. Rademacher chaos, random Eulerian graphs and the sparse Johnson-Lindenstrauss transform. CoRR (2010). arXiv:1011.2590.
Bourgain J., Pajor A., Szarek S.J., Tomczak-Jaegermann N.: On the duality problem for entropy numbers of operators. Geometric Aspects of Functional Analysis, 1376, 50–163 (1989)
Buhler J., Tompa M.: Finding motifs using random projections. Journal of Computational Biology, 2(9), 225–242 (2002)
P. Bühlmann and S. van de Geer. Statistics for high-dimensional data. Springer, Heidelberg (2011).
Baraniuk R.G., Wakin M.B.: Random projections of smooth manifolds. Foundations of Computational Mathematics, 1(9), 51–77 (2009)
C. Boutsidis, A. Zouzias, M.W. Mahoney, and P. Drineas. Stochastic dimensionality reduction for k-means clustering. CoRR (2011). arXiv:1110.2897.
E. Candès. The restricted isometry property and its implications for compressed sensing. Comptes Rendus Mathematique, (9-10)346 (2008), 589–592.
B. Carl. Inequalities of Bernstein-Jackson-type and the degree of compactness operators in Banach spaces. Annales Institut Fourier (Grenoble), (3)35 (1985), 79–118.
Charikar M., Chen K., Farach-Colton M.: Finding frequent items in data streams. Theoretical Computer Science, 1(312), 3–15 (2004)
K.L. Clarkson, P. Drineas, M. Magdon-Ismail, M.W. Mahoney, X. Meng, and D.P. Woodruff. The Fast Cauchy Transform and faster robust linear regression. In Proceedings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) (2013), pp. 466–477.
M. Cohen, S. Elder, C. Musco, C. Musco, and M. Persu. Dimensionality reduction for k-means clustering and low rank approximation. CoRR (2014). arXiv:1410.6801.
Cheraghchi M., Guruswami V., Velingker A.: Restricted isometry of Fourier matrices and list decodability of random linear codes. SIAM Journal of Computing, 5(42), 1888–1914 (2013)
K.L. Clarkson. Tighter bounds for random projections of manifolds. In Proceedings of the 24th ACM Symposium on Computational Geometry, College Park, MD, USA, June 9–11, 2008, pp. 39–48.
Contreras P., Murtagh F.: Fast, linear time hierarchical clustering using the Baire metric. Journal of Classification, 2(29), 118–143 (2012)
M.B. Cohen. Personal communication, (2014).
J.W. Cooley and J.M. Tukey. An algorithm for the machine calculation of complex Fourier series. Mathmatics of Computation, 19 (1965), 297–301.
Candès E., Tao T.: Decoding by linear programming. IEEE Transactions of Information Theory, 12(51), 4203–4215 (2005)
E.J. Candès and T. Tao. Near-optimal signal recovery from random projections: universal encoding strategies? IEEE Transactions of Information Theory, 52 (2006), 5406–5425.
K.L. Clarkson and D.P. Woodruff. Numerical linear algebra in the streaming model. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, May 31–June 2, 2009 (2009), pp. 205–214.
K.L. Clarkson and D.P. Woodruff. Low rank approximation and regression in input sparsity time. In Proceedings of the 45th ACM Symposium on Theory of Computing (STOC) (2013), pp. 81–90.
Donoho D.L., Grimes C.: Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data.. Proceedings of Nationall Academy of Sciences, 10(100), 5591–5596 (2013)
Dirksen S.: Tail bounds via generic chaining. Electronic Journal of Probability, 53(20), 1–29 (2015)
S. Dirksen. Dimensionality reduction with subgaussian matrices: a unified theory. CoRR (2014), arXiv:1402.3973.
A. Dasgupta, R. Kumar, and T. Sarlós. A sparse Johnson-Lindenstrauss transform. In Proceedings of the 42nd ACM Symposium on Theory of Computing (STOC) (2010), pp. 341–350.
Drineas P., Magdon-Ismail M., Mahoney M., Woodruff D.: Fast approximation of matrix coherence and statistical leverage. Journal of Machine Learning Research, 13, 3475–3506 (2012)
Donoho D.: Compressed sensing. IEEE Transactions of Information Theory, (4)52, 1289–1306 (2006)
Dudley R.M.: The sizes of compact subsets of Hilbert space and continuity of Gaussian processes. Journal of Functional Analysis, 1, 290–330 (1967)
A. Eftekhari and M.B. Wakin. New analysis of manifold embeddings and signal recovery from compressive measurements. CoRR (2013).arXiv:1306.4748 .
Fernique X.: Regularité des trajectoires des fonctions aléatoires gaussiennes. Lecture Notes in Math, 480, 1–96 (1975)
Figiel T.: On the moduli of convexity and smoothness. Studia Mathematics, 2(56), 121–155 (1976)
S. Foucart and H. Rauhut. A Mathematical Introduction to Compressive Sensing. Birkhaüser, Boston (2013).
O. Guédon, S. Mendelson, A. Pajor, and N. Tomczak-Jaegermann. Subspaces and orthogonal decompositions generated by bounded orthogonal systems. Positivity, (2)11, 269–283 (2007)
O. Guédon, S. Mendelson, A. Pajor, and N. Tomczak-Jaegermann. Majorizing measures and proportional subsets of bounded orthonormal systems. Revista Matematica Iberoamericana, (3)24 (2008), 1075–1095.
Y. Gordon. On Milman’s inequality and random subspaces which escape through a mesh in \({\mathbb{R}^n}\). Geometric Aspects of Functional Analysis, (1988), 84–106.
S. Har-Peled, P. Indyk, and R. Motwani. Approximate nearest neighbor: Towards removing the curse of dimensionality. Theory of Computing, (1)8 (2012), 321–350.
J.-B. Hiriart-Urruty and C. Lemaréchal. Fundamentals of convex analysis. Springer-Verlag, Berlin, (2001).
C. Hegde, M. Wakin, and R. Baraniuk. Random projections for manifold learning. In Advances in neural information processing systems (2007), pp. 641–648.
P. Indyk. Algorithmic applications of low-distortion geometric embeddings. In Proceedings of the 42nd Annual Symposium on Foundations of Computer Science (FOCS)(2001), pp. 10–33.
P. Indyk and I. Razenshteyn. On model-based RIP-1 matrices. In Proceedings of the 40th International Colloquium on Automata, Languages and Programming (ICALP) (2013), pp. 564–575.
W.B. Johnson and J. Lindenstrauss. Extensions of Lipschitz mappings into a Hilbert space. Contemporary Mathematics, 26 (1984), 189–206.
J.-P. Kahane. Some Random Series of Functions. Heath Math. Monographs. Cambridge University Press, Cambridge (1968).
B. Klartag and S. Mendelson. Empirical processes and random projections. Journal of Functional of Analysis, (1)225 (2005), 229–245.
F. Krahmer, S. Mendelson, and H. Rauhut. Suprema of chaos processes and the restricted isometry property. Communications of Pure Applied Mathmetics, (11)67 (2014), 1877–1904.
D.M. Kane and J. Nelson. A derandomized sparse Johnson–Lindenstrauss transform. CoRR (2010).arXiv:1006.3585.
D.M. Kane and J. Nelson. Sparser Johnson–Lindenstrauss transforms. Journal of the ACM, (1) 61 (2014), 4.
F. Krahmer and R. Ward. New and improved Johnson–Lindenstrauss embeddings via the Restricted Isometry Property. SIAM Journal of Mathematical Analysis, (3)43 (2011), 1269–1281.
G. Kovács, S. Zucker, and T. Mazeh. A box-fitting algorithm in the search for periodic transits. Astronomy and Astrophysics, 391 (2002), 369–377.
Y. Lu, P. Dhillon, D. Foster, and L. Ungar. Faster ridge regression via the subsampled randomized Hadamard transform. In Proceedings of the 26th Annual Conference on Advances in Neural Information Processing Systems (NIPS), (2013).
K.G. Larsen and Jelani Nelson. The Johnson–Lindenstrauss lemma is optimal for linear dimensionality reduction, Manuscript (2014).
F. Lust-Piquard. Inégalités de Khintchine dans C p \({(1 < p < \infty)}\). Comptes Rendus Mathematique Acadameic Science Paris, (7)303 (1986), 289–292.
F. Lust-Piquard and G. Pisier. Noncommutative Khintchine and Paley inequalities. Ark. Mat., (2)29 (1991), 241–260.
Y.T. Lee and A. Sidford. Matching the universal barrier without paying the costs : Solving linear programs with õ(sqrt(rank)) linear system solves. CoRR (2013). arXiv:1312.6677.
J. Matousek. On variants of the Johnson–Lindenstrauss lemma. Random Structure Algorithms, (2)33 (2008):142–156.
X. Meng and M.W. Mahoney. Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression. In Proceedings of the 45th ACM Symposium on Theory of Computing (STOC) (2013), pp. 91–100.
S. Mendelson, A. Pajor, and N. Tomczak-Jaegermann. Reconstruction and subgaussian operators in asymptotic geometric analysis. Geometric and Functional Analysis, (4)17 (2007), 1248–1282.
R. Meise and D. Vogt. Introduction to Functional Analysis. Oxford Graduate Texts in Mathematics (Book 2). Oxford University Press, Oxford (1997).
J. Nelson and H.L. Nguy\({\tilde{\hat{\mbox{e} }}}\)n. OSNAP: Faster numerical linear algebra algorithms via sparser subspace embeddings. In Proceedings of the 54th Annual IEEE Symposium on Foundations of Computer Science (FOCS) (2013), pp. 117–126.
J. Nelson and H.L. Nguy\({\tilde{\hat{\mbox{e} }}}\)n. Sparsity lower bounds for dimensionality-reducing maps. In Proceedings of the 45th ACM Symposium on Theory of Computing (STOC) (2013), pp. 101–110.
J. Nelson, E. Price, and M. Wootters. New constructions of RIP matrices with fast multiplication and fewer rows. In Proceedings of the 25th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)(2014).
D. Needell and J.A. Tropp. CoSaMP: Iterative signal recovery from incomplete and inaccurate samples. Applied Computational Harmonic Analysis, 26 (2009), 301–332.
S. Paul, C. Boutsidis, M. Magdon-Ismail, and P. Drineas. Random projections for support vector machines. In Proceedings of the 16th International Conference on Artificial Intelligence and Statistics (AISTATS) (2013), pp. 498–506.
A. Pietsch. Theorie der Operatorenideale (Zusammenfassung). Friedrich-Schiller-Universität Jena (1972).
G. Pisier. Probabilistic methods in the geometry of Banach spaces. Probability and Analysis, Lecture Notes in Mathematics, 1206 (1986), 167– 241.
G. Pisier. Remarques sur un résultat non public de B. Maurey. Séminaire d’analyse fonctionnelle, Exp. V., (1980/81).
A. Pajor and N. Tomczak-Jaegermann. Subspaces of small codimension of finite dimensional Banach spaces. Proceedings of American of Mathematics Society, 97, (1986), 637–642.
M. Pilanci and M. Wainwright. Randomized sketches of convex programs with sharp guarantees. arXiv (2014).arXiv:1404.7203.
R. Rockafellar. Convex Analysis. Princeton University Press, Princeton (1970).
S.T. Roweis and L.K. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290 (2000), 2323–2326.
M. Rudelson and R. Vershynin. Sampling from large matrices: An approach through geometric functional analysis. Journal of the ACM, (4)54 (2007).
M. Rudelson and R. Vershynin. On sparse reconstruction from Fourier and Gaussian measurements. Communications on Pure and Applied Mathematics, (8)61 (2008), 1025–1045.
T. Sarlós. Improved approximation algorithms for large matrices via random projections. In Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS) (2006), pp. 143–152.
D.A. Spielman and N. Srivastava. Graph sparsification by effective resistances. SIAM Journal of Computations, (6) 40 (2011), 1913–1926.
M. Talagrand. The generic chaining: upper and lower bounds of stochastic processes. Springer Verlag, Berlin (2005).
J.B. Tenenbaum, V. de Silva, and J.C. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, (5500)290 (2000), 2319–2323.
R. Tibshirani. Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society, Series B, (1)58 (1996), 267–288.
J.A. Tropp. The random paving property for uniformly bounded matrices. Studia Math., 185:67–82, 2008.
J.A. Tropp. Improved analysis of the subsampled randomized hadamard transform. Advances in Adaptive Data Analysis, (1–2)3 (2011), 115– 126.
A. Vanderburg and J.A. Johnson. A technique for extracting highly precise photometry for the two-wheeled Kepler mission. CoRR (2011). arXiv:1408.3853.
K.Q. Weinberger, A. Dasgupta, J. Langford, A.J. Smola, and J. Attenberg. Feature hashing for large scale multitask learning. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML) (2009), pp. 1113–1120.
D.P. Woodruff. Personal communication (2014).
D.P. Woodruff and Q. Zhang. Subspace embeddings and \({\ell_p}\) regression using exponential random variables. In Proceedings of the 26th Conference on Learning Theory (COLT) (2013).
Author information
Authors and Affiliations
Corresponding author
Additional information
J. B. partially supported by NSF Grant DMS-1301619. S. D. partially supported by SFB Grant 1060 of the Deutsche Forschungsgemeinschaft (DFG).
J.N. supported by NSF Grant IIS-1447471 and CAREER Award CCF-1350670, ONR Grant N00014-14-1-0632, and a Google Faculty Research Award. Part of this work done while supported by NSF Grants CCF-0832797 and DMS-1128155.
Rights and permissions
About this article
Cite this article
Bourgain, J., Dirksen, S. & Nelson, J. Toward a unified theory of sparse dimensionality reduction in Euclidean space. Geom. Funct. Anal. 25, 1009–1088 (2015). https://doi.org/10.1007/s00039-015-0332-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00039-015-0332-9