Skip to main content
Log in

Approximation bounds for sparse principal component analysis

  • Full Length Paper
  • Series B
  • Published:
Mathematical Programming Submit manuscript

Abstract

We produce approximation bounds on a semidefinite programming relaxation for sparse principal component analysis. The sparse maximum eigenvalue problem cannot be efficiently approximated up to a constant approximation ratio, so our bounds depend on the optimum value of the semidefinite relaxation: the higher this value, the better the approximation. In particular, these bounds allow us to control approximation ratios for tractable statistics in hypothesis testing problems where data points are sampled from Gaussian models with a single sparse leading component.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Amini, A.A., Wainwright, M.: High-dimensional analysis of semidefinite relaxations for sparse principal components. Ann. Stat. 37(5B), 2877–2921 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  2. Ben-Tal, A., Nemirovski, A.: On tractable approximations of uncertain linear matrix inequalities affected by interval uncertainty. SIAM J. Optim. 12(3), 811–833 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  3. Ben-Tal, A., Nemirovski, A.: Non-Euclidean restricted memory level method for large-scale convex optimization. Math. Program. 102(3), 407–456 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  4. Benaych-Georges, F., Guionnet, A., Maida, M.: Fluctuations of the extreme eigenvalues of finite rank deformations of random matrices. Electron. J. Probab. 16(60), 1621–1662 (2011), ISSN 1083–6489, doi:10.1214/EJP.v16-929

    Google Scholar 

  5. Berthet, Q., Rigollet, P.: Optimal Detection of Sparse Principal Components in High Dimension. Arxiv preprint arXiv:1202.5070 (2012)

  6. Candes, E., Tao, T.: The Dantzig selector: statistical estimation when p is much larger than n. Ann. Stat. 35(6), 2313–2351 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  7. Clarkson, K.L.: Coresets, sparse greedy approximation, and the Frank–Wolfe algorithm. ACM Trans. Algorithms (TALG) 6(4), 63 (2010)

    MathSciNet  Google Scholar 

  8. d’Aspremont, A., El Ghaoui, L., Jordan, M.I., Lanckriet, G.R.G.: A direct formulation for sparse PCA using semidefinite programming. SIAM Rev. 49(3), 434–448 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  9. d’Aspremont, A., Bach, F., El Ghaoui, L.: Optimal solutions for sparse principal component analysis. J. Mach. Learn. Res. 9, 1269–1294 (2008)

    MATH  MathSciNet  Google Scholar 

  10. El Ghaoui, L.: On the Quality of a Semidefinite Programming Bound for Sparse Principal Component Analysis. ArXiV math. OC/0601448 (2006)

  11. Goemans, M.X., Williamson, D.P.: Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. J. ACM 42, 1115–1145 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  12. Jaggi, M.: Convex Optimization Without Projection Steps. Arxiv preprint arXiv:1108.1170 (2011)

  13. Journée, M., Bach, F., Absil, P.A., Sepulchre, R.: Low-Rank Optimization for Semidefinite Convex Problems. Arxiv preprint arXiv:0807.4423 (2008)

  14. Journée, M., Nesterov, Y., Richtárik, P., Sepulchre, R.: Generalized Power Method for Sparse Principal Component Analysis. arXiv:0811.4724 (2008)

  15. Mezard, M., Montanari, A.: Information, Physics, and Computation. Oxford University Press, Oxford (2009)

    Book  MATH  Google Scholar 

  16. Mezard, M., Parisi, G., Virasoro, M.A.: Spin Glass Theory and Beyond, vol. 9. World Scientific, Singapore (1987)

    MATH  Google Scholar 

  17. Nesterov, Y.: Smooth minimization of non-smooth functions. Math. Program. 103(1), 127–152 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  18. Papailiopoulos, D.S., Dimakis, A.G., Korokythakis, S.: Sparse PCA Through Low-Rank Approximations. arXiv preprint arXiv:1303.0551 (2013)

  19. Spielman, D., Teng, S.H.: Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time. In: Proceedings of the Thirty-Third Annual ACM Symposium on Theory of Computing, pp. 296–305. ACM (2001)

  20. Talagrand, M.: Mean Field Models for Spin Glasses: Basic Examples, vol. 1. Springer, Berlin (2010)

    Google Scholar 

  21. Zwick, U.: Outward rotations: a tool for rounding solutions of semidefinite programming relaxations, with applications to max cut and other problems. In: Proceedings of the Thirty-First Annual ACM Symposium on Theory of Computing, pp. 679–687. ACM (1999)

Download references

Acknowledgments

AA would like to acknowledge partial support from NSF Grants SES-0835550 (CDI), CMMI-0844795 (CAREER), CMMI-0968842, a starting grant from the European Research Council (project SIPA), a Peek junior faculty fellowship, a Howard B. Wentz Jr. award and a gift from Google. FB would like to acknowledge support from a starting grant from the European Research Council (project SIERRA) and the INRIA associated-team StatWeb. The authors would like to thank Philippe Rigollet and Quentin Berthet for very constructive discussions and introducing them to the detection problem in Sect. 3.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexandre d’Aspremont.

Rights and permissions

Reprints and permissions

About this article

Cite this article

d’Aspremont, A., Bach, F. & Ghaoui, L.E. Approximation bounds for sparse principal component analysis. Math. Program. 148, 89–110 (2014). https://doi.org/10.1007/s10107-014-0751-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-014-0751-7

Mathematics Subject Classification

Navigation