Accelerated gradient methods for nonconvex nonlinear and stochastic programming

Ghadimi, Saeed; Lan, Guanghui

doi:10.1007/s10107-015-0871-8

Accelerated gradient methods for nonconvex nonlinear and stochastic programming

Full Length Paper
Series A
Published: 21 February 2015

Volume 156, pages 59–99, (2016)
Cite this article

Mathematical Programming Submit manuscript

Saeed Ghadimi¹ &
Guanghui Lan¹

7500 Accesses
250 Citations
1 Altmetric
Explore all metrics

Abstract

In this paper, we generalize the well-known Nesterov’s accelerated gradient (AG) method, originally designed for convex smooth optimization, to solve nonconvex and possibly stochastic optimization problems. We demonstrate that by properly specifying the stepsize policy, the AG method exhibits the best known rate of convergence for solving general nonconvex smooth optimization problems by using first-order information, similarly to the gradient descent method. We then consider an important class of composite optimization problems and show that the AG method can solve them uniformly, i.e., by using the same aggressive stepsize policy as in the convex case, even if the problem turns out to be nonconvex. We demonstrate that the AG method exhibits an optimal rate of convergence if the composite problem is convex, and improves the best known rate of convergence if the problem is nonconvex. Based on the AG method, we also present new nonconvex stochastic approximation methods and show that they can improve a few existing rates of convergence for nonconvex stochastic optimization. To the best of our knowledge, this is the first time that the convergence of the AG method has been established for solving nonconvex nonlinear programming in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accelerated first-order methods for large-scale convex optimization: nearly optimal complexity under strong convexity

Article 01 June 2019

Conditional gradient type methods for composite nonlinear and stochastic optimization

Article 24 January 2018

Generalized Nesterov’s accelerated proximal gradient algorithms with convergence rate of order o(1/k2)

Article 22 August 2022

Notes

A function \(\varPsi (\cdot )\) satisfying (1.2) is denoted by \(\varPsi \in {\mathcal {C}}_{L_\varPsi }^{1,1}(\mathbb {R}^n)\).

References

Andradóttir, S.: A review of simulation optimization techniques. In: Proceedings of the 1998 Winter Simulation Conference, pp. 151–158 (1998)
Asmussen, S., Glynn, P.W.: Stochastic Simulation: Algorithm and Analysis. Springer, New York (2000)
Google Scholar
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2, 183–202 (2009)
Article MathSciNet MATH Google Scholar
Cartis, C., Gould, N.I.M., Toint, P.L.: On the complexity of steepest descent, Newton’s and regularized Newton’s methods for nonconvex unconstrained optimization. SIAM J. Optim. 20(6), 2833–2852 (2010)
Article MathSciNet MATH Google Scholar
Chen, X., Ge, D., Wang, Z., Ye, Y.: Complexity of unconstrained \(l_2-l_p\) minimization. Math. Program. (2012). doi:10.1007/s10107-012-0613-0
Dang, C.D., Lan, G.: Stochastic block mirror descent methods for nonsmooth and stochastic optimization. Manuscript, Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL 32611, USA (2013)
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96, 1348–1360 (2001)
Article MathSciNet MATH Google Scholar
Feng, M., Mitchell, J.E., Pang, J.-S., Shen, X., Wächter, A.: Complementarity formulations of \(l_0\)-norm optimization problems. Manuscript (2013)
Fu, M.: Optimization for simulation: theory vs. practice. INFORMS J. Comput. 14, 192–215 (2002)
Article MathSciNet MATH Google Scholar
Ghadimi, S., Lan, G.: Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization, I: a generic algorithmic framework. SIAM J. Optim. 22, 1469–1492 (2012)
Article MathSciNet MATH Google Scholar
Ghadimi, S., Lan, G.: Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization, II: shrinking procedures and optimal algorithms. SIAM J. Optim. 23, 2061–2089 (2013)
Article MathSciNet MATH Google Scholar
Ghadimi, S., Lan, G.: Stochastic first- and zeroth-order methods for nonconvex stochastic programming. SIAM J. Optim. 23(4), 2341–2368 (2013)
Article MathSciNet MATH Google Scholar
Ghadimi, S., Lan, G., Zhang, H.: Mini-batch stochastic approximation methods for constrained nonconvex stochastic programming. Manuscript, Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL 32611, USA. Mathematical Programming (Under second-round review) (2014)
Lemarechal, C., Hiriart-Urruty, J.-B.: Convex Analysis and Minimization Algorithms II. A Series of Comperhensive Studies in Mathematics. Springer Science & Business Media, Heidelberg (1993)
Lan, G.: An optimal method for stochastic composite optimization. Math. Program. 133(1), 365–397 (2012)
Article MathSciNet MATH Google Scholar
Lan, G.: The complexity of large-scale convex programming under a linear optimization oracle. Manuscript, Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL 32611, USA. http://www.optimization-online.org/ (2013)
Lan, G., Monteiro, R.D.C.: Iteration-complexity of first-order penalty methods for convex programming. Math. Program. 138, 115–139 (2013)
Article MathSciNet MATH Google Scholar
Lan, G., Monteiro, R.D.C.: Iteration-complexity of first-order augmented lagrangian methods for convex programming. Technical report, Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL 32611, USA. Mathematical Programming (Under second-round review) (2013)
Law, A.M.: Simulation Modeling and Analysis. McGraw Hill, New York (2007)
Google Scholar
Lewis, A.S., Wright, S.J.: A proximal method for composite minimization. Manuscript, Cornell University, Ithaca, NY (2009)
Mairal, J., Bach, F., Ponce, J., Sapiro, G.: Online dictionary learning for sparse coding. In: ICML, pp. 689–696 (2009)
Monteiro, R.D.C., Svaiter, B.F.: An accelerated hybrid proximal extragradient method for convex optimization and its implications to second-order methods. Manuscript, School of ISyE, Georgia Tech, Atlanta, GA, 30332, USA (2011)
Nemirovski, A.S., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19, 1574–1609 (2009)
Article MathSciNet MATH Google Scholar
Nemirovski, A.S., Yudin, D.: Problem complexity and method efficiency in optimization. In: Graham, R.L., Lenstra, J.K. (eds.) Wiley-Interscience Series in Discrete Mathematics. Wiley, XV (1983)
Nesterov, Y.E.: A method for unconstrained convex minimization problem with the rate of convergence \(O(1/k^2)\). Doklady AN SSSR 269, 543–547 (1983)
MathSciNet Google Scholar
Nesterov, Y.E.: Introductory Lectures on Convex Optimization: A Basic Course. Kluwer Academic Publishers, Massachusetts (2004)
Book Google Scholar
Nesterov, Y.E.: Smooth minimization of nonsmooth functions. Math. Program. 103, 127–152 (2005)
Article MathSciNet MATH Google Scholar
Nesterov, Y.E.: How to make gradients small. Optima 88, 10–11 (2012)
Google Scholar
Nesterov, Y.E.: Gradient methods for minimizing composite objective functions. Math. Program. Ser. B 140, 125–161 (2013)
Article MathSciNet MATH Google Scholar
Polyak, B.T., Juditsky, A.B.: Acceleration of stochastic approximation by averaging. SIAM J. Control Optim. 30, 838–855 (1992)
Article MathSciNet MATH Google Scholar
Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22, 400–407 (1951)
Article MathSciNet MATH Google Scholar
Sartenaer, A., Gratton, S., Toint, P.L.: Recursive trust-region methods for multiscale nonlinear optimization. SIAM J. Optim. 19, 414–444 (2008)
Article MathSciNet MATH Google Scholar
Spall, J.C.: Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control. Wiley, Hoboken (2003)
Book Google Scholar
Tseng, P.: On accelerated proximal gradient methods for convex–concave optimization. Manuscript, University of Washington, Seattle (May 2008)

Download references

Author information

Authors and Affiliations

Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL, 32611, USA
Saeed Ghadimi & Guanghui Lan

Authors

Saeed Ghadimi
View author publications
You can also search for this author in PubMed Google Scholar
Guanghui Lan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guanghui Lan.

Additional information

This research was partially supported by NSF Grants CMMI-1000347, CMMI-1254446, DMS-1319050, and ONR Grant N00014-13-1-0036.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ghadimi, S., Lan, G. Accelerated gradient methods for nonconvex nonlinear and stochastic programming. Math. Program. 156, 59–99 (2016). https://doi.org/10.1007/s10107-015-0871-8

Download citation

Received: 14 October 2013
Accepted: 06 February 2015
Published: 21 February 2015
Issue Date: March 2016
DOI: https://doi.org/10.1007/s10107-015-0871-8

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accelerated gradient methods for nonconvex nonlinear and stochastic programming

Abstract

Access this article

Similar content being viewed by others

Accelerated first-order methods for large-scale convex optimization: nearly optimal complexity under strong convexity

Conditional gradient type methods for composite nonlinear and stochastic optimization

Generalized Nesterov’s accelerated proximal gradient algorithms with convergence rate of order o(1/k2)

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Accelerated gradient methods for nonconvex nonlinear and stochastic programming

Abstract

Access this article

Similar content being viewed by others

Accelerated first-order methods for large-scale convex optimization: nearly optimal complexity under strong convexity

Conditional gradient type methods for composite nonlinear and stochastic optimization

Generalized Nesterov’s accelerated proximal gradient algorithms with convergence rate of order o(1/k2)

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation