research-article

An accelerated gradient method for trace norm minimization

Authors:
Shuiwang Ji

Arizona State University, Tempe, AZ

Arizona State University, Tempe, AZ
View Profile

,
Jieping Ye

Arizona State University, Tempe, AZ

Arizona State University, Tempe, AZ
View Profile

ICML '09: Proceedings of the 26th Annual International Conference on Machine LearningJune 2009Pages 457–464https://doi.org/10.1145/1553374.1553434

Published:14 June 2009Publication History

ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning

Pages 457–464

ABSTRACT

We consider the minimization of a smooth loss function regularized by the trace norm of the matrix variable. Such formulation finds applications in many machine learning tasks including multi-task learning, matrix classification, and matrix completion. The standard semidefinite programming formulation for this problem is computationally expensive. In addition, due to the non-smooth nature of the trace norm, the optimal first-order black-box method for solving such class of problems converges as O(1/√k), where k is the iteration counter. In this paper, we exploit the special structure of the trace norm, based on which we propose an extended gradient algorithm that converges as O(1/k). We further propose an accelerated gradient algorithm, which achieves the optimal convergence rate of O(1/k²) for smooth problems. Experiments on multi-task learning problems demonstrate the efficiency of the proposed algorithms.

References

Abernethy, J., Bach, F., Evgeniou, T., & Vert, J.-P. (2006). Low-rank matrix factorization with attributes (Technical Report N24/06/MM). Ecole des Mines de Paris.Google Scholar
Abernethy, J., Bach, F., Evgeniou, T., & Vert, J.-P. (2009). A new approach to collaborative filtering: Operator estimation with spectral regularization. J. Mach. Learn. Res., 10, 803--826. Google ScholarDigital Library
Amit, Y., Fink, M., Srebro, N., & Ullman, S. (2007). Uncovering shared structures in multiclass classification. In Proceedings of the International Conference on Machine Learning, 17--24. Google ScholarDigital Library
Argyriou, A., Evgeniou, T., & Pontil, M. (2008). Convex multi-task feature learning. Machine Learning, 73, 243--272. Google ScholarDigital Library
Bach, F. R. (2008). Consistency of trace norm minimization. J. Mach. Learn. Res., 9, 1019--1048. Google ScholarDigital Library
Beck, A., & Teboulle, M. (2009). A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences, 2, 183--202. Google ScholarDigital Library
Bertsekas, D. P. (1999). Nonlinear programming. Athena Scientific. 2nd edition.Google Scholar
Cai, J.-F., Candés, E. J., & Shen, Z. (2008). A singular value thresholding algorithm for matrix completion (Technical Report 08-77). UCLA Computational and Applied Math.Google Scholar
Candés, E. J., & Recht, B. (2008). Exact matrix completion via convex optimization (Technical Report 08-76). UCLA Computational and Applied Math.Google Scholar
Fazel, M., Hindi, H., & Boyd, S. P. (2001). A rank minimization heuristic with application to minimum order system approximation. In Proceedings of the American Control Conference, 4734--4739.Google ScholarCross Ref
Lu, Z., Monteiro, R. D. C., & Yuan, M. (2008). Convex optimization methods for dimension reduction and coefficient estimation in multivariate linear regression. Submitted to Mathematical Programming. Google ScholarDigital Library
Ma, S., Goldfarb, D., & Chen, L. (2008). Fixed point and Bregman iterative methods for matrix rank minimization (Technical Report 08-78). UCLA Computational and Applied Math.Google Scholar
Nemirovsky, A. S., & Yudin, D. B. (1983). Problem complexity and method efficiency in optimization. John Wiley & Sons Ltd.Google Scholar
Nesterov, Y. (1983). A method for solving a convex programming problem with convergence rate O(1/k ²). Soviet Math. Dokl., 27, 372--376.Google Scholar
Nesterov, Y. (2003). Introductory lectures on convex optimization: A basic course. Kluwer Academic Publishers.Google Scholar
Nesterov, Y. (2005). Smooth minimization of non-smooth functions. Mathematical Programming, 103, 127--152. Google ScholarDigital Library
Nesterov, Y. (2007). Gradient methods for minimizing composite objective function (Technical Report 2007/76). CORE, Université catholique de Louvain.Google Scholar
Obozinski, G., Taskar, B., & Jordan, M. I. (2009). Joint covariate selection and joint subspace selection for multiple classification problems. Statistics and Computing. In press. Google ScholarDigital Library
Recht, B., Fazel, M., & Parrilo, P. (2008a). Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. Submitted to SIAM Review. Google ScholarDigital Library
Recht, B., Xu, W., & Hassibi, B. (2008b). Necessary and sufficient condtions for success of the nuclear norm heuristic for rank minimization. In Proceedings of the 47th IEEE Conference on Decision and Control, 3065--3070.Google Scholar
Rennie, J. D. M., & Srebro, N. (2005). Fast maximum margin matrix factorization for collaborative prediction. In Proceedings of the International Conference on Machine Learning, 713--719. Google ScholarDigital Library
Srebro, N., Rennie, J. D. M., & Jaakkola, T. S. (2005). Maximum-margin matrix factorization. In Advances in Neural Information Processing Systems, 1329--1336.Google Scholar
Toh, K.-C., & Yun, S. (2009). An accelerated proximal gradient algorithm for nuclear norm regularized least squares problems. Preprint, Department of Mathematics, National University of Singapore, March 2009.Google Scholar
Tomioka, R., & Aihara, K. (2007). Classifying matrices with a spectral regularization. In Proceedings of the International Conference on Machine Learning, 895--902. Google ScholarDigital Library
Tseng, P. (2008). On accelerated proximal gradient methods for convex-concave optimization. Submitted to SIAM Journal on Optimization.Google Scholar
Weimer, M., Karatzoglou, A., Le, Q., & Smola, A. rank (2008a). COFI^rank - maximum margin matrix factorization for collaborative ranking. In Advances in Neural Information Processing Systems, 1593--1600.Google Scholar
Weimer, M., Karatzoglou, A., & Smola, A. (2008b). Improving maximum margin matrix factorization. Machine Learning, 72, 263--276. Google ScholarDigital Library
Yuan, M., Ekici, A., Lu, Z., & Monteiro, R. (2007). Dimension reduction and coefficient estimation in multivariate linear regression. Journal of the Royal Statistical Society: Series B, 69, 329--346.Google ScholarCross Ref

Index Terms

An accelerated gradient method for trace norm minimization

Recommendations

Decomposable norm minimization with proximal-gradient homotopy algorithm

We study the convergence rate of the proximal-gradient homotopy algorithm applied to norm-regularized linear least squares problems, for a general class of norms. The homotopy algorithm reduces the regularization parameter in a series of steps, and uses ...
Read More
Accelerated reweighted nuclear norm minimization algorithm for low rank matrix recovery

In this paper we propose an accelerated reweighted nuclear norm minimization algorithm to recover a low rank matrix. Our approach differs from other iterative reweighted algorithms, as we design an accelerated procedure which makes the objective ...
Read More
Nonlocal image denoising via adaptive tensor nuclear norm minimization

Nonlocal self-similarity shows great potential in image denoising. Therefore, the denoising performance can be attained by accurately exploiting the nonlocal prior. In this paper, we model nonlocal similar patches through the multi-linear approach and ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning
June 2009
1331 pages
ISBN:9781605585161
DOI:10.1145/1553374
General Chair:
Andrea Danyluk
Williams College
,
Program Chairs:
Léon Bottou
NEC Laboratories America
,
Michael Littman
Rutgers University
Copyright © 2009 Copyright 2009 by the author(s)/owner(s).
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 June 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate140of548submissions,26%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 323
  Total Citations
  View Citations
- 1,952
  Total Downloads
- Downloads (Last 12 months)66
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

An accelerated gradient method for trace norm minimization

ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning

ABSTRACT

References

Cited By

Index Terms

Recommendations

Decomposable norm minimization with proximal-gradient homotopy algorithm

Accelerated reweighted nuclear norm minimization algorithm for low rank matrix recovery

Nonlocal image denoising via adaptive tensor nuclear norm minimization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

An accelerated gradient method for trace norm minimization

ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning

ABSTRACT

References

Cited By

Index Terms

Recommendations

Decomposable norm minimization with proximal-gradient homotopy algorithm

Accelerated reweighted nuclear norm minimization algorithm for low rank matrix recovery

Nonlocal image denoising via adaptive tensor nuclear norm minimization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media