Large Scale Kernel Regression via Linear Programming

Mangasarian, O.L.; Musicant, David R.

doi:10.1023/A:1012422931930

Large Scale Kernel Regression via Linear Programming

Published: January 2002

Volume 46, pages 255–269, (2002)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Large Scale Kernel Regression via Linear Programming

Download PDF

O.L. Mangasarian¹ &
David R. Musicant²

873 Accesses
53 Citations
Explore all metrics

Abstract

The problem of tolerant data fitting by a nonlinear surface, induced by a kernel-based support vector machine is formulated as a linear program with fewer number of variables than that of other linear programming formulations. A generalization of the linear programming chunking algorithm for arbitrary kernels is implemented for solving problems with very large datasets wherein chunking is performed on both data points and problem variables. The proposed approach tolerates a small error, which is adjusted parametrically, while fitting the given data. This leads to improved fitting of noisy data (over ordinary least error solutions) as demonstrated computationally. Comparative numerical results indicate an average time reduction as high as 26.0% over other formulations, with a maximal time reduction of 79.7%. Additionally, linear programs with as many as 16,000 data points and more than a billion nonzero matrix elements are solved.

References

Bennett, K. P. (1999). Combining support vector and mathematical programming methods for induction. In B. Schölkopf, C. Burges, & A. Smola (Eds.). Advances in kernel methods: Support vector mechines (pp. 307-326). Cambridge, MA: MIT Press.
Google Scholar
Bradley, P. S. & Mangasarian, O. L. (2000). Massive data discrimination via linear support vector machines. Optimization, Methods and Software, (vol. 13, pp. 1-10). ftp://ftp.cs.wisc.edu/math-prog/tech-reports/98-03.ps
Google Scholar
Burges, C. J. C. (1998). Atutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2:2, 121-167.
Google Scholar
Cherkassky, V. & Mulier, F. (1998). Learning from data-concepts, theory and methods. New York: John Wiley & Sons.
Google Scholar
Cortes, C. & Vapnik, V. (1995). Support vector networks. Machine Learning, 20, 273-279.
Google Scholar
Dantzig, G. B. & Wolfe, P. (1960). Decomposition principle for linear programs. Operations Research, 8, 101-111.
Google Scholar
Delve. Data for evaluating learning in valid experiments. http://www.cs.utoronto.ca/~delve/
Gilmore, P. C. & Gomory, R. E. (1961). A linear programming approach to the cutting stock problem. Operations Research, 9, 849-859.
Google Scholar
Huber, P. J. (1964). Robust estimation of location parameter. Annals of Mathematical Statistics, 35, 73-101.
Google Scholar
Huber, P. J. (1981). Robust statistics. New York: John Wiley.
Google Scholar
ILOG CPLEX Division, Incline Village, Nevada. (1991). ILOG CPLEX 6.5 Reference Manual. 2000.
Mangasarian, O. L. (2000). Generalized support vector machines. In A. Smola, P. Bartlett, B. Schölkopf, & D. Schuurmans (Eds.). Advances in large margin classifiers (pp. 135-146). Cambridge, MA: MIT Press. ftp://ftp.cs.wisc.edu/math-prog/tech-reports/98-14.ps
Google Scholar
Mangasarian, O. L. & Meyer, R. R. (1979). Nonlinear perturbation of linear programs. SIAM Journal on Control and Optimization, 17:6, 745-752.
Google Scholar
MATLAB. (1994-2000). User's guide. The MathWorks, Inc., Natick, MA 01760. http:/www.mathworks.com/ products/matlab/usersguide.shtml
Google Scholar
MATLAB. (1997). Application program interface guide. The MathWorks, Inc., Natick, MA 01760.
Google Scholar
Murphy, P. M. & Aha, D. W. (1992). UCI repository of machine learning databases. www.ics.uci.edu/~mlearn/ MLRepository.html
Schölkopf, B., Bartlett, P., Smola, A., & Williamson, R. (1998). Support vector regression with automatic accuracy control. In L. Niklasson, M. Boden, & T. Ziemke (Eds.). Proceedings of the Eight International Conference on Artificial Neural Networks (pp. 111-116) Berlin: Springer Verlag. Available at http://www.kernelmachines. org/publications.html
Google Scholar
Schölkopf, B., Bartlett, P., Smola, A., & Williamson, R. (1999). Shrinking the tube: A new support vector regression algorithm. In M. S. Kearns, S. A. Solla, & D. A. Cohn (Eds.). Advances in neural information processing systems (vol. 11, pp. 330-336). Cambridge, MA: MIT Press. Available at http://www.kernelmachines. org/publications.html
Google Scholar
Schölkopf, B., Burges, C., & Smola, A. (Eds.). (1999). Advances in kernel methods: Support vector machines. Cambridge, MA: MIT Press.
Google Scholar
Smola, A. J. (1998). Learning with kernels. Ph.D. Thesis, Technische Universität Berlin, Berlin, Germany.
Google Scholar
Smola, A., Schölkopf, B., & Rätsch, G. (1999). Linear programs for automatic accuracy control in regression. In Ninth International Conference on Artificial Neural Networks, Conference Publications No. 470 (pp. 575-580). London: IEE. Available at http://www.kernel-machines.org/publications.html
Google Scholar
Street, W. N. & Mangasarian, O. L. (1998). Improved generalization via tolerant training. Journal of Optimization Theory and Applications, 96:2, 259-279. ftp://ftp.cs.wisc.edu/math-prog/tech-reports/95-11.ps
Google Scholar
US Census Bureau. Adult dataset. Publicly available from www.sgi.com/Technology/mlc/db/.
Vapnik, V. N. (1995). The nature of statistical learning theory. New York: Springer.
Google Scholar
Weston, J., Gammerman, A., Stitson, M., Vapnik, V., Vovk, V., & Watkins, C. (1997). Support vector density estimation. In B. Schölkopf, C. Burnes, & A. Smola (Eds.). Advances in kernel methods: Support vector machines (pp. 293-306). Cambridge, MA: MIT Press.
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Sciences Department, University of Wisconsin, 1210 West Dayton Street, Madison, WI, 53706, USA
O.L. Mangasarian
Department of Mathematics and Computer Science, Carleton College, One North College Street, Northfield, MN, 55057, USA
David R. Musicant

Authors

O.L. Mangasarian
View author publications
You can also search for this author in PubMed Google Scholar
David R. Musicant
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mangasarian, O., Musicant, D.R. Large Scale Kernel Regression via Linear Programming. Machine Learning 46, 255–269 (2002). https://doi.org/10.1023/A:1012422931930

Download citation

Issue Date: January 2002
DOI: https://doi.org/10.1023/A:1012422931930

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Large Scale Kernel Regression via Linear Programming

Abstract

Article PDF

Similar content being viewed by others

Large Scale Learning Techniques for Least Squares Support Vector Machines

Nyström-SGD: Fast Learning of Kernel-Classifiers with Conditioned Stochastic Gradient Descent

Fast online algorithm for nonlinear support vector machines and other alike models

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Large Scale Kernel Regression via Linear Programming

Abstract

Article PDF

Similar content being viewed by others

Large Scale Learning Techniques for Least Squares Support Vector Machines

Nyström-SGD: Fast Learning of Kernel-Classifiers with Conditioned Stochastic Gradient Descent

Fast online algorithm for nonlinear support vector machines and other alike models

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation