Skip to main content

Statistical Learning and Kernel Methods

  • Chapter
Data Fusion and Perception

Part of the book series: International Centre for Mechanical Sciences ((CISM,volume 431))

Abstract

We briefly describe the main ideas of statistical learning theory, support vector machines, and kernel feature spaces.

The present article is based on Microsoft TR-2000-23, Redmond, WA, and on Schölkopf and Smola (2001).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Aizerman, M. A., Braverman, É.. M., and Rozonoér, L. I. (1964). Theoretical foundations of the potential function method in pattern recognition learning. Automation and Remote Control 25: 821–837.

    Google Scholar 

  • Alon, N., Ben-David, S., Cesa-Bianchi, N., and Haussier, D. (1997). Scale-sensitive Dimensions, Uniform Convergence, and Learnability. Journal of the ACM 44 (4): 615–631.

    Article  MATH  MathSciNet  Google Scholar 

  • Aronszajn, N. (1950). Theory of reproducing kernels. Transactions of the American Mathematical Society 68: 337–404.

    Article  MATH  MathSciNet  Google Scholar 

  • Bartlett, P. L., and Shawe-Taylor, J. (1999). Generalization performance of support vector machines and other pattern classifiers. In Schölkopf, B., Burges, C. J. C., and Smola, A. J., eds., Advances in Kernel Methods — Support Vector Learning, 43–54. Cambridge, MA: MIT Press.

    Google Scholar 

  • Berg, C., Christensen, J. P. R., and Ressel, R. (1984). Harmonic Analysis on Semigroups. New York: Springer-Verlag.

    Book  MATH  Google Scholar 

  • Bertsekas, D. R. (1995). Nonlinear Programming. Belmont, MA: Athena Scientific.

    MATH  Google Scholar 

  • Blanz, V., Schölkopf, B., Bülthoff, H., Burges, C., Vapnik, V., and Vetter, T. (1996). Comparison of view-based object recognition algorithms using realistic 3D models. In von der Malsburg, C., von Seelen, W., Vorbrüggen, J. C., and Sendhoff, B., eds., Artificial Neural Networks — ICANN’96, 251–256. Berlin: Springer Lecture Notes in Computer Science, Vol. 1112.

    Google Scholar 

  • Boser, B. E., Guyon, I. M., and Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. In Haussier, D., ed., Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory, 144–152. Pittsburgh, PA: ACM Press.

    Google Scholar 

  • Burges, C. J. C., and Schölkopf, B. (1997). Improving the accuracy and speed of support vector learning machines. In Mozer, M., Jordan, M., and Petsche, T., eds., Advances in Neural Information Processing Systems 9, 375–381. Cambridge, MA: MIT Press.

    Google Scholar 

  • Cortes, C., and Vapnik, V. (1995). Support vector networks. Machine Learning 20: 273–297.

    MATH  Google Scholar 

  • DeCoste, D., and Schölkopf, B. (2001). Training invariant support vector machines. Machine Learning. Accepted for publication. Also: Technical Report JPL-MLTR-00–1, Jet Propulsion Laboratory, Pasadena, CA, 2000.

    Google Scholar 

  • Girosi, F., Jones, M., and Poggio, T. (1995). Regularization theory and neural networks architectures. Neural Computation 7 (2): 219–269.

    Article  Google Scholar 

  • Haussier, D. (1999). Convolutional kernels on discrete structures. Technical Report UCSC-CRL-99–10, Computer Science Department, University of California at Santa Cruz.

    Google Scholar 

  • Mercer, J. (1909). Functions of positive and negative type and their connection with the theory of integral equations. Philosophical Transactions of the Royal Society, London A 209: 415–446.

    MATH  Google Scholar 

  • Osuna, E., Freund, R., and Girosi, F. (1997). An improved training algorithm for support vector machines. In Principe, J., Gile, L., Morgan, N., and Wilson, E., eds., Neural Networks for Signal Processing VII — Proceedings of the 1997 IEEE Workshop, 276–285. New York: IEEE.

    Chapter  Google Scholar 

  • Platt, J. (1999). Fast training of support vector machines using sequential minimal optimization. In Schölkopf, B., Burges, C. J. C., and Smola, A. J., eds., Advances in Kernel Methods — Support Vector Learning, 185–208. Cambridge, MA: MIT Press.

    Google Scholar 

  • Poggio, T. (1975). On optimal nonlinear associative recall. Biological Cybernetics 19: 201–209.

    Article  MATH  MathSciNet  Google Scholar 

  • Schölkopf, B., and Smola, A. J. (2001). Learning with Kernels. Cambridge, MA: MIT Press. Forthcoming.

    Google Scholar 

  • Schölkopf, B., Burges, C., and Vapnik, V. (1995). Extracting support data for a given task. In Fayyad, U. M., and Uthurusamy, R., eds., Proceedings, First International Conference on Knowledge Discovery Data Mining. Menlo Park: AAAI Press.

    Google Scholar 

  • Schölkopf, B., Smola, A., and Müller, K.-R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation 10: 1299–1319.

    Article  Google Scholar 

  • Schölkopf, B., Burges, C. J. C., and Smola, A. J. (1999). Advances in Kernel Methods - Support Vector Learning. Cambridge, MA: MIT Press.

    Google Scholar 

  • Schölkopf, B., Smola, A., Williamson, R. C., and Bartlett, P. L. (2000). New support vector algorithms. Neural Computation 12: 1207–1245.

    Article  Google Scholar 

  • Schölkopf, B., Platt, J., Shawe-Taylor, J., Smola, A. J., and Williamson, R. C. (2001). Estimating the support of a high-dimensional distribution. Neural Computation. To appear.

    Google Scholar 

  • Schölkopf, B. (1997). Support Vector Learning. München: R. Oldenbourg Verlag. Doktorarbeit, TU Berlin. Download: http://www.kernel-machines.org.

    Google Scholar 

  • Schölkopf, B. (2000). The kernel trick for distances. TR MSR 2000–51, Microsoft Research, Redmond, WA. Published in: T. K. Leen, T. G. Dietterich and V. Tresp (eds.), Advances in Neural Information Processing Systems 13, MIT Press, 2001.

    Google Scholar 

  • Smola, A. J., and Schölkopf, B. (1998). On a kernel-based method for pattern recognition, regression, approximation and operator inversion. Algorithmica 22: 211–231.

    Article  MATH  MathSciNet  Google Scholar 

  • Smola, A., and Schölkopf, B. (2001). A tutorial on support vector regression. Statistics and Computing. Forthcoming.

    Google Scholar 

  • Smola, A., Schölkopf, B., and Müller, K.-R. (1998). The connection between regularization operators and support vector kernels. Neural Networks 11: 637–649.

    Article  Google Scholar 

  • Smola, A. J., Bartlett, P. L., Schölkopf, B., and Schuurmans, D. (2000). Advances in Large Margin Classifiers. Cambridge, MA: MIT Press.

    MATH  Google Scholar 

  • Vapnik, V., and Chervonenkis, A. (1974). Theory of Pattern Recognition [in Russian]. Moscow: Nauka. (German Translation: W. Wapnik A. Tscherwonenkis, Theorie der Zeichenerkennung, Akademie-Verlag, Berlin, 1979 ).

    Google Scholar 

  • Vapnik, V., and Lerner, A. (1963). Pattern recognition using generalized portrait method. Automation and Remote Control 24.

    Google Scholar 

  • Vapnik, V. (1979). Estimation of Dependences Based on Empirical Data [in Russian]. Moscow: Nauka. ( English translation: Springer Verlag, New York, 1982 ).

    Google Scholar 

  • Vapnik, V. (1995). The Nature of Statistical Learning Theory. NY: Springer.

    Book  MATH  Google Scholar 

  • Vapnik, V. (1998). Statistical Learning Theory. NY: Wiley.

    MATH  Google Scholar 

  • Wahba, G. (1990). Spline Models for Observational Data, volume 59 of CBMS-NSF Regional Conference Series in Applied Mathematics. Philadelphia: SIAM.

    Google Scholar 

  • Watkins, C. (2000). Dynamic alignment kernels. In Smola, A. J., Bartlett, P. L., Schölkopf, B., and Schuurmans, D., eds., Advances in Large Margin Classifiers, 39–50. Cambridge, MA: MIT Press.

    Google Scholar 

  • Williamson, R. C., Smola, A. J., and Scljilkopf, B. (1998). Generalization performance of regularization networks and support vector machines via entropy numbers of compact operators. Technical Report 19, NeuroCOLT, http://www.neurocolt.com. Accepted for publication in IEEE Transactions on Information Theory.

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Wien

About this chapter

Cite this chapter

Schölkopf, B. (2001). Statistical Learning and Kernel Methods. In: Della Riccia, G., Lenz, HJ., Kruse, R. (eds) Data Fusion and Perception. International Centre for Mechanical Sciences, vol 431. Springer, Vienna. https://doi.org/10.1007/978-3-7091-2580-9_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-7091-2580-9_1

  • Publisher Name: Springer, Vienna

  • Print ISBN: 978-3-211-83683-5

  • Online ISBN: 978-3-7091-2580-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics