Abstract
We briefly describe the main ideas of statistical learning theory, support vector machines, and kernel feature spaces.
The present article is based on Microsoft TR-2000-23, Redmond, WA, and on Schölkopf and Smola (2001).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aizerman, M. A., Braverman, É.. M., and Rozonoér, L. I. (1964). Theoretical foundations of the potential function method in pattern recognition learning. Automation and Remote Control 25: 821–837.
Alon, N., Ben-David, S., Cesa-Bianchi, N., and Haussier, D. (1997). Scale-sensitive Dimensions, Uniform Convergence, and Learnability. Journal of the ACM 44 (4): 615–631.
Aronszajn, N. (1950). Theory of reproducing kernels. Transactions of the American Mathematical Society 68: 337–404.
Bartlett, P. L., and Shawe-Taylor, J. (1999). Generalization performance of support vector machines and other pattern classifiers. In Schölkopf, B., Burges, C. J. C., and Smola, A. J., eds., Advances in Kernel Methods — Support Vector Learning, 43–54. Cambridge, MA: MIT Press.
Berg, C., Christensen, J. P. R., and Ressel, R. (1984). Harmonic Analysis on Semigroups. New York: Springer-Verlag.
Bertsekas, D. R. (1995). Nonlinear Programming. Belmont, MA: Athena Scientific.
Blanz, V., Schölkopf, B., Bülthoff, H., Burges, C., Vapnik, V., and Vetter, T. (1996). Comparison of view-based object recognition algorithms using realistic 3D models. In von der Malsburg, C., von Seelen, W., Vorbrüggen, J. C., and Sendhoff, B., eds., Artificial Neural Networks — ICANN’96, 251–256. Berlin: Springer Lecture Notes in Computer Science, Vol. 1112.
Boser, B. E., Guyon, I. M., and Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. In Haussier, D., ed., Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory, 144–152. Pittsburgh, PA: ACM Press.
Burges, C. J. C., and Schölkopf, B. (1997). Improving the accuracy and speed of support vector learning machines. In Mozer, M., Jordan, M., and Petsche, T., eds., Advances in Neural Information Processing Systems 9, 375–381. Cambridge, MA: MIT Press.
Cortes, C., and Vapnik, V. (1995). Support vector networks. Machine Learning 20: 273–297.
DeCoste, D., and Schölkopf, B. (2001). Training invariant support vector machines. Machine Learning. Accepted for publication. Also: Technical Report JPL-MLTR-00–1, Jet Propulsion Laboratory, Pasadena, CA, 2000.
Girosi, F., Jones, M., and Poggio, T. (1995). Regularization theory and neural networks architectures. Neural Computation 7 (2): 219–269.
Haussier, D. (1999). Convolutional kernels on discrete structures. Technical Report UCSC-CRL-99–10, Computer Science Department, University of California at Santa Cruz.
Mercer, J. (1909). Functions of positive and negative type and their connection with the theory of integral equations. Philosophical Transactions of the Royal Society, London A 209: 415–446.
Osuna, E., Freund, R., and Girosi, F. (1997). An improved training algorithm for support vector machines. In Principe, J., Gile, L., Morgan, N., and Wilson, E., eds., Neural Networks for Signal Processing VII — Proceedings of the 1997 IEEE Workshop, 276–285. New York: IEEE.
Platt, J. (1999). Fast training of support vector machines using sequential minimal optimization. In Schölkopf, B., Burges, C. J. C., and Smola, A. J., eds., Advances in Kernel Methods — Support Vector Learning, 185–208. Cambridge, MA: MIT Press.
Poggio, T. (1975). On optimal nonlinear associative recall. Biological Cybernetics 19: 201–209.
Schölkopf, B., and Smola, A. J. (2001). Learning with Kernels. Cambridge, MA: MIT Press. Forthcoming.
Schölkopf, B., Burges, C., and Vapnik, V. (1995). Extracting support data for a given task. In Fayyad, U. M., and Uthurusamy, R., eds., Proceedings, First International Conference on Knowledge Discovery Data Mining. Menlo Park: AAAI Press.
Schölkopf, B., Smola, A., and Müller, K.-R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation 10: 1299–1319.
Schölkopf, B., Burges, C. J. C., and Smola, A. J. (1999). Advances in Kernel Methods - Support Vector Learning. Cambridge, MA: MIT Press.
Schölkopf, B., Smola, A., Williamson, R. C., and Bartlett, P. L. (2000). New support vector algorithms. Neural Computation 12: 1207–1245.
Schölkopf, B., Platt, J., Shawe-Taylor, J., Smola, A. J., and Williamson, R. C. (2001). Estimating the support of a high-dimensional distribution. Neural Computation. To appear.
Schölkopf, B. (1997). Support Vector Learning. München: R. Oldenbourg Verlag. Doktorarbeit, TU Berlin. Download: http://www.kernel-machines.org.
Schölkopf, B. (2000). The kernel trick for distances. TR MSR 2000–51, Microsoft Research, Redmond, WA. Published in: T. K. Leen, T. G. Dietterich and V. Tresp (eds.), Advances in Neural Information Processing Systems 13, MIT Press, 2001.
Smola, A. J., and Schölkopf, B. (1998). On a kernel-based method for pattern recognition, regression, approximation and operator inversion. Algorithmica 22: 211–231.
Smola, A., and Schölkopf, B. (2001). A tutorial on support vector regression. Statistics and Computing. Forthcoming.
Smola, A., Schölkopf, B., and Müller, K.-R. (1998). The connection between regularization operators and support vector kernels. Neural Networks 11: 637–649.
Smola, A. J., Bartlett, P. L., Schölkopf, B., and Schuurmans, D. (2000). Advances in Large Margin Classifiers. Cambridge, MA: MIT Press.
Vapnik, V., and Chervonenkis, A. (1974). Theory of Pattern Recognition [in Russian]. Moscow: Nauka. (German Translation: W. Wapnik A. Tscherwonenkis, Theorie der Zeichenerkennung, Akademie-Verlag, Berlin, 1979 ).
Vapnik, V., and Lerner, A. (1963). Pattern recognition using generalized portrait method. Automation and Remote Control 24.
Vapnik, V. (1979). Estimation of Dependences Based on Empirical Data [in Russian]. Moscow: Nauka. ( English translation: Springer Verlag, New York, 1982 ).
Vapnik, V. (1995). The Nature of Statistical Learning Theory. NY: Springer.
Vapnik, V. (1998). Statistical Learning Theory. NY: Wiley.
Wahba, G. (1990). Spline Models for Observational Data, volume 59 of CBMS-NSF Regional Conference Series in Applied Mathematics. Philadelphia: SIAM.
Watkins, C. (2000). Dynamic alignment kernels. In Smola, A. J., Bartlett, P. L., Schölkopf, B., and Schuurmans, D., eds., Advances in Large Margin Classifiers, 39–50. Cambridge, MA: MIT Press.
Williamson, R. C., Smola, A. J., and Scljilkopf, B. (1998). Generalization performance of regularization networks and support vector machines via entropy numbers of compact operators. Technical Report 19, NeuroCOLT, http://www.neurocolt.com. Accepted for publication in IEEE Transactions on Information Theory.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Wien
About this chapter
Cite this chapter
Schölkopf, B. (2001). Statistical Learning and Kernel Methods. In: Della Riccia, G., Lenz, HJ., Kruse, R. (eds) Data Fusion and Perception. International Centre for Mechanical Sciences, vol 431. Springer, Vienna. https://doi.org/10.1007/978-3-7091-2580-9_1
Download citation
DOI: https://doi.org/10.1007/978-3-7091-2580-9_1
Publisher Name: Springer, Vienna
Print ISBN: 978-3-211-83683-5
Online ISBN: 978-3-7091-2580-9
eBook Packages: Springer Book Archive