Abstract
We present new tools from probability theory that can be applied to the analysis of learning algorithms. These tools allow to derive new bounds on the generalization performance of learning algorithms and to propose alternative measures of the complexity of the learning task, which in turn can be used to derive new learning algorithms.
Similar content being viewed by others
References
Anthony, M. and Shawe-Taylor, J. (1993). A result of Vapnik with applications,Discrete Appl. Math. 47, 207–217.
Bartlett, P. and Mendelson, S. (2002). Rademacher and gaussian complexities: Risk bounds and structural results,Journal of Machine Learning Research,3, 463–482.
Bartlett, P., Boucheron, S. and Lugosi, G. (2002a). Model selection and error estimation,Machin Learning,48, 85–113.
Bartlett, P., Bousquet, O. and Mendelson, S. (2002b). Local rademacher complexities (preprint).
Bartett, P., Bousquet, O. and Mendelson, S. (2002c). Localized rademacher complexity,Proceedings of the 15th Annual Conference on Computational Learning Theory, Lecture Notes in Comput. Sci., 44–58, Springer, Berlin.
Boucheron, S., Lugosi, G. and Massart, P. (2002). A sharp concentration inequality with applications,Random Structures Algorithms,16(3), 277–292.
Boucheron, S., Lugosi, G. and Massart, P. (2002). Concentration inequalities using the entropy method,Ann. Probab. (to appear).
Bousquet, O. (2002a). A Bennett concentration inequality and its application to suprema of empirical processes,Computes Rendus Mathématique Academie des Sciences. Paris,334, 495–500.
Bousquet, O. (2002b). Concentration inequalities and empirical processes theory applied to the analysis of learning algorithms, Ph.D. thesis, Centre de Mathématiques Appliquées, Ecole Polytechnique (preprint).
Bousquet, O. and Elisseeff, A. (2002). Stability and generalization,Journal of Machine Learning Research,2, 499–526.
Koltchinskii, V. and Panchenko, D. (2000). Rademacher processes and bounding the risk of function learning,High Dimensional Probability II (eds. E. Gine, D. Mason and J. Wellner) 443–459.
Ledoux, M. and Talagrand, M. (1991).Probability in Banach Spaces, Springer, Berlin.
Massart, P. (2000). Some applications of concentration inequalities to statistics,Ann. Fac. Sci. Toulouse Math. (6),9(2), 245–303.
McDiarmid, C. (1989). On the method of bounded differences,Surveys in Combinatorics, London Math. Soc. Lecture Note Ser.,141, 148–188, Cambridge University Press, Cambridge.
Mendelson, S. (2001). On the size of convex hulls of small sets,Journal of Machine Learning Research,2, 1–18.
van der Vaart, A. and Wellner, J. (1996).Weak Convergence and Empirical Processes with Applications to Statistics, Wiley, New York.
Vapnik, V. and Chervonenkis, A. (1971). On the uniform convergence of relative frequencies of events to their probabilities,Theory Probab. Appl.,16, 264–280.
Vapnik, V. and Chervonenkis, A. (1991). The necessary and sufficient conditions for consistency of the method of empirical risk minimization,Pattern Recognition and Image Analysis,1(3), 284–305.
Author information
Authors and Affiliations
About this article
Cite this article
Bousquet, O. New approaches to statistical learning theory. Ann Inst Stat Math 55, 371–389 (2003). https://doi.org/10.1007/BF02530506
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02530506