Exact combinatorial bounds on the probability of overfitting for empirical risk minimization

Vorontsov, K. V.

doi:10.1134/S105466181003003X

Exact combinatorial bounds on the probability of overfitting for empirical risk minimization

Mathematical Methods in Pattern Recognition
Published: 11 September 2010

Volume 20, pages 269–285, (2010)
Cite this article

Pattern Recognition and Image Analysis Aims and scope Submit manuscript

K. V. Vorontsov¹

121 Accesses
11 Citations
Explore all metrics

Abstract

Three general methods for obtaining exact bounds on the probability of overfitting are proposed within statistical learning theory: a method of generating and destroying sets, a recurrent method, and a blockwise method. Six particular cases are considered to illustrate the application of these methods. These are the following model sets of predictors: a pair of predictors, a layer of a Boolean cube, an interval of a Boolean cube, a monotonic chain, a unimodal chain, and a unit neighborhood of the best predictor. For the interval and the unimodal chain, the results of numerical experiments are presented that demonstrate the effects of splitting and similarity on the probability of overfitting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Measuring the Capacity of Sets of Functions in the Analysis of ERM

Fast – Asymptotically Optimal – Methods for Determining the Optimal Number of Features

Unconstrained Optimization

References

P. V. Botov, “Exact Bounds for the Probability of Overfitting for Monotone and Unimodal Sets of Predictors,” in Proceedings of the 14th Russian Conference on Mathematical Methods of Pattern Recognition (MAKS Press, Moscow, 2009), pp. 7–10.
Google Scholar
K. V. Vorontsov, “Combinatorial Approach to Estimating the Quality of Learning Algorithms,” in Mathematics Problems of Cybernetics Ed. by O.B. Lupanov (Fizmatlit, Moscow, 2004), Vol. 13, pp. 5–36.
Google Scholar
D. A. Kochedykov, “Similarity Structures in Sets of Classifiers and Generalization Bounds,” in Proceedings of the 14th Russian Conference on Mathematical Methods of Pattern Recognition (MAKS Press, Moscow, 2009), pp. 45–48.
Google Scholar
A. I. Frey, “Exact Bounds for the Probability of Overfitting for Symmetric Sets of Predictors,” in Proceedings of the 14th Russian Conference on Mathematical Methods of Pattern Recognition (MAKS Press, Moscow, 2009), pp. 66–69.
Google Scholar
E. T. Bax, “Similar Predictors and VC Error Bounds,” Tech. Rep. CalTech-CS-TR97-14: 6 1997.
S. Boucheron, O. Bousquet, and G. Lugosi, “Theory of Classification: A Survey of Some Recent Advances,” ESIAM: Probab. Stat., No. 9, 323–375 (2005).
R. Herbrich and R. Williamson, “Algorithmic Luckiness,” J. Machine Learning Res., No. 3, 175–212 (2002).
V. Koltchinskii, “Rademacher Penalties and Structural Risk Minimization,” IEEE Trans. Inf. Theory 47(5) 1902–1914 (2001).
Article MATH MathSciNet Google Scholar
V. Koltchinskii and D. Panchenko, “Rademacher Processes and Bounding the Risk of Function Learning,” in High Dimensional Probability, II, Ed. by D.E. Gine and J Wellner (Birkhauser, 1999) pp. 443–457.
J. Langford, “Quantitatively Tight Sample Complexity Bounds,” Ph.D. Thesis (Carnegie Mellon Thesis, 2002).
J. Langford and D. McAllester, “Computable Shell Decomposition Bounds,” in Proceedings of the 13th Annual Conference on Computer Learning Theory (Morgan Kaufmann, San Francisco, CA, 2000), pp. 25–34.
Google Scholar
J. Langford and J. Shawe-Taylor, “PAC-Bayes and Margins,” in Advances in Neural Information Processing Systems 15 (MIT Press, 2002), pp. 439–446.
D. McAllester, “PAC-Bayesian Model Averaging,” in COLT: Proceedings of the Workshop on Computational Learning Theory (Morgan Kaufmann, San Francisco, CA, 1999).
Google Scholar
P. Philips, “Data-Dependent Analysis of Learning Systems,” Ph.D. Thesis (The Australian National University, Canberra, 2005).
Google Scholar
J. Sill, “Monotonicity and Connectedness in Learning Systems,” Ph.D. Thesis (California Inst. Technol., 1998).
V. Vapnik, Estimation of Dependencies Based on Empirical Data (Springer, New York, 1982).
Google Scholar
V. Vapnik, Statistical Learning Theory (Wiley, New York, 1998).
MATH Google Scholar
V. Vapnik and A. Chervonenkis, “On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities,” Theory Probab. Its. Appl. 16(2), 264–280 (1971).
Article MATH MathSciNet Google Scholar
N. Vayatis and R. Azencott, “Distribution-dependent Vapnik-Chervonenkis Bounds,” Lecture Notes in Computer Science 1572 230–240 (1999).
Article MathSciNet Google Scholar
K. V. Vorontsov, “Combinatorial Probability and the Tightness of Generalization Bounds,” Pattern Recognit. Image Anal. 18(2), 243–259 (2008).
Article Google Scholar
K. V. Vorontsov, “On the Influence of Similarity of Classifiers on the Probability of Overfitting,” in Pattern Recognition and Image Analysis: New Information Technologies (PRIA-9) (Nizhni Novgorod, 2008), Vol. 2, pp. 303–306.
Google Scholar
K. V. Vorontsov, “Splitting and Similarity Phenomena in the Sets of Classifiers and Their Effect on the Probability of Overfitting,” Pattern Recognit. Image Anal. 19(3), 412–420 (2009).
Article Google Scholar
K. V. Vorontsov, “Tight Bounds for the Probability of Overfitting,” Dokl. Math. 80(3) 793–796 (2009).
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Dorodnicyn Computing Centre, Russian Academy of Sciences, ul. Vavilova 40, Moscow, 119333, Russia
K. V. Vorontsov

Authors

K. V. Vorontsov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to K. V. Vorontsov.

Additional information

Konstantin Vorontsov. Born 1971. Graduated from the Faculty of Applied Mathematics and Control, Moscow Institute of Physics and Technology, in 1994. Received candidate’s degree in 1999 and doctoral degree in 2010. Currently is with the Dorodnicyn Computing Centre, Russian Academy of Sciences. Scientific interests: statistical learning theory, machine learning, data mining, probability theory, and combinatorics. Author of 75 papers.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vorontsov, K.V. Exact combinatorial bounds on the probability of overfitting for empirical risk minimization. Pattern Recognit. Image Anal. 20, 269–285 (2010). https://doi.org/10.1134/S105466181003003X

Download citation

Received: 14 April 2010
Published: 11 September 2010
Issue Date: September 2010
DOI: https://doi.org/10.1134/S105466181003003X

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Exact combinatorial bounds on the probability of overfitting for empirical risk minimization

Abstract

Access this article

Similar content being viewed by others

Measuring the Capacity of Sets of Functions in the Analysis of ERM

Fast – Asymptotically Optimal – Methods for Determining the Optimal Number of Features

Unconstrained Optimization

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Key words

Navigation

Exact combinatorial bounds on the probability of overfitting for empirical risk minimization

Abstract

Access this article

Similar content being viewed by others

Measuring the Capacity of Sets of Functions in the Analysis of ERM

Fast – Asymptotically Optimal – Methods for Determining the Optimal Number of Features

Unconstrained Optimization

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation