Skip to main content
Log in

Correlations of random classifiers on large data sets

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

This article has been updated

Abstract

Classification of large data sets by feedforward neural networks is investigated. To deal with unmanageably large sets of classification tasks, a probabilistic model of their relevance is considered. Optimization of networks computing randomly chosen classifiers is studied in terms of correlations of classifiers with network input–output functions. Effects of increasing sizes of sets of data to be classified are analyzed using geometrical properties of high-dimensional spaces. Their consequences on concentrations of values of sufficiently smooth functions of random variables around their mean values are applied. It is shown that the critical factor for suitability of a class of networks for computing randomly chosen classifiers is the maximum of sizes of the mean values of their correlations with network input–output functions. To include cases in which function values are not independent, the method of bounded differences is exploited.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

Change history

  • 19 July 2021

    Communicated by removal

References

  • Azuma K (1967) Weighted sums of certain dependent random variables. Tohoku Math J 19:357–367

    Article  MathSciNet  Google Scholar 

  • Barron AR (1993) Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans Inf Theory 39:930–945

    Article  MathSciNet  Google Scholar 

  • Bellman R (1957) Dynamic Programming. Princeton University Press, Princeton

    MATH  Google Scholar 

  • Bengio Y, Delalleau O, Roux NL (2006) The curse of highly variable functions for local kernel machines. Adv Neural Inf Process Syst 18:107–114

    Google Scholar 

  • Chung F, Lui L (2005) Concentration inequalities and martingale inequalities: a survey. Internet Math 3:79–127

    Article  MathSciNet  Google Scholar 

  • Dubhashi D, Panconesi A (2009) Concentration of measure for the analysis of randomized algorithms. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Gallicchio C, Scardapane S (2020) Deep randomized neural networks. In: Oneto L, Navarin N, Sperduti A, Anguita D (eds) Recent trends in learning from data. Studies in computational intelligence, vol 896. Springer, Switzeland, pp 44–68

  • Gnecco G, Kůrková V, Sanguineti M (2011) Can dictionary-based computational models outperform the best linear ones? Neural Netw 24:881–887

    Article  Google Scholar 

  • Gnecco G, Kůrková V, Sanguineti M (2011) Some comparisons of complexity in dictionary-based and linear computational models. Neural Netw 24:171–182

    Article  Google Scholar 

  • Gnecco G, Sanguineti M (2008) Approximation error bounds via Rademacher’s complexity. Appl Math Sci 2(4):153–176

    MathSciNet  MATH  Google Scholar 

  • Gnecco G, Sanguineti M (2011) On a variational norm tailored to variable-basis approximation schemes. IEEE Trans Inf Theory 57:549–558

    Article  MathSciNet  Google Scholar 

  • Gorban A, Tyukin I, Prokhorov D, Sofeikov K (2016) Approximation with random bases: Pro et contra. Inf Sci 364–365:129–145

    Article  Google Scholar 

  • Gorban AN, Makarov VA, Tyukin IY (2019) The unreasonable effectiveness of small neural ensembles in high-dimensional brain. Phys Life Rev 29:55–88

    Article  Google Scholar 

  • Ito Y (1992) Finite mapping by neural networks and truth functions. Math Sci 17:69–77

    MathSciNet  MATH  Google Scholar 

  • Kainen P, Kůrková V, Sanguineti M (2003) Minimization of error functionals over variable-basis functions. SIAM J Optim 14:732–742

    Article  MathSciNet  Google Scholar 

  • Kainen PC, Kůrková V (1993) Quasiorthogonal dimension of Euclidean spaces. Appl Math Lett 6(3):7–10

    Article  MathSciNet  Google Scholar 

  • Kainen PC, Kůrková V (2020) Quasiorthogonal dimension. In: Beyond traditional probabilistic data processing techniques: interval, fuzzy, etc. Methods and their applications. Studies in computational intelligence, vol 835, pp 615–629. Springer

  • Kainen PC, Kůrková V, Sanguineti M (2012) Dependence of computational models on input dimension: tractability of approximation and optimization tasks. IEEE Trans Inf Theory 58:1203–1214

    Article  MathSciNet  Google Scholar 

  • Kůrková V (2012) Complexity estimates based on integral transforms induced by computational units. Neural Netw 33:160–167

    Article  Google Scholar 

  • Kůrková V, Sanguineti M (2016) Model complexities of shallow networks representing highly varying functions. Neurocomputing 171:598–604

    Article  Google Scholar 

  • Kůrková V, Sanguineti M (2017) Probabilistic lower bounds for approximation by shallow perceptron networks. Neural Netw 91:34–41

    Article  Google Scholar 

  • Kůrková V, Sanguineti M (2019) Classification by sparse neural networks. IEEE Trans Neural Netw Learning Syst 30(9):2746–2754

    Article  MathSciNet  Google Scholar 

  • Matoušek J (2002) Lectures on discrete geometry. Springer, New York

    Book  Google Scholar 

  • Mhaskar HN (2004) On the tractability of multivariate integration and approximation by neural networks. J Complex 20:561–590

    Article  MathSciNet  Google Scholar 

  • Tropp A (2004) Greed is good: algorithmic results for sparse approximation. IEEE Trans Inf Theory 50:2231–2242

    Article  MathSciNet  Google Scholar 

Download references

Funding

This study was funded by the Czech Science Foundation grant GA 19-05704S and by institutional support of the Institute of Computer Science RVO 67985807 (V. Kůrková) and by the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 824160 (M. Sanguineti). M. Sanguineti is Research Associate at the Institute for Marine Engineering of National Research Council of Italy under project PDGP 2018/20 DIT.AD016.001, Affiliated Researcher at IIT—Italian Institute of Technology (Advanced Robotics Research Line), Genova, and Visiting Professor at IMT—School for Advances Studies (AXES Research Unit), Lucca.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Věra Kůrková.

Ethics declarations

Conflict of Interest

Author Věra Kůrková declares that she has no conflict of interest. Author Marcello Sanguineti declares that he has no conflict of interest.

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

V.K. was partially supported by the Czech Science Foundation grant GA 19-05704S and by institutional support of the Institute of Computer Science RVO 67985807. M.S. was partially supported by the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 824160 and by the Project PDGP 2018/20 DIT.AD016.001 “Technologies for Smart Communities” of INM (Institute for Marine Engineering) of CNR (National Research Council of Italy), where he is Research Associate. M.S. is a member of GNAMPA-INdAM (Gruppo Nazionale per l’Analisi Matematica, la Probabilità e le loro Applicazioni—Instituto Nazionale di Alta Matematica), Affiliated Researcher at IIT—Italian Institute of Technology (Advanced Robotics Research Line), Genova, and Visiting Professor at IMT—School for Advances Studies (AXES Research Unit), Lucca.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kůrková, V., Sanguineti, M. Correlations of random classifiers on large data sets. Soft Comput 25, 12641–12648 (2021). https://doi.org/10.1007/s00500-021-05938-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-021-05938-4

Keywords

Navigation