Measurements of Generalisation Based on Information Geometry

Zhu, Huaiyu; Rohwer, Richard

doi:10.1007/978-1-4615-6099-9_69

Measurements of Generalisation Based on Information Geometry

Huaiyu Zhu^4,5 &
Richard Rohwer^4,6

Chapter

2313 Accesses
6 Citations

Part of the book series: Operations Research/Computer Science Interfaces Series ((ORCS,volume 8))

Abstract

Neural networks are statistical models and learning rules are estimators. In this paper a theory for measuring generalisation is developed by combining Bayesian decision theory with information geometry. The performance of an estimator is measured by the information divergence between the true distribution and the estimate, averaged over the Bayesian posterior. This unifies the majority of error measures currently in use. The optimal estimators also reveal some intricate interrelationships among information geometry, Banach spaces and sufficient statistics.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

D. H. Ackley, G. E. Hinton, and T. J. Sejnowski, A learning algorithm for Boltzmann machines, Cog. Sci., Vol. 9 (1985), pp147–169.
Article Google Scholar
S. Amari, Differential-Geometrical Methods in Statistics, Vol. 28 of Springer Lecture Notes in Statistics. Springer-Verlag, New York 1985.
Google Scholar
S. Amari, Differential geometrical theory of statistics, In Amari et al. [4], Ch. 2, pp19–94.
Google Scholar
S. Amari, O. E. Barndoff-Nieldon, R. E. Kass, S. L. Lauritzen, and C. R. Rao, eds., Differential Geometry in Statistical Inference, Vol. 10 of IMS Lecture Notes Monograph. IMS, Hayward, CA (1987).
Google Scholar
S. J. Hanson, J. D. Cowan, and C. L. Giles, eds., Advances in Neural Information Processing Systems, Vol. 5 (1993), San Mateo, CA. Morgan Kaufmann.
Google Scholar
R. E. Kass, Canonical parameterization and zero parameter effects curvature, J. Roy. Stat. Soc., B, Vol. 46 (1984), pp86–92.
MathSciNet MATH Google Scholar
S. L. Lauritzen, Statistical manifolds, in: Amari et al. [4], Ch. 4, pp163–216.
Google Scholar
D. J. C. MacKay, Bayesian Methods for Adaptive Models. PhD thesis, California Institute of Technology, Pasadena, CA (1992)
Google Scholar
D. J. C. MacKay, Hyperparameters: Optimise, or integrate out?, Technical report, Cambridge (1993).
Google Scholar
R. M. Neal, Bayesian learning via stochastic dynamics, in: Hanson et al. [5], pp475–482.
Google Scholar
H. White, Learning in artificial neural networks: A statistical perspective, Neural Computation, Vol. 1(4) (1989), pp425–464.
Article Google Scholar
D. H. Wolpert, On the use of evidence in neural neworks, In Hanson et al. [5], pp539–546.
Google Scholar
H. Zhu, Neural Networks and Adaptive Computers: Theory and Methods of Stochastic Adaptive Computations. PhD thesis, Dept. of Stat. & Comp. Math., Liverpool University (1993), ftp://archive.cis.ohio-state.edu/pub/neuroprose/Thesis/zhu.thesis.ps.Z
Google Scholar
H. Zhu and R. Rohwer, Bayesian invariant measurements of generalisation for continuous distributions, Technical Report NCRG/4352, Dept. Comp. Sci. & Appl. Math., Aston University (August 1995), ftp://cs.aston.ac.Uk/neural/zhuh/continuous.ps.Z.
H. Zhu and R. Rohwer, Bayesian invariant measurements of generalisation for discrete distributions, Technical Report NCRG/4351, Dept. Comp. Sei. & Appl. Math., Aston University (August 1995), ftp://cs.aston.ac.uk/neural/zhuh/discrete.ps.Z.
H. Zhu and R. Rohwer, Information geometric measurements of generalisation, Technical Report NCRG/4350, Dept. Comp. Sci. & Appl. Math., Aston University (August 1995), ftp://cs.aston.ac.uk/neural/zhuh/generalisation.ps.Z.

Download references

Author information

Authors and Affiliations

Neural Computing Research Group, Department of Computer Science and Applied Mathematics, Aston University, Birmingham, B4 7ET, UK
Huaiyu Zhu & Richard Rohwer
Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM, 87501, USA
Huaiyu Zhu
Prediction Co., 320 Aztec Street, Suite B, Santa Fe, NM, 87501, USA
Richard Rohwer

Authors

Huaiyu Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Richard Rohwer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computing and Mathematical Sciences, University of Brighton, Brighton, BN1 4GJ, UK
Stephen W. Ellacott
School of Computing and Mathematics, University of Huddersfield, Queensgate, Huddersfield, HD1 3DH, UK
John C. Mason & Iain J. Anderson &

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zhu, H., Rohwer, R. (1997). Measurements of Generalisation Based on Information Geometry. In: Ellacott, S.W., Mason, J.C., Anderson, I.J. (eds) Mathematics of Neural Networks. Operations Research/Computer Science Interfaces Series, vol 8. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-6099-9_69

Download citation

DOI: https://doi.org/10.1007/978-1-4615-6099-9_69
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-7794-8
Online ISBN: 978-1-4615-6099-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics