Abstract
The set of all the neural networks of a fixed architecture forms a geometrical manifold where the modifable connection weights play the role of coordinates. It is important to study all such networks as a whole rather than the behavior of each network in order to understand the capability of information processing of neural networks. What is the natural geometry to be introduced in the manifold of neural networks? Information geometry gives an answer, giving the Riemannian metric and a dual pair of affine connections. An overview is given to information geometry of neural networks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ackley, D., Hinton, G., and Sejnowski, T., A learning algorithm for Boltzmann machines, Cognitive Science, Vol. 9 (1985), pp147–169.
Amari, S., Theory of adaptive pattern classifiers, IEEE Trans., Vol. EC-16 (1967), pp299–307.
Amari, S., Differential-Geometircal Methods in Statistics, Springer-Verlag, New York (1985).
Amari, S., Differential geometry of a parametric family of invertible linear systems — Riemannian metric, dual affine connections and divergence, Mathematical Systems Theory, Vol. 20} (1987), pp53–82.
Amari, S., Fisher information under restriction of Shannon information in multiterminal situations, Annals of Institute of Statistical Mathematics, Vol. 41 (1989), pp623–648.
Amari, S., Dualistic geometry of the manifold of higher-order neurons, Neural Networks, Vol. 4 (1991), pp443–451.
Amari, S., Information geometry of EM and em algorithms for neural networks, Neural Networks, Vol. 8 (1995), No.5.
Amari, S., Murata, N., Müller, K.-R., Finke, M. and Yang, H., Asymptotic statistical theory of overtraining and cross-validation, IEEE Trans. NN, submitted (1995).
Amari, S. and Han, T.S., Statistical inference under multi-terminal rate restrictions — a differential geometrical approach, IEEE Trans, on Information Theory, Vol. IT-35 (1989), pp217–227.
Amari, S., Kurata, K. and Nagaoka, H., Information geometry of Boltzmann machines, IEEE Trans. Neural Networks, Vol. 3 (1992), pp260–277.
Barndorff-Nielsen, O.E., Cox, R.D., and Reid, N., The role of differential geometry in statistical theory, International Statistical Review, Vol. 54 (1986), pp83–96.
Csiszár, I., I-divergence geometry of probability distributions and minimization problems, Annals of Probability, Vol. 3 (1975), pp146–158.
Csiszár, I. and Tusnády, G., Information geometry and alternating minimization procedures, in E.F. Dedewicz, et al. (eds), Statistics and Decisions (Supplementary Issue, No.1 (1984), pp205–237, Munich: Oldenburg Verlag.
Fujiwara, A. and Amari, S., Dualistic dynamical systems in the framework of information geometry, Physica D, Vol. 80 (1995), pp317–327.
Jordan, M.I. and Jacobs, R.A., Higherarchical mixtures of experts and the EM-algorithm, Neural Computation, Vol. 6 (1994), pp181–214.
Kass, R. E., The geometry of asymptotic inference (with discussions), Statistical Science, Vol. 4 (1989), pp188–234.
Murray, M.K. and Rice, J.W., Differential Geometry and Statistics, Chapman & Hall (1993).
Nakamura, Y., A tau-function for the finite Toda molecule, and information spaces, Contemporary Mathematics, Vol. 179 (1994), pp205–211.
Ohara, A. and Amari, S., Differential geometric structures of stable state feedback systems with dual connections, Kybernetika, Vol. 30 (1994), pp369–386.
Rumelhart, D., Hinton, G.E. and Williams, R.J., Learning internal representations by error propagation, in: Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Vol. 1, Foundations, MIT Press (1986), Cambridge, MA.
Xu, L., YING-YANG machine: A Bayesian-Kullback scheme for unified learnings and new results on vector quantization, Proc. ICONIP’95-Beijing (1995).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1997 Springer Science+Business Media New York
About this chapter
Cite this chapter
Amari, Si. (1997). Information Geometry of Neural Networks — An Overview —. In: Ellacott, S.W., Mason, J.C., Anderson, I.J. (eds) Mathematics of Neural Networks. Operations Research/Computer Science Interfaces Series, vol 8. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-6099-9_2
Download citation
DOI: https://doi.org/10.1007/978-1-4615-6099-9_2
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-7794-8
Online ISBN: 978-1-4615-6099-9
eBook Packages: Springer Book Archive