Abstract
This paper considers conditions on an activation function of hidden units for the purpose of utilizing backpropagation for three-layer-net learning. A necessary condition for the convergence of backpropagation procedures to a global minimum of a cost function is that a set of states of the hidden layer is linearly separable. A sufficient condition for the separability is that the vectors made from the states and a constant become linearly independent. This paper discusses the conditions that the vectors become linearly independent.
It is proved that when there are I-training patterns, if the (I-1)-th derivative of the activation function of hidden units exists and if it is not zero at a point, there is a set of the states which is linearly separable when there are (I-1)-hidden units.
Two examples of nets with one input unit are considered to estimate the connection weights when sets of states of hidden layers are linearly separable: the net whose activation function of the hidden units is a polynomial of degree (I-1); and the case where the connection weights between the hidden units and the input one are sufficiently small. It is shown for both nets with the (I-1)-hidden units that if all connection-weight values between the hidden units and the input unit are different, then there are separable sets of states of the hidden layers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
R. P. Gorman and T. J. Sejnowski, “Analysis of Hidden Units in a Layered Network Trained to Classify Sonar Targets,” Neural Networks Vol. 1, No. 1, 1988, pp. 75–89.
W. Y. Huang and R. P. Lippmann, “Neural Net and Traditional Classifiers,” in Proc. Neural Info. Proc. Systems, Denver, 1987, pp. 387–396.
D. E. Rumelhalt, G. E. Hinton and R. J. Williams, “Learning Internal Representations by Error Propagations,” Parallel Distributed Processing: Explorations in the Microstructure of Cognition, (D. E. Rumelhalt and J. L. MaClelland(Eds.)), Vol.1, MIT Press, Cambridge, 1986, pp. 318–362.
E. D. Songtag and H. J. Sussmann, “Backpropagation Separates When Perceptrons do,” in Proc. IEEE Int’l Conf. on Neural Networks, Washington D. C., 1989, Vol. 1, pp. 639–642.
M. Brady, R. Raghavan and J. Slawny, “Gradient Descent Fails to Separate,” in Proc. IEEE Int’l Conf. on Neural Networks, San Diego, California, July 1988, Vol. 1, pp. 649–656.
B. S. Wittner and J. S. Denker, “Strategies for Teaching Layered Networks Classification Tasks,” in Proc. Conf. Neural Info. Proc. Systems, Denver, 1987, pp. 387–396.
M. Minsky and S. Papert, Perceptrons: an Introduction to Computational Geometry, Cambridge, MIT Press, 1986.
M. Arai, “Mapping Abilities of Three-Layer Neural Networks,” in Proc. IEEE Int’l Joint Conf. on Neural Networks, Washington D. C., 1989, Vol. 1, pp. 419–424.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 1990 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Arai, M. (1990). Conditions on Activation Functions of Hidden Units for Learning by Backpropagation. In: International Neural Network Conference. Springer, Dordrecht. https://doi.org/10.1007/978-94-009-0643-3_119
Download citation
DOI: https://doi.org/10.1007/978-94-009-0643-3_119
Publisher Name: Springer, Dordrecht
Print ISBN: 978-0-7923-0831-7
Online ISBN: 978-94-009-0643-3
eBook Packages: Springer Book Archive