Dynamic self-organising map
Introduction
Vector quantisation (VQ) refers to the modelling of a probability density function into a discrete set of prototype vectors (sometimes called the codebook) such that any point drawn from the associated distribution can be associated to a prototype vector. Most VQ algorithms try to match the density through the density of their codebook: high density regions of the distribution tend to have more associated prototypes than low density region. This generally allows to minimise the loss of information (or distortion) as measured by the mean quadratic error. For a complete picture, it is to be noted that there also exists some cases where only a partition of the space occupied by the data (regardless of their density) is necessary. In this case, one wants to achieve a regular quantification a priori of the probability density function. For example, in some classification problems, one wants to achieve a discrimination of data in terms of classes and thus needs only to draw frontiers between data regardless of their respective density.
Vector quantisation can be achieved using several methods such as variations of the k-means method [1], Linde–Buzo–Gray (LBG) algorithm [2] or neural network models such as the self-organising map (SOM) [3], neural gas (NG) [4] and growing neural gas (GNG) [5]. Among all these methods, the SOM algorithm is certainly the most famous in the field of computational neurosciences since it can give a biologically and plausible account on the organisation of receptive fields in sensory areas where adjacent neurons share similar representations. The stability and quality of such self-organisation depend heavily on a decreasing learning rate as well as on a decreasing neighbourhood function. This is quite congruent with the idea of a critical period in the early years of development where most sensory or motor properties are acquired and stabilised [6], [7], [8]. However, this fails to explain cortical plasticity since we know that the cortex has the capacity to re-organise itself in face of lesions or deficits [9], [10], [11]. The question is then to know to what extent it is possible to have both stable and dynamic representations?
Quite obviously, this cannot be achieved using SOM-like algorithms that depend on a time decreasing learning rate and/or neighbourhood function (SOM,NG,GNG) and, despite the huge amount of literature [12], [13] around self-organising maps and Kohonen-typed networks (more than 7000 works listed in [14]), there is surprisingly and comparatively very little work dealing with online learning (also referred as incremental or lifelong learning). Furthermore, most of these works are based on incremental models, that is, networks that create and/or delete nodes as necessary. For example, the modified GNG model [15] is able to follow non-stationary distributions by creating nodes like in a regular GNG and deleting them when they have a too small utility parameter. Similarly, the evolving self-organising map (ESOM) [16], [17] is based on an incremental network quite similar to GNG that creates dynamically based on the measure of the distance of the winner to the data (but the new node is created at exact data point instead of the mid-point as in GNG). Self-organising incremental neural network (SOINN) [18] and its enhanced version (ESOINN) [19] are also based on an incremental structure where the first version is using a two layers network while the enhanced version proposed a single layer network. One noticeable result is the model proposed by [20] which does not rely on an incremental structure but is based on the Butterworth decay scheme that does not decay parameters to zero. The model works in two phases, an initial phase (approximately 10 epochs) is used to establish a rough global topology thanks to a very large neighbourhood and the second phase uses a small neighbourhood phase to train the network. Unfortunately, the size of the neighbourhood in the second phase has to be adapted to the expected density of the data.
Without judging performances of these models, we do not think they give a satisfactory answer to our initial question and we propose instead to answer by considering a tight coupling between the environment and representations. If the environment is stable, representations should remain stable and if the environment suddenly changes, representations must dynamically adapt themselves and stabilise again onto the new environment. We thus modified the original SOM algorithm in order to make its learning rule and neighbourhood independent of time. This results in a tight coupling between the environment and the model that ensure both stability and plasticity. In the next section, we formally describe the dynamic self-organising map in the context of vector quantisation and both neural gas and self-organising map are formally described in order to underline differences between the three algorithms. The next section re-introduces the model from a more behavioural point of view and main experimental results are introduced using either low or high dimensional data and offers side-to-side comparison with other algorithms. Results concerning dynamic distributions are also introduced in the case of dynamic self-organising map in order to illustrate the coupling between the distribution and the model. Finally, we discuss the relevancy of such a model in the context of computational neurosciences and embodied cognition.
Section snippets
Definitions
Let us consider a probability density function f(x) on a compact manifold . A vector quantisation (VQ) is a function from to a finite subset of n code words that form the codebook. A cluster is defined as , which forms a partition of and the distortion of the VQ is measured by the mean quadratic errorIf the function f is unknown and a finite set of p non-biased observations is available, the distortion error may be
Model
As we explained in the Introduction, the DSOM algorithm is essentially a variation of the SOM algorithm where the time dependency has been removed. Regular learning function (2.4) and neighbourhood function (2.5) have been respectively replaced by Eqs. (2.10), (2.11) which reflect two main ideas:
- •
If a neuron is close enough to the data, there is no need for others to learn anything: the winner can represent the data.
- •
If there is no neuron close enough to the data, any neuron learns the data
Experimental results
We report in this section some experimental results we obtained on different types of distribution that aim at illustrating DSOM principles. We do not have yet formal results about convergence and/or quality of the codebook. As a consequence, these results do not pretend to prove anything and are introduced mainly to illustrate the qualitative behaviour of the algorithm.
Unless stated otherwise, the learning procedure in the following examples is:
- 1.
A distribution is chosen (normal, uniform, etc.).
- 2.
Conclusion
One of the major problem of most neural map algorithms is the necessity to have a finite set of observations to perform adaptive learning starting from a set of initial parameters (learning rate, neighbourhood or temperature) at time ti down to a set of final parameters at time tf. In the framework of signal processing or data analysis, this may be acceptable as long as we can generate a finite set of samples in order to learn it off-line. However, from a more behavioural point of view, this is
Acknowledgements
This work has received useful corrections and comments by Thierry Viville and support from the MAPS ANR grant.
Nicolas Rougier has graduated in 1995 with an engineer diploma from ESIAL (Ecole Supérieure d’Informatique et Applications de Lorraine). He has also received a Ph.D. degree in Computer Science in 2000 from the University Henri Poincaré, Nancy. He is currently an experienced researcher at INRIA, the French National Institute for Research in Computer Science and Control and is working within the CORTEX team/project at LORIA laboratory. He is interested in studying the properties and capacities of
References (29)
- et al.
On-line pattern analysis by evolving self-organizing maps
Neurocomputing
(2003) - et al.
An incremental network for on-line unsupervised classification and topology learning
Neural Networks
(2006) - et al.
An enhanced self-organizing incremental neural network for online unsupervised learning
Neural Networks
(2007) Competitive learning: from interactive activation to adaptive resonance
Cognitive Science
(1987)- et al.
Theoretical aspects of the som algorithm
Neurocomputing
(1998) Energy functions for self-organizing maps
Some methods of classification and analysis of multivariate observations
- et al.
An algorithm for vector quantization design
IEEE Transactions on Communications COM-28
(1980) Self-organized formation of topologically correct feature maps
Biological Cybernetics
(1982)- et al.
Neural-gas network for vector quantization and its application to time-series prediction
IEEE Transactions on Neural Networks
(1993)
A growing neural gas network learns topologies
Receptive fields and functional architecture in two non-striate visual areas (18 and 19) of the cat
Journal of Neurophysiology
The period of susceptibility to the physiological effects of unilateral eye closure in kittens
Journal of Physiology
Mechanisms of plasticity in the visual cortex
Investigative Ophthalmology
Cited by (0)
Nicolas Rougier has graduated in 1995 with an engineer diploma from ESIAL (Ecole Supérieure d’Informatique et Applications de Lorraine). He has also received a Ph.D. degree in Computer Science in 2000 from the University Henri Poincaré, Nancy. He is currently an experienced researcher at INRIA, the French National Institute for Research in Computer Science and Control and is working within the CORTEX team/project at LORIA laboratory. He is interested in studying the properties and capacities of neural computation, seen as distributed, numerical and adaptative information processing in light of embodied cognition. Its main topics of research includes dynamic neural fields, self-organisation, visual attention and active perception.
Yann Boniface received a Ph.D. degree in Computer Science in 2000 from the University Henri Poincaré, Nancy. He is currently an associate professor at Nancy University 2 where he teaches computer sciences and cognitive science. He is part of the CORTEX team/project at LORIA laboratory. His research interests are the dynamic neural field theory and the modeling of spiking neurons in order to study spatial and temporal coding.