Elsevier

Neurocomputing

Volume 74, Issue 11, May 2011, Pages 1840-1847
Neurocomputing

Dynamic self-organising map

https://doi.org/10.1016/j.neucom.2010.06.034Get rights and content

Abstract

We present in this paper a variation of the self-organising map algorithm where the original time-dependent (learning rate and neighbourhood) learning function is replaced by a time-invariant one. This allows for on-line and continuous learning on both static and dynamic data distributions. One of the property of the newly proposed algorithm is that it does not fit the magnification law and the achieved vector density is not directly proportional to the density of the distribution as found in most vector quantisation algorithms. From a biological point of view, this algorithm sheds light on cortical plasticity seen as a dynamic and tight coupling between the environment and the model.

Introduction

Vector quantisation (VQ) refers to the modelling of a probability density function into a discrete set of prototype vectors (sometimes called the codebook) such that any point drawn from the associated distribution can be associated to a prototype vector. Most VQ algorithms try to match the density through the density of their codebook: high density regions of the distribution tend to have more associated prototypes than low density region. This generally allows to minimise the loss of information (or distortion) as measured by the mean quadratic error. For a complete picture, it is to be noted that there also exists some cases where only a partition of the space occupied by the data (regardless of their density) is necessary. In this case, one wants to achieve a regular quantification a priori of the probability density function. For example, in some classification problems, one wants to achieve a discrimination of data in terms of classes and thus needs only to draw frontiers between data regardless of their respective density.

Vector quantisation can be achieved using several methods such as variations of the k-means method [1], Linde–Buzo–Gray (LBG) algorithm [2] or neural network models such as the self-organising map (SOM) [3], neural gas (NG) [4] and growing neural gas (GNG) [5]. Among all these methods, the SOM algorithm is certainly the most famous in the field of computational neurosciences since it can give a biologically and plausible account on the organisation of receptive fields in sensory areas where adjacent neurons share similar representations. The stability and quality of such self-organisation depend heavily on a decreasing learning rate as well as on a decreasing neighbourhood function. This is quite congruent with the idea of a critical period in the early years of development where most sensory or motor properties are acquired and stabilised [6], [7], [8]. However, this fails to explain cortical plasticity since we know that the cortex has the capacity to re-organise itself in face of lesions or deficits [9], [10], [11]. The question is then to know to what extent it is possible to have both stable and dynamic representations?

Quite obviously, this cannot be achieved using SOM-like algorithms that depend on a time decreasing learning rate and/or neighbourhood function (SOM,NG,GNG) and, despite the huge amount of literature [12], [13] around self-organising maps and Kohonen-typed networks (more than 7000 works listed in [14]), there is surprisingly and comparatively very little work dealing with online learning (also referred as incremental or lifelong learning). Furthermore, most of these works are based on incremental models, that is, networks that create and/or delete nodes as necessary. For example, the modified GNG model [15] is able to follow non-stationary distributions by creating nodes like in a regular GNG and deleting them when they have a too small utility parameter. Similarly, the evolving self-organising map (ESOM) [16], [17] is based on an incremental network quite similar to GNG that creates dynamically based on the measure of the distance of the winner to the data (but the new node is created at exact data point instead of the mid-point as in GNG). Self-organising incremental neural network (SOINN) [18] and its enhanced version (ESOINN) [19] are also based on an incremental structure where the first version is using a two layers network while the enhanced version proposed a single layer network. One noticeable result is the model proposed by [20] which does not rely on an incremental structure but is based on the Butterworth decay scheme that does not decay parameters to zero. The model works in two phases, an initial phase (approximately 10 epochs) is used to establish a rough global topology thanks to a very large neighbourhood and the second phase uses a small neighbourhood phase to train the network. Unfortunately, the size of the neighbourhood in the second phase has to be adapted to the expected density of the data.

Without judging performances of these models, we do not think they give a satisfactory answer to our initial question and we propose instead to answer by considering a tight coupling between the environment and representations. If the environment is stable, representations should remain stable and if the environment suddenly changes, representations must dynamically adapt themselves and stabilise again onto the new environment. We thus modified the original SOM algorithm in order to make its learning rule and neighbourhood independent of time. This results in a tight coupling between the environment and the model that ensure both stability and plasticity. In the next section, we formally describe the dynamic self-organising map in the context of vector quantisation and both neural gas and self-organising map are formally described in order to underline differences between the three algorithms. The next section re-introduces the model from a more behavioural point of view and main experimental results are introduced using either low or high dimensional data and offers side-to-side comparison with other algorithms. Results concerning dynamic distributions are also introduced in the case of dynamic self-organising map in order to illustrate the coupling between the distribution and the model. Finally, we discuss the relevancy of such a model in the context of computational neurosciences and embodied cognition.

Section snippets

Definitions

Let us consider a probability density function f(x) on a compact manifold ΩRd. A vector quantisation (VQ) is a function Φ from Ω to a finite subset of n code words {wiRd}1in that form the codebook. A cluster is defined as Ci=def{xΩ|Φ(x)=wi}, which forms a partition of Ω and the distortion of the VQ is measured by the mean quadratic errorξ=i=1nCixwi2f(x)dx.If the function f is unknown and a finite set {xi} of p non-biased observations is available, the distortion error may be

Model

As we explained in the Introduction, the DSOM algorithm is essentially a variation of the SOM algorithm where the time dependency has been removed. Regular learning function (2.4) and neighbourhood function (2.5) have been respectively replaced by Eqs. (2.10), (2.11) which reflect two main ideas:

  • If a neuron is close enough to the data, there is no need for others to learn anything: the winner can represent the data.

  • If there is no neuron close enough to the data, any neuron learns the data

Experimental results

We report in this section some experimental results we obtained on different types of distribution that aim at illustrating DSOM principles. We do not have yet formal results about convergence and/or quality of the codebook. As a consequence, these results do not pretend to prove anything and are introduced mainly to illustrate the qualitative behaviour of the algorithm.

Unless stated otherwise, the learning procedure in the following examples is:

  • 1.

    A distribution is chosen (normal, uniform, etc.).

  • 2.

Conclusion

One of the major problem of most neural map algorithms is the necessity to have a finite set of observations to perform adaptive learning starting from a set of initial parameters (learning rate, neighbourhood or temperature) at time ti down to a set of final parameters at time tf. In the framework of signal processing or data analysis, this may be acceptable as long as we can generate a finite set of samples in order to learn it off-line. However, from a more behavioural point of view, this is

Acknowledgements

This work has received useful corrections and comments by Thierry Viville and support from the MAPS ANR grant.

Nicolas Rougier has graduated in 1995 with an engineer diploma from ESIAL (Ecole Supérieure d’Informatique et Applications de Lorraine). He has also received a Ph.D. degree in Computer Science in 2000 from the University Henri Poincaré, Nancy. He is currently an experienced researcher at INRIA, the French National Institute for Research in Computer Science and Control and is working within the CORTEX team/project at LORIA laboratory. He is interested in studying the properties and capacities of

References (29)

  • B. Fritzke

    A growing neural gas network learns topologies

  • D. Hubel et al.

    Receptive fields and functional architecture in two non-striate visual areas (18 and 19) of the cat

    Journal of Neurophysiology

    (1965)
  • D. Hubel et al.

    The period of susceptibility to the physiological effects of unilateral eye closure in kittens

    Journal of Physiology

    (1970)
  • N. Daw

    Mechanisms of plasticity in the visual cortex

    Investigative Ophthalmology

    (1994)
  • Cited by (0)

    Nicolas Rougier has graduated in 1995 with an engineer diploma from ESIAL (Ecole Supérieure d’Informatique et Applications de Lorraine). He has also received a Ph.D. degree in Computer Science in 2000 from the University Henri Poincaré, Nancy. He is currently an experienced researcher at INRIA, the French National Institute for Research in Computer Science and Control and is working within the CORTEX team/project at LORIA laboratory. He is interested in studying the properties and capacities of neural computation, seen as distributed, numerical and adaptative information processing in light of embodied cognition. Its main topics of research includes dynamic neural fields, self-organisation, visual attention and active perception.

    Yann Boniface received a Ph.D. degree in Computer Science in 2000 from the University Henri Poincaré, Nancy. He is currently an associate professor at Nancy University 2 where he teaches computer sciences and cognitive science. He is part of the CORTEX team/project at LORIA laboratory. His research interests are the dynamic neural field theory and the modeling of spiking neurons in order to study spatial and temporal coding.

    View full text