Abstract
The self-organizing map (SOM) is related to the classical vector quantization (VQ). Like in the VQ, the SOM represents a distribution of input data vectors using a finite set of models. In both methods, the quantization error (QE) of an input vector can be expressed, e.g., as the Euclidean norm of the difference of the input vector and the best-matching model. Since the models are usually optimized in the VQ so that the sum of the squared QEs is minimized for the given set of training vectors, a common notion is that it will be impossible to find models that produce a smaller rms QE. Therefore it has come as a surprise that in some cases the rms QE of a SOM can be smaller than that of a VQ with the same number of models and the same input data. This effect may manifest itself if the number of training vectors per model is on the order of small integers and the testing is made with an independent set of test vectors. An explanation seems to ensue from statistics. Each model vector in the VQ is determined as the average of those training vectors that are mapped into the same Voronoi domain as the model vector. On the contrary, each model vector of the SOM is determined as a weighted average of all of those training vectors that are mapped into the “topological” neighborhood around the corresponding model. The number of training vectors mapped into the neighborhood of a SOM model is generally much larger than that mapped into a Voronoi domain around a model in the VQ. Since the SOM model vectors are then determined with a significantly higher statistical accuracy, the Voronoi domains of the SOM are significantly more regular, and the resulting rms QE may then be smaller than in the VQ. However, the effective dimensionality of the vectors must also be sufficiently high.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bação, O., Lobo, V., Painho, M.: Self-organizing maps as substitutes for k-means clustering. In: Sunderam, V.S., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2005. LNCS, vol. 3516, pp. 476–483. Springer, Heidelberg (2005)
Cole, R.A., Muthusamy, Y., Fanty, M.A.: The ISOLET Spoken Letter Database, Technical Report 90-004, Computer Science Department, Oregon Graduate Institute (1994)
Cottrell, M., Fort, J.C., Pagès, G.: Theoretical aspects of the SOM algorithm. Neurocomputing 21(1), 119–138 (1998)
Dersch, D., Tavan, P.: Asymptotic level density in topological feature maps. IEEE Trans. Neural Networks 6(1), 230–236 (1995)
Falconer, K.: Fractal Geometry: Mathematical Foundations and Applications. Wiley, West Sussex (2003)
Fort, J.C., Cottrell, M., Letremy, P.: Stochastic on-line algorithm vs. batch algorithm for quantization and self-organizing maps. In: Neural Networks for Signal Processing XI: Proc. of the 2001 IEEE Signal Processing Society Workshop, pp. 43–52. IEEE, Piscataway (2001)
Fukunaga, K., Olsen, D.R.: An algorithm for finding intrinsic dimensionality of data. IEEE Trans. Computers C-20, 176–183 (1971)
Gersho, A.: On the structure of vector quantizers. IEEE Trans. Inform. Theory IT-25, 373–380 (1979)
Kohonen, T.: Self-Organizing Maps, 3rd edn. Springer, Heidelberg (2001)
Lewis, D.D., Yang, Y., Rose, T.G., Li, T.: RCV1: A new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004)
Linde, Y., Buzo, A., Gray, R.M.: An algorithm for vector quantization. IEEE Trans. Communication COM-28, 84–95 (1980)
Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)
McAuliffe, J.D., Atlas, L.E., Rivera, C.: A comparison of the LBG algorithm and Kohonen neural network paradigm for image vector quantization. In: Proc. ICASSP-90, Acoustics, Speech and Signal Processing, vol. IV, pp. 2293–2296. IEEE Service Center, Piscataway (1990)
Ritter, H.: Asymptotic level density for a class of vector quantization processes. IEEE Trans. Neural Networks 2(1), 173–175 (1991)
Vesanto, J., Alhoniemi, E., Himberg, J., Kiviluoto, K., Parviainen, J.: Self-organizing map for data mining in Matlab: the SOM Toolbox. Simulation News Europe (25), 54 (1999)
Zador, P.L.: Asymptotic quantization error of continuous signals and the quantization dimension. IEEE Trans. Inform. Theory IT-28, 139–149 (1982)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kohonen, T., Nieminen, I.T., Honkela, T. (2009). On the Quantization Error in SOM vs. VQ: A Critical and Systematic Study. In: Príncipe, J.C., Miikkulainen, R. (eds) Advances in Self-Organizing Maps. WSOM 2009. Lecture Notes in Computer Science, vol 5629. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02397-2_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-02397-2_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02396-5
Online ISBN: 978-3-642-02397-2
eBook Packages: Computer ScienceComputer Science (R0)