Skip to main content

K-means clustering in a low-dimensional Euclidean space

  • Conference paper
New Approaches in Classification and Data Analysis

Summary

A procedure is developed for clustering objects in a low-dimensional subspace of the column space of an objects by variables data matrix. The method is based on the K-means criterion and seeks the subspace that is maximally informative about the clustering structure in the data. In this low-dimensional representation, the objects, the variables and the cluster centroids are displayed jointly. The advantages of the new method are discussed, an efficient alternating least-squares algorithm is described, and the procedure is illustrated on some artificial data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • ARABIE, P., and HUBERT, L. (in press): Cluster analysis in marketing research. In: R. P. Bagozzi (ed.): Handbook of marketing research. Blackwell, Oxford.

    Google Scholar 

  • CHANG, W.-C. (1983): On using principal components before separating a mixture of two multivariate normal distributions. Applied Statistics, 32, 267–275.

    Article  Google Scholar 

  • DESARBO, W. S., HOWARD, D. J., and JEDIDI, K. (1991): Multiclus: A new method for simultaneously performing multidimensional scaling and cluster analysis. Psychometrika, 56, 121–136.

    Article  Google Scholar 

  • DESARBO, W. S., JEDIDI, K., COOL, K., and SCHENDEL, D. (1990): Simultaneous multidimensional unfolding and cluster analysis: An investigation of strategic groups. Marketing Letters, 2, 129–146.

    Article  Google Scholar 

  • DE SOETE, G., and HEISER, W. J. (1993): A latent class unfolding model for analyzing single stimulus preference ratings. Psychometrika, 58, 545–565.

    Article  Google Scholar 

  • DE SOETE, G. and WINSBERG, S. (1993): A latent class vector model for preference data. Journal of Classification, 10, 195–218.

    Article  Google Scholar 

  • DOYLE, P., and SAUNDERS, J. (1985): Market segmentation and positioning in specialized industrial markets. Journal of Marketing, 49, 24–32.

    Article  Google Scholar 

  • FURSE, D. H., PUNJ, G. N., and STEWART, D. W. (1984): A typology of individual search strategies among purchasers of new automobiles. Journal of Consumer Research, 10, 417–431.

    Article  Google Scholar 

  • GABRIEL, K. R. (1971): The biplot graphic display of matrices with application to principal component analysis. Biometrika, 58, 453–467.

    Article  Google Scholar 

  • HEISER, W. J. (1993): Clustering in low-dimensional space. In: O. Opitz, B. Lausen, and R. Klar (eds.): Information and classification. Springer-Verlag, Berlin, 162–173.

    Chapter  Google Scholar 

  • HUBERT, L., and ARABIE, P. (1985): Comparing partitions. Journal of Classification, 2, 193–218.

    Article  Google Scholar 

  • KRUSKAL, J. B. (1972): Linear transformation of multivariate data to reveal clustering. In: R. N. Shepard, A. K. Romney, and S. B. Nerlove (eds.): Multidimensional scaling. Theory and applications in the behavioral sciences. Seminar Press, New York, vol. 1, 179–191.

    Google Scholar 

  • MACQUEEN, J. (1967): Some methods for classification and analysis of multivariate observations. In: L. M. LeCam and J. Neyman (eds.): 5th Berkeley Symposium on Mathematics, Statistics, and Probability. University of California Press, Berkeley, vol. 1, 281–298.

    Google Scholar 

  • MILLIGAN, G. W. (1980): An examination of the effect of six types of error perturbation on fifteen clustering algorithms. Psychometrika, 45, 325–342.

    Article  Google Scholar 

  • VAN BUUREN, S., and HEISER, W. J. (1989): Clustering N objects into K groups under optimal scaling of variables. Psychometrika, 54, 699–706.

    Article  Google Scholar 

  • WINSBERG, S., and DE SOETE, G. (1993): A latent class approach to fitting the weighted Euclidean mode, Clascal. Psychometrika, 58, 315–330.

    Article  Google Scholar 

  • YOUNG, G. (1940): Maximum likelihood estimation and factor analysis. Psychometrika, 6, 49–53.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1994 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

De Soete, G., Carroll, J.D. (1994). K-means clustering in a low-dimensional Euclidean space. In: Diday, E., Lechevallier, Y., Schader, M., Bertrand, P., Burtschy, B. (eds) New Approaches in Classification and Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-51175-2_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-51175-2_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-58425-4

  • Online ISBN: 978-3-642-51175-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics