Abstract
A practical approach to nonparametric cluster analysis of large data sets is presented. The number of clusters and the cluster centers are derived by applying the mean shift procedure on a reduced set of points randomly selected from the data. The cluster boundaries are delineated using a k-nearest neighbor technique. The resulting algorithm is stable and efficient, allowing the cluster decomposition of a 10000 point data set in only a few seconds. Complex clustering examples and applications are discussed.
This research was supported by the NSF under the grant IRI-9530546.
Chapter PDF
Keywords
- Cluster Center
- Kernel Density Estimate
- Mean Integrate Square Error
- Density Constraint
- Epanechnikov Kernel
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Y. Cheng, “Mean Shift, Mode Seeking, and Clustering”, IEEE Trans. Pattern Anal. Machine Intell., vol. 17, 790–799, 1995.
D. Comaniciu, P. Meer, “Robust Analysis of Feature Spaces: Color Image Segmentation”, Proc. IEEE Conf. on Comp. Vis. and Pattern Recognition, Puerto Rico, 750–755, 1997.
J.H. Friedman, J.L. Bentley, R.A. Finkel, “An Algorithm for Finding Best Matches in Logarithmic Expected Time”, ACM Trans. Mathematical Software, vol. 3, 209–226, 1977.
K. Fukunaga, L.D. Hostetler, “The Estimation of the Gradient of a Density Function, with Applications in Pattern Recognition”, IEEE Trans. Info. Theory, vol. IT-21, 32–40, 1975.
K. Fukunaga, Introduction to Statistical Pattern Recognition, Boston: Academic Press, 1990.
J.A. Garcia, J.F. Valdivia, F.J. Cortijo, and R. Molina, “A Dynamic Approach for Clustering Data”, Signal Processing, vol. 44, 181–196, 1995.
M. Herbin, N. Bonnet, P. Vautrot, “A Clustering Method Based on the Estimation of the Probability Density Function and on the Skeleton by Influence Zones”, Pattern Recognition Letters, vol. 17, 1141–1150, 1996.
A.K. Jain, R.C. Dubes, Algorithms for Clustering Data, Englewood Cliff, NJ: Prentice Hall, 1988.
S.A. Nene, S.K. Nayar, “A Simple Algorithm for Nearest Neighbor Search in High Dimensions”, IEEE Trans. Pattern Anal. Machine Intell., vol. 19, 989–1003, 1997.
K. Popat, R.W. Picard, “Cluster-Based Probability Model and Its Application to Image and Texture Processing”, IEEE Trans. Image Process., vol. 6, no. 2, 268–284, 1997.
K. Rose, E. Gurewitz, G.C. Fox, “Constrained Clustering as an Optimization Method”, IEEE Trans. Pattern Anal. Machine Intell., vol. 15, 785–794, 1993.
D.W. Scott, Multivariate Density Estimation, New York: Wiley, 1992.
R. Sedgewick, Algorithms in C++, New York: Addison-Wesley, 1992.
B.W. Silverman, Density Estimation for Statistics and Data Analysis, New York: Chapman and Hall, 1986.
G.R. Terrell and D.W. Scott, “Variable Density Estimation”, The Annals of Statistics, vol. 20, 1236–1265, 1992.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Comaniciu, D., Meer, P. (1998). Distribution free decomposition of multivariate data. In: Amin, A., Dori, D., Pudil, P., Freeman, H. (eds) Advances in Pattern Recognition. SSPR /SPR 1998. Lecture Notes in Computer Science, vol 1451. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0033284
Download citation
DOI: https://doi.org/10.1007/BFb0033284
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64858-1
Online ISBN: 978-3-540-68526-5
eBook Packages: Springer Book Archive