Abstract
Learning a Gaussian mixture with a local algorithm like EM can be difficult because (i) the true number of mixing components is usually unknown, (ii) there is no generally accepted method for parameter initialization, and (iii) the algorithm can get trapped in one of the many local maxima of the likelihood function. In this paper we propose a greedy algorithm for learning a Gaussian mixture which tries to overcome these limitations. In particular, starting with a single component and adding components sequentially until a maximum number k, the algorithm is capable of achieving solutions superior to EM with k components in terms of the likelihood of a test set. The algorithm is based on recent theoretical results on incremental mixture density estimation, and uses a combination of global and local search each time a new component is added to the mixture.
Similar content being viewed by others
References
Blake, C. L. and Merz, C. J.: UCI Repository of Machine Learning Databases, University of California, Irvine, Dept. of Information and Computer Sciences, 1998.
Böhning, D.: A review of reliable maximum likelihood algorithms for semiparametric mixture models, J. Statist. Plann. Inference 47 (1995), 5–28.
Dasgupta, S.: Learning mixtures of Gaussians, In: Proc. IEEE Symp. on Foundations of Computer Science, New York, Oct. 1999.
Dempster, A. P., Laird, N. M. and Rubin, D. B.: Maximum likelihood from incomplete data via the EM algorithm, J. Roy. Statist. Soc. B 39 (1977), 1–38.
DerSimonian, R.: Maximum likelihood estimation of a mixing distribution, J. Roy. Statist. Soc. C 35 (1986), 302–309.
Ghahramani, Z. and Hinton, G.: The EM Algorithm for Mixtures of Factor Analyzers. Technical report, University of Toronto, 1997, CRG-TR–96–1.
Li, J. Q. and Barron, A. R.: Mixture density estimation, In: Advances in Neural Information Processing Systems 12, The MIT Press, 2000.
Lindsay, B. G.: The geometry of mixture likelihoods: a general theory, Ann. Statist. 11(1) (1983), 86–94.
McLachlan, G. J. and Peel, D.: Finite Mixture Models, Wiley, New York, 2000.
Moore, A. W.: Very fast EM-based mixture model clustering using multiresolution kd-trees, In: Advances in Neural Information Processing Systems 11, The MIT Press, 1999.
Neal, R. M. and Hinton, G. E.: A view of the EM algorithm that justifies incremental, sparse, and other variants, In: M. I. Jordan (ed), Learning in graphical models, pages 355–368. Kluwer Academic Publishers, Dordrecht, The Netherlands, 1998.
Redner, R. A. and Walker, H. F.: Mixture densities, maximum likelihood and the EM algorithm, SIAM Review 26(2) (1984), 195–239.
Richardson, S. and Green, P. J.: On Bayesian analysis of mixtures with an unknown number of components, J. Roy. Statist. Soc. B 59(4) (1997), 731–792.
Ripley, B. D.: Pattern Recognition and Neural Networks, Cambridge University Press, Cambridge, U.K., 1996.
Smola, A. J., Mangasarian, O. L. and Schölkopf, B.: Sparse kernel feature analysis. Technical report, Data Mining Institute, University of Wisconsin, Madison, 1999.
Tipping, M. E. and Bishop, C. M.: Mixtures of probabilistic principal component analysers, Neural Computation 11(2) (1999), 443–482.
Ueda, N., Nakano, R., Ghahramani, Z. and Hinton, G. E.: SMEM algorithm for mixture models, Neural Computation 12 (2000), 2109–2128.
Vlassis, N. and Likas, A.: A kurtosis-based dynamic approach to Gaussian mixture modeling, IEEE Trans. Systems, Man, and Cybernetics, Part A 29(4) (1999), 393–399.
Wand, M. P.: Fast computation of multivariate kernel estimators, J. Comp. and Graph. Statistics 3(4) (1994), 433–445.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Vlassis, N., Likas, A. A Greedy EM Algorithm for Gaussian Mixture Learning. Neural Processing Letters 15, 77–87 (2002). https://doi.org/10.1023/A:1013844811137
Issue Date:
DOI: https://doi.org/10.1023/A:1013844811137