research-article

Free Access

Online EM for unsupervised models

Authors:
Percy Liang

University of California at Berkeley, Berkeley, CA

University of California at Berkeley, Berkeley, CA
View Profile

,
Dan Klein

University of California at Berkeley, Berkeley, CA

University of California at Berkeley, Berkeley, CA
View Profile

NAACL '09: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational LinguisticsMay 2009Pages 611–619

Published:31 May 2009Publication History

NAACL '09: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

Pages 611–619

ABSTRACT

The (batch) EM algorithm plays an important role in unsupervised induction, but it sometimes suffers from slow convergence. In this paper, we show that online variants (1) provide significant speedups and (2) can even find better solutions than those found by batch EM. We support these findings on four unsupervised tasks: part-of-speech tagging, document classification, word segmentation, and word alignment.

References

L. Bottou and O. Bousquet. 2008. The tradeoffs of large scale learning. In Advances in Neural Information Processing Systems (NIPS).Google Scholar
O. Cappé and E. Moulines. 2009. Online expectation-maximization algorithm for latent data models. Journal of the Royal Statistics Society: Series B (Statistical Methodology), 71.Google ScholarCross Ref
M. Collins, A. Globerson, T. Koo, X. Carreras, and P. Bartlett. 2008. Exponentiated gradient algorithms for conditional random fields and max-margin Markov networks. Journal of Machine Learning Research, 9. Google ScholarDigital Library
M. Collins. 2002. Discriminative training methods for hidden Markov models: Theory and experiments with Perceptron algorithms. In Empirical Methods in Natural Language Processing (EMNLP). Google ScholarDigital Library
J. R. Finkel, A. Kleeman, and C. Manning. 2008. Efficient, feature-based, conditional random field parsing. In Human Language Technology and Association for Computational Linguistics (HLT/ACL).Google Scholar
D. Gildea and T. Hofmann. 1999. Topic-based language models using EM. In Eurospeech.Google Scholar
S. Goldwater and T. Griffiths. 2007. A fully Bayesian approach to unsupervised part-of-speech tagging. In Association for Computational Linguistics (ACL).Google Scholar
S. Goldwater, T. Griffiths, and M. Johnson. 2006. Contextual dependencies in unsupervised word segmentation. In International Conference on Computational Linguistics and Association for Computational Linguistics (COLING/ACL). Google ScholarDigital Library
M. Johnson. 2007. Why doesn't EM find good HMM POS-taggers? In Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP/CoNLL).Google Scholar
M. Johnson. 2008. Using adaptor grammars to identify synergies in the unsupervised acquisition of linguistic structure. In Human Language Technology and Association for Computational Linguistics (HLT/ACL), pages 398--406.Google Scholar
J. Kuo, H. Li, and C. Lin. 2008. Mining transliterations from web query results: An incremental approach. In Sixth SIGHAN Workshop on Chinese Language Processing.Google Scholar
P. Liang and D. Klein. 2008. Analyzing the errors of unsupervised learning. In Human Language Technology and Association for Computational Linguistics (HLT/ACL).Google Scholar
P. Liang, D. Klein, and M. I. Jordan. 2008. Agreement-based learning. In Advances in Neural Information Processing Systems (NIPS).Google Scholar
R. McDonald, K. Crammer, and F. Pereira. 2005. Online large-margin training of dependency parsers. In Association for Computational Linguistics (ACL). Google ScholarDigital Library
R. Neal and G. Hinton. 1998. A view of the EM algorithm that justifies incremental, sparse, and other variants. In Learning in Graphical Models. Google ScholarDigital Library
F. J. Och and H. Ney. 2003. A systematic comparison of various statistical alignment models. Computational Linguistics, 29:19--51. Google ScholarDigital Library
M. Sato and S. Ishii. 2000. On-line EM algorithm for the normalized Gaussian network. Neural Computation, 12:407--432. Google ScholarDigital Library
Y. Seginer. 2007. Fast unsupervised incremental parsing. In Association for Computational Linguistics (ACL).Google Scholar
S. Shalev-Shwartz and N. Srebro. 2008. SVM optimization: Inverse dependence on training set size. In International Conference on Machine Learning (ICML). Google ScholarDigital Library
A. Venkataraman. 2001. A statistical model for word discovery in transcribed speech. Computational Linguistics, 27:351--372. Google ScholarDigital Library

Index Terms

Online EM for unsupervised models
1. Applied computing
  1. Arts and humanities
    1. Language translation
2. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Machine learning

Recommendations

Unsupervised estimation for noisy-channel models
ICML '07: Proceedings of the 24th international conference on Machine learning

Shannon's Noisy-Channel model, which describes how a corrupted message might be reconstructed, has been the corner stone for much work in statistical language and speech processing. The model factors into two components: a language model to characterize ...
Read More
Random swap EM algorithm for Gaussian mixture models

Expectation maximization (EM) algorithm is a popular way to estimate the parameters of Gaussian mixture models. Unfortunately, its performance highly depends on the initialization. We propose a random swap EM for the initialization of EM. Instead of ...
Read More
An unsupervised method for word sense disambiguation
Abstract
Word sense disambiguation (WSD) finds the actual meaning of a word according to its context. This paper presents a novel WSD method to find the correct sense of a word present in a sentence. The proposed method uses both the WordNet ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
NAACL '09: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
May 2009
716 pages
ISBN:9781932432411
General Chair:
Mari Ostendorf
University of Washington
,
Program Chairs:
Michael Collins
Massachusetts Institute of Technology
,
Shri Narayanan
University of Southern California
,
Douglas W. Oard
Microsoft Research
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 31 May 2009
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate21of29submissions,72%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 37
  Total Citations
  View Citations
- 905
  Total Downloads
- Downloads (Last 12 months)56
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Online EM for unsupervised models

NAACL '09: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Unsupervised estimation for noisy-channel models

Random swap EM algorithm for Gaussian mixture models

An unsupervised method for word sense disambiguation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Online EM for unsupervised models

NAACL '09: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Unsupervised estimation for noisy-channel models

Random swap EM algorithm for Gaussian mixture models

An unsupervised method for word sense disambiguation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media