Article

Free Access

A hierarchical phrase-based model for statistical machine translation

Author:
David Chiang

University of Maryland, College Park, MD

University of Maryland, College Park, MD
View Profile

ACL '05: Proceedings of the 43rd Annual Meeting on Association for Computational LinguisticsJune 2005Pages 263–270https://doi.org/10.3115/1219840.1219873

Published:25 June 2005Publication History

ACL '05: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics

Pages 263–270

ABSTRACT

We present a statistical phrase-based translation model that uses hierarchical phrases---phrases that contain subphrases. The model is formally a synchronous context-free grammar but is learned from a bitext without any syntactic information. Thus it can be seen as a shift to the formal machinery of syntax-based translation systems without any linguistic commitment. In our experiments using BLEU as a metric, the hierarchical phrase-based model achieves a relative improvement of 7.5% over Pharaoh, a state-of-the-art phrase-based system.

References

A. V. Aho and J. D. Ullman. 1969. Syntax directed translations and the pushdown assembler. Journal of Computer and System Sciences, 3:37--56.Google ScholarDigital Library
Daniel M. Bikel and David Chiang. 2000. Two statistical parsing models applied to the Chinese Treebank. In Proceedings of the Second Chinese Language Processing Workshop, pages 1--6. Google ScholarDigital Library
Hans Ulrich Block. 2000. Example-based incremental synchronous interpretation. In Wolfgang Wahlster, editor, Verbmobil: Foundations of Speech-to-Speech Translation, pages 411--417. Springer-Verlag, Berlin.Google ScholarCross Ref
Peter F. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, and Robert L. Mercer. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, 19:263--311. Google ScholarDigital Library
Stanley F. Chen and Joshua Goodman. 1998. An empirical study of smoothing techniques for language modeling. Technical Report TR-10-98, Harvard University Center for Research in Computing Technology.Google Scholar
Philipp Koehn, Franz Josef Och, and Daniel Marcu. 2003. Statistical phrase-based translation. In Proceedings of HLT-NAACL 2003, pages 127--133. Google ScholarDigital Library
Philipp Koehn. 2003. Noun Phrase Translation. Ph.D. thesis, University of Southern California. Google ScholarDigital Library
Philipp Koehn. 2004a. Pharaoh: a beam search decoder for phrase-based statistical machine translation models. In Proceedings of the Sixth Conference of the Association for Machine Translation in the Americas, pages 115--124.Google ScholarCross Ref
Philipp Koehn. 2004b. Statistical significance tests for machine translation evaluation. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 388--395.Google Scholar
Shankar Kumar, Yonggang Deng, and William Byrne. 2005. A weighted finite state transducer translation template model for statistical machine translation. Natural Language Engineering. To appear. Google ScholarDigital Library
Daniel Marcu and William Wong. 2002. A phrase-based, joint probability model for statistical machine translation. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 133--139. Google ScholarDigital Library
Franz Josef Och and Hermann Ney. 2000. Improved statistical alignment models. In Proceedings of the 38th Annual Meeting of the ACL, pages 440--447. Google ScholarDigital Library
Franz Josef Och and Hermann Ney. 2002. Discriminative training and maximum entropy models for statistical machine translation. In Proceedings of the 40th Annual Meeting of the ACL, pages 295--302. Google ScholarDigital Library
Franz Josef Och and Hermann Ney. 2004. The alignment template approach to statistical machine translation. Computational Linguistics, 30:417--449. Google ScholarDigital Library
Franz Josef Och, Ignacio Thayer, Daniel Marcu, Kevin Knight, Dragos Stefan Munteanu, Quamrul Tipu, Michel Galley, and Mark Hopkins. 2004. Arabic and Chinese MT at USC/ISI. Presentation given at NIST Machine Translation Evaluation Workshop.Google Scholar
Franz Josef Och. 2003. Minimum error rate training in statistical machine translation. In Proceedings of the 41st Annual Meeting of the ACL, pages 160--167. Google ScholarDigital Library
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the ACL, pages 311--318. Google ScholarDigital Library
Andreas Stolcke. 2002. SRILM -- an extensible language modeling toolkit. In Proceedings of the International Conference on Spoken Language Processing, volume 2, pages 901--904.Google Scholar
Dekai Wu. 1997. Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Computational Linguistics, 23:377--404. Google ScholarDigital Library
Kenji Yamada and Kevin Knight. 2001. A syntax-based statistical translation model. In Proceedings of the 39th Annual Meeting of the ACL, pages 523--530. Google ScholarDigital Library
Richard Zens and Hermann Ney. 2004. Improvements in phrase-based statistical machine translation. In Proceedings of HLT-NAACL 2004, pages 257--264. Google ScholarDigital Library
Ying Zhang, Stephan Vogel, and Alex Waibel. 2004. Interpreting BLEU/NIST scores: How much improvement do we need to have a better system? In Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC), pages 2051--2054.Google Scholar

A hierarchical phrase-based model for statistical machine translation

Recommendations

A tree-to-string phrase-based model for statistical machine translation
CoNLL '08: Proceedings of the Twelfth Conference on Computational Natural Language Learning

Though phrase-based SMT has achieved high translation quality, it still lacks of generalization ability to capture word order differences between languages. In this paper we describe a general method for tree-to-string phrase-based SMT. We study how ...
Read More
Integrating source-language context into phrase-based statistical machine translation

The translation features typically used in Phrase-Based Statistical Machine Translation (PB-SMT) model dependencies between the source and target phrases, but not among the phrases in the source language themselves. A swathe of research has demonstrated ...
Read More
The operation sequence model-combining n-gram-based and phrase-based statistical machine translation

In this article, we present a novel machine translation model, the Operation Sequence Model OSM, which combines the benefits of phrase-based and N-gram-based statistical machine translation SMT and remedies their drawbacks. The model represents the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ACL '05: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
June 2005
657 pages
General Chair:
Kevin Knight
University of Southern California
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 25 June 2005
Qualifiers
- Article
Conference

Acceptance Rates
ACL '05 Paper Acceptance Rate77of423submissions,18%Overall Acceptance Rate85of443submissions,19%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 269
  Total Citations
  View Citations
- 3,742
  Total Downloads
- Downloads (Last 12 months)61
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A hierarchical phrase-based model for statistical machine translation

ACL '05: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics

ABSTRACT

References

Cited By

Recommendations

A tree-to-string phrase-based model for statistical machine translation

Integrating source-language context into phrase-based statistical machine translation

The operation sequence model-combining n-gram-based and phrase-based statistical machine translation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A hierarchical phrase-based model for statistical machine translation

ACL '05: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics

ABSTRACT

References

Cited By

Recommendations

A tree-to-string phrase-based model for statistical machine translation

Integrating source-language context into phrase-based statistical machine translation

The operation sequence model-combining n-gram-based and phrase-based statistical machine translation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media