skip to main content
10.3115/1219840.1219873dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
Article
Free Access

A hierarchical phrase-based model for statistical machine translation

Published:25 June 2005Publication History

ABSTRACT

We present a statistical phrase-based translation model that uses hierarchical phrases---phrases that contain subphrases. The model is formally a synchronous context-free grammar but is learned from a bitext without any syntactic information. Thus it can be seen as a shift to the formal machinery of syntax-based translation systems without any linguistic commitment. In our experiments using BLEU as a metric, the hierarchical phrase-based model achieves a relative improvement of 7.5% over Pharaoh, a state-of-the-art phrase-based system.

References

  1. A. V. Aho and J. D. Ullman. 1969. Syntax directed translations and the pushdown assembler. Journal of Computer and System Sciences, 3:37--56.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Daniel M. Bikel and David Chiang. 2000. Two statistical parsing models applied to the Chinese Treebank. In Proceedings of the Second Chinese Language Processing Workshop, pages 1--6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Hans Ulrich Block. 2000. Example-based incremental synchronous interpretation. In Wolfgang Wahlster, editor, Verbmobil: Foundations of Speech-to-Speech Translation, pages 411--417. Springer-Verlag, Berlin.Google ScholarGoogle ScholarCross RefCross Ref
  4. Peter F. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, and Robert L. Mercer. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, 19:263--311. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Stanley F. Chen and Joshua Goodman. 1998. An empirical study of smoothing techniques for language modeling. Technical Report TR-10-98, Harvard University Center for Research in Computing Technology.Google ScholarGoogle Scholar
  6. Philipp Koehn, Franz Josef Och, and Daniel Marcu. 2003. Statistical phrase-based translation. In Proceedings of HLT-NAACL 2003, pages 127--133. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Philipp Koehn. 2003. Noun Phrase Translation. Ph.D. thesis, University of Southern California. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Philipp Koehn. 2004a. Pharaoh: a beam search decoder for phrase-based statistical machine translation models. In Proceedings of the Sixth Conference of the Association for Machine Translation in the Americas, pages 115--124.Google ScholarGoogle ScholarCross RefCross Ref
  9. Philipp Koehn. 2004b. Statistical significance tests for machine translation evaluation. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 388--395.Google ScholarGoogle Scholar
  10. Shankar Kumar, Yonggang Deng, and William Byrne. 2005. A weighted finite state transducer translation template model for statistical machine translation. Natural Language Engineering. To appear. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Daniel Marcu and William Wong. 2002. A phrase-based, joint probability model for statistical machine translation. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 133--139. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Franz Josef Och and Hermann Ney. 2000. Improved statistical alignment models. In Proceedings of the 38th Annual Meeting of the ACL, pages 440--447. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Franz Josef Och and Hermann Ney. 2002. Discriminative training and maximum entropy models for statistical machine translation. In Proceedings of the 40th Annual Meeting of the ACL, pages 295--302. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Franz Josef Och and Hermann Ney. 2004. The alignment template approach to statistical machine translation. Computational Linguistics, 30:417--449. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Franz Josef Och, Ignacio Thayer, Daniel Marcu, Kevin Knight, Dragos Stefan Munteanu, Quamrul Tipu, Michel Galley, and Mark Hopkins. 2004. Arabic and Chinese MT at USC/ISI. Presentation given at NIST Machine Translation Evaluation Workshop.Google ScholarGoogle Scholar
  16. Franz Josef Och. 2003. Minimum error rate training in statistical machine translation. In Proceedings of the 41st Annual Meeting of the ACL, pages 160--167. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the ACL, pages 311--318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Andreas Stolcke. 2002. SRILM -- an extensible language modeling toolkit. In Proceedings of the International Conference on Spoken Language Processing, volume 2, pages 901--904.Google ScholarGoogle Scholar
  19. Dekai Wu. 1997. Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Computational Linguistics, 23:377--404. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Kenji Yamada and Kevin Knight. 2001. A syntax-based statistical translation model. In Proceedings of the 39th Annual Meeting of the ACL, pages 523--530. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Richard Zens and Hermann Ney. 2004. Improvements in phrase-based statistical machine translation. In Proceedings of HLT-NAACL 2004, pages 257--264. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Ying Zhang, Stephan Vogel, and Alex Waibel. 2004. Interpreting BLEU/NIST scores: How much improvement do we need to have a better system? In Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC), pages 2051--2054.Google ScholarGoogle Scholar
  1. A hierarchical phrase-based model for statistical machine translation

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image DL Hosted proceedings
          ACL '05: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
          June 2005
          657 pages
          • General Chair:
          • Kevin Knight

          Publisher

          Association for Computational Linguistics

          United States

          Publication History

          • Published: 25 June 2005

          Qualifiers

          • Article

          Acceptance Rates

          ACL '05 Paper Acceptance Rate77of423submissions,18%Overall Acceptance Rate85of443submissions,19%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader