skip to main content
article

Example-based machine translation using efficient sentence retrieval based on edit-distance

Authors Info & Claims
Published:01 December 2005Publication History
Skip Abstract Section

Abstract

An Example-Based Machine Translation (EBMT) system, whose translation example unit is a sentence, can produce an accurate and natural translation if translation examples similar enough to an input sentence are retrieved. Such a system, however, suffers from the problem of narrow coverage. To reduce the problem, a large-scale parallel corpus is required and, therefore, an efficient method is needed to retrieve translation examples from a large-scale corpus. The authors propose an efficient retrieval method for a sentence-wise EBMT using edit-distance. The proposed retrieval method efficiently retrieves the most similar sentences using the measure of edit-distance without omissions. The proposed method employs search-space division, word graphs, and an A* search algorithm. The performance of the EBMT was evaluated through Japanese-to-English translation experiments using a bilingual corpus comprising hundreds of thousands of sentences from a travel conversation domain. The EBMT system achieved a high-quality translation ability by using a large corpus and also achieved efficient processing by using the proposed retrieval method.

References

  1. Baldwin, T. and Tanaka, H. 2001. Balancing up efficiency and accuracy in translation retrieval. Journal of Natural Language Processing 8, 2, 19--37.Google ScholarGoogle Scholar
  2. Brzozowski, J. A. 1962. Canonical regular expressions and minimal state graphs for definite events. In Proc. of Symposium of Mathematical Theory of Automata, MRI Symposia Series 12, 529--561.Google ScholarGoogle Scholar
  3. Cormen, H. T., Leiserson, C. E., and Rivest, L. R. 1989. Introduction to Algorithms. The MIT Press, Cambridge, MA. Google ScholarGoogle Scholar
  4. Cranias, L., Papageorgiou, H., and Piperidis, S. 1997. Example retrieval from a translation memory. Natural Language Engineering 3, 4, 255--277. Google ScholarGoogle Scholar
  5. Doi, T. and Sumita, E. 2005. Splitting input for machine translation using n-gram language model together with utterance similarity. IEICE Transactions on Information and Systems E88-D, 6, 1256--1264. Google ScholarGoogle Scholar
  6. Imamura, K. 2002. Application of translation knowledge acquired by hierarchical phrase alignment for pattern-based MT. In Proc. of TMI-2002. 74--84.Google ScholarGoogle Scholar
  7. Imamura, K., Sumita, E., and Matsumoto, Y. 2003. Feedback cleaning of machine translation rules using automatic evaluation. In Proc. of ACL 2003. 457--454. Google ScholarGoogle Scholar
  8. Manning, D. C. and Schütze, H. 1999. Foundations of Statistical Natural Language Processing. The MIT Press, Cambridge, MA. Google ScholarGoogle Scholar
  9. Nagao, M. 1984. A framework of a mechanical translation between Japanese and English by analogy principle. In Artificial and Human Intelligence, A. Elithorn and R. Banerji, Eds. North-Holland, Amsterdam. 173--180. Google ScholarGoogle Scholar
  10. Nilsson, N. 1971. Problem-Solving Methods in Artificial Intelligence. McGraw-Hill, New York. Google ScholarGoogle Scholar
  11. Ohno, S. and Hamanishi, M. 1984. Ruigo-Shin-Jiten (in Japanese). Kadokawa, Tokyo, Japan.Google ScholarGoogle Scholar
  12. Papineni, K., Roukos, S., Ward, T., and Zhu, W. 2002. Bleu: A method for automatic evaluation of machine translation. In Proc. of 40th Annual Meeting of ACL. 311--318. Google ScholarGoogle Scholar
  13. Planas, E. and Furuse, O. 1999. Formalizing translation memories. In Proc. of 7th MT Summit. 331--339.Google ScholarGoogle Scholar
  14. Rapp, R. 2002. A part-of-speech-based search algorithm for translation memories. In Proc. of LREC 2002. 466--472.Google ScholarGoogle Scholar
  15. Sato, S. 1992. CTM: An example-based translation aid system. In Proc. of COLING '92. 1259--1263. Google ScholarGoogle Scholar
  16. Shimohata, M., Sumita, E., and Matsumoto, Y. 2003. Example-based rough translation for speech-to-speech translation. In Proc. of 9th MT Summit. 354--361.Google ScholarGoogle Scholar
  17. Somers, H. 2003. An overview of ebmt. In Recent Advances in Example-Based Machine Translation, M. Carl and A. Way, Eds. Kluwer Academic Publ., Boston, MA. 3--57.Google ScholarGoogle Scholar
  18. Sugaya, F., Takezawa, T., Yokoo, A., and Yamamoto, S. 2001. Proposal of an evaluation method for speech translation capability by comparing a speech translation system with humans and experiments using the method. IEICE Transactions on Information and Systems (Japanese edn.) J84-D-II, 11, 2362--2370.Google ScholarGoogle Scholar
  19. Sumita, E. 2003. An example-based machine translation system using DP-matching between word sequences. In Recent Advances in Example-Based Machine Translation, M. Carl and A. Way, Eds. Kluwer Academic Publ., Boston, MA. 189--209. Google ScholarGoogle Scholar
  20. Sumita, E. and Iida, H. 1991. Experiments and prospects of example-based machine translation. In Proc. of 29th Annual Meeting of ACL. 185--192. Google ScholarGoogle Scholar
  21. Sumita, E., Yamada, S., Yamamoto, K., Paul, M., Kashioka, H., Ishikawa, K., and Shirai, S. 1999. Solutions to problems inherent in spoken-language translation: The ATR-MATRIX approach. In Proc. of 7th MT Summit. 229--235.Google ScholarGoogle Scholar
  22. Takezawa, T. and Kikui, G. 2003. Collecting machine-translation-aided bilingual dialogues for corpus-based speech translation. In Proc. of EUROSPEECH. 2757--2760.Google ScholarGoogle Scholar
  23. Ueffing, N., Och, F., and Ney, H. 2002. Generation of word graphs in statistical machine translation. In Proc. of Conf. on Empirical Methods for Natural Language Processing. 156--163. Google ScholarGoogle Scholar

Index Terms

  1. Example-based machine translation using efficient sentence retrieval based on edit-distance

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader