ABSTRACT
Previous comparisons of document and query translation suffered difficulty due to differing quality of machine translation in these two opposite directions. We avoid this difficulty by training identical statistical translation models for both translation directions using the same training data. We investigate information retrieval between English and French, incorporating both translations directions into both document translation and query translation-based information retrieval, as well as into hybrid systems. We find that hybrids of document and query translation-based systems out-perform query translation systems, even human-quality query translation systems.
- L. Ballasteros and W. B. Croft. 1997. Phrasal translation and query expansion techniques for cross-language information retrieval. In 20th Annual ACM SIGIR Conference on Information Retrieval.]] Google ScholarDigital Library
- L. Ballasteros and W. B. Croft. 1998. Resolving ambiguity for cross-language retrieval. In 21th Annual ACM SIGIR Conference on Information Retrieval.]] Google ScholarDigital Library
- P. F. Brown, J. C. Lai, and R. L. Mercer. 1991. Aligning sentences in parallel corpora. In Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics.]] Google ScholarDigital Library
- P. Brown, S. Della Pietra, V. Della Pietra, and R. Mercer. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, 19: 263--311.]] Google ScholarDigital Library
- C. Buckley, M. Mitra, J. Wals, and C. Cardie. 1998. Using clustering and superconcepts within SMART: TREC-6. In E. M. Voorhees and D. K. Harman, editors, The 6th Text REtrieval Conference (TREC-6).]]Google Scholar
- J. G. Carbonell, Y. Yang, R. E. Frederking, R. D. Brown, Yibing Geng, and Danny Lee. 1997. Translingual information retrieval: A comparative evaluation. In Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence.]]Google Scholar
- E. Chan, S. Garcia, and S. Roukos. 1997. TREC-5 ad-hoc retrieval using k nearest-neighbors re-scoring. In E. M. Voorhees and D. K. Harman, editors, The 5th Text REtrieval Conference (TREC-5).]]Google Scholar
- A. Diekema, F. Oroumchian, P. Sheridan, and E. Liddy. 1999. TREC-7 evaluation of Conceptual Interlingua Document Retrieval (CINDOR) in English and French. In E. M. Voorhees and D. K. Harman, editors, The 7th Text REtrieval Conference (TREC-7).]]Google Scholar
- S. Dumais, T. A. Letsche, M. L. Littman, and T. K. Landauer. 1997. Automatic cross-language retrieval using latent semantic indexing. In AAAI Symposium on Cross-Language Text and Speech Retrieval.]]Google Scholar
- M. Franz and S. Roukos. 1998. TREC-6 adhoc retrieval. In E. M. Voorhees and D. K. Harman, editors, The 6th Text REtrieval Conference (TREC-6).]]Google Scholar
- M. Franz, J. S. McCarley, and S. Roukos. 1999. Ad hoc and multilingual information retrieval at IBM. In E. M. Voorhees and D. K. Harman, editors, The 7th Text REtrieval Conference (TREC-7).]]Google Scholar
- J. S. McCarley and S. Roukos. 1998. Fast document translation for cross-language information retrieval. In D. Farwell., E. Hovy, and L. Gerber, editors, Machine Translation and the Information Soup, page 150.]] Google ScholarDigital Library
- D. W. Oard and P. Hackett. 1998. Document translation for cross-language text retrieval at the University of Maryland. In E. M. Voorhees and D. K. Harman, editors, The 6th Text REtrieval Conference (TREC-6).]]Google Scholar
- D. W. Oard. 1998. A comparative study of query and document translation for cross-language information retrieval. In D. Farwell., E. Hovy, and L. Gerber, editors, Machine Translation and the Information Soup, page 472.]] Google ScholarDigital Library
- S. E. Robertson, S. Walker, S. Jones, M. M. Hancock-Beaulieu, and M. Gatford. 1995. Okapi at TREC-3. In E. M. Voorhees and D. K. Harman, editors, The 3d Text REtrieval Conference (TREC-3).]]Google Scholar
- Jinxi Xu and W. Bruce Croft. 1996. Query expansion using local and global document analysis. In 19th Annual ACM SIGIR Conference on Information Retrieval.]] Google ScholarDigital Library
- Should we translate the documents or the queries in cross-language information retrieval?
Recommendations
Divide and translate: improving long distance reordering in statistical machine translation
WMT '10: Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATRThis paper proposes a novel method for long distance, clause-level reordering in statistical machine translation (SMT). The proposed method separately translates clauses in the source sentence and reconstructs the target sentence using the clause ...
On Arabic-English Cross-Language Information Retrieval: A Machine Translation Approach
ITCC '02: Proceedings of the International Conference on Information Technology: Coding and ComputingA Machine Translation (MT) system is an automatic process that translates from one human language to another language by using context information. We evaluate the use of an MT-based approach for query translation in an Arabic-English Cross-Language ...
Comments