skip to main content
10.1145/996350.996378acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
Article

Translating unknown cross-lingual queries in digital libraries using a web-based approach

Authors Info & Claims
Published:07 June 2004Publication History

ABSTRACT

Users' cross-lingual queries to a digital library system might be short and not included in a common translation dictionary (unknown terms). In this paper, we investigate the feasibility of exploiting the Web as the corpus source to translate unknown query terms for cross-language information retrieval (CLIR) in digital libraries. We propose a Web-based term translation approach to determine effective translations for unknown query terms by mining bilingual search-result pages obtained from a real Web search engine. This approach can enhance the construction of a domain-specific bilingual lexicon and benefit CLIR services in a digital library that only has monolingual document collections Very promising results have been obtained in generating effective translation equivalents for many unknown terms, including proper nouns, technical terms and Web query terms.

References

  1. Chakrabarti, S. Mining the Web: Analysis of Hypertext and Semi Structured Data, Morgan Kaufmann, 2002.]]Google ScholarGoogle Scholar
  2. Chen, A. Jiang, H. and Gey, F Combining Multiple Sources for Short Query Translation in Chinese-English Cross-Language Information Retrieval. In Proceedings of the 5th International Workshop on Information Retrieval with Asian Languages (IRAL 2000), 2000, 17--23.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Chien, L F PAT-Tree-based Keyword Extraction for Chinese Information Retrieval. In Proceedings of the 20th Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR 1997), 1997, 50--58.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Dumais, S. T. Landauer, T. K and Littman, M. L. Automatic Cross-Linguistic Information Retrieval Using Latent Semantic Indexing In Proceedings of ACM-SIGIR Workshop on Cross-Linguistic Information Retrieval (SIGIR 1996), 1996, 16--24.]]Google ScholarGoogle Scholar
  5. Fung, P. and Yee, L. Y. An IR Approach for Translating New Words from Nonparallel, Comparable Texts. In Proceedings of the 36th Annual Conference of the Association for Computational Linguistics (ACL 1998), 1998, 414--420.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Gale, W. A. and Church, K. W. Identifying Word Correspondences in Parallel Texts. In Proceedings of DARPA Speech and Natural Language Workshop, 1991, 152--157.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Gale, W. A. and Church, K. W. A Program for Aligning Sentences in Bilingual Corpora Computational Linguistics, 19, 1 (1993), 75--102.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Gonnet, G. H. Baeza-yates, R. A. and Snider, T. New Indices for Text: Pat Trees and Pat Arrays Information Retrieval Data Structures & Algorithms, Prentice Hall, 1992, 66--82.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Kwok, K L NTCIR-2 Chinese, Cross Language Retrieval Experiments Using PIRCS. In Proceedings of NTCIR workshop meeting, 2001, 111--118.]]Google ScholarGoogle Scholar
  10. Larson, R. R. Gey, F. and Chen, A. Harvesting Translingual Vocabulary Mappings for Multilingual Digital Libraries. In Proceedings of ACM/IEEE Joint Conference on Digital Libraries (JCDL 2002), 2002, 185--190.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Lavrenko, V. Choquette, M. and Croft, W. B. Cross-Lingual Relevance Models. In Proceedings of ACM Conference on Research and Development in Information Retrieval (SIGIR 2002), 2002, 175--182.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Lu, W. H. Chien, L. F. and Lee, H. J. Translation of Web Queries using Anchor Text Mining ACM Transactions on Asian Language Information Processing, 1 (2002), 159--172.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Lu, W. H. Chien, L. F. and Lee, H. J. Anchor Text Mining for Translation of Web Queries: A Transitive Translation Approach ACM Transactions on Information Systems, 22 (2004), 1--28.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Manber, U. and Baeza-yates, R. An Algorithm for String Matching with a Sequence of Don't Cares Information Processing Letters, 37 (1991), 133--136.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Morrison, D. PATRICIA: Practical Algorithm to Retrieve Information Coded in Alphanumeric JACM, 1968, 514--534.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Nie, J. Y. Isabelle, P. Simard, M. and Durand, R Cross-language Information Retrieval Based on Parallel Texts and Automatic Mining of Parallel Texts from the Web In Proceedings of ACM Conference on Research and Development in Information Retrieval (SIGIR 1999), 1999, 74--81.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Rapp, R. Automatic Identification of Word Translations from Unrelated English and German Corpora. In Proceedings of the 37th Annual Conference of the Association for Computational Linguistics (ACL 1999), 1999, 519--526.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Silva, J. F. Dias, G. Guillore, S. and Lopes, G. P. Using LocalMaxs Algorithm for the Extraction of Contiguous and Non-contiguous Multiword Lexical Units Lecture Notes in Artificial Intelligence, 1695, Springer-Verlag, 1999, 113--132.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Silva, J. F. and Lopes, G. P. A Local Maxima Method and a Fair Dispersion Normalization for Extracting Multiword Units. In Proceedings of the 6th Meeting on the Mathematics of Language, 1999, 369--381.]]Google ScholarGoogle Scholar
  20. Smadja, F. McKeown, K. and Hatzivassiloglou, V. Translating Collocations for Bilingual Lexicons: A Statistical Approach, Computational Linguistics, 22, 1 (1996), 1--38.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Translating unknown cross-lingual queries in digital libraries using a web-based approach

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          JCDL '04: Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
          June 2004
          440 pages
          ISBN:1581138326
          DOI:10.1145/996350

          Copyright © 2004 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 7 June 2004

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          JCDL '04 Paper Acceptance Rate61of249submissions,24%Overall Acceptance Rate415of1,482submissions,28%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader