Skip to main content

Cross-Language Information Retrieval Using EuroWordNet and Word Sense Disambiguation

  • Conference paper
Advances in Information Retrieval (ECIR 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2997))

Included in the following conference series:

Abstract

One of the aims of EuroWordNet (EWN) was to provide a resource for Cross-Language Information Retrieval (CLIR). In this paper we present experiments which test the usefulness of EWN for this purpose via a formal evaluation using the Spanish queries from the TREC6 CLIR test set. All CLIR systems using bilingual dictionaries must find a way of dealing with multiple translations and we employ a Word Sense Disambiguation (WSD) algorithm for this purpose. It was found that this algorithm achieved only around 50% correct disambiguation when compared with manual judgement, however, retrieval performance using the senses it returned was 90% of that recorded using manually disambiguated queries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ballesteros, L., Croft, W.: Resolving ambiguity for cross-language retrieval. In: Research and Development in Information Retrieval, pp. 64–71 (1998)

    Google Scholar 

  2. Jang, M., Myaeng, S., Park, S.: Using mutual information to resolve query translation ambiguities and query term weighting. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL 1999), College Park, MA, pp. 223–229 (1999)

    Google Scholar 

  3. Gao, J., Nie, J., He, H., Chen, W., Zhou, M.: Resolving query translation ambiguity using a decaying co-occurence model and syntactic dependence relations. In: Proceedings of the 25th International ACM SIGIR Conference on Research and Development in Information Retreival, Tampere, Finland, pp. 183–190 (2002)

    Google Scholar 

  4. Vossen, P.: Introduction to EuroWordNet. Computers and the Humanities 32, 73–89 (1998); Special Issue on EuroWordNet

    Article  Google Scholar 

  5. Gilarranz, J., Gonzalo, J., Verdejo, F.: Language-independent text retireval with the EuroWordNet Multilingual Semantic Database. In: Proceedings of the Second Workshop on Multilinguality in the Software Industry: the AI contribution, Nagoya, Japan, pp. 9–16 (1997)

    Google Scholar 

  6. Miller, G.: WordNet: An on-line lexical database. International Journal of Lexicography 3, 235–312 (1990)

    Article  Google Scholar 

  7. Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database and some of its Applications. MIT Press, Cambridge (1998)

    Google Scholar 

  8. Gilarranz, J., Gonzalo, J., Verdejo, F.: Language-independent text retrieval with the EuroWordNet Multilingual Semantic Database. In: Proceedings of the Second Workshop on Multilinguality in the Software Industry: the AI contribution at the Fifteenth International Joint Conference on Artificial Intelligence, Nagoya, Japan, pp. 9–16 (1997)

    Google Scholar 

  9. Resnik, P.: Disambiguating Noun Groupings with Respect to WordNet senses. In: Armstrong, S., Church, K., Isabelle, P., Manzi, S., Tzoukermann, E., Yarowsky, D. (eds.) Natural Language Processing using Very Large Corpora, pp. 77–98. Kluwer Academic Press, Dordrecht (1999)

    Google Scholar 

  10. Schaüble, P., Sheridan, P.: Cross-Language Information Retrieval (CLIR) Track Overview. In: Voorhees, E., Harman, D. (eds.) The Sixth Text REtrieval Conference (TREC-6), Gaithersburg, MA, pp. 31–44 (1997)

    Google Scholar 

  11. Cutting, D., Kupiec, J., Pedersen, J., Sibun, P.: A practical part-of-speech tagger. In: Proceedings of the Third Conference on Applied Natural Language Processing, Trento, Italy, pp. 133–140 (1992)

    Google Scholar 

  12. Robertson, S., Walker, S., Beaulieu, M.: Okapi at TREC-7: automatic ad hoc, filtering VLC and interactive track. In: NIST Special Publication 500-242: The Seventh Text REtrieval Conference (TREC-7), Gaithersburg, MA, pp. 253–264 (1998)

    Google Scholar 

  13. Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison Wesley Longman Limited, Essex (1999)

    Google Scholar 

  14. Krovetz, R., Croft, B.: Lexical ambiguity and information retrieval. ACM Transactions on Information Systems 10, 115–141 (1992)

    Article  Google Scholar 

  15. Sanderson, M.: Word sense disambiguation and information retrieval. In: Proceedings of the 17th ACM SIGIR Conference, Dublin, Ireland, pp. 142–151 (1994)

    Google Scholar 

  16. Qu, Y., Grefenstette, G., Evans, D.: Resolving translation ambiguity using monolingual corpora. In: Cross Language Evaluation Forum 2002, Rome, Italy (2002)

    Google Scholar 

  17. Jing, H., Tzoukermann, E.: Information retrieval based on context distance and morphology. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1999), Seattle, WA, pp. 90–96 (1999)

    Google Scholar 

  18. Stevenson, M.: Augmenting Noun Taxonomies by Combining Lexical Similarity Metrics. In: Proceedings of the 19th International Conference on Computational Linguistics (COLING 2002), Taipei, Taiwan, pp. 953–959 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Clough, P., Stevenson, M. (2004). Cross-Language Information Retrieval Using EuroWordNet and Word Sense Disambiguation. In: McDonald, S., Tait, J. (eds) Advances in Information Retrieval. ECIR 2004. Lecture Notes in Computer Science, vol 2997. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24752-4_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24752-4_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-21382-6

  • Online ISBN: 978-3-540-24752-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics