skip to main content
10.3115/1117794.1117816dlproceedingsArticle/Chapter ViewAbstractPublication PagesemnlpConference Proceedingsconference-collections
Article
Free Access

An empirical study of the domain dependence of supervised word sense disambiguation systems

Authors Info & Claims
Published:07 October 2000Publication History

ABSTRACT

This paper describes a set of experiments carried out to explore the domain dependence of alternative supervised Word Sense Disambiguation algorithms. The aim of the work is threefold: studying the performance of these algorithms when tested on a different corpus from that they were trained on; exploring their ability to tune to new domains, and demonstrating empirically that the Lazy-Boosting algorithm outperforms state-of-the-art supervised WSD algorithms in both previous situations.

References

  1. E. Agirre and D. Martinez. 2000. Decision Lists and Automatic Word Sense Disambiguation. In Proceedings of the COLING Workshop on Semantic Annotation and Intelligent Content]]Google ScholarGoogle Scholar
  2. D. Aha, D. Kibler, and M. Albert. 1991. Instance-based Learning Algorithms. Machine Learning, 7:37--66.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. F. Bruce and J. M. Wiebe. 1999. Decomposable Modeling in Natural Language Processing. Computational Linguistics. 25(2):195--207.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Cost and S. Salzberg. 1993. A weighted nearest neighbor algorithm for learning with symbolic features. Machine Learning, 10(1), 57--78.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. W. Daelemans, A. van den Bosch, and J. Zavrel. 1999. Forgetting Exceptions is Harmful in Language Learning. Machine Learning, 34:11--41.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. T. G. Dietterich. 1998. Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms. Neural Computation, 10(7).]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. O. Duda and P. E. Hart. 1973. Pattern Classification and Scene Analysis. Wiley.]]Google ScholarGoogle Scholar
  8. G. Escudero, L. Màrquez, and G. Rigau. 2000a. Boosting Applied to Word Sense Disambiguation. In Proceedings of the 12th European Conference on Machine Learning, ECML, Barcelona, Spain.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. G. Escudero. L. Màrquez, and G. Rigau. 2000b. Naive Bayes and Exemplar-Based Approaches to Word Sense Disambiguation Revisited. In To appear in Proceedings of the 14th European Conference on Artificial Intelligence, ECAI.]]Google ScholarGoogle Scholar
  10. G. Escudero, L. Màrquez, and G. Rigau. 2000c. On the Portability and Tuning of Supervised Word Sense Disambiguation Systems. Research Report LSI-00-30-R, Software Department (LSI). Technical University of Catalonia (UPC).]]Google ScholarGoogle Scholar
  11. A. Fujii. K. Inui. T. Tokunaga, and H. Tanaka. 1998. Selective Sampling for Example-based Word Sense Disambiguation. Computational Linguistics, 24(4):573--598.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. W. Gale, K. W. Church, and D. Yarowsky. 1992a. A Method for Disambiguating Word Senses in a Large Corpus. Computers and the Humanities, 26:415--439.]]Google ScholarGoogle ScholarCross RefCross Ref
  13. W. Gale, K. W. Church, and D. Yarowsky. 1992b. Estimating Upper and Lower Bounds on the Performance of Word Sense Disambiguation. In Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics. ACL.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. N. Ide and J. Véronis. 1998. Introduction to the Special Issue on Word Sense Disambiguation: The State of the Art. Computational Linguistics, 24(1):1--40.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Kilgarriff and J. Rosenzweig. 2000. English SENSEVAL: Report and Results. In Proceedings of the 2nd International Conference on Language Resources and Evaluation, LREC, Athens, Greece.]]Google ScholarGoogle Scholar
  16. C. Leacock, M. Chodorow, and G. A. Miller. 1998. Using Corpus Statistics and WordNet Relations for Sense Identification. Computational Linguistics, 24(1):147--166.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. N. Littlestone. 1988. Learning Quickly when Irrelevant Attributes Abound. Machine Learning, 2:285--318.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. Mihalcea and I. Moldovan. 1999. An Automatic Method for Generating Sense Tagged Corpora. In Proceedings of the 16th National Conference on Artificial Intelligence. AAAI Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. J. Mooney. 1996. Comparative Experiments on Disambiguating Word Senses: An Illustration of the Role of Bias in Machine Learning. In Proceedings of the 1st Conference on Empirical Methods in Natural Language Processing, EMNLP.]]Google ScholarGoogle Scholar
  20. H. T. Ng and H. B. Lee. 1996. Integrating Multiple Knowledge Sources to Disambiguate Word Sense: An Exemplar-based Approach. In Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics. ACL.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. H. T. Ng. 1997a. Exemplar-Base Word Sense Disambiguation: Some Recent Improvements. In Proceedings of the 2nd Conference on Empirical Methods in Natural Language Processing, EMNLP.]]Google ScholarGoogle Scholar
  22. H. T. Ng. 1997b. Getting Serious about Word Sense Disambiguation. In Proceedings of the ACL SIGLEX Workshop: Tagging Text with Lexical Semantics: Why, what and how?, Washington, USA.]]Google ScholarGoogle Scholar
  23. D. Roth. 1998. Learning to Resolve Natural Language Ambiguities: A Unified Approach. In Proceedings of the National Conference on Artificial Intelligence, AAAI '98, July.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. R. E. Schapire and Y. Singer, to appear. Improved Boosting Algorithms Using Confidence-rated Predictions. Machine Learning. Also appearing in Proceedings of the 11th Annual Conference on Computational Learning Theory, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. S. Sekine. 1997. The Domain Dependence of Parsing. In Proceedings of the 5th Conference on Applied Natural Language Processing, ANLP, Washington DC. ACL.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. G. Towell and E. M. Voorhees. 1998. Disambiguating Highly Ambiguous Words. Computational Linguistics. 24(1):125--146.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. D. Yarowsky. 1994. Decision Lists for Lexical Ambiguity Resolution: Application to Accent Restoration in Spanish and French. In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, pages 88--95, Las Cruces, NM. ACL.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. An empirical study of the domain dependence of supervised word sense disambiguation systems

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image DL Hosted proceedings
          EMNLP '00: Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
          October 2000
          233 pages

          Publisher

          Association for Computational Linguistics

          United States

          Publication History

          • Published: 7 October 2000

          Qualifiers

          • Article

          Acceptance Rates

          Overall Acceptance Rate73of234submissions,31%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader