Article

Free Access

An empirical study of the domain dependence of supervised word sense disambiguation systems

Authors:
Gerard Escudero

Universitat Politècnica de Catalunya (UPC), Barcelona, Catalonia

Universitat Politècnica de Catalunya (UPC), Barcelona, Catalonia
View Profile

,
Lluís Màrquez

Universitat Politècnica de Catalunya (UPC), Barcelona, Catalonia

Universitat Politècnica de Catalunya (UPC), Barcelona, Catalonia
View Profile

,
German Rigau

Universitat Politècnica de Catalunya (UPC), Barcelona, Catalonia

Universitat Politècnica de Catalunya (UPC), Barcelona, Catalonia
View Profile

EMNLP '00: Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13October 2000Pages 172–180https://doi.org/10.3115/1117794.1117816

Published:07 October 2000Publication History

EMNLP '00: Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13

Pages 172–180

ABSTRACT

This paper describes a set of experiments carried out to explore the domain dependence of alternative supervised Word Sense Disambiguation algorithms. The aim of the work is threefold: studying the performance of these algorithms when tested on a different corpus from that they were trained on; exploring their ability to tune to new domains, and demonstrating empirically that the Lazy-Boosting algorithm outperforms state-of-the-art supervised WSD algorithms in both previous situations.

References

E. Agirre and D. Martinez. 2000. Decision Lists and Automatic Word Sense Disambiguation. In Proceedings of the COLING Workshop on Semantic Annotation and Intelligent Content]]Google Scholar
D. Aha, D. Kibler, and M. Albert. 1991. Instance-based Learning Algorithms. Machine Learning, 7:37--66.]] Google ScholarDigital Library
R. F. Bruce and J. M. Wiebe. 1999. Decomposable Modeling in Natural Language Processing. Computational Linguistics. 25(2):195--207.]] Google ScholarDigital Library
S. Cost and S. Salzberg. 1993. A weighted nearest neighbor algorithm for learning with symbolic features. Machine Learning, 10(1), 57--78.]] Google ScholarDigital Library
W. Daelemans, A. van den Bosch, and J. Zavrel. 1999. Forgetting Exceptions is Harmful in Language Learning. Machine Learning, 34:11--41.]] Google ScholarDigital Library
T. G. Dietterich. 1998. Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms. Neural Computation, 10(7).]] Google ScholarDigital Library
R. O. Duda and P. E. Hart. 1973. Pattern Classification and Scene Analysis. Wiley.]]Google Scholar
G. Escudero, L. Màrquez, and G. Rigau. 2000a. Boosting Applied to Word Sense Disambiguation. In Proceedings of the 12th European Conference on Machine Learning, ECML, Barcelona, Spain.]] Google ScholarDigital Library
G. Escudero. L. Màrquez, and G. Rigau. 2000b. Naive Bayes and Exemplar-Based Approaches to Word Sense Disambiguation Revisited. In To appear in Proceedings of the 14th European Conference on Artificial Intelligence, ECAI.]]Google Scholar
G. Escudero, L. Màrquez, and G. Rigau. 2000c. On the Portability and Tuning of Supervised Word Sense Disambiguation Systems. Research Report LSI-00-30-R, Software Department (LSI). Technical University of Catalonia (UPC).]]Google Scholar
A. Fujii. K. Inui. T. Tokunaga, and H. Tanaka. 1998. Selective Sampling for Example-based Word Sense Disambiguation. Computational Linguistics, 24(4):573--598.]] Google ScholarDigital Library
W. Gale, K. W. Church, and D. Yarowsky. 1992a. A Method for Disambiguating Word Senses in a Large Corpus. Computers and the Humanities, 26:415--439.]]Google ScholarCross Ref
W. Gale, K. W. Church, and D. Yarowsky. 1992b. Estimating Upper and Lower Bounds on the Performance of Word Sense Disambiguation. In Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics. ACL.]] Google ScholarDigital Library
N. Ide and J. Véronis. 1998. Introduction to the Special Issue on Word Sense Disambiguation: The State of the Art. Computational Linguistics, 24(1):1--40.]] Google ScholarDigital Library
A. Kilgarriff and J. Rosenzweig. 2000. English SENSEVAL: Report and Results. In Proceedings of the 2nd International Conference on Language Resources and Evaluation, LREC, Athens, Greece.]]Google Scholar
C. Leacock, M. Chodorow, and G. A. Miller. 1998. Using Corpus Statistics and WordNet Relations for Sense Identification. Computational Linguistics, 24(1):147--166.]] Google ScholarDigital Library
N. Littlestone. 1988. Learning Quickly when Irrelevant Attributes Abound. Machine Learning, 2:285--318.]] Google ScholarDigital Library
R. Mihalcea and I. Moldovan. 1999. An Automatic Method for Generating Sense Tagged Corpora. In Proceedings of the 16th National Conference on Artificial Intelligence. AAAI Press.]] Google ScholarDigital Library
R. J. Mooney. 1996. Comparative Experiments on Disambiguating Word Senses: An Illustration of the Role of Bias in Machine Learning. In Proceedings of the 1st Conference on Empirical Methods in Natural Language Processing, EMNLP.]]Google Scholar
H. T. Ng and H. B. Lee. 1996. Integrating Multiple Knowledge Sources to Disambiguate Word Sense: An Exemplar-based Approach. In Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics. ACL.]] Google ScholarDigital Library
H. T. Ng. 1997a. Exemplar-Base Word Sense Disambiguation: Some Recent Improvements. In Proceedings of the 2nd Conference on Empirical Methods in Natural Language Processing, EMNLP.]]Google Scholar
H. T. Ng. 1997b. Getting Serious about Word Sense Disambiguation. In Proceedings of the ACL SIGLEX Workshop: Tagging Text with Lexical Semantics: Why, what and how?, Washington, USA.]]Google Scholar
D. Roth. 1998. Learning to Resolve Natural Language Ambiguities: A Unified Approach. In Proceedings of the National Conference on Artificial Intelligence, AAAI '98, July.]] Google ScholarDigital Library
R. E. Schapire and Y. Singer, to appear. Improved Boosting Algorithms Using Confidence-rated Predictions. Machine Learning. Also appearing in Proceedings of the 11th Annual Conference on Computational Learning Theory, 1998.]] Google ScholarDigital Library
S. Sekine. 1997. The Domain Dependence of Parsing. In Proceedings of the 5th Conference on Applied Natural Language Processing, ANLP, Washington DC. ACL.]] Google ScholarDigital Library
G. Towell and E. M. Voorhees. 1998. Disambiguating Highly Ambiguous Words. Computational Linguistics. 24(1):125--146.]] Google ScholarDigital Library
D. Yarowsky. 1994. Decision Lists for Lexical Ambiguity Resolution: Application to Accent Restoration in Spanish and French. In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, pages 88--95, Las Cruces, NM. ACL.]] Google ScholarDigital Library

An empirical study of the domain dependence of supervised word sense disambiguation systems
1. Computing methodologies
  1. Artificial intelligence
  2. Machine learning
    1. Learning paradigms
      1. Supervised learning
2. Hardware
  1. Power and energy
    1. Power estimation and optimization

Recommendations

A Sense Annotated Corpus for All-Words Urdu Word Sense Disambiguation

Word Sense Disambiguation (WSD) aims to automatically predict the correct sense of a word used in a given context. All human languages exhibit word sense ambiguity, and resolving this ambiguity can be difficult. Standard benchmark resources are required ...
Read More
A word sense disambiguation corpus for Urdu
Abstract
The aim of word sense disambiguation (WSD) is to correctly identify the meaning of a word in context. All natural languages exhibit word sense ambiguities and these are often hard to resolve automatically. Consequently WSD is considered an ...
Read More
Unsupervised translated word sense disambiguation in constructing bilingual lexical database
SAC '18: Proceedings of the 33rd Annual ACM Symposium on Applied Computing

The performance of a machine translation system depends on the availability of bilingual lexical dictionary and completion of its word sense disambiguation performance. Word sense disambiguation plays a vital role in several applications such as machine ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
EMNLP '00: Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
October 2000
233 pages
Conference Chairs:
Hinrich Schiitze
GroupFire Inc
,
Keh-Yih Su
Behavior Design Corporation
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 7 October 2000
Author Tags
cross-corpus evaluation of NLP systems
supervised machine learning
word sense disambiguation
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate73of234submissions,31%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 25
  Total Citations
  View Citations
- 281
  Total Downloads
- Downloads (Last 12 months)27
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

An empirical study of the domain dependence of supervised word sense disambiguation systems

EMNLP '00: Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13

ABSTRACT

References

Cited By

Recommendations

A Sense Annotated Corpus for All-Words Urdu Word Sense Disambiguation

A word sense disambiguation corpus for Urdu

Unsupervised translated word sense disambiguation in constructing bilingual lexical database

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

An empirical study of the domain dependence of supervised word sense disambiguation systems

EMNLP '00: Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13

ABSTRACT

References

Cited By

Recommendations

A Sense Annotated Corpus for All-Words Urdu Word Sense Disambiguation

A word sense disambiguation corpus for Urdu

Unsupervised translated word sense disambiguation in constructing bilingual lexical database

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media