ABSTRACT
In this paper we present a supervised Word Sense Disambiguation methodology, that exploits kernel methods to model sense distinctions. In particular a combination of kernel functions is adopted to estimate independently both syntagmatic and domain similarity. We defined a kernel function, namely the Domain Kernel, that allowed us to plug "external knowledge" into the supervised learning process. External knowledge is acquired from unlabeled data in a totally unsupervised way, and it is represented by means of Domain Models. We evaluated our methodology on several lexical sample tasks in different languages, outperforming significantly the state-of-the-art for each of them, while reducing the amount of labeled training data required for learning.
- N. Cristianini and J. Shawe-Taylor. 2000. An introduction to Support Vector Machines. Cambridge University Press. Google ScholarDigital Library
- B. Decadt, V. Hoste, W. Daelemens, and A. van den Bosh. 2004. Gambl, genetic algorithm optimization of memory-based wsd. In Proc. of Senseval-3, Barcelona, July.Google Scholar
- S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. 1990. Indexing by latent semantic analysis. Journal of the American Society of Information Science.Google ScholarCross Ref
- A. Gliozzo, C. Strapparava, and I. Dagan. 2004. Unsupervised and supervised exploitation of semantic domains in lexical disambiguation. Computer Speech and Language, 18(3):275--299.Google ScholarCross Ref
- B. Magnini and G. Cavaglià. 2000. Integrating subject field codes into WordNet. In Proceedings of LREC-2000, pages 1413--1418, Athens, Greece, June.Google Scholar
- B. Magnini, C. Strapparava, G. Pezzulo, and A. Gliozzo. 2002. The role of domain information in word sense disambiguation. Natural Language Engineering, 8(4):359--373. Google ScholarDigital Library
- R. Mihalcea and P. Edmonds, editors. 2004. Proceedings of SENSEVAL-3, Barcelona, Spain, July.Google Scholar
- R. Mihalcea and E. Faruque. 2004. Senselearner: Minimally supervised WSD for all words in open text. In Proceedings of SENSEVAL-3, Barcelona, Spain, July.Google Scholar
- G. Salton and M. H. McGill. 1983. Introduction to modern information retrieval. McGraw-Hill, New York. Google ScholarDigital Library
- J. Shawe-Taylor and N. Cristianini. 2004. Kernel Methods for Pattern Analysis. Cambridge University Press. Google ScholarDigital Library
- S. Small. 1980. Word Expert Parsing: A Theory of Distributed Word-based Natural Language Understanding. Ph.D. Thesis, Department of Computer Science, University of Maryland. Google ScholarDigital Library
- C. Strapparava, A. Gliozzo, and C. Giuliano. 2004. Pattern abstraction and term similarity for word sense disambiguation: Irst at senseval-3. In Proc. of SENSEVAL-3 Third International Workshop on Evaluation of Systems for the Semantic Analysis of Text, pages 229--234, Barcelona, Spain, July.Google Scholar
- S. K. M. Wong, W. Ziarko, and P. C. N. Wong. 1985. Generalized vector space model in information retrieval. In Proceedings of the 8th ACM SIGIR Conference. Google ScholarDigital Library
- D. Yarowsky and R. Florian. 2002. Evaluating sense disambiguation across diverse parameter space. Natural Language Engineering, 8(4):293--310. Google ScholarDigital Library
- Domain kernels for word sense disambiguation
Recommendations
Word sense disambiguation using label propagation based semi-supervised learning
ACL '05: Proceedings of the 43rd Annual Meeting on Association for Computational LinguisticsShortage of manually sense-tagged data is an obstacle to supervised word sense disambiguation methods. In this paper we investigate a label propagation based semi-supervised learning algorithm for WSD, which combines labeled and unlabeled data in ...
Word Sense Disambiguation by Learning Decision Trees from Unlabeled Data
In this paper we describe a machine learning approach to word sense disambiguation that uses unlabeled data. Our method is based on selective sampling with committees of decision trees. The committee members are trained on a small set of labeled ...
Word sense disambiguation using automatically translated sense examples
CrossLangInduction '06: Proceedings of the International Workshop on Cross-Language Knowledge InductionWe present an unsupervised approach to Word Sense Disambiguation (WSD). We automatically acquire English sense examples using an English-Chinese bilingual dictionary, Chinese monolingual corpora and Chinese-English machine translation software. We then ...
Comments