skip to main content
10.3115/1220835.1220851dlproceedingsArticle/Chapter ViewAbstractPublication PageshltConference Proceedingsconference-collections
Article
Free Access

An empirical study of the behavior of active learning for word sense disambiguation

Published:04 June 2006Publication History

ABSTRACT

This paper shows that two uncertainty-based active learning methods, combined with a maximum entropy model, work well on learning English verb senses. Data analysis on the learning process, based on both instance and feature levels, suggests that a careful treatment of feature extraction is important for the active learning to be useful for WSD. The overfitting phenomena that occurred during the active learning process are identified as classic overfitting in machine learning based on the data analysis.

References

  1. Naoki Abe and Hiroshi Mamitsuka. 1998. Query learning strategies using boosting and bagging. In Proc. of ICML 1998, pages 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Jinying Chen and Martha Palmer. 2005. Towards Robust High Performance Word Sense Disambiguation of English Verbs Using Rich Linguistic Features, In Proc. of IJCNLP 2005, Oct., Jeju, Republic of Korea.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Tim Chklovski and Rada Mihalcea, Building a Sense Tagged Corpus with Open Mind Word Expert, in Proceedings of the ACL 2002 Workshop on "Word Sense Disambiguation: Recent Successes and Future Directions", Philadelphia, July 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Hoa T. Dang. 2004. Investigations into the role of lexical semantics in word sense disambiguation. PhD Thesis. University of Pennsylvania. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Atsushi Fujii, Takenobu Tokunaga, Kentaro Inui, Hozumi Tanaka. 1998. Selective sampling for example-based word sense disambiguation, Computational Linguistics, v.24 n.4, p.573--597, Dec. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Eduard Hovy, Mitchell Marcus, Martha Palmer, Lance Ramshaw and Ralph Weischedel. OntoNotes: The 90% Solution. Accepted by HLT-NAACL06. Short paper. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. David D. Lewis and William A. Gale. 1994. A sequential algorithm for training text classifiers. In W. Bruce Croft and Cornelis J. van Rijsbergen, editors, Proceedings of SIGIR-94, Dublin, IE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Andrew K. McCallum. 2002. MALLET: A Machine Learning for Language Toolkit. http://www.cs.umass.edu/~mccallum/mallet.Google ScholarGoogle Scholar
  9. Andew McCallum and Kamal Nigam. 1998. Employing EM in pool-based active learning for text classification. In Proc. of ICML '98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Martha Palmer, Hoa Trang Dang and Christiane Fellbaum. (to appear, 2006). Making fine-grained and coarse-grained sense distinctions, both manually and automatically. Natural Language Engineering.Google ScholarGoogle Scholar
  11. Andrew I. Schein. 2005. Active Learning for Logistic Regression. Ph.D. Thesis. Univ. of Pennsylvania. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Dan Shen, Jie Zhang, Jian Su, Guodong Zhou and Chew Lim Tan. 2004 Multi-criteria-based active learning for named entity recognition, In Proc. of ACL04, Barcelona, Spain. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Min Tang, Xiaoqiang Luo, and Salim Roukos. 2002. Active learning for statistical natural language parsing. In Proc. of ACL 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Cynthia A. Thompson, Mary Elaine Califf, and Raymond J. Mooney. 1999. Active learning for natural language parsing and information extraction. In Proc. of ICML-99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. An empirical study of the behavior of active learning for word sense disambiguation

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image DL Hosted proceedings
        HLT-NAACL '06: Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
        June 2006
        522 pages

        Publisher

        Association for Computational Linguistics

        United States

        Publication History

        • Published: 4 June 2006

        Qualifiers

        • Article

        Acceptance Rates

        HLT-NAACL '06 Paper Acceptance Rate62of257submissions,24%Overall Acceptance Rate240of768submissions,31%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader