ABSTRACT
Supertagging is the tagging process of assigning the correct elementary tree of LTAG, or the correct supertag, to each word of an input sentence. In this paper we propose to use supertags to expose syntactic dependencies which are unavailable with POS tags. We first propose a novel method of applying Sparse Network of Winnow (SNoW) to sequential models. Then we use it to construct a supertagger that uses long distance syntactical dependencies, and the supertagger achieves an accuracy of 92.41%. We apply the supertagger to NP chunking. The use of supertags in NP chunking gives rise to almost 1% absolute increase (from 92.03% to 92.95%) in F-score under Transformation Based Learning(TBL) frame. The surpertagger described here provides an effective and efficient way to exploit syntactic information.
- S. Abney. 1991. Parsing by chunks. In Principle-Based Parsing. Kluwer Academic Publishers.Google Scholar
- E. Brill. 1995. Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging. Computational Linguistics, 21(4):543--565. Google ScholarDigital Library
- J. Chen, B. Srinivas, and K. Vijay-Shanker. 1999. New models for improving supertag disambiguation. In Proceedings of the 9th EACL. Google ScholarDigital Library
- J. Chen. 2001. Towards Efficient Statistical Parsing using Lexicalized Grammatical Information. Ph.D. thesis, University of Delaware. Google ScholarDigital Library
- M. Collins. 2002. Discriminative training methods for hidden markov models: Theory and experiments with perceptron algorithms. In EMNLP 2002. Google ScholarDigital Library
- A. Joshi and Y. Schabes. 1997. Tree-adjoining grammars. In G. Rozenberg and A. Salomaa, editors, Handbook of Formal Languages, volume 3, pages 69--124. Springer. Google ScholarDigital Library
- A. Joshi and B. Srinivas. 1994. Disambiguation of super parts of speech (or supertags): Almost parsing. In COLING'94. Google ScholarDigital Library
- T. Kudo and Y. Matsumoto. 2001. Chunking with support vector machines. In Proceedings of NAACL 2001. Google ScholarDigital Library
- J. Lafferty, A. McCallum, and F. Pereira. 2001. Conditional random fields: Probabilistic models for stgmentation and labeling sequence data. In Proceedings of ICML 2001. Google ScholarDigital Library
- M. P. Marcus, B. Santorini, and M. A. Marcinkiewicz. 1994. Building a large annotated corpus of english: the penn treebank. Computational Linguistics, 19(2):313--330. Google ScholarDigital Library
- M. Muñoz, V. Punyakanok, D. Roth, and D. Zimak. 1999. A learning approach to shallow parsing. In Proceedings of EMNLP-WVLC'99.Google Scholar
- G. Ngai and R. Florian. 2001. Transformation-based learning in the fast lane. In Proceedings of NAACL-2001, pages 40--47. Google ScholarDigital Library
- V. Punyakanok and D. Roth. 2000. The use of classifiers in sequential inference. In NIPS'00.Google Scholar
- L. Ramshaw and M. Marcus. 1995. Text chunking using transformation-based learning. In Proceedings of the 3rd WVLC.Google Scholar
- A. Ratnaparkhi. 1996. A maximum entropy part-of-speech tagger. In Proceedings of EMNLP 96.Google Scholar
- D. Roth. 1998. Learning to resolve natural language ambiguities: A unified approach. In AAAI'98. Google ScholarDigital Library
- Erik F. Tjong Kim Sang. 2002. Memory-based shallow parsing. Journal of Machine Learning Research, 2:559--594. Google ScholarDigital Library
- F. Sha and F. Pereira. 2003. Shallow parsing with conditional random fields. In Proceedings of NAACL 2003. Google ScholarDigital Library
- B. Srinivas and A. Joshi. 1999. Supertagging: An approach to almost parsing. Computational Linguistics, 25(2). Google ScholarDigital Library
- B. Srinivas. 1997. Performance evaluation of supertagging for partial parsing. In IWPT 1997.Google Scholar
- H. van Halteren, J. Zavrel, and W. Daelmans. 1998. Improving data driven wordclass tagging by system combination. In Proceedings of COLING-ACL 98. Google ScholarDigital Library
- F. Xia. 2001. Automatic Grammar Generation From Two Different Perspectives. Ph.D. thesis, University of Pennsylvania. Google ScholarDigital Library
- XTAG-Group. 2001. A lexicalized tree adjoining grammar for english. Technical Report 01-03, IRCS, Univ. of Pennsylvania.Google Scholar
- T. Zhang, F. Damerau, and D. Johnson. 2001. Text chunking using regularized winnow. In Proceedings of ACL 2001. Google ScholarDigital Library
- A SNoW based supertagger with application to NP chunking
Recommendations
Structure-guided supertagger learning
As described in this paper, we specifically examine the structural learning problem of a supertagging task. Supertagging is a task to assign the most probable lexical entry to each word in a sentence. A supertagger is extremely important for a ...
Faster parsing by supertagger adaptation
ACL '10: Proceedings of the 48th Annual Meeting of the Association for Computational LinguisticsWe propose a novel self-training method for a parser which uses a lexicalised grammar and supertagger, focusing on increasing the speed of the parser rather than its accuracy. The idea is to train the supertagger on large amounts of parser output, so ...
Forest-guided supertagger training
COLING '10: Proceedings of the 23rd International Conference on Computational LinguisticsSupertagging is an important technique for deep syntactic analysis. A supertagger is usually trained independently of the parser using a sequence labeling method. This presents an inconsistent training objective between the supertagger and the parser. ...
Comments