skip to main content
10.5555/1273073.1273171dlproceedingsArticle/Chapter ViewAbstractPublication PagescolingConference Proceedingsconference-collections
Article
Free Access

Exact decoding for jointly labeling and chunking sequences

Published:17 July 2006Publication History

ABSTRACT

There are two decoding algorithms essential to the area of natural language processing. One is the Viterbi algorithm for linear-chain models, such as HMMs or CRFs. The other is the CKY algorithm for probabilistic context free grammars. However, tasks such as noun phrase chunking and relation extraction seem to fall between the two, neither of them being the best fit. Ideally we would like to model entities and relations, with two layers of labels. We present a tractable algorithm for exact inference over two layers of labels and chunks with time complexity O(n2), and provide empirical results comparing our model with linear-chain models.

References

  1. M. Collins. 2002. Discriminative training methods for hidden Markov models: Theory and experiments with perceptron algorithms. In Proc. of Empirical Methods in Natural Language Processing (EMNLP) Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. K. Crammer and Y. Singer. 2003. Ultraconservative online algorithms for multiclass problems. Journal of Machine Learning Research Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. K. Crammer, O. Dekel, S. Shalev-Shwartz, and Y. Singer. 2003. Online passive aggressive algorithms. In Advances in Neural Information Processing Systems 15Google ScholarGoogle Scholar
  4. K. Crammer, R. McDonald, and F. Pereira. 2004. New large margin algorithms for structured prediction. In Learning with Structured Outputs Workshop (NIPS)Google ScholarGoogle Scholar
  5. Y. Freund and R. Schapire 1999. Large Margin Classification using the Perceptron Algorithm. In Machine Learning, 37(3):277--296. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. T. S. Jaakkola, M. Diekhans, and D. Haussler. 2000. A discriminative framework for detecting remote protein homologies. Journal of Computational BiologyGoogle ScholarGoogle ScholarCross RefCross Ref
  7. T. Kudo 2005. CRF++: Yet Another CRF toolkit. Available at http://chasen.org/~taku/software/CRF++/Google ScholarGoogle Scholar
  8. J. Lafferty, A. McCallum, and F. Pereira. 2001. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proc. of the 18th International Conference on Machine Learning (ICML) Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. F. Peng and A. McCallum. 2004. Accurate Information Extraction from Research Papers using Conditional Random Fields. In Proc. of the Human Language Technology Conf. (HLT)Google ScholarGoogle Scholar
  10. F. Sha and F. Pereira. 2003. Shallow parsing with conditional random fields. In Proc. of the Human Language Technology Conf. (HLT) Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. C. Manning and H. Schutze. 1999. Foundations of Statistical Natural Language Processing MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. McCallum, K. Rohanimanesh and C. Sutton. 2003. Dynamic Conditional Random Fields for Jointly Labeling Multiple Sequences. In Proc. of Workshop on Syntax, Semantics, Statistics. (NIPS)Google ScholarGoogle Scholar
  13. R. McDonald, K. Crammer, and F. Pereira. 2005. Online large-margin training of dependency parsers. In Proc. of the 43rd Annual Meeting of the ACL Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. L. Ramshaw and M. Marcus. 1995. Text chunking using transformation-based learning. In Proc. of Third Workshop on Very Large Corpora. ACLGoogle ScholarGoogle Scholar
  15. C. Sutton, K. Rohanimanesh and A. McCallum. 2004. Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data. In Proc. of the 21st International Conference on Machine Learning (ICML) Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. B. Taskar, D. Klein, M. Collins, D. Koller, and C. Manning 2004. Max Margin Parsing. In Proc. of Empirical Methods in Natural Language Processing (EMNLP)Google ScholarGoogle Scholar
  17. B. Taskar and D. Klein. 2005. Max-Margin Methods for NLP: Estimation, Structure, and Applications Available at http://www.cs.berkeley.edu/~taskar/pubs/max-margin-acl05-tutorial.pdfGoogle ScholarGoogle Scholar
  18. E. F. Tjong Kim Sang and S. Buchholz. 2000. Introduction to the CoNLL-2000 shared task: Chunking. In Proc. of the 4th Conf. on Computational Natural Language Learning (CoNLL) Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. T. Zhang. 2001. Regularized winnow methods. In Advances in Neural Information Processing Systems 13Google ScholarGoogle Scholar
  1. Exact decoding for jointly labeling and chunking sequences

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image DL Hosted proceedings
          COLING-ACL '06: Proceedings of the COLING/ACL on Main conference poster sessions
          July 2006
          992 pages

          Publisher

          Association for Computational Linguistics

          United States

          Publication History

          • Published: 17 July 2006

          Qualifiers

          • Article

          Acceptance Rates

          COLING-ACL '06 Paper Acceptance Rate126of126submissions,100%Overall Acceptance Rate1,537of1,537submissions,100%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader