Abstract
Conditional Random Fields (CRFs) are widely known to scale poorly, particularly for tasks with large numbers of states or with richly connected graphical structures. This is a consequence of inference having a time complexity which is at best quadratic in the number of states. This paper describes a novel parameterisation of the CRF which ties the majority of clique potentials, while allowing individual potentials for a subset of the labellings. This has two beneficial effects: the parameter space of the model (and thus the propensity to over-fit) is reduced, and the time complexity of training and decoding becomes sub-quadratic. On a standard natural language task, we reduce CRF training time four-fold, with no loss in accuracy. We also show how inference can be performed efficiently in richly connected graphs, in which current methods are intractable.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Download to read the full chapter text
Chapter PDF
References
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labelling sequence data. In: Proceedings of ICML, pp. 282–289 (2001)
Sha, F., Pereira, F.: Shallow parsing with conditional random fields. In: Proceedings of HLT-NAACL, pp. 213–220 (2003)
Cohn, T., Smith, A., Osborne, M.: Scaling conditional random fields using error-correcting codes. In: Proceedings of ACL (2005)
Malouf, R.: A comparison of algorithms for maximum entropy parameter estimation. In: Proceedings of CoNLL, pp. 49–55 (2002)
Kschischang, F., Frey, B., Loeliger, H.A.: Factor graphs and the sum-product algorithm. IEEE Trans. Inform. Theory 47, 498–519 (2001)
Marcus, M., Kim, G., Marcinkiewicz, M.A., MacIntyre, R., Bies, A., Ferguson, M., Katz, K., Schasberger, B.: The Penn Treebank: Annotating predicate argument structure. In: Proceedings of ARPA Human Language Technology Workshop (1994)
Collins, M.: Ranking algorithms for named entity extraction: Boosting and the voted perceptron. In: Proceedings of ACL, pp. 489–496 (2002)
Gillick, L., Cox, S.: Some statistical issues in the comparison of speech recognition algorithms. In: Proceedings of ICASSP, pp. 532–535 (1989)
Toutanova, K., Klein, D., Manning, C., Singer, Y.: Feature rich part-of-speech tagging with a cyclic dependency network. In: Proceedings of HLT-NAACL, pp. 252–259 (2003)
Palmer, M., Gildea, D., Kingsbury, P.: The proposition bank: An annotated corpus of semantic roles. Computational Linguistics 31(1), 71–105 (2005)
Carreras, X., Màrquez, L.: Introduction to the CoNLL-2005 Shared Task: Semantic Role Labeling. In: Proceedings of CoNLL, pp. 152–164 (2005)
Pradhan, S., Hacioglu, K., Krugler, V., Ward, W., Martin, J., Jurafsky, D.: Support vector learning for semantic argument classification. Machine Learning journal, Special issue on Speech and Natural Language Processing 60(1), 11–39 (2005)
Xue, N., Palmer, M.: Calibrating features for semantic role labeling. In: Proceedings of EMNLP, pp. 88–94 (2004)
Roth, D., Yih, W.: Integer linear programming inference for conditional random fields. In: Proceedings of ICML, pp. 737–744 (2005)
Siddiqi, S., Moore, A.: Fast inference and learning in large-state-space HMMs. In: Proceedings of ICML, pp. 800–807 (2005)
Pal, C., Sutton, C., McCallum, A.: Sparse forward-backward using minimum divergence beams for fast training of conditional random fields. In: Proceedings of ICASSP (2006)
Roark, B., Saraclar, M., Collins, M., Johnson, M.: Discriminative language modeling with conditional random fields and the perceptron algorithm. In: Proceedings of ACL, pp. 48–55 (2004)
Wu, J., Khudanpur, S.: Efficient training methods for maximum entropy language modelling. In: Proceedings of the ICSLP, pp. 114–117 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cohn, T. (2006). Efficient Inference in Large Conditional Random Fields. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds) Machine Learning: ECML 2006. ECML 2006. Lecture Notes in Computer Science(), vol 4212. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11871842_58
Download citation
DOI: https://doi.org/10.1007/11871842_58
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45375-8
Online ISBN: 978-3-540-46056-5
eBook Packages: Computer ScienceComputer Science (R0)