ABSTRACT
This paper presents a new, exemplar-based model of thematic fit. In contrast to previous models, it does not approximate thematic fit as argument plausibility or 'fit with verb selectional preferences', but directly as semantic role plausibility for a verb-argument pair, through similarity-based generalization from previously seen verb-argument pairs. This makes the model very robust for data sparsity. We argue that the model is easily extensible to a model of semantic role ambiguity resolution during online sentence comprehension.
The model is evaluated on human semantic role plausibility judgments. Its predictions correlate significantly with the human judgments. It rivals two state-of-the-art models of thematic fit and exceeds their performance on previously unseen or low-frequency items.
- Thomas M. Cover and Peter E. Hart. 1967. Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1):21--27.Google ScholarDigital Library
- Waller Daelemans and Antal van den Bosch. 2005. Memory-based language processing. Cambridge University Press, Cambridge. Google ScholarDigital Library
- Charles J. Fillmore, Christopher R. Johnson, and Miriam R. L. Petruck. 2003. Background to FrameNet. International Journal of Lexicography, 16:235--250.Google ScholarCross Ref
- Evelyn Fix and Joseph L. Hodges. 1951. Discriminatory analysis---nonparametric discrimination: consistency properties. Technical Report Project 21-49-004, Report No. 4, USAF School of Aviation Medicine, Randolp Field, TX.Google Scholar
- Lyn Frazier. 1987. Sentence processing: A tutorial review. In Max Coltheart, editor, Attention and Performance XII: The Psychology of Reading, pages 559--586. Erlbaum, Hillsdale, NJ.Google Scholar
- Peter Koomen, Vasin Punyakanok, Dan Roth, and Wen-tau Yih. 2005. Generalized inference with multiple semantic role labeling systems. In Ido Dagan and Daniel Gildea, editors, Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL-2005), pages 181--184. Association for Computational Linguistics, Morris-town, NJ. Google ScholarDigital Library
- Dekang Lin. 1998. Automatic retrieval and clustering of similar words. In Christian Boitet and Pete Whitelock, editors, Proceedings of the 17th International Conference on Computational Linguistics, pages 768--774. Association for Computational Linguistics, Morristown, NJ. Google ScholarDigital Library
- Maryellen C. MacDonald and Mark S. Seidenberg. 2006. Constraint satisfaction accounts of lexical and sentence comprehension. In Matthew J. Traxler and Morton A. Gernsbacher, editors, Handbook of Psycholinguistics (Second Edition), pages 581--611. Academic Press, London.Google Scholar
- Mitchell P. Marcus, Mary Ann Marcinkiewicz, and Beatrice Santorini. 1993. Building a large annotated corpus of english: the Penn Treebank. Computational Linguistics, 19(2):313--330. Google ScholarDigital Library
- Ken McRae, Michael J. Spivey-Knowlton, and Michael K. Tanenhaus. 1998. Modeling the influence of thematic fit (and other constraints) in on-line sentence comprehension. Journal of Memory and Language, 38(3):283--312.Google ScholarCross Ref
- Robert M. Nosofsky. 1986. Attention, similarity, and the identification-categorization relationship. Journal of Experimental Psychology-General, 115(1):39--57.Google ScholarCross Ref
- Sebastian Padó and Mirella Lapata. 2007. Dependency-based construction of semantic space models. Computational Linguistics, 33(2):161--199. Google ScholarDigital Library
- Ulrike Padó, Frank Keller, and Matthew Crocker. 2006. Combining syntax and thematic fit in a probabilistic model of sentence processing. In Ron Sun and Naomi Miyake, editors, Proceedings of the 28th Annual Conference of the Cognitive Science Society, pages 657--662. Cognitive Science Society, Austin, TX.Google Scholar
- Sebastian Padó, Ulrike Padó, and Katrin Erk. 2007. Flexible, corpus-based modelling of human plausibility judgements. In Jason Eisner, editor, Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pages 400--409. Association for Computational Linguistics, Morristown, NJ.Google Scholar
- Martha Palmer, Daniel Gildea, and Paul Kingsbury. 2005. The Proposition Bank: An annotated corpus of semantic roles. Computational Linguistics, 31(1):71--106. Google ScholarDigital Library
- Trivellore Raghunathan. 2003. An approximate test for homogeneity of correlated correlation coefficients. Quality and Quantity, 4(1):99--110.Google ScholarCross Ref
- Philip Resnik. 1996. Selectional constraints: an information-theoretic model and its computational realization. Cognition, 61(1--2):127--159.Google Scholar
- Roger N. Shepard. 1987. Toward a universal law of generalization for psychological science. Science, 237(4820):1317--1323.Google ScholarCross Ref
- Jakub Zavrel and Walter Daelemans. 1997. Memory-based learning: Using similarity for smoothing. In Philip R. Cohen and Wolfgang Wahlster, editors, Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, pages 436--443. Association for Computational Linguistics, Morristown, NJ. Google ScholarDigital Library
Index Terms
- A robust and extensible exemplar-based model of thematic fit
Recommendations
Language model based arabic word segmentation
ACL '03: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1We approximate Arabic's rich morphology by a model that a word consists of a sequence of morphemes in the pattern prefix*-stem-suffix* (* denotes zero or more occurrences of a morpheme). Our method is seeded by a small manually segmented Arabic corpus ...
Exemplar-based models for word meaning in context
ACLShort '10: Proceedings of the ACL 2010 Conference Short PapersThis paper describes ongoing work on distributional models for word meaning in context. We abandon the usual one-vector-per-word paradigm in favor of an exemplar model that activates only relevant occurrences. On a paraphrasing task, we find that a ...
Exemplar-based word-space model for compositionality detection: shared task system description
DiSCo '11: Proceedings of the Workshop on Distributional Semantics and CompositionalityIn this paper, we highlight the problems of polysemy in word space models of compositionality detection. Most models represent each word as a single prototype-based vector without addressing polysemy. We propose an exemplar-based model which is designed ...
Comments