article

Estimating satisfactoriness of selectional restriction from corpus without a thesaurus

Authors:
Yoichi Tomiura

Kyushu University, Fukuoka, Japan

Kyushu University, Fukuoka, Japan
View Profile

,
Shosaku Tanaka

Ritsumeikan University, Kyoto, Japan

Ritsumeikan University, Kyoto, Japan
View Profile

,
Toru Hitaka

Kyushu University (retired March 2003)

Kyushu University (retired March 2003)
View Profile

ACM Transactions on Asian Language Information Processing Volume 4 Issue 4pp 400–416https://doi.org/10.1145/1113308.1113311

Published:01 December 2005Publication History

ACM Transactions on Asian Language Information Processing

Abstract

A selectional restriction specifies what combinations of words are semantically valid in a particular syntactic construction. This is one of the basic and important pieces of knowledge in natural language processing and has been used for syntactic and word sense disambiguation. In the case of acquiring the selectional restriction for many combinations of words from a corpus, it is necessary to estimate whether or not a word combination that is not observed in the corpus satisfies the selectional restriction. This paper proposes a new method for estimating the degree of satisfaction of the selectional restriction for a word combination from a tagged corpus, based on the multiple regression model. The independent variables of this model correspond to modifiers. Unlike a conventional multiple regression analysis, the independent variables are also parameters to be learned. We experiment on estimating the degree of satisfaction of the selectional restriction for Japanese word combinations 〈noun, postpositional-particle, verb〉. The experimental results indicate that our method estimates the degree of satisfaction of a word combination not very well observed in the corpus, and that the accuracy of syntactic disambiguation using the co-occurrencies estimated by our method is higher than using co-occurrence probabilities smoothed by previous methods.

References

Dagan, I., Lee, L., and Pereira, F. 1999. Similarity-based models of word co-occurrence probabilities. Machine Learning 34, 1-3 (Feb.), 43--69. Google Scholar
Grishman, R. and Sterling, J. 1994. Generalizing automatically generated selectional patterns. In Proceedings of COLING'94. 742--747. Google Scholar
Hindle, D. 1990. Noun classification from predicate-argument structures. In Proceedings of the 28th Annual Meeting of ACL. 268--275. Google Scholar
Hofmann, T. 1999. Probabilistic latent semantic indexing. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 50--57. Google Scholar
Kawahara, D. and Kurohashi, S. 2002. Case frame construction by coupling the predicate and its closest case component. Journal of Natural Language Processing 9, 1 (Jan.), 3--19 (in Japanese).Google Scholar
Lee, L. J. 1997. Similarity-based approaches to natural language processing. Ph.D. thesis, Harvard University, Cambridge, Massachusetts. Google Scholar
Nagamatsu, K. and Tanaka, H. 1996. Evaluation of a similarity measure based on co-occurrence and dependency between words. In IPSJ SIG Notes 96-NL-116. Information Processing Society of Japan, 73--78 (in Japanese).Google Scholar
Pereira, F., Tishby, N., and Lee, L. 1993. Distributional clustering of english words. In Proceedings of the 31st Annual Meeting of ACL. 183--190. Google Scholar
Uchimoto, K., Sekine, S., and Isahara, H. 1999. Japanese dependency structure analysis based on maximum entropy models. Trans. of Information Processing Society of Japan 40, 9 (Sept.), 3397--3407 (in Japanese). Google Scholar

Index Terms

Estimating satisfactoriness of selectional restriction from corpus without a thesaurus
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Machine learning
    1. Learning settings

Recommendations

Extracting ontological selectional preferences for non-pertainym adjectives from the google corpus
AAAI'10: Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence

While there has been much research into using selectional preferences for word sense disambiguation (WSD), much difficulty has been encountered. To facilitate study into this difficulty and aid in WSD in general, a database of the selectional ...
Read More
Word Sense Disambiguation of Farsi Homographs Using Thesaurus and Corpus
GoTAL '08: Proceedings of the 6th international conference on Advances in Natural Language Processing

This paper describes disambiguation of Farsi homographs in unrestricted text using thesaurus and corpus. The proposed method is based on [1] with some differences. These differences consist of first using collocational information to avoid the ...
Read More
Websters Occitan - English Thesaurus Dictionary
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Asian Language Information Processing Volume 4, Issue 4
December 2005
129 pages
ISSN:1530-0226
EISSN:1558-3430
DOI:10.1145/1113308
Issue’s Table of Contents

Copyright © 2005 ACM
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 December 2005
Published in talip Volume 4, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Co-occurrence of word combination
multiple regression model
similarity between words with respect to co-occurrence
syntactic disambiguation
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 294
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Estimating satisfactoriness of selectional restriction from corpus without a thesaurus

ACM Transactions on Asian Language Information Processing

Abstract

References

Cited By

Index Terms

Recommendations

Extracting ontological selectional preferences for non-pertainym adjectives from the google corpus

Word Sense Disambiguation of Farsi Homographs Using Thesaurus and Corpus

Websters Occitan - English Thesaurus Dictionary

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Estimating satisfactoriness of selectional restriction from corpus without a thesaurus

ACM Transactions on Asian Language Information Processing

Abstract

References

Cited By

Index Terms

Recommendations

Extracting ontological selectional preferences for non-pertainym adjectives from the google corpus

Word Sense Disambiguation of Farsi Homographs Using Thesaurus and Corpus

Websters Occitan - English Thesaurus Dictionary

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media