Article

Free Access

Noun phrase chunking in Hebrew: influence of lexical and morphological features

Authors:
Yoav Goldberg

Ben Gurion University of the Negev, Israel

Ben Gurion University of the Negev, Israel
View Profile

,
Meni Adler

Ben Gurion University of the Negev, Israel

Ben Gurion University of the Negev, Israel
View Profile

,
Michael Elhadad

Ben Gurion University of the Negev, Israel

Ben Gurion University of the Negev, Israel
View Profile

ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational LinguisticsJuly 2006Pages 689–696https://doi.org/10.3115/1220175.1220262

Published:17 July 2006Publication History

ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics

Pages 689–696

ABSTRACT

We present a method for Noun Phrase chunking in Hebrew. We show that the traditional definition of base-NPs as non-recursive noun phrases does not apply in Hebrew, and propose an alternative definition of Simple NPs. We review syntactic properties of Hebrew related to noun phrases, which indicate that the task of Hebrew SimpleNP chunking is harder than base-NP chunking in English. As a confirmation, we apply methods known to work well for English to Hebrew data. These methods give low results (F from 76 to 86) in Hebrew. We then discuss our method, which applies SVM induction over lexical and morphological features. Morphological features improve the average precision by ~0.5%, recall by ~1%, and F-measure by ~0.75, resulting in a system with average performance of 93% precision, 93.4% recall and 93.2 F-measure.

References

Meni Adler and Michael Elhadad, 2006. Unsupervised Morpheme-based HMM for Hebrew Morphological Disambiguation. In Proc. of COLING/ACL 2006, Sidney. Google ScholarDigital Library
Steven P. Abney. 1991. Parsing by Chunks. In Robert C. Berwick, Steven P. Abney, and Carol Tenny editors, Principle Based Parsing. Kluwer Academic Publishers.Google Scholar
Erin L. Allwein, Robert E. Schapire, and Yoram Singer. 2000. Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers. Journal of Machine Learning Research, 1:113--141. Google ScholarDigital Library
Claire Cardie and David Pierce. 1998. Error-Driven Pruning of Treebank Grammars for Base Noun Phrase Identification. In Proc. of COLING-98, Montréal. Google ScholarDigital Library
Mona Diab, Kadri Hacioglu, and Daniel Jurafsky. 2004. Automatic Tagging of Arabic Text: From Raw Text to Base Phrase Chunks, In Proc. of HLT/NAACL 2004, Boston. Google ScholarDigital Library
Nizar Habash and Owen Rambow, 2005. Arabic Tokenization, Part-of-speech Tagging and Morphological Disambiguation in One Fell Swoop. In Proc. of ACL 2005, Ann Arbor. Google ScholarDigital Library
Thorsten Joachims. 1998. Text Categorization with Support Vector Machines: Learning with Many Relevant Features. In Proc. of ECML-98, Chemnitz. Google ScholarDigital Library
Taku Kudo and Yuji Matsumato. 2000. Use of Support Vector Learning for Chunk Identification. In Proc. of CoNLL-2000 and LLL-2000, Lisbon. Google ScholarDigital Library
Taku Kudo and Yuji Matsumato. 2003. Fast Methods for Kernel-Based Text Analysis. In Proc. of ACL 2003, Sapporo. Google ScholarDigital Library
Yael Netzer-Dahan and Michael Elhadad, 1998. Generation of Noun Compounds in Hebrew: Can Syntactic Knowledge be Fully Encapsulated? In Proc. of INLG-98, Ontario.Google Scholar
Lance A. Ramshaw and Mitchel P. Marcus. 1995. Text Chunking Using Transformation-based Learning. In Proc. of the 3rd ACL Workshop on Very Large Corpora. Cambridge.Google Scholar
Khalil Sima'an, Alon Itai, Yoad Winter, Alon Altman and N. Nativ, 2001. Building a Tree-bank of Modern Hebrew Text, in Traitement Automatique des Langues 42(2).Google Scholar
Fei Sha and Fernando Pereira. 2003. Shallow Parsing with Conditional Random Fields. Technical Report CIS TR MS-CIS-02-35, University of Pennsylvania.Google Scholar
Erik F. Tjong Kim Sang and Sabine Buchholz. 2000. Introduction to the CoNLL-2000 Shared Task: Chunking. In Proc. of CoNLL-2000 and LLL-2000, Lisbon. Google ScholarDigital Library
Vladimir Vapnik. 1995. The Nature of Statistical Learning Theory. Springer Verlag, New York, NY. Google ScholarDigital Library
Tong Zhang, Fred Damerau and David Johnson. 2002. Text Chunking based on a Generalization of Winnow. Journal of Machine Learning Research, 2:615--637. Google ScholarDigital Library

Noun phrase chunking in Hebrew: influence of lexical and morphological features

Recommendations

Urdu Noun Phrase Chunking - Hybrid Approach
IALP '10: Proceedings of the 2010 International Conference on Asian Language Processing

In this work, chunking is used to mark the noun phrases of Urdu sentences. The approach used in this work is hybrid that combines statistical method and hand crafted rules. The statistical model used in this work is HMM along with IOB chunk annotation. ...
Read More
Chinese Noun Phrases Chunking: A Latent Discriminative Model with Global Features
CSE '11: Proceedings of the 2011 14th IEEE International Conference on Computational Science and Engineering

In the fields of Chinese natural language processing, recognizing simple and non-recursive base phrases is an important task for natural language processing applications, such as information processing and machine translation. In stead of rule-based ...
Read More
Kazakh Noun Phrase Extraction Based on N-gram and Rules
IALP '10: Proceedings of the 2010 International Conference on Asian Language Processing

The aim of the work is to extract Kazakh phrase and basic noun phrase from corpus. For the phrase extraction, N-gram model methods were used, specifically bigram and trigram methods were applied. For basic noun phrase extraction, rule-based methods were ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
July 2006
1214 pages
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 17 July 2006
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate85of443submissions,19%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 335
  Total Downloads
- Downloads (Last 12 months)31
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Noun phrase chunking in Hebrew: influence of lexical and morphological features

ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics

ABSTRACT

References

Cited By

Recommendations

Urdu Noun Phrase Chunking - Hybrid Approach

Chinese Noun Phrases Chunking: A Latent Discriminative Model with Global Features

Kazakh Noun Phrase Extraction Based on N-gram and Rules

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Noun phrase chunking in Hebrew: influence of lexical and morphological features

ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics

ABSTRACT

References

Cited By

Recommendations

Urdu Noun Phrase Chunking - Hybrid Approach

Chinese Noun Phrases Chunking: A Latent Discriminative Model with Global Features

Kazakh Noun Phrase Extraction Based on N-gram and Rules

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media