Abstract
Multiword expressions pose a challenge to the development of large-scale, semantically-rich Natural Language Processing (NLP) systems. We use a bilingual parallel corpus for automatically extracting Light Verb Constructions (LVCs), a very common type of multiword expressions in many languages, including Persian. Using two classifiers, we investigate the usefulness of seven linguistically-informed features for automatically identifying Persian LVCs. To our knowledge, this is the first attempt at the automatic detection of a broad class of Persian LVCs. Results of our experiments show that the proposed features are reasonably successful at the task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baldwin, T., Villavicencio, A.: Extracting the unextractable: A case study on verb-particles. In: CoNLL 2002 (2002)
Bradley, A.: The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognition 30(7) (1997)
Brown, P., Della Pietra, V., Della Pietra, S., Mercer, R.: The mathematics of statistical machine translation: parameter estimation. Computational Linguistics 19(2) (1993)
Butt, M.: The light verb jungle. In: Workshop on Multi-Verb Constructions (2003)
Dabir-Moghaddam, M.: Compound verbs in Persian. Studies in the Linguistic Sciences 27(2) (1997)
Doostan, G.K., Sanandaj, I.: N+ v complex predicates in persian. Structural Aspects of Semantically Complex Verbs, 277–292 (2001)
Evert, S., Krenn, B.: Using small random samples for the manual evaluation of statistical association measures. Computer Speech & Lang. 19 (2005)
Fazly, A.: Automatic acquisition of lexical knowledge about multiword predicates. PhD thesis, University of Toronto (2007)
Fazly, A., Stevenson, S.: Distinguishing subtypes of multiword expressions using linguistically-motivated statistical measures. In: ACL Wkshp. on MWEs (2007)
Karimi, S.: Persian complex verbs: Idiomatic or compositional. Lexicology 3(1) (1997)
Karimi-Doostan, G.: Light verb constructions in Persian. PhD thesis, University of Essex (1997)
Karimi-Doostan, G.: Light verbs and structural case. Lingua 115(12) (2005)
Khanlari, P.: Tarikh-e zaban-e farsi [history of the Persian language]. Bonyâd-e Farhang (1973)
Megerdoomian, K.: A semantic template for light verb constructions. In: 1st Wkshp. on Persian Language and Computers (2004)
Melamed, I.D.: Measuring semantic entropy. In: ACL-SIGLEX Workshop Tagging Text with Lexical Semantics: Why, What, and How (1997)
Och, F.J., Ney, H.: Improved statistical alignment models. In: ACL 2000 (2000)
Pilevar, M.T., Faili, H.: Presenting an automatic movie subtitle translation system. In: 15th Conf. of the Computer Society of Iran (2010) (in Persian)
Salehi, B., Fazly, A., Jahromi, M.: Extracting non-compositional verb particle constructions using a bilingual corpus. In: Signal and Image Processing and Applications (2011)
Stevenson, S., Fazly, A., North, R.: Statistical measures of the semi-productivity of light verb constructions. In: ACL Wkshp. on MWEs (2004)
Villada Moirón, B., Tiedemann, J.: Identifying idiomatic expressions using automatic word alignment. In: EACL Wkshp. on MWEs (2006)
Villavicencio, A., Kordoni, V., Zhang, Y., Idiart, M., Ramisch, C.: Validation and evaluation of automatically acquired multiword expressions for grammar engineering. In: EMNLP 2007 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Salehi, B., Askarian, N., Fazly, A. (2012). Automatic Identification of Persian Light Verb Constructions. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2012. Lecture Notes in Computer Science, vol 7181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28604-9_17
Download citation
DOI: https://doi.org/10.1007/978-3-642-28604-9_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28603-2
Online ISBN: 978-3-642-28604-9
eBook Packages: Computer ScienceComputer Science (R0)