Skip to main content

Automatic Identification of Persian Light Verb Constructions

  • Conference paper
Book cover Computational Linguistics and Intelligent Text Processing (CICLing 2012)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7181))

Abstract

Multiword expressions pose a challenge to the development of large-scale, semantically-rich Natural Language Processing (NLP) systems. We use a bilingual parallel corpus for automatically extracting Light Verb Constructions (LVCs), a very common type of multiword expressions in many languages, including Persian. Using two classifiers, we investigate the usefulness of seven linguistically-informed features for automatically identifying Persian LVCs. To our knowledge, this is the first attempt at the automatic detection of a broad class of Persian LVCs. Results of our experiments show that the proposed features are reasonably successful at the task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Baldwin, T., Villavicencio, A.: Extracting the unextractable: A case study on verb-particles. In: CoNLL 2002 (2002)

    Google Scholar 

  • Bradley, A.: The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognition 30(7) (1997)

    Google Scholar 

  • Brown, P., Della Pietra, V., Della Pietra, S., Mercer, R.: The mathematics of statistical machine translation: parameter estimation. Computational Linguistics 19(2) (1993)

    Google Scholar 

  • Butt, M.: The light verb jungle. In: Workshop on Multi-Verb Constructions (2003)

    Google Scholar 

  • Dabir-Moghaddam, M.: Compound verbs in Persian. Studies in the Linguistic Sciences 27(2) (1997)

    Google Scholar 

  • Doostan, G.K., Sanandaj, I.: N+ v complex predicates in persian. Structural Aspects of Semantically Complex Verbs, 277–292 (2001)

    Google Scholar 

  • Evert, S., Krenn, B.: Using small random samples for the manual evaluation of statistical association measures. Computer Speech & Lang. 19 (2005)

    Google Scholar 

  • Fazly, A.: Automatic acquisition of lexical knowledge about multiword predicates. PhD thesis, University of Toronto (2007)

    Google Scholar 

  • Fazly, A., Stevenson, S.: Distinguishing subtypes of multiword expressions using linguistically-motivated statistical measures. In: ACL Wkshp. on MWEs (2007)

    Google Scholar 

  • Karimi, S.: Persian complex verbs: Idiomatic or compositional. Lexicology 3(1) (1997)

    Google Scholar 

  • Karimi-Doostan, G.: Light verb constructions in Persian. PhD thesis, University of Essex (1997)

    Google Scholar 

  • Karimi-Doostan, G.: Light verbs and structural case. Lingua 115(12) (2005)

    Google Scholar 

  • Khanlari, P.: Tarikh-e zaban-e farsi [history of the Persian language]. Bonyâd-e Farhang (1973)

    Google Scholar 

  • Megerdoomian, K.: A semantic template for light verb constructions. In: 1st Wkshp. on Persian Language and Computers (2004)

    Google Scholar 

  • Melamed, I.D.: Measuring semantic entropy. In: ACL-SIGLEX Workshop Tagging Text with Lexical Semantics: Why, What, and How (1997)

    Google Scholar 

  • Och, F.J., Ney, H.: Improved statistical alignment models. In: ACL 2000 (2000)

    Google Scholar 

  • Pilevar, M.T., Faili, H.: Presenting an automatic movie subtitle translation system. In: 15th Conf. of the Computer Society of Iran (2010) (in Persian)

    Google Scholar 

  • Salehi, B., Fazly, A., Jahromi, M.: Extracting non-compositional verb particle constructions using a bilingual corpus. In: Signal and Image Processing and Applications (2011)

    Google Scholar 

  • Stevenson, S., Fazly, A., North, R.: Statistical measures of the semi-productivity of light verb constructions. In: ACL Wkshp. on MWEs (2004)

    Google Scholar 

  • Villada Moirón, B., Tiedemann, J.: Identifying idiomatic expressions using automatic word alignment. In: EACL Wkshp. on MWEs (2006)

    Google Scholar 

  • Villavicencio, A., Kordoni, V., Zhang, Y., Idiart, M., Ramisch, C.: Validation and evaluation of automatically acquired multiword expressions for grammar engineering. In: EMNLP 2007 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Salehi, B., Askarian, N., Fazly, A. (2012). Automatic Identification of Persian Light Verb Constructions. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2012. Lecture Notes in Computer Science, vol 7181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28604-9_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28604-9_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28603-2

  • Online ISBN: 978-3-642-28604-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics