Skip to main content

English to Persian Transliteration

  • Conference paper
String Processing and Information Retrieval (SPIRE 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4209))

Included in the following conference series:

Abstract

Persian is an Indo-European language written using Arabic script, and is an official language of Iran, Afghanistan, and Tajikistan. Transliteration of Persian to English—that is, the character-by-character mapping of a Persian word that is not readily available in a bilingual dictionary—is an unstudied problem. In this paper we make three novel contributions. First, we present performance comparisons of existing grapheme-based transliteration methods on English to Persian. Second, we discuss the difficulties in establishing a corpus for studying transliteration. Finally, we introduce a new model of Persian that takes into account the habit of shortening, or even omitting, runs of English vowels. This trait makes transliteration of Persian particularly difficult for phonetic based methods. This new model outperforms the existing grapheme based methods on Persian, exhibiting a 24% relative increase in transliteration accuracy measured using the top-5 criteria.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. AbdulJaleel, N., Larkey, L.S.: Statistical transliteration for English-Arabic cross language information retrieval. In: CIKM, pp. 139–146 (2003)

    Google Scholar 

  2. Bilac, S., Tanaka, H.: Direct combination of spelling and pronunciation information for robust back-transliteration. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 413–424. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  3. Brown, P.F., Pietra, V.J.D., Pietra, S.A.D., Mercer, R.L.: The mathematics of statistical machine translation: Parameter estimation. Computional Linguistics 19(2), 263–311 (1993)

    Google Scholar 

  4. Cleary, J.G., Witten, I.H.: A comparison of enumerative and adaptive codes. IEEE Transactions on Information Theory 30(2), 306–315 (1984)

    Article  MathSciNet  Google Scholar 

  5. Eppstein, D.: Finding the k shortest paths. SIAM J. Computing 28(2), 652–673 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  6. Hall, P.A.V., Dowling, G.R.: Approximate string matching. ACM Comput. Surv. 12(4), 381–402 (1980)

    Article  MathSciNet  Google Scholar 

  7. Jung, S.Y., Hong, S.L., Paek, E.: An English to Korean transliteration model of extended markov window. In: COLING, pp. 383–389 (2000)

    Google Scholar 

  8. Knight, K., Graehl, J.: Machine transliteration. Computational Linguistics 24(4), 599–612 (1998)

    Google Scholar 

  9. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions and reversals. Doklady Akademii Nauk SSSR 163(4), 845–848 (1965)

    MathSciNet  Google Scholar 

  10. Linden, K.: Multilingual modeling of cross-lingual spelling variants. Inf. Retrieval 9(3), 295–310 (2005)

    Article  MathSciNet  Google Scholar 

  11. Och, F.J., Ney, H.: A systematic comparison of various statistical alignment models. Computational Linguistics 29(1), 19–51 (2003)

    Article  Google Scholar 

  12. Jong-Hoon, O., Key-Sun, C.: An ensemble of transliteration models for information retrieval. Inf. Process. Manage. 42(4), 980–1002 (2006)

    Article  Google Scholar 

  13. Toivonen, J., Pirkola, A., Keskustalo, H., Visala, K., Järvelin, K.: Translating cross-lingual spelling variants using transformation rules. Inf. Process. Manage. 41(4), 859–872 (2005)

    Article  Google Scholar 

  14. Wan, S., Verspoor, C.: Automatic English-Chinese name transliteration for development of multilingual resources. In: COLING-ACL, pp. 1352–1356 (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Karimi, S., Turpin, A., Scholer, F. (2006). English to Persian Transliteration. In: Crestani, F., Ferragina, P., Sanderson, M. (eds) String Processing and Information Retrieval. SPIRE 2006. Lecture Notes in Computer Science, vol 4209. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11880561_21

Download citation

  • DOI: https://doi.org/10.1007/11880561_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-45774-9

  • Online ISBN: 978-3-540-45775-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics