Skip to main content

A New Large-Scale Multi-purpose Handwritten Farsi Database

  • Conference paper
Image Analysis and Recognition (ICIAR 2009)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5627))

Included in the following conference series:

Abstract

This paper introduces the Center for Pattern Recognition and Machine Intelligence (CENPARMI) Farsi dataset which can be used to measure the performance of handwritten recognition and word spotting systems. This dataset is unique in terms of its large number of gray and binary images (432,357 each) consisting of dates, words, isolated letters, isolated digits, numeral strings, special symbols, and documents. The data was collected from 400 native Farsi writers. The selection of Farsi words has been based on their high frequency in financial documents. The dataset is divided into grouped and ungrouped subsets which will give the user the flexibility of whether or not to use CENPARMI’s pre-divided dataset (60% of the images are used as the Training set, 20% of the images as the Validation set, and the rest as the Testing set). Finally, experiments have been conducted on the Farsi isolated digits with a recognition rate of 96.85%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alamri, H., Sadri, J., Nobile, N., Suen, C.Y.: A Novel Comprehensive Database for Arabic Off-Line Handwriting Recognition. In: Proceedings of 11th International Conference on Frontiers in Handwriting Recognition (ICFHR 11), Montreal, Canada, pp. 664–669 (2008)

    Google Scholar 

  2. Bidgoli, A.M., Sarhadi, M.: IAUT/PHCN: Islamic Azad University of Tehran/ Persian Handwritten City Names, A very large database of handwritten Persian word. In: Proceedings of 11th International Conference on Frontiers in Handwriting Recognition (ICFHR 11), Montreal, Canada, pp. 192–197 (2008)

    Google Scholar 

  3. Dong, J.X., Krzyzak, A., Suen, C.Y.: A fast SVM training algorithm. International Journal of Pattern Recognition and Artificial Intelligence 17(3), 367–384 (2003)

    Article  MATH  Google Scholar 

  4. Khosravi, H., Kabir, E.: Introducing a very large dataset of handwritten Farsi digits and a study on their varieties. Pattern Recognition Letters 28(10), 1133–1141 (2007)

    Article  Google Scholar 

  5. Liu, C.-L., Nakashima, K., Sako, H., Fujisawa, H.: Handwritten digit recognition: Investigation of normalization and feature extraction techniques. Pattern Recognition 37(2), 265–279 (2004)

    Article  MATH  Google Scholar 

  6. Liu, C.-L., Suen, C.Y.: A new benchmark on the recognition of handwritten Bangla and Farsi numeral characters. In: Proceedings of 11th International Conference on Frontiers in Handwriting Recognition (ICFHR 11), Montreal, Canada, pp. 278–283 (2008)

    Google Scholar 

  7. Mozaffari, S., El Abed, H., Margner, V., Faez, K., Amirshahi, A.: IfN/Farsi-Database: A Database of Farsi Handwritten City Names. In: Proceedings of 11th International Conference on Frontiers in Handwriting Recognition (ICFHR 11), Montreal, Canada, pp. 397–402 (2008)

    Google Scholar 

  8. Mozaffari, S., Faez, K., Faraji, F., Ziaratban, M., Golzan, S.M.: A Comprehensive Isolated Farsi/Arabic Character Database for Handwritten OCR Research. In: Proceedings of IWFHR 2006, Paris, France, pp. 23–26 (2006)

    Google Scholar 

  9. Otsu, N.: A threshold selection method from gray-level histogram. IEEE Trans. System Man Cybernet. 9, 1569–1576 (1979)

    Article  Google Scholar 

  10. Clawson, P., Rubin, M.: Eternal Iran, p. 6. Palgrave Macmillan Publishers, New York (2005)

    Book  Google Scholar 

  11. Shi, M., Fujisawa, Y., Wakabayashi, T., Kimura, F.: Handwritten numeral recognition using gradient and curvature of gray scale image. Pattern Recognition 35(10), 2051–2059 (2002)

    Article  MATH  Google Scholar 

  12. Solimanpour, F., Sadri, J., Suen, C.Y.: Standard Databases for Recognition of Handwritten Digits, Numerical Strings, Legal Amounts, Letters and Dates in Farsi Language. In: Proceedings of 10th International Workshop on Frontiers in Handwriting Recognition (IWFHR 10), La Baule, France, pp. 743–751 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Haghighi, P.J., Nobile, N., He, C.L., Suen, C.Y. (2009). A New Large-Scale Multi-purpose Handwritten Farsi Database. In: Kamel, M., Campilho, A. (eds) Image Analysis and Recognition. ICIAR 2009. Lecture Notes in Computer Science, vol 5627. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02611-9_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02611-9_28

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02610-2

  • Online ISBN: 978-3-642-02611-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics