Abstract
Devanagari script is the most widely used script in India and other Asian countries. There is a rich collection of ancient Devanagari manuscripts, which is a wealth of knowledge. To make these manuscripts available to people, efforts are being done to digitize these documents. Optical Character Recognition (OCR) plays an important role in recognizing these documents. Convolutional Neural Network (CNN) is a powerful model that is giving very promising results in the field of character recognition, pattern recognition etc. CNN has never been used for the recognition of the Devanagari ancient manuscripts. Our aim in the proposed work is to use the power of CNN for extracting the wealth of knowledge from Devanagari handwritten ancient manuscripts. In addition, we aim is to experiment with various design options like number of layes, stride size, number of filters, kenel size and different functions in various layers and to select the best of these. In this paper, the authors have proposed to use deep learning model as a feature extractor as well as a classifier for the recognition of 33 classes of basic characters of Devanagari ancient manuscripts. A dataset containing 5484 characters has been used for the experimental work. Various experiments show that the accuracy achieved using CNN as a feature extractor is better than other state-of-the-art techniques. The recognition accuracy of 93.73% has been achieved by using the model proposed in this paper for Devanagari ancient character recognition.
Similar content being viewed by others
References
Acharya S, Pant AK, and Gyawali PK (2015). Deep learning based large scale handwritten Devanagari character recognition. Proceedings of the 9th international conference on software, knowledge, information management and applications (SKIMA), 1–6.
Agarwal B, Ramampiaro H, Langseth H, Ruocco M (2018) A deep network model for paraphrase detection in short text messages. Inf Process Manag 54:922–937
Ahlawat S, Choudhary A, Nayyar A, Singh S, Yoon B (2020) Improved handwritten digit recognition using convolutional neural networks (CNN). Sensors 20(12):3344
Alizadehashraf B and Roohi S (2017). Persian handwritten character recognition using convolutional neural network. Proceedings of the 10th Iranian Conference on Machine Vision and Image Processing, 247-251.
Alzubi J, Nayyar A, Kumar A (2018) Machine learning from theory to algorithms: an overview. J Phys Conf Ser 1142(1):012012
Avadesh M and Goyal N (2018). Optical character recognition for Sanskrit using convolution neural networks. Proceedings of the 13th IAPR International Workshop on Document Analysis Systems (DAS), Vienna, 447-452.
Bag S, Harit G (2013) A survey on optical character recognition for Bangla and Devanagari scripts. Sadhana 38(1):133–168
Cecotti H, Belaid A (2005) Hybrid OCR combination approach complemented by a specialized ICR applied on ancient documents. Proceedings of the Eighth International Conference on Document Analysis and Recognition (ICDAR’05), Seoul, pp 1–5
Ciresan DC, Meier U, Masci J, Gambardella LM and Schmidhuber J (2011) Flexible high performance convolutional neural networks for image classification. Proceedings of the 22nd International Joint conference on Artificial Intelligence, 1237-1242.
Diem M, Sablatnig R (2010). Recognizing characters of ancient manuscripts. In the Proceedings of the International Conference on Computer Image Analysis in the Study of Art: 753106-753106.
Erhan D, Bengio Y, Courville A, Manzagol P, Vincent P, Bengio S (2010) Why does unsupervised pre-training help deep learning? J Mach Learn Res 11:625–660
Farabet C, Couprie C, Najman L, Lecun Y (2013) Learning hierarchical features for scene labeling. IEEE Trans Pattern Anal Mach Intell 35:1915–1929
Garz A, Diem M, and Sablatnig R (2010) Local descriptors for document layout analysis. Proceedings of the International Symposium on Visual Computing 29–38
Ghosh K, Chakraborty A, Parui SK, Majumder P (2016) Improving information retrieval performance on OCRed text in the absence of clean text ground truth. Inf Process Manag 52:873–884
Heyong W, Ming H (2019) Supervised Hebb rule- based feature selection for text classification. Inf Process Manag 56:167–191
Jangid M, Srivastava S (2018) Handwritten Devanagari character recognition using layer wise training of deep convolutional neural networks and adaptive gradient methods. J Imag 4(2):41
Kavitha AS, Shivakumara P, Hemantha G (2013) Skewness and nearest neighbour based approach for historical document classification. Proceedings of the International Conference on Communication Systems and Network Technologies, 602-606.
Khanduja D, Nain N, Panwar S (2016) A hybrid feature extraction algorithm for Devanagari script. ACM Trans Asian Low-Resource Language Inf Process 15(1):2
Kim MS, Jang MD, Choi HL, Rhee TH, Kim JH and Kwag HK (2004) Digitalizing scheme of handwritten Hanja historical documents. Proceedings of the First International Workshop on Document Image Analysis for Libraries: 321-327.
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. Adv Neural Inf Proces Syst 25:1–9
Kumar S (2016). A study for handwritten Devanagari word recognition. Proceedings of the international conference on communication and signal processing (ICCSP), 1009–1014.
Kumar M, Sharma RK, Jindal MK, Jindal SR (2018) Character and numeral recognition for non-Indic and Indic scripts: a survey. Artif Intell Rev 52:2235–2261
Lee H, Grosse R, Ranganath R and Ng AY (2009). Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. Proceedings of the 26th Annual International Conference on Machine Learning, 609–616.
Narang SR, Jindal MK, Kumar M (2019) Devanagari ancient documents recognition using statistical feature extraction techniques. Sādhanā 44(6):0141
Narang SR, Jindal MK, Ahuja S, Kumar M (2020a) On the recognition of Devanagari ancient handwritten characters using SIFT and Gabor features. Soft Comput 24:17279–17289
Narang SR, Jindal MK, Kumar M (2020b) Ancient text recognition: a review. Artif Intell Rev 53:5517–5558
Narang SR, Jindal MK and Sharma P (2018) Devanagari ancient character recognition using HOG and DCT features. Proceedings of the 5th IEEE International Conference on Parallel, Distributed and Grid Computing (PDGC-2018), Solan, India.
Phan TV, Nguyen KC, Nakagawa M (2016) A nom historical document recognition system for digital archiving. Int J Document Anal Recognit 19(1):49–64
Purkaystha B, Datta T, Islam MS (2017) Bengali handwritten character recognition using deep convolutional neural network. Proceedings of the 20th International Conference of Computer and Information Technology (ICCIT), 1-5.
Quiles MG, Romero R (2006) A computer vision system based on multi-layer Perceptrons for controlling Mobile robots. ABCM Symp Ser Mechatron 2:661–668
Shah KR, Badgujar DD (2013) Devanagari handwritten character recognition (DHCR) for ancient documents: a review. Proceedings of IEEE Conference on Information and Communication Technology, 656-660.
Sharma N, Patnaik T, Kumar B (2013) Recognition for handwritten English letters: a review. Int J Eng Innov Technol 2(7):318–321
Shelke S, Apte S (2015) A fuzzy-based classification scheme for unconstrained handwritten Devanagari character recognition. In International Conference on Communication, Information & Computing Technology (ICCICT), 1–6.
Singh J, Lehal GS (2014) Comparative performance analysis of feature (S)-classifier combination for Devanagari optical character recognition system. Int J Adv Comput Sci Appl 5(6):37–42
Singh D, Saini JP, Chauhan DS (2015). Hindi character recognition using RBF neural network and directional group feature extraction technique. In international conference on cognitive computing and information processing (CCIP), 1–4
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from Overfitting. J Mach Learn Res 15:1929–1958
Soumya A, Kumar HG (2015) Feature extraction and recognition of ancient Kannada epigraphs. Smart Innov Syst Technol 33:469–478
Sousa JMC, Pinto JRC, Ribeiro CS, Gil JM (2005) Ancient document recognition using fuzzy methods. Proceedings of IEEE International Conference on Fuzzy Systems, 833-836.
Sumetphong C, Tangwongsan S (2012) An optimal approach towards recognizing broken Thai characters in OCR systems. International Conference on Digital Image Computing Techniques and Applications (DICTA),1-5.
Theeramunkong T, Wongtapan C (2005) Off-line isolated handwritten Thai OCR using island-based projection with n-gram model and hidden Markov model. Inf Process Manag 41:139–160
Uysel AK, Gunal S (2014) The impact of preprocessing on text classification. Inf Process Manag 50:104–112
Verma K, Singh M (2018) Hindi handwritten character recognition using convolutional neural network. International Journal of Computer Sciences and Engineering 6(6):909–914
Vinyals O, Toshev A, Bengio S and Erhan D (2015) Show and tell: a neural image caption generator. Proceeding of the IEEE conference on Computer Vision and Pattern Recognition, 3156-3164.
Yang J, Shen K, Ong C, Li X (2009) Feature selection for MLP neural network: the use of random permutation of probabilistic outputs. IEEE Trans Neural Netw 20(12):1911–1922
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declared that they have no conflict of interest in this work.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Narang, S.R., Kumar, M. & Jindal, M.K. DeepNetDevanagari: a deep learning model for Devanagari ancient character recognition. Multimed Tools Appl 80, 20671–20686 (2021). https://doi.org/10.1007/s11042-021-10775-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-021-10775-6