Abstract
Optical character recognition (OCR) is an important research area in the field of pattern recognition. A lot of research has been done on OCR in the last 60 years. There is a large volume of paper-based data in various libraries and offices. Also, there is a wealth of knowledge in the form of ancient text documents. It is a challenge to maintain and search from this paper-based data. At many places, efforts are being done to digitize this data. Paper based documents are scanned to digitize data but scanned data is in pictorial form. It cannot be recognized by computers because computers can understand standard alphanumeric characters as ASCII or some other codes. Therefore, alphanumeric information must be retrieved from scanned images. Optical character recognition system allows us to convert a document into electronic text, which can be used for edit, search, etc. operations. OCR system is the machine replication of human reading and has been the subject of intensive research for more than six decades. This paper presents a comprehensive survey of the work done in the various phases of an OCR with special focus on the OCR for ancient text documents. This paper will help the novice researchers by providing a comprehensive study of the various phases, namely, segmentation, feature extraction and classification techniques required for an OCR system especially for ancient documents. It has been observed that there is a limited work is done for the recognition of ancient documents especially for Devanagari script. This article also presents future directions for the upcoming researchers in the field of ancient text recognition.
Similar content being viewed by others
References
Acharya S, Pant AK, Gyawali PK (2015) Deep learning based large scale handwritten Devanagari character recognition. In: Proceedings of the 9th international conference on software, knowledge, information management and applications (SKIMA), pp 1–6
Adiguzel H, Sahin E, Dugulu P (2012) A hybrid approach for line segmentation in handwritten documents. In: Proceedings of the international conference on frontiers in handwriting recognition (ICFHR), pp 503–508
Aggarwal A, Rani R, Dhir R (2012) Handwritten Devanagari character recognition using gradient features. Int J Adv Res Comput Sci Softw Eng 2(5):85–90
Alaei A, Nagabhushan P, Pal U (2011) Piece-wise painting technique for line segmentation of unconstrained handwritten text: a specific study with Persian text documents. Pattern Anal Appl 14(4):381–394
Alam M, Kashem AM (2010) A complete Bangla OCR system for printed characters. J Cases Inf Technol 1(1):30–35
Alizadehashraf B, Roohi S (2017) Persian handwritten character recognition using convolutional neural network. In: Proceedings of the 10th Iranian conference on machine vision and image processing, pp 247–251
Almazan EJ, Tal R, Qian Y, Elder JH (2017) MCMLSD: a dynamic programming approach to line segment detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 5854–5862
Ameta D (2017) Ensemble classifier approach in breast cancer detection and malignancy grading—a review. Int J Manag Public Sector Inf Commun Technol (IJMPICT) 8(1):17–26
Angadi SA, Kodabagi MM (2014) A robust segmentation technique for line, word and character extraction from Kannada text in low resolution display board images. In: Proceedings of the fifth international conference on signals and image processing, pp 42–49
Arivazhagan M, Srinivasan H, Srihari S (2007) A statistical approach to line segmentation in handwritten documents. In: Proceedings of SPIE document recognition and retrieval, pp 1–11
Arora S, Bhatcharjee D, Nasipuri M, Malik L (2007) A two stage classification approach for handwritten Devnagari characters. In: Proceedings of the international conference on computational intelligence and multimedia applications (ICCIMA 2007), Sivakasi, Tamil Nadu, pp 399–403
Arvanitopoulos N, Süsstrunk S (2014) Seam carving for text line extraction on color and grayscale historical manuscripts. In: Proceedings of the 14th international conference on frontiers in handwriting recognition (ICFHR-14), pp 721–726
Avadesh M, Goyal N (2018) Optical character recognition for sanskrit using convolution neural networks. In: Proceedings of the 13th IAPR international workshop on document analysis systems (DAS), Vienna, pp 447–452
Bag S, Harit G (2013) A survey on optical character recognition for Bangla and Devnagari scripts. Sadhana 38(1):133–168
Bag S, Krishna A (2015) Character segmentation of hindi unconstrained handwritten words. Proc 17th Int Workshop Comb Image. Anal 9448:247–260
Bansal V, Sinha RMK (2001) A complete OCR for printed Hindi text in Devanagari script. In: Proceedings of the 6th international conference on document analysis and recognition, pp 800–804
Bansal V, Sinha R, Kumar MK (2002) Segmentation of touching and fused Devanagari characters. Pattern Recognit 35(4):875–893
Bar-Yosef A, Hagbi N, Kedem K, Dinstein I (2009) Line segmentation for degraded handwritten historical documents. In: Proceedings of the 10th international conference on document analysis and recognition, Barcelona, pp 1161–1165
Basu S, Chaudhuri C, Kundu M, Nasipuri M, Basu DK (2007) Text line extraction from multi-skewed handwritten documents. Pattern Recognit 40(6):1825–1839
Beare R (2006) A locally constrained watershed transform. IEEE Trans Pattern Anal Mach Intell 28(7):1063–1074
Belaid A (1997) OCR print - an overview. In: Survey of the state of the art in human language technology, pp 71–74
Bhopi SA, Singh MP (2018) Review on optical character recognition of Devanagari script using neural network. Int J Future Revolut Comput Sci Commun Eng 4(3):415–420
Boiangiu CA, Tanase MC, Ioanitescu R (2014) Handwritten documents text line segmentation based on information energy. Int J Comput Commun Control 9(1):8–15
Brodic D (2012) Extended approach to water flow algorithm for text line segmentation. J Comput Sci Technol 27(1):187–194
Brodic D (2015) Text line segmentation with water flow algorithm based on power function. J Electr Eng 66(3):132–141
Casey RG, Lecolinet E (1996) A survey of methods and strategies in character segmentation. IEEE Trans Pattern Anal Mach Intell 18(7):690–706
Cecotti H, Belaid A(2005) Hybrid OCR combination approach complemented by a specialized ICR applied on ancient documents. In: Proceedings of the 8th international conference on document analysis and recognition, vol 2, pp 1045–1049
Chen YK, Wang JF (2000) Segmentation of single or multiple-touching handwritten numeral string using background and foreground analysis. IEEE Trans Pattern Anal Mach Intell 22(11):1304–1317
Chen K, Seuret M, Liwicki M, Hennebert J, Liu C, Ingold R (2016) Page segmentation for historical handwritten document images using conditional random fields. In: Proceedings of the 15th international conference on frontiers in handwriting recognition (ICFHR), Shenzhen, pp 90–95
Clausner C, Antonacopoulos A, Pletschacher S (2012) A robust hybrid approach for text line segmentation in historical documents. In: Proceedings of the 21st international conference on pattern recognition (ICPR), pp 335–338
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceedings of the IEEE Computer Society conference on computer vision and pattern recognition (CVPR'05), vol 1, pp 886–893
Daugman JG (1980) Two-dimensional spectral analysis of cortical receptive field profiles. Vision Res 20(10):847–856
Deng D, Chan KP, Yu Y (1994) Handwritten Chinese character recognition using spatial Gabor filters and self-organizing feature maps. In: Proceedings of the international conference on image processing, vol 3, pp 940–944
Devijver PA, Kittler J (1985) Pattern recognition: a statistical approach. Image Vis Comput 3(2):87–88
Diem M, Sablatnig R (2009) Recognition of degraded handwritten characters using local features. In: Proceedings of the 10th international conference on document analysis and recognition, pp 221–225
Diem M, Sablatnig R (2010) Recognizing characters of ancient manuscripts. In: Proceedings of the international conference on computer image analysis in the study of art, pp 753106–753112
Din I, Malik Z, Siddiqi I, Khalid S (2016) Line and ligature segmentation in printed Urdu document images. J Appl Environ Biol Sci 6(3S):114–120
Ding X, Li Y, Belatreche A, Maguire L (2012) Constructing minimum volume surfaces using level set methods for novelty detection. In: Proceedings of the 2012 international joint conference on neural networks (IJCNN), Brisbane, QLD, pp 1–6
Dongre VJ, Mankar VH (2010) A review of research on Devanagari character recognition. Int J Comput Appl 12(2):8–15
Dunn CE, Wang PSP (1992) Character segmentation techniques for handwritten text-a survey. In: Proceedings of the 11th international conference on recognition methodology and systems, vol 2, pp 577–580
Dutta K, Krishnan P, Mathew M, Jawahar CV (2018) Offline handwriting recognition on Devanagari using a new benchmark dataset. In: Proceedings of the 13th IAPR international workshop on document analysis systems (DAS), Vienna, pp 25–30
Erhan D, Bengio Y, Courville A, Manzagol P, Vincent P, Bengio S (2010) Why does unsupervised pre-training help deep learning? J Mach Learn Res 11:625–660
Feldbach M, Tonnies KD (2001) Line detection and segmentation in historical church registers. In: Proceedings of the 6th international conference on document analysis and recognition, pp 743–747
Fragkou P, Petridis V, Kehagias A (2004) A dynamic programming algorithm for linear text segmentation. J Intell Inf Syst 23(2):179–197
Fujisawa H, Nakano Y, Kurino K (1992) Segmentation methods for character recognition: from segmentation to document structure analysis. Proc IEEE 80(7):1079–1092
Fukunaga K (1990) Introduction to statistical pattern recognition, 2nd edn. Academic Press, New York
Garz A, Sablatnig R, Diem M (2011) Using local features for efficient layout analysis of ancient manuscripts. In: Proceedings of the 19th European Signal Processing Conference (EUSIPCO) Barcelona, Spain, pp 1259–1263
Gatos B, Louloudis G, Stamatopoulos N (2014) Segmentation of historical handwritten documents into text zones and text lines. In: Proceedings of the international conference on frontiers in handwriting recognition, ICFHR, pp 464–469. https://doi.org/10.1109/ICFHR.2014.84
Ghosh R, Roy PP (2015) Study of two zone-based features for online Bengali and Devanagari character recognition. In: Proceedings of the 13th international conference on document analysis and recognition (ICDAR), pp 401–405
Gomathi R, Uma RS, Mohanval S (2012) Segmentation of touching, overlapping, skewed and short handwritten text lines. Int J Comput Appl 49:24–27
Gonzalez RC, Woods RE (1992) Digital image processing. Prentice-Hall, NewDelhi
Holambe N, Thool RC, Jagade SM (2011) A brief review and survey of feature extraction methods for Devnagari OCR. In: Proceedings of the 9th international conference on ICT and knowledge engineering, pp 99–104
Hussain E, Hannan A, Kashyap K (2015) A zoning based feature extraction method for recognition of handwritten assamese characters. Int J Comput Sci Technol 6(2):226–228
Jangid M, Srivastava S (2016) Accuracy enhancement of devanagari character recognition by gray level normalization. In: Proceedings of the 7th international conference on computing communication and networking technologies ACM, p 25
Jetley S, Belhe S, Koppula VK, Negi A (2012) Two-stage hybrid binarization around fringe map based text line segmentation for document images. In: Proceedings of the international conference on pattern recognition, pp 343–346
Jindal S, Lehal GS (2012) Line segmentation of handwritten Gurmukhi manuscripts. In: Proceedings of the document analysis and recognition, Mumbai, IN, India Copyright 2012 ACM, pp 74–78
Jindal MK, Sharma RK, Lehal GS (2007) Segmentation of horizontally overlapping lines in printed Indian scripts. Int J Comput Intell Res 3(4):277–286
Jindal MK, Sharma DV, Lehal GS (2008) Structural features for recognising degraded printed Gurmukhi script. In: Proceedings of the 5th international conference on information technology, pp 668–673
Jindal M, Lehal G, Sharma RK (2009) On segmentation of touching characters and overlapping lines in degraded printed gurmukhi script. Int J Image Graph 9:321–353
Jindal MK, Garg NK, Kaur L (2010) Segmentation of handwritten Hindi text. Int J Comput Appl 1:19–23
Katiyar G, Mehfuz S (2016) A hybrid recognition system for off-line handwritten characters. SpringerPlus 5:1–18
Kavitha AS, Shivakumara P, Hemantha G (2013) Skewness and nearest neighbour based approach for historical document classification. In: Proceedings of the international conference on communication systems and network technologies, pp 602–606
Kennard DJ, Barrett WA (2006) Separating lines of text in free-form handwritten historical documents. In: Proceedings of the 2nd international conference on document image analysis for libraries (DIAL-06), pp 12–23
Khanale PB, Chitnis SD (2011) Handwritten Devanagari character recognition using artificial neural network. J Artif Intell 4(1):55–62
Khanduja D, Nain N, Panwar S (2016) A hybrid feature extraction algorithm for Devanagari script. ACM Trans Asian Low-Resour Lang Inf Process 15(1):2
Khodadad M, Sid-Ahmed E, Raheem A (2011) Online Arabic/Persian character recognition using neural network classifier and DCT features. In: Proceedings of the 54th international Midwest symposium on circuits and systems, pp 1–4
Kim KK, Kim JH, Suen CY (2000) Recognition of unconstrained handwritten numeral strings by composite segmentation method. In: Proceedings of the 15th international conference on pattern recognition, pp 594–597
Kim MS, Jang MD, Choi HL, Rhee TH, Kim JH, Kwag HK (2004) Digitalizing scheme of handwritten Hanja historical documents. In: Proceedings of the first international workshop on document image analysis for libraries, pp 321–327
Kim K, Choi H, Oh K (2017) Object Detection using ensemble of linear classifiers with fuzzy adaptive boosting. EURASIP J Image Video Process 17:40
Kimura F, Shridhar M (1991) Handwritten numerical recognition based on multiple algorithms. Pattern Recognit 24(10):969–983
Kleber F, Sablatnig R, Gau M, Miklas H (2008) Ancient document analysis based on text line extraction. In: Proceedings of the 19th international conference on pattern recognition, pp 1–4
Kobayashi T, Hidaka A, Kurita T (2007) Selection of histograms of oriented gradients features for pedestrian detection. In: Proceedings of the international conference on neural information processing, pp 598–607
Koppula VK, Negi A (2011) Fringe map based text line segmentation of printed Telugu document images. In: Proceedings of the international conference on document analysis and recognition (ICDAR-11), pp 1294–1298
Kumar S (2016) A study for handwritten Devanagari word recognition. In: Proceedings of the international conference on communication and signal processing (ICCSP), pp 1009–1014
Kumar D, Gupta D (2018) Review on optical character recognition for off-line Devanagari handwritten characters & challenges. Int J Sci Res Comput Sci Eng Inf Technol 3(3):1364–1367
Kumar KSS, Namboodiri AM, Jawahar CV (2006) Learning segmentation of documents with complex scripts. In: Fifth Indian conference on computer vision, graphics and image processing, Madurai, India, pp 749–760
Kumar M, Jindal MK, Sharma RK (2012) Offline handwritten Gurmukhi character recognition: study of different features and classifiers combinations. In: Proceedings of the international workshop on document analysis and recognition, IIT Bombay, pp 94–99
Kumar M, Sharma RK, Jindal MK (2013) A novel feature extraction technique for offline handwritten Gurmukhi character recognition. IETE J Res 59(6):687–692
Kumar M, Jindal MK, Sharma RK (2014) A novel hierarchical techniques for offline handwritten Gurmukhi character recognition. Natl Acad Sci Lett 37(6):567–572
Kumar M, Sharma RK, Jindal MK (2018) Character and numeral recognition for non-Indic and Indic scripts: a survey. Artif Intell Rev 52:2235–2261
Lawgali A, Bouridane A, Angelova M, Ghassemlooy Z (2011) Handwritten Arabic character recognition: which feature extraction method. Int J Adv Sci Technol 34:1–8
Lebourgeois F (1997) Robust multi-font OCR system from gray level images. In: Proceedings of the international conference on document analysis and recognition, vol 1, pp 1–5
Lehal GS (2009) Optical character recognition of Gurmukhi script using multiple classifiers. In: Proceedings of the international workshop on multilingual OCR, p 7
Lehal GS, Dhir R (1999) A range free skew detection technique for digitized Gurmukhi script documents. In: Proceedings of the fifth international conference on document analysis and recognition, pp 147–152
Lehal GS, Singh C (1999) Feature extraction and classification for OCR of Gurmukhi script. Vivek 12(2):2–12
Likforman-Sulem L, Hanimyan A, Faure C (1995) A Hough based algorithm for extracting text lines in handwritten documents. In: Proceedings of the 3rd international conference on document analysis and recognition, Montreal, Canada, vol 2, pp 774–777
Liu N, Han W (2007) Recognition of human faces using discrete cosine transform filtered trace feature. In: Proceedings of the 6th international conference on information, communications & signal processing (ICICS), pp 1–5
Liwicki M, Indermuhle E, Bunke H (2007) On-line handwritten text line detection using dynamic programming. In: Proceedings of the 9th international conference on document analysis and recognition (ICDAR 07), vol 1, pp 447–451
Louloudis G, Gatos B, Pratikakis I, Halatsis C (2009) Text line and word segmentation of handwritten documents. Pattern Recognit 42(12):3169–3183
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Mahadevan U, Nagabushnam RC (1995) Gap metrics for word separation in handwritten lines. In: Proceedings of the 3rd international conference on document analysis and recognition (ICDAR-95), pp 124–127
Manjusha k, Kumar S, Rajendran J, Soman KP (2012) Hindi character segmentation in document images using level set methods and non-linear diffusion. Int J Comput Appl 44(16):42–47
Manmatha R, Rothfeder JL (2005) A scale space approach for automatically segmenting words from historical handwritten documents. IEEE Trans Pattern Anal Mach Intell 27(8):1212–1225
Mantas J (1986) An overview of character recognition methodologies. Pattern Recognit 19(6):425–430
Messaoud B, El-Abed H, Amiri H, Margner V (2012) A multilevel textline segmentation framework for handwritten historical documents. In: Proceedings of the international conference on frontiers of handwriting recognition (ICFHR-12), pp 513–518
Monro DM, Rakshit S, Zhang D (2007) DCT-based iris recognition. IEEE Trans Pattern Anal Mach Intell 29(4):586–595
Narang SR, Jindal MK (2018) Issues in Devanagari ancient character recognition: a study. J Adv Sch Res Allied Educ 15(10):6–11
Narang SR, Jindal MK, Sharma P (2018) Devanagari ancient character recognition using HOG and DCT features. In: Proceedings of the 5th IEEE international conference on parallel, distributed and grid computing (PDGC-2018), Solan, India
Narang SR, Jindal MK, Kumar M (2019) Devanagari ancient character recognition using DCT features with adaptive boosting and bootstrap aggregating. Soft Comput 23:13603–13614
Ngo W, Chan CK (2005) Video text detection and segmentation for optical character recognition. Multimed Syst 10(3):261–272
Nikolaou N, Makridis M, Gatos B, Stamatopoulos N, Papamarkos N (2010) Segmentation of historical machine-printed documents using Adaptive Run Length Smoothing and skeleton segmentation paths. Image Vis Comput 28(4):590–604
O’Gorman L (1993) The document spectrum for page layout analysis. IEEE Trans Pattern Anal Mach Intell 15(11):1162–1173
Oliveira LS, Lethelier E, Bortolozzi F, Sabourin R (2000) A new segmentation approach for handwritten digits. In: Proceedings of the 15th international conference on pattern recognition, vol 2, pp 323–326
Pal U, Chaudhuri BB (2004) Indian script character recognition: a survey. Pattern Recognit 37(9):1887–1899
Pal U, Datta S (2003) Segmentation of Bangla unconstrained handwritten text. In: Proceedings of the international conference on document analysis and recognition, pp 1128–1132
Pal U, Belaı̈d A, Choisy C (2003) Touching numeral segmentation using water reservoir concept. Pattern Recognit Lett 24:261–272
Panichkriangkrai C, Li L, Hachimura K (2013) Character segmentation and retrieval for learning support system of Japanese historical books. In: Proceedings of the ACM international conference Proceeding series, pp 118–122. https://doi.org/10.1145/2501115.2501129
Parisi R, Claudi ED, Lucarelli G, Orlandi G (1998) Car plate recognition by neural networks and image processing. In: Proceedings of the IEEE international symposium on circuits and systems, vol 3, pp 195–198
Phan TV, Nguyen KC, Nakagawa M (2016) A Nom historical document recognition system for digital archiving. Int J Doc Anal Recognit 19(1):49–64
Phillips CL (1999) The level set method. MIT Undergrad J Math 1:155–164
Ptak R, Żygadło B, Unold O (2017) Projection-based text line segmentation with a variable threshold. Int J Appl Math Comput Sci 27(1):195–206
Purkaystha B, Datta T, Islam MS (2017) Bengali handwritten character recognition using deep convolutional neural network. In: Proceedings of the 20th international conference of computer and information technology (ICCIT), pp 1–5
Quacimy E, Kerroum MA, Hammouch A (2014) Feature extraction based on DCT for handwritten digit recognition. Int J Comput Sci Issues 11(6):27–33
Quo L, Boukir S (2014) Ensemble margin framework for image classification. In: Proceedings of the IEEE international conference on image processing, France, pp 4231–4235
Quo L, Boukir S (2017) Building an ensemble classifier using ensemble margin. Application to image classification. In: Proceedings of the 2017 IEEE international conference on image processing, Beijing, pp 4492–4496
Ramanathan R, Ponmathavan S, Thaneshwaran L, Nair AS, Valliappan N, Soman KP (2009) Tamil font recognition using gabor filters and support vector machines. In: Proceedings of the international conference on advances in computing, control, and telecommunication technologies, Trivandrum, Kerala, pp 613–615
Rani S (2015) Recognition of Gurmukhi handwritten manuscripts. Ph.D. thesis, Punjabi University, Patiala, India
Rani R, Dhir R, Lehal GS (2014) Gabor features based script identification of lines within a bilingual/trilingual document. Int J Adv Sci Technol 66:1–12
Rao VN, Sastry ASCS, Chakravarthy A, SrinivasaRao AV (2015) Analysis of canonical character segmentation technique for ancient Telugu text documents. J Theor Appl Inf Technol 82(2):311–320
Razak Z, Zulkiflee K, Idris MYI, Tamil EM, Noorzaily M (2008) Off-line handwriting text line segmentation: a review. Int J Comput Sci Netw Secur 8(7):12–20
Reddy LP, Babu TR, Rao, NV &Babu BR (2010) Touching syllable segmentation using split profile algorithm. Int J Comput Sci 7(3):1–10
Saabni R, Asi A, El-Sana J (2014) Text line extraction for historical document images. Pattern Recognit Lett 35:23–33
Saha S, Basu S, Nasipuri M, Basu DK (2010) A Hough transform based technique for text segmentation. J Comput 2(2):134–141
Sarkar R, Moulik S, Das N, Basu S, Nasipuri M, Kundu M (2011) Suppression of non-text components in handwritten document images. In: Proceedings of the international conference on image and information processing, pp 1–7
Sesh Kumar KS, Namboodiri AM, Jawahar CV (2006) Learning segmentation of documents with complex scripts. In: Proceedings of the fifth Indian conference on computer vision, graphics and image processing, Madurai, India, pp 749–760
Shah KR, Badgujar DD (2013) Devnagari handwritten character recognition (DHCR) for ancient documents: a review. In: Proceedings of IEEE conference on information and communication technology, pp 656–660
Shahi M, Ahlawat A, Pandey BN (2012) Literature survey on offline recognition of handwritten Hindi curve script using ANN approach. Int J Sci Res Publ 2(5):1–6
Shao Y, Wang C, Xiao B (2014) A character image restoration method for unconstrained handwritten Chinese character recognition. Int J Doc Anal Recognit 18(1):73–86
Shapiro VA (1993) From Radon to Hough transform of gray-scale images via digital halftoning. In: Proceedings of the 8th Scandinavian conference on image analysis, pp 665–672
Sharma DV, Lehal GS (2006) An iterative algorithm for segmentation of isolated handwritten words in Gurmukhi script. In: Proceedings of the 18th international conference on pattern recognition (ICPR'06), pp 1022–1025
Sharma N, Patnaik T, Kumar B (2013) Recognition for handwritten English letters: a review. Int J Eng Innov Technol 2(7):318–321
Shelke S, Apte S (2015) A fuzzy-based classification scheme for unconstrained handwritten Devanagari character recognition. In: Proceedings of the international conference on communication, information & computing technology (ICCICT), pp 1–6
Shi Z, Govindaraju V (2004) Line separation for complex document images using fuzzy runlength. In: Proceedings of the international workshop on document image analysis for libraries, p 306
Shi Z, Setlur S, Govindaraju V (2005) Text extraction from gray scale historical document images using adaptive local connectivity map. In: Proceedings of the international conference on document analysis and recognition (ICDAR), vol 2, pp 794–798
Singh P, Budhiraja S (2011) Feature extraction and classification techniques in OCR systems for handwritten Gurmukhi script- a survey. Int J Eng Res Appl 1(4):1736–1739
Singh J, Lehal GS (2014) Comparative performance analysis of feature (S)-classifier combination for Devanagari optical character recognition system. Int J Adv Comput Sci Appl 5(6):37–42
Singh S, Aggarwal A, Dhir R (2012) Use of Gabor filters for recognition of handwritten Gurmukhi character. Int J Adv Res Comput Sci Softw Eng 2(5):234–240
Singh D, Saini JP, Chauhan DS (2015) Hindi character recognition using RBF neural network and directional group feature extraction technique. In: Proceedings of the international conference on cognitive computing and information processing (CCIP), pp 1–4
Singh PK, Sarkar R, Nasipuri M (2016) A study of moment based features on handwritten digit recognition. Appl Comput Intell Soft Comput, Article ID 2796863
Sinha RMK, Mahabala HN (1979) Machine recognition of Devanagari script. IEEE Trans Syst Man Cybern 9(8):435–441
Souhar A, Boulid Y, Ameur EB, Ouagague MM (2017) Watershed transform for text lines extraction on binary Arabic handwritten documents. In: Proceedings of the 2nd international conference on big data, cloud and applications (BDCA'17). ACM, New York. https://doi.org/10.1145/3090354.3090444
Soumya A, Kumar HG (2015) Feature extraction and recognition of ancient Kannada epigraphs. Smart Innov Syst Technol 33:469–478
Sousa JMC, Pinto JRC, Ribeiro CS, Gil JM (2005) Ancient document recognition using fuzzy methods. In: Proceedings of the IEEE international conference on fuzzy systems, pp 833–836
Sridevi N, Subashini P (2012) Segmentation of text lines and characters in ancient tamil script documents using computational intelligence techniques. Int J Comput Appl 52(14):7–12
Sulem LL, Zahour A, Taconet B (2006) Text line segmentation of historical documents: a survey. Int J Doc Anal Recognit 9:123–138
Sumetphong C, Tangwongsan S (2012) An optimal approach towards recognizing broken Thai characters in OCR systems. In: Proceedings of the international conference on digital image computing techniques and applications (DICTA), pp 1–5
Trier OD, Jain AK, Taxt T (1996) Feature extraction methods for character recognition – a survey. Pattern Recognit 29(4):641–642
Tripathy N, Pal U (2004) Handwriting segmentation of unconstrained Oriya text. In: Proceedings of the international workshop on frontiers in handwriting recognition, pp 306–311
Tripathy N, Pal U (2006) Handwriting segmentation of unconstrained Oriya text. Sadhana 31:755–769
Tseng YH, Lee HJ (1999) Recognition-based handwritten Chinese character segmentation using a probabilistic Viterbi algorithm. Pattern Recognit Lett 20(8):791–806
Verma R, Ali Z (2012) A survey of feature extraction and classification techniques in OCR systems. Int J Comput Appl Inf Technol 1(3):1–3
Vincent L, Soille P (1991) Watersheds in digital spaces: an efficient algorithm based on immersion simulations. IEEE Trans Pattern Anal Mach Intell 13(6):583–598
Weliwitage C, Harvey AL, Jennings AB (2005) Handwritten document offline text line segmentation. In: Proceedings of the digital image computing: techniques and applications, pp 184–187
Wong K, Casey R, Wahl F (1982) Document analysis systems. IBM J Res Dev 26(6):647–656
Yadav D, Sánchez-Cuadrado S, Morato J (2013) OCR for Hindi language using a neural network approach. J Inf Process Syst 9(1):117–140
Yin F, Liu CL (2009) Handwritten Chinese text line segmentation by clustering with distance metric learning. Pattern Recognit 42(12):3146–3157
Zahour A, Taconet B, Mercy P, Ramdane S (2001) Arabic hand-written text-line extraction. In: Proceedings of the sixth international conference on document analysis and recognition, pp 281–285
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Authors have declared that they have no conflict on interest in this work.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Narang, S.R., Jindal, M.K. & Kumar, M. Ancient text recognition: a review. Artif Intell Rev 53, 5517–5558 (2020). https://doi.org/10.1007/s10462-020-09827-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-020-09827-4