Automatic recognition of printed Oriya script

Chaudhuri, B. B.; Pal, U.; Mitra, M.

doi:10.1007/BF02703310

Automatic recognition of printed Oriya script

Published: February 2002

Volume 27, pages 23–34, (2002)
Cite this article

Sadhana Aims and scope Submit manuscript

B. B. Chaudhuri¹,
U. Pal¹ &
M. Mitra¹

152 Accesses
41 Citations
Explore all metrics

Abstract

This paper deals with an Optical Character Recognition (OCR) system for printedOriya script. The development of OCR for this script is difficult because a large number of character shapes in the script have to be recognized. In the proposed system, the document image is first captured using a flat-bed scanner and then passed through different preprocessing modules like skew correction, line segmentation, zone detection, word and character segmentation etc. These modules have been developed by combining some conventional techniques with some newly proposed ones. Next, individual characters are recognized using a combination of stroke and run-number based features, along with features obtained from the concept of water overflow from a reservoir. The feature detection methods are simple and robust, and do not require preprocessing steps like thinning and pruning. A prototype of the system has been tested on a variety of printed Oriya material, and currently achieves 96.3% character level accuracy on average.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Akiyama T, Hagita N 1990 Automatic entry system for printed documents.Pattern Recogn. 23:1141–1154
Article Google Scholar
Bozinovic R M, Srihari S N 1989 Off line cursive script word recognition.IEEE Trans. Pattern Anal. Machine Intell. PAMI-11: 68–83.
Article Google Scholar
Chaudhuri B B, Pal U 1997 Skew angle detection of digitized Indian script documents,IEEE Trans. Pattern Anal. Machine Intell. PAMI 19: 182–186
Article Google Scholar
Chaudhuri B B, Pal U 1997 An OCR system to read two Indian language scripts: Bangla and Devnagari (Hindi).Proc. Fourth Int. Conf. on Document Analysis and Recognition (Los Alamitos, CA: IEEE Comput. Soc.) pp 1011–1016
Chapter Google Scholar
Chaudhuri B B, Pal U 1998 A complete printed Bangla OCR system.Pattern Recogn. 31: 531–549
Article Google Scholar
Dutta A K, Chaudhuri S 1993 Bengali alpha-numeric character recognition using curvature features.Pattern Recogn. 26: 1757–1770
Article Google Scholar
Garain U, Chaudhuri B B 1998 Compound character recognition by run number based metric distance.Proc. SPIE Annual Symposium on Electronic Imaging, San Jose, USA, pp 90–97
Govindan V K, Shivaprasad A P 1990 Character recognition -a survey.Pattern Recogn. 23: 671–683
Article Google Scholar
Hinds S C, Fisher J L, D’Amato D P 1990 A document skew detection method using run-length encoding and the Hough transform.Proc. 10th Int. Conf. on Pattern Recognition (Los Alamitos, CA: IEEE Comput. Soc.) vol. 1, pp 464–468
Chapter Google Scholar
Le D S, Thoma G R, Wechsler H 1994 Automatic page orientation and skew angle detection for binary document images.Pattern Recogn. 27: 1325–1344
Article Google Scholar
Lehal G S, Singh C 2000 A Gurmukhi script recognition system.Proc. 15th Int. Conf. on Pattern Recognition (Los Alamitos, CA: IEEE Comput. Soc.) vol. 2, pp 557–560
Chapter Google Scholar
Mantas J 1986 An overview of character recognition methodologies.Pattern Recogn. 19: 425–430
Article Google Scholar
Mori S, Suen C Y, Yamamoto K 1992 Historical review of OCR research and development.Proc. IEEE 80: 1029–1058
Article Google Scholar
O’Gorman L 1993 The document spectrum for page layout analysis.IEEE Trans. Pattern Anal. Machine Intell. PAMI-15: 1162–1173
Article Google Scholar
Pal U, Chaudhuri B B 1997 Printed Devnagari script OCR system.Vivek 10: 12–24
Google Scholar
Pavlidis T, Zhou J 1992 Page segmentation and classification.Comput. Vision Graphics Image Process. 54: 484–96
Google Scholar
Sinha R M K 1987 Rule based contextual post processing for Devnagari text recognition.Pattern Recogn. 20: 475–85
Article Google Scholar
Siromony G, Chandrasekaran R, Chandrasekaran M 1978 Computer recognition of printed Tamil characters.Pattern Recogn. 10: 243–247
Article Google Scholar
Wang P S P 1991Character and handwritten recognition (Singapore: World Scientific)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Vision and Pattern Recognition Unit, Indian statistical Institute, 203, B T Road, 700 108, Kolkata, India
B. B. Chaudhuri, U. Pal & M. Mitra

Authors

B. B. Chaudhuri
View author publications
You can also search for this author in PubMed Google Scholar
U. Pal
View author publications
You can also search for this author in PubMed Google Scholar
M. Mitra
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chaudhuri, B.B., Pal, U. & Mitra, M. Automatic recognition of printed Oriya script. Sadhana 27, 23–34 (2002). https://doi.org/10.1007/BF02703310

Download citation

Issue Date: February 2002
DOI: https://doi.org/10.1007/BF02703310

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic recognition of printed Oriya script

Abstract

Access this article

Similar content being viewed by others

Indian and European Script Identification: A Review

Automatic Text-Line Level Handwritten Indic Script Recognition: A Two-Stage Framework

Character Segmentation from Offline Handwritten Gujarati Script Documents

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Automatic recognition of printed Oriya script

Abstract

Access this article

Similar content being viewed by others

Indian and European Script Identification: A Review

Automatic Text-Line Level Handwritten Indic Script Recognition: A Two-Stage Framework

Character Segmentation from Offline Handwritten Gujarati Script Documents

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation