Elsevier

Pattern Recognition

Volume 58, October 2016, Pages 172-189
Pattern Recognition

A multi-objective approach towards cost effective isolated handwritten Bangla character and digit recognition

https://doi.org/10.1016/j.patcog.2016.04.010Get rights and content

Highlights

  • Developed a cost effective approach towards handwritten character recognition system.

  • A multi-objective region sampling methodology for isolated handwritten Bangla characters and digits recognition has been proposed.

  • A non-dominated sorting harmony search algorithm based region sampling and a non-dominated sorting genetic algorithm based region sampling methodology have been developed.

  • An AFS theory based fuzzy logic is utilized to develop a model for combining the pareto-optimal solutions from two multi-objective heuristics algorithms.

  • Maximum recognition accuracies of 86.6478% and 98.23% have been achieved with 0.234% and 12.60% decrease in recognition cost for handwritten Bangla characters and digits respectively.

Abstract

Identifying the most informative local regions of a handwritten character image is necessary for a robust handwritten character recognition system. But identifying them from a character image is a difficult task. If this task were to be performed incurring minimum possible cost, it becomes more challenging due to having two independent, apparently contradicting objectives which need to be optimized simultaneously, i.e. maximizing the recognition accuracy and minimizing the associated recognition cost. To address the problem a multi-objective approach is required. In the present task, two popular multi-objective optimization Algorithm (1) a Non-Dominated Sorting Harmony-Search Algorithm (NSHA) and (2) a Non-Dominated Sorting Genetic Algorithm-II (NSGA-II, Deb et al., 2002 [18]) are employed for region sampling separately. The method objectively selects the most informative set of local regions using the framework of Axiomatic Fuzzy Set (AFS) theory, from the sets of pareto-optimal solutions provided by the multi-objective region sampling algorithms. The system has been evaluated on two isolated handwritten Bangla datasets, (1) a dataset of randomly mixed handwritten Bangla Basic and Compound characters and (2) a dataset of handwritten Bangla numerals separately, with SVM based classifier, using a feature set containing convex-hull based features and CG based quad-tree partitioned longest-run based local features extracted from the selected local regions. The results have shown a significant increase in recognition accuracy and decrease in recognition cost for all the datasets. Thus the present system introduces a cost effective approach towards isolated handwritten character recognition systems.

Graphical abstract

Schematic representation of the integrated system developed under present work.

fx1
  1. Download : Download high-res image (122KB)
  2. Download : Download full-size image

Introduction

Optical Character Recognition (OCR) is an active area of research. While there are many systems commercially available for recognizing printed text [1], [2], [3], [4], their success is yet to be extended to handwritten characters. Several reasons can be cited to explain this apparent anomaly. Shape and size of handwritten characters vary from one individual to another. It may even vary for a single individual from time to time, depending on various factors. These challenges make the task of recognizing handwritten characters very difficult. Researchers all around the world have proposed several methods [5] for handwritten character recognition, but most of them are focused on Roman scripts [6], concentrating on English and other European languages. Among Asian languages, Chinese [7], Japanese, Korean languages are dominant in the literature. Indian scripts like Malayalam, Tamil, Telugu, and Hindi have started to get attention of the researchers during past decade [8], [9], but development of OCR for complete Bangla script [10] has not received much attention from researchers until recently. Bangla is the second most popular script in India and the fifth most popular script in the world [11]. Bangla alphabet contains some of the most intricate and complex characters, which differ from one another only by a single period, a modifier ref or an upper horizontal line or Matra, as shown in an example in Fig. 1. Bangla alphabet contains about 50 Basic characters (11 vowels and 39 consonants) and more than 334 Compound characters [12]. Samples of a few of Bangla Basic and Compound characters is shown in Fig. 2.

One of the most common approaches taken up by OCR researchers is zoning, i.e. dividing the character image into several zones or local regions [13] and generating the invariant local feature set by extracting features from every local region. There are several different zoning methods [13] mentioned in the literature, but most of them can be classified into two major categories: static [4], [10] and dynamic zoning methods [13]. Static zoning methods divide a handwritten character image into a fixed set of overlapping or non-overlapping windows, where the number of windows is fixed. Basu et al. used static zoning method in [14] and sub-divided the handwritten numerals’ image into 9 fixed-sized, overlapping local regions and extracted longest-run based features form each sub-region. On other hand, dynamic zoning methods sub-divide a handwritten character image into local regions by dynamically creating windows based on some statistical or topological feature of that specific character. Cao et al. [8] proposed a similar technique to generate a hierarchical feature-space based on a quin-tree partition of the character image, where zones were dynamically created based on the centroid of the contour segment of the character residing in the parent zone. Das et al. [10], [15] have used a GA based selection mechanism to find out the most optimal set of local regions for recognition of handwritten Bangla numerals.

In those papers, the researchers have emphasized on achieving better recognition accuracies, but associated recognition costs incurred in the process were not taken into consideration. For example, Das et al. presented a two pass approach towards handwritten character recognition in [10], which produced a significant increase in the recognition accuracy but at the cost of a recognition cost which is almost 8.5 times than the average per character recognition cost incurred by traditional single pass approach towards handwritten character recognition. This may prove to be undesirable to users who want to use such a system for real-life applications. An extensive study of recognition accuracy versus associated recognition cost is undertaken in our experimental setup to investigate the scope of a practical optical character recognition system, in terms of both recognition accuracy and associated recognition cost.

In the present work, a multi-objective approach towards optical character recognition (OCR) is proposed, which attempts to find a trade-off between the recognition accuracy achieved by the system and its associated recognition costs. In real life applications of an OCR system, insignificant increase in recognition accuracy at the expense of high recognition cost may not be acceptable to the users of the system. In such cases, a multi-objective approach can provide the user with a set of good solutions. In the present work, framework of a novel, multi-objective isolated handwritten character recognition system is proposed. There are several variants of multi-objective Evolutionary Algorithms [16] present in the literature. A Non-dominated Sorting Harmony-search Algorithm (NSHA [17]) based region sampling method and a Non-dominated Sorting Genetic Algorithm – II (NSGA-II [18]) based region sampling method is introduced in our present work. These two multi-objective region sampling algorithms mark one of the contributions of the present work. Both of the region sampling algorithms are employed over the decision space separately. These algorithms have two objective functions: – (1) maximizing handwritten character recognition accuracy and (2) minimizing associated recognition costs. In our experimental setup, recognition accuracy is measured using an SVM based classifier and recognition costs are measured by: (i) average time taken by the recognition system to recognize each handwritten character in the test-set and (ii) the number of local regions used to represent each handwritten character in the test-set. Two sets of pareto-optimal solutions provided by these two algorithms are then combined using Axiomatic Fuzzy Set (AFS) Theory [19]. The multi-objective region sampling algorithms and the AFS theory based approach to objectively combine the pareto-optimal solutions provided by the multi-objective algorithms mark one of the contributions of the present work.

The proposed method tries to find an objective solution over the decision space, while providing an optimal trade-off between recognition accuracy and corresponding recognition costs, making it suitable to use in practical applications. The present work has been evaluated on datasets of isolated handwritten Bangla characters and handwritten Bangla numerals separately. Results from these experiments have been compared with some of the other popular handwritten character recognition methods present in the literature, to prove its superiority.

The rest of the paper is organized as follows: in Section 2, a brief overview on multi-objective evolutionary algorithms based region sampling techniques is presented, basics of Axiomatic Fuzzy Set (AFS) theory is introduced in Section 3; Section 4 describes the featureset and our present work is discussed in details in Section 5, experimental results are presented in Section 6. Finally, a brief conclusion is drawn based on the results gathered from the experiments.

Section snippets

Motivation behind using multi-objective evolutionary algorithms for region sampling

Region sampling based OCR systems try to identify the most discriminative set of local regions from handwritten character images. The easiest approach is to exhaustively enumerate every possible combination of local regions until the best combination is found. This approach however, would take time of exponential order. Therefore, Evolutionary Algorithm based meta-heuristic approaches are generally employed, so that a good enough solution is found within a reasonable amount of time.

Most of the

A Brief overview on axiomatic set theory based fuzzy logic

A priori identification of the set of local regions containing the most informative set of features is a difficult task. Local region or feature selection methods, described in the literature, can be broadly classified into two categories: wrappers and filters [30]. Wrappers identify the most informative set of features from the discourse with the help of a training dataset and an efficient learning algorithm; whereas filters use some kind of heuristics to identify the feature set that has the

Dataset of the experiment

The proposed method has been evaluated on four publicly available benchmark [12], [34] datasets. A brief overview of the datasets used in the present work is described in Table 1. More details about these databases such as sample collection, preparation techniques etc. can be found in the ‘Reference’ column provided in Table 1.

Design of the feature set

The extracted features, used in the present work can be classified into two categories: global and local. While the number of global features extracted from a handwritten

Present work

As discussed before, objectives of our present work is threefold. They are: (a) proposing a region sampling methodology that returns a subset of local regions containing the most discriminating features, while incurring minimum possible recognition cost (b) evaluating the performance of the proposed method on two separate, publicly available datasets of Bangla handwritten characters and (c) comparing the performance of the proposed method against some of the popular contemporaries present in

Experimental results

Objectives of the present work are: (a) increasing the recognition accuracy and (b) decreasing the corresponding recognition cost for the recognition of isolated Bangla handwritten characters. The recognition cost is represented in our experimental setup as, (i) average per character recognition time and (ii) number of local regions used to represent each handwritten digit/character. The integrated system shown in Fig. 4 is implemented for the experimental setup. As discussed in Table 1 of

Conclusion

A novel region sampling method based on multi-objective evolutionary algorithms is presented here. The proposed method tries to identify the set of most discriminative local regions while incurring minimal recognition cost. A non-dominated sorting harmony search based region sampling method and a non-dominated sorting genetic algorithm based region sampling methodology have been employed in our experimental setup. It provided 1.3178% and 0.73% increase in recognition accuracy, while decreasing

Conflict of interest

None declared.

Acknowledgment

The authors are thankful to CVPR unit, ISI, Kolkata for providing the dataset for the experiment. The authors are also thankful to “Centre for Microprocessor Application for Training Education and Research” and Department of Computer Science & Engineering, Jadavpur University, Kolkata for kindly providing the resources and infrastructural facilities that helped to complete this work.

Ritesh Sarkhel received his B.C.S.E degree from Jadavpur University in 2012. He worked as an R&D Engineer in Samsung Research Institute, Noida from 2012 to 2014. He is currently pursuing M.C.S.E degree from Jadavpur University. His areas of current research interest are OCR of handwritten text, optimization techniques and computer vision.

References (62)

  • S. Sivasubramani et al.

    Environmental/economic dispatch using multi-objective harmony search algorithm

    Electr. Power Syst. Res.

    (2011)
  • M. Ramze Rezaee et al.

    Fuzzy feature selection

    Pattern Recognit.

    (1999)
  • S. Basu et al.

    A novel framework for automatic sorting of postal documents with multi-script address blocks

    Pattern Recognit.

    (2010)
  • S. Salcedo-Sanz et al.

    A coral reefs optimization algorithm with harmony search operators for accurate wind speed prediction

    Renew. Energy

    (2015)
  • Y. Li et al.

    Fuzzy feature selection based on min–max learning rule and extension matrix

    Pattern Recognit.

    (2008)
  • N. Das et al.

    A statistical–topological feature combination for recognition of handwritten numerals

    Appl. Soft Comput.

    (2012)
  • Y. Wen et al.

    Handwritten Bangla numeral recognition system and its application to postal automation

    Pattern Recognit.

    (2007)
  • S. Basu et al.

    A hierarchical approach to recognition of handwritten Bangla characters

    Pattern Recognit

    (2009)
  • A. Ul-Hasan, S. Bin Ahmed, F. Rashid, F. Shafait, T.M. Breuel, Offline printed Urdu Nastaleeq script recognition with...
  • A. Ray, S. Rajeswar, S. Chaudhury, Text recognition using deep BLSTM networks, in: Proceedings of the 2015 Eighth...
  • A. Park, Offline text recognition without intraword character segmentation based on two-dimensional low frequency...
  • A. Vinciarelli et al.

    Offline recognition of unconstrained handwritten texts using HMMs and statistical language models

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2004)
  • R.M. Bozinovic et al.

    Off-line cursive script word recognition

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1989)
  • P.-K. Wong et al.

    Off-line handwritten Chinese character recognition as a compound Bayes decision problem

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1998)
  • S.V. Rajashekararadhya et al.

    Efficient zone based feature extraction algorithm for handwritten numeral recognition of four popular south indian scripts

    J. Theor. Appl. Inf. Technol.

    (2008)
  • N. Das et al.

    Handwritten Bangla character recognition using a soft computing paradigm embedded in two pass approach

    Pattern Recognit.

    (2014)
  • R. Sarkar et al.

    CMATERdb1: a database of unconstrained handwritten Bangla and Bangla–English mixed script document image

    Int. J. Doc. Anal. Recognit.

    (2011)
  • S. Basu, N. Das, R. Sarkar, M. Kundu, M. Nasipuri, D.K. Basu, Handwritten ‘Bangla’ alphabet recognition using an MLP...
  • K. Deb, Multi-objective optimization using evolutionary algorithms: an introduction, 2011, pp....
  • K. Deb et al.

    A fast and elitist multiobjective genetic algorithm

    IEEE Trans. Evol. Comput.

    (2002)
  • V. Chankong et al.

    Multiobjective Decision Making: Theory and Methodology

    (1983)
  • Cited by (93)

    • Tamil Handwritten Character Recognition System using Statistical Algorithmic Approaches

      2023, Computer Speech and Language
      Citation Excerpt :

      For classification, SVM assumes a noteworthy part in both phases of the classification procedure. SVM was utilized for classifying different characters by utilizing direction features (Liu and Suen, 2009, Dasgupta et al., 2016), Speeded Up Robust Features, Curvature feature (Alexa and Das, 2016), longest-run features (Basu et al., 2010), Quad tree-based Longest Run features and Convex hull-based features (Das, 2015, Sarkhel et al., 2016), multi-features (Roy et al., 2016), directions and transition (Abdelhak, 2017), gradient (Kamble and Hegadi, 2015), modified shadow features, octant centroid features, distance-based features, number of loops and longest run features (Das et al., 2012). In the previous works, there were few works that provided a solution for very limited characters.

    View all citing articles on Scopus

    Ritesh Sarkhel received his B.C.S.E degree from Jadavpur University in 2012. He worked as an R&D Engineer in Samsung Research Institute, Noida from 2012 to 2014. He is currently pursuing M.C.S.E degree from Jadavpur University. His areas of current research interest are OCR of handwritten text, optimization techniques and computer vision.

    Nibaran Das received his B.Tech degree in Computer Science and Technology from Kalyani Govt. Engineering College under Kalyani University, in 2003. He received his M.C.S.E. degree from Jadavpur University, in 2005. He received his Ph.D. (Engg.) degree thereafter from Jadavpur University, in 2012. He joined J.U. as a lecturer in 2006. His areas of current research interest are OCR of handwritten text, Bengali fonts, optimization techniques and image processing. He has been an editor of Bengali monthly magazine “Computer Jagat” since 2005.

    Amit K. Saha received his B.Tech degree in Information Technology from WBUT, in 2011. He received his M.T.C.T. degree from Jadavpur University, in 2015. His areas of current research interest are OCR of handwritten text, Nature Inspired Computing and Multi-Objective Optimization.

    Mita Nasipuri received her B.E.Tel.E., M.E.Tel.E., and Ph.D. (Engg.) degrees from Jadavpur University, in 1979, 1981 and 1990, respectively. Prof. Nasipuri has been a faculty member of J.U. since 1987. Her current research interest includes image processing, pattern recognition, and multimedia systems. She is a senior member of the IEEE, U.S.A., Fellow of I.E. (India) and W.B.A.S.T., Kolkata, India.

    View full text