Skip to main content

Advertisement

Log in

A GA based hierarchical feature selection approach for handwritten word recognition

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Feature selection plays a key role in reducing the dimensionality of a feature vector by discarding redundant and irrelevant ones. In this paper, a Genetic Algorithm-based hierarchical feature selection (HFS) model has been designed to optimize the local and global features extracted from each of the handwritten word images under consideration. In this context, two recently developed feature descriptors based on shape and texture of the word images have been taken into account. Experimentation is conducted on an in-house dataset of 12,000 handwritten word samples written in Bangla script. This database comprises names of 80 popular cities of West Bengal, a state of India. Proposed model not only reduces the feature dimension by nearly 28%, but also enhances the performance of the handwritten word recognition (HWR) technique by 1.28% over the recognition performance obtained with unreduced feature set. Moreover, the proposed HFS-based HWR system performs better in comparison with some recently developed methods on the present dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Narendra PM, Fukunaga K (1977) A branch and bound algorithm for feature subset selection. IEEE Trans Comput 26(9):917–922

    Article  Google Scholar 

  2. Chen XW (2003) An improved branch and bound algorithm for feature selection. Pattern Recogn Lett 24(12):1925–1933

    Article  Google Scholar 

  3. Pudil P, Novovičová J, Kittler J (1994) Floating search methods in feature selection. Pattern Recogn Lett 15(11):1119–1125

    Article  Google Scholar 

  4. Raymer ML, Punch WF, Goodman ED, Kuhn LA, Jain AK (2000) Dimensionality reduction using genetic algorithms. IEEE Trans Evol Comput 4(2):164–171

    Article  Google Scholar 

  5. Oh IS, Lee JS, Moon BR (2004) Hybrid genetic algorithms for feature selection. IEEE Trans Pattern Anal Mach Intell 26(11):1424–1437

    Article  Google Scholar 

  6. Guyon I, Gunn S, Nikravesh M, Zadeh LA (2008) Feature extraction: foundations and applications. Springer, Berlin, p 207

    Google Scholar 

  7. Law MH, Figueiredo MA, Jain AK (2004) Simultaneous feature selection and clustering using mixture models. IEEE Trans Pattern Anal Mach Intell 26(9):1154–1166

    Article  Google Scholar 

  8. Sánchez-Maroño N, Alonso-Betanzos A, Tombilla-Sanromán M (2007) Filter methods for feature selection–a comparative study. In: International conference on intelligent data engineering and automated learning, Springer, Heidelberg, pp 178–187

  9. Xue B, Zhang M, Browne WN (2013) Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans Cybern 43(6):1656–1671

    Article  Google Scholar 

  10. Cateni S, Colla V, Vannucci M (2014) A hybrid feature selection method for classification purposes. In: European modelling symposium, IEEE Press, New York, pp 39–44

  11. Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: Proceedings of the 6th international symposium on micro machine and human science. IEEE, pp 39–43

  12. Tabakhi S, Najafi A, Ranjbar R, Moradi P (2015) Gene selection for microarray data classification using a novel ant colony optimization. Neurocomputing 168(30):1024–1036

    Article  Google Scholar 

  13. Meiri R, Zahavi J (2006) Using simulated annealing to optimize the feature selection problem in marketing applications. Eur J Oper Res 171:842–858

    Article  Google Scholar 

  14. Panda R, Naik MK, Panigrahi BK (2011) Face recognition using bacterial for aging strategy. Swarm Evol Comput 1:138–146

    Article  Google Scholar 

  15. Oreski S, Oreski G (2014) Genetic algorithm-based heuristic for feature selection in credit risk assessment. Expert Syst Appl 41(4):2052–2064

    Article  Google Scholar 

  16. Ghamisi P, Benediktsson JA (2015) Feature selection based on hybridization of genetic algorithm and particle swarm optimization. IEEE Geosci Remote Sens Lett 12(2):309–313

    Article  Google Scholar 

  17. Uysal AK, Gunal S (2014) Text classification using genetic algorithm oriented latent semantic features. Expert Syst Appl 41(13):5938–5947

    Article  Google Scholar 

  18. Leardi R (2000) Application of genetic algorithm-PLS for feature selection in spectral data sets. J Chemom 14(5–6):643–655

    Article  Google Scholar 

  19. Ghosh M, Adhikary S, Ghosh KK, Sardar A, Begum S, Sarkar R (2018) Genetic algorithm based cancerous gene identification from microarray data using ensemble of filter methods. Med Biol Eng Comput. https://doi.org/10.1007/s11517-018-1874-4

    Article  Google Scholar 

  20. Tan F, Fu X, Zhang Y, Bourgeois AG (2008) A genetic algorithm-based method for feature subset selection. Soft Comput Fus Found Methodol Appl 12(2):111–120

    Google Scholar 

  21. Welikala RA, Fraz MM, Dehmeshki J, Hoppe A, Tah V, Mann S, Barman SA (2015) Genetic algorithm based feature selection combined with dual classification for the automated detection of proliferative diabetic retinopathy. Comput Med Imaging Gr 43:64–77

    Article  Google Scholar 

  22. Katiyar G, Mehfuz S (2016) A hybrid recognition system for off-line handwritten characters. SpringerPlus 5(1):357

    Article  Google Scholar 

  23. Kim G, Kim S, Tek T, Kyungki S (2000) Feature selection using genetic algorithms for handwritten character recognition. In: Proceedings of the 7th international workshop on frontiers in handwriting recognition. International Unipen Foundation, pp 103–112

  24. Shi D, Shu W, Liu H (1998) Feature selection for handwritten Chinese character recognition based on genetic algorithms. In: IEEE International conference on systems, man, and cybernetics. 5:4201–4206

  25. Oliveira LS, Sabourin R, Bortolozzi F, Suen CY (2002) Feature selection using multi-objective genetic algorithms for handwritten digit recognition. In: Proceedings of 16th international conference on pattern recognition. 1:568–571

  26. Oliveira LS, Sabourin R, Bortolozzi F, Suen CY (2003) A methodology for feature selection using multiobjective genetic algorithms for handwritten digit string recognition. Int J Pattern Recognit Artif Intell 17(06):903–929

    Article  Google Scholar 

  27. Morita M, Sabourin R, Bortolozzi F, SuenCY (2003) Unsupervised feature selection using multi-objective genetic algorithms for handwritten word recognition. In: Proceedings of 7th international conference on document analysis and recognition. IEEE, pp 666–670

  28. List of languages by number of native speakers, https://en.wikipedia.org/wiki/List_of_languages_by_number_of_native_speakers. Accessed on 11 July 2017

  29. Singh PK, Sarkar R, Nasipuri M (2015) Offline script identification from multilingual indic-script documents: a state-of-the-art. Comput Sci Rev 15:1–28

    Article  MathSciNet  Google Scholar 

  30. Basu S, Das N, Sarkar R, Kundu M, Nasipuri M, Basu DK (2009) A hierarchical approach to recognition of handwritten Bangla characters. Pattern Recogn 42(7):1467–1484

    Article  Google Scholar 

  31. Roy PP, Bhunia AK, Das A, Dey P, Pal U (2016) HMM-based Indic handwritten word recognition using zone segmentation. Pattern Recogn 60:1057–1075

    Article  Google Scholar 

  32. Madhvanath S, Govindaraju V (2001) The role of holistic paradigms in handwritten word recognition. IEEE Trans Pattern Anal Mach Intell 23(2):149–164

    Article  Google Scholar 

  33. Bhowmik S, Malakar S, Sarkar R, Nasipuri M (2014) Handwritten Bangla word recognition using elliptical features. In: International conference on computational intelligence and communication networks. IEEE, pp 257–261

  34. Bhowmik S, Roushan MG, Sarkar R, Nasipuri M, Polley S, Malakar S (2014) Handwritten Bangla word recognition using HOG descriptor. In: 4th International conference of emerging applications of information technology. IEEE, pp 193–197

  35. Barua S, Malakar S, Bhowmik S, Sarkar R, Nasipuri M (2017) Bangla handwritten city name recognition using gradient-based feature. In: Proceedings of the 5th international conference on frontiers in intelligent computing: theory and applications. Springer, Singapore, pp 343–352

  36. Malakar S, Sharma P, Singh PK, Das M, Sarkar R, Nasipuri M (2017) A holistic approach for handwritten hindi word recognition. Int J Comput Vis Image Process (IJCVIP) 7(1):59–78

    Article  Google Scholar 

  37. Sahoo S, Nandi SK, Barua S, Pallavi, Bhowmik S, Malakar S, Sarkar R (2018) Handwritten Bangla word recognition using negative refraction based shape transformation. J Intell Fuzzy Syst 35(2):1765–1777

    Article  Google Scholar 

  38. Malakar S, Ghosh P, Sarkar R, Das N, Basu S, Nasipuri M (2011) An improved offline handwritten character segmentation algorithm for Bangla script. In: Proceedings of the 5th Indian international conference on artificial intelligence, pp 71–90

  39. Vajda S, Roy K, Pal U, Chaudhuri BB, Belaid A (2009) Automation of Indian postal documents written in Bangla and English. Int J Pattern Recognit Artif Intell 23(08):1599–1632

    Article  Google Scholar 

  40. Dzuba G, Filatov A, Gershuny D, Kil I, Nikitin V (1997) Check amount recognition based on the cross validation of courtesy and legal amount fields. Int J Pattern Recognit Artif Intell 11(04):639–655

    Article  Google Scholar 

  41. Kim KK, Kim JH, Chung YK, Suen CY (2001) Legal amount recognition based on the segmentation hypotheses for bank check processing. In: Proceedings of 6th international conference on document analysis and recognition. IEEE, pp 964–967

  42. Malakar S, Ghosh M, Sarkar R, Nasipuri M (2018) Development of a two-stage segmentation-based word searching method for handwritten document images. J Intell Syst. Preprint https://doi.org/10.1515/jisys-2017-0384

  43. Phatak AM, Pande SS (2012) Optimum part orientation in rapid prototyping using genetic algorithm. J Manuf Syst 31(4):395–402

    Article  Google Scholar 

  44. Spears WM, Jong D, Kenneth D (1995) On the virtues of parameterized uniform crossover. Naval Research Lab, Washinton DC

    Book  Google Scholar 

  45. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE computer society conference on computer vision and pattern recognition. IEEE, 1:886–893

  46. Bhowmik S, Sarkar R, Das B, Doermann D (2019) GiB: a Game theory Inspired Binarization technique for degraded document images. IEEE Trans Image Process 28(3):1443–1455

    Article  MathSciNet  Google Scholar 

  47. Gonzalez RC, Woods RE (2009) Digital image processing. Pearson Education, India

    Google Scholar 

  48. Kennedy J, Eberhart RC (1997) A discrete binary version of the particle swarm algorithm. In: IEEE international conference on computational cybernetics and simulation systems, man, and cybernetics. IEEE, 5:4104–4108

  49. Dasgupta J, Bhattacharya K, Chanda B (2016) A holistic approach for Off-line handwritten cursive word recognition using directional feature based on Arnold transform. Pattern Recogn Lett 79:73–79

    Article  Google Scholar 

  50. Marti UV, Bunke H (2002) The IAM-database: an English sentence database for offline handwriting recognition. Int J Doc Anal Recogn 5(1):39–46

    Article  Google Scholar 

Download references

Acknowledgement

We would like to thank CMATER research laboratory of the Computer Science and Engineering Department, Jadavpur University, India, for providing us the infrastructural support. This work is partially supported by the PURSE-II and UPE-II, Jadavpur University projects. Showmik Bhowmik is thankful to Ministry of Electronics and Information Technology (Me-itY), Govt. of India, for providing him PhD-Fellowship under Visvesvaraya PhD scheme. Ram Sarkar is partially funded by DST grant (EMR/2016/007213).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Samir Malakar.

Ethics declarations

Conflict of interest

All the authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Malakar, S., Ghosh, M., Bhowmik, S. et al. A GA based hierarchical feature selection approach for handwritten word recognition. Neural Comput & Applic 32, 2533–2552 (2020). https://doi.org/10.1007/s00521-018-3937-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-018-3937-8

Keywords

Navigation