Abstract
In this article, we propose a lexicon-free, script-dependent approach to segment online handwritten isolated Tamil words into its constituent symbols. Our proposed segmentation strategy comprises two modules, namely the (1) Dominant Overlap Criterion Segmentation (DOCS) module and (2) Attention Feedback Segmentation (AFS) module. Based on a bounding box overlap criterion in the DOCS module, the input word is first segmented into stroke groups. A stroke group may at times correspond to a part of a valid symbol (over-segmentation) or a merger of valid symbols (under-segmentation). Attention on specific features in the AFS module serve in detecting possibly over-segmented or under-segmented stroke groups. Thereafter, feedbacks from the SVM classifier likelihoods and stroke-group based features are considered in modifying the suspected stroke groups to form valid symbols.
The proposed scheme is tested on a set of 10000 isolated handwritten words (containing 53,246 Tamil symbols). The results show that the DOCS module achieves a symbol-level segmentation accuracy of 98.1%, which improves to as high as 99.7% after the AFS strategy. This in turn entails a symbol recognition rate of 83.9% (at the DOCS module) and 88.4% (after the AFS module). The resulting word recognition rates at the DOCS and AFS modules are found to be, 50.9% and 64.9% respectively, without any postprocessing.
- Babu, J., Prashanth, L., Sharma, R. R., Rao, G. V. P., and Bharath, A. 2007. HMM-based online handwriting recognition system for Telugu symbols. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’07). 1028--1032. Google ScholarDigital Library
- Basu, S., Sarkar, R., Das, N., Kundu, M., Nasipuri, M., and Basu, D. K. 2007. A fuzzy technique for segmentation of handwritten Bangla word images. In Proceedings of the International Conference on Computing: Theory and Applications (ICCTA’07). 427--433. Google ScholarDigital Library
- Bharath, A. and Madhvanath, S. 2007. Hidden Markov Models for online handwritten Tamil word recognition. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’07). 506--510. Google ScholarDigital Library
- Bhattacharya, U., Gupta, B. K., and Parui, S. 2007. Direction code based features for recognition of online handwritten characters of Bangla. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’07). 58--62. Google ScholarDigital Library
- Bhattacharya, U., Nigam, A., Rawat, Y. S., and Parui, S. K. 2008. An analytic scheme for online handwritten Bangla cursive word recognition. In Proceedings of the International Conference on Frontiers in Handwriting Recognition (ICFHR’08). 320--325.Google Scholar
- Bishnu, A. and Chaudhuri, B. B. 1999. Segmentation of Bangla handwritten text into characters by recursive contour following. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’99). 402--405. Google ScholarDigital Library
- Boynton, G. M. 2005. Attention and visual perception. Current Opin. Neurobiol. 15, 4, 465--469.Google ScholarCross Ref
- Burges, C. 1998. A tutorial on Support Vector Machines for pattern recognition. Data Min. Knowl. Disc. 2, 2, 1--47. Google ScholarDigital Library
- Camastra, F. 2007. A SVM-based cursive character recognizer. Patt. Recog. 40, 12, 3721--3727. Google ScholarDigital Library
- Chang, C. C. and Lin, C. J. 2011. LIBSVM -- A Library for Support Vector Machines. ACM Trans. Intell. Syst. Tech. 2, 3, 27:1--27:7. Google ScholarDigital Library
- Cheriet, M., Kharma, N., Liu, C. L., and Suen, C. 2008. Character Recognition Systems: A Guide for Students and Practitioners. Wiley. Google ScholarDigital Library
- Duda, Hart, and Stork. 1995. Pattern Classification. Springer Wiley. Google ScholarDigital Library
- Fink, G. A., Vajda, S., Bhattacharya, U., Parui, S. K., and Chaudhuri, B. B. 2010. Online Bangla word recognition using sub-stroke level features and Hidden Markov Models. In Proceedings of the International Conference on Frontiers in Handwriting Recognition (ICFHR’10). 393--398. Google ScholarDigital Library
- Furukawa, N., Tokuno, J., and Ikeda, H. 2006. Online character segmentation method for unconstrained handwritten strings using off-stroke features. In Proceedings of the International Workshop on Frontiers in Handwriting Recognition (IWFHR’06). 361--366.Google Scholar
- Gao, X., Lallican, P. M., and Viard-Gaudin, C. 2005. A two-stage online handwritten Chinese character segmentation algorithm based on dynamic programming. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’05). 735--739. Google ScholarDigital Library
- Jager, S., Manke, S., Reichert, J., and Waibel, A. 2001. Online handwriting recognition: The NPen++ recognizer. Int. J. Doc. Anal. Recog. 3, 3, 169--180.Google ScholarCross Ref
- Joshi, N., Sita, G., Ramakrishnan, A. G., and Madhavanath, S. 2004. Comparison of elastic matching algorithms for online Tamil handwritten character recognition. In Proceedings of the International Workshop on Frontiers in Handwriting Recognition (IWFHR’04). 444--449. Google ScholarDigital Library
- Koerich, A. L., Sabourin, R., and Suen, C. Y. 2005. Recognition and verification of unconstrained handwritten words. IEEE Trans. Patt. Anal. Mach. Intell. 27, 10, 1509--1522. Google ScholarDigital Library
- Li, X. and Yeung, D. Y. 1997. Online handwritten alphanumeric character recognition using dominant points in strokes. Patt. Recog. 30, 1, 31--44.Google ScholarCross Ref
- Liu, C. L., Jaeger, S., and Nakagawa, M. 2004a. Online recognition of Chinese characters: The state-of-the-art. IEEE Trans. Patt. Anal. Mach. Intell. 24, 2, 198--213. Google ScholarDigital Library
- Liu, C. L., Sako, H., and Fujisawa, H. 2004b. Effects of classifier structures and training regimes on integrated segmentation and recognition of handwritten numeral strings. IEEE Trans. Patt. Anal. Mach. Intell. 26, 11, 1395--1407. Google ScholarDigital Library
- Liwicki, M., Scherz, M., and Bunke, H. 2006. Word extraction from online handwritten text lines. In Proceedings of the International Conference on Pattern Recognition (ICPR’06). 929--933. Google ScholarDigital Library
- Ma, L., Huo, Q., and Shi, Y. 2009. A study of feature design for online handwritten Chinese character recognition based on continuous-density Hidden Markov Models. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’09). 526--530. Google ScholarDigital Library
- Madhvanath, S. and Govindaraju, V. 2001. The role of holistic paradigms in handwritten word recognition. IEEE Trans. Patt. Anal. Mach. Intell. 23, 2, 149--164. Google ScholarDigital Library
- Marti, U. V. and Bunke, H. 2002. Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition system. Int. J. Patt. Recog. Artif. Intell. 15, 1, 65--90. Google ScholarDigital Library
- Murase, H. 1988. Online recognition of free-format Japanese handwritings. In Proceedings of the International Conference on Pattern Recognition (ICPR’98). 1143--1147.Google ScholarCross Ref
- Nagakawa, M., Zhu, B., and Onuma, M. 2005. A model of online handwritten Japanese text recognition free from line direction and writing format constraints. IECIE Trans. Inf. Syst. E88-D, 8, 1815--1822. Google ScholarDigital Library
- Nethravathi, B., Archana, C. P., Shashikiran, K., Ramakrishnan, A. G., and Kumar, V. 2010. Creation of a huge annotated database for Tamil and Kannada OHR. In Proceedings of the International Conference on Frontiers in Handwriting Recognition (ICFHR’10). 415--420. Google ScholarDigital Library
- Oliveira, L. E. S., Sabourin, R., Bortolozzi, F., and Suen, C. Y. 2002. Automatic recognition of handwritten numerical strings: A recognition and verification strategy. IEEE Trans. Patt. Anal. Mach. Intell. 24, 11, 1438--1454. Google ScholarDigital Library
- Quiniou, S., Bouteruche, F., and Anquetil, E. 2009. Word extraction associated with a confidence index for online handwritten sentence recognition. Int. J. Patt. Recog. Artif. Intell. 23, 5, 945--966.Google ScholarCross Ref
- Senior, A. W. and Robinson, A. J. 1998. An offline cursive handwriting recognition system. IEEE Trans. Patt. Anal. Mach. Intell. 20, 3, 309--321. Google ScholarDigital Library
- Sillito, A. M. and Jones, H. E. 2002. Corticothalamic interactions in the transfer of visual information. Philos. Trans. R. Soc. Lond. B. Biol. Sci. 357, 1428, 1739--1752.Google ScholarCross Ref
- Sundaram, S. and Ramakrishnan, A. G. 2010. Attention feedback based robust segmentation of online handwritten words. Indian Patent Office Reference Number: 03974/CHE/2010.Google Scholar
- Sundaram, S. and Ramakrishnan, A. G. 2011. Lexicon-free, novel segmentation of online handwritten Indic words. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’11). 1175--1179. Google ScholarDigital Library
- Swethalakshmi, H., Sekhar, C. C., and Chakravarthy, V. S. 2007. Spatiostructural features for recognition of online handwritten characters in Devanagari and Tamil scripts. In Proceedings of the International Conference on Artificial Neural Networks (ICANN’07). 230--239. Google ScholarDigital Library
- Tonouchi, Y. 2010. Path evaluation and character classifier training on integrated segmentation and recognition of online handwritten Japanese character string. In Proceedings of the International Conference on Frontiers in Handwriting Recognition (ICFHR’10). 513--517. Google ScholarDigital Library
- Tripathy, N. and Pal, U. 2004. Handwriting segmentation of unconstrained Oriya text. In Proceedings of the International Workshop on Frontiers in Handwriting Recognition (IWFHR’04). 306--311. Google ScholarDigital Library
- Varga, T. and Bunke, H. 2005. Tree structure for word extraction from handwritten text lines. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’05). 352--356. Google ScholarDigital Library
- Yang, S. and Dai, G. 2002. Detecting dominant points on online scripts with a simple approach. In Proceedings of the International Workshop on Frontiers in Handwriting Recognition (IWFHR’02). 351--356. Google ScholarDigital Library
- Zhao, S. Y., Chi, Z. R., and Shi, P. F. 2003. Two-stage segmentation of unconstrained handwritten Chinese characters. Patt. Recog. 36, 1, 145--156.Google ScholarCross Ref
- Zhou, X. D., Yu, J. L., Liu, C. L., Nagasaki, T., and Marukawa, K. 2007. Online handwritten Japanese character string recognition incorporating geometric context. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’07). 48--52. Google ScholarDigital Library
- Zhou, X. D., Liu, C. L., and Nakagawa, M. 2009. Online handwritten Japanese character string recognition using conditional random fields. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’09). 521--525. Google ScholarDigital Library
- Zhu, B. and Nakagawa, M. 2008. Segmentation of online freely written Japanese text using SVM for improving text recognition. IEICE Trans. Info. Syst. E91-D, 1, 105--113. Google ScholarDigital Library
- Zhu, B., Zhou, X. D., Liu, C. L., and Nakagawa, M. 2009. Effect of improved path evaluation for online handwritten Japanese text recognition. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’09). 516--521. Google ScholarDigital Library
- Zhu, B., Zhou, X. D., Liu, C. L., and Nagakawa, M. 2010. A robust model for online handwritten Japanese text recognition. Int. J. Doc. Anal. Recog. 13, 2, 121--131. Google ScholarDigital Library
Index Terms
- Attention-Feedback Based Robust Segmentation of Online Handwritten Isolated Tamil Words
Recommendations
Bigram Language Models and Reevaluation Strategy for Improved Recognition of Online Handwritten Tamil Words
This article describes a postprocessing strategy for online, handwritten, isolated Tamil words. Contributions have been made with regard to two issues hardly addressed in the online Indic word recognition literature, namely, use of (1) language models ...
Lexicon-Free, Novel Segmentation of Online Handwritten Indic Words
ICDAR '11: Proceedings of the 2011 International Conference on Document Analysis and RecognitionResearch in the field of recognizing unlimited vocabulary, online handwritten Indic words is still in its infancy. Most of the focus so far has been in the area of isolated character recognition. In the context of lexicon-free recognition of words, one ...
Stroke Segmentation and Recognition from Bangla Online Handwritten Text
ICFHR '12: Proceedings of the 2012 International Conference on Frontiers in Handwriting RecognitionThis paper deals with recognition of online handwritten Bangla (Bengali) text. Here, at first, we segment cursive words into strokes. A stroke may represent a character or a part of a character. We selected a set of Bangla words written by different ...
Comments