research-article

Attention-Feedback Based Robust Segmentation of Online Handwritten Isolated Tamil Words

Authors:
Suresh Sundaram

Indian Institute of Science

Indian Institute of Science
View Profile

,
A. G. Ramakrishnan

Indian Institute of Science

Indian Institute of Science
View Profile

ACM Transactions on Asian Language Information Processing Volume 12 Issue 1Article No.: 4pp 1–25https://doi.org/10.1145/2425327.2425331

Published:01 March 2013Publication History

ACM Transactions on Asian Language Information Processing

Abstract

In this article, we propose a lexicon-free, script-dependent approach to segment online handwritten isolated Tamil words into its constituent symbols. Our proposed segmentation strategy comprises two modules, namely the (1) Dominant Overlap Criterion Segmentation (DOCS) module and (2) Attention Feedback Segmentation (AFS) module. Based on a bounding box overlap criterion in the DOCS module, the input word is first segmented into stroke groups. A stroke group may at times correspond to a part of a valid symbol (over-segmentation) or a merger of valid symbols (under-segmentation). Attention on specific features in the AFS module serve in detecting possibly over-segmented or under-segmented stroke groups. Thereafter, feedbacks from the SVM classifier likelihoods and stroke-group based features are considered in modifying the suspected stroke groups to form valid symbols.

The proposed scheme is tested on a set of 10000 isolated handwritten words (containing 53,246 Tamil symbols). The results show that the DOCS module achieves a symbol-level segmentation accuracy of 98.1%, which improves to as high as 99.7% after the AFS strategy. This in turn entails a symbol recognition rate of 83.9% (at the DOCS module) and 88.4% (after the AFS module). The resulting word recognition rates at the DOCS and AFS modules are found to be, 50.9% and 64.9% respectively, without any postprocessing.

References

Babu, J., Prashanth, L., Sharma, R. R., Rao, G. V. P., and Bharath, A. 2007. HMM-based online handwriting recognition system for Telugu symbols. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’07). 1028--1032. Google ScholarDigital Library
Basu, S., Sarkar, R., Das, N., Kundu, M., Nasipuri, M., and Basu, D. K. 2007. A fuzzy technique for segmentation of handwritten Bangla word images. In Proceedings of the International Conference on Computing: Theory and Applications (ICCTA’07). 427--433. Google ScholarDigital Library
Bharath, A. and Madhvanath, S. 2007. Hidden Markov Models for online handwritten Tamil word recognition. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’07). 506--510. Google ScholarDigital Library
Bhattacharya, U., Gupta, B. K., and Parui, S. 2007. Direction code based features for recognition of online handwritten characters of Bangla. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’07). 58--62. Google ScholarDigital Library
Bhattacharya, U., Nigam, A., Rawat, Y. S., and Parui, S. K. 2008. An analytic scheme for online handwritten Bangla cursive word recognition. In Proceedings of the International Conference on Frontiers in Handwriting Recognition (ICFHR’08). 320--325.Google Scholar
Bishnu, A. and Chaudhuri, B. B. 1999. Segmentation of Bangla handwritten text into characters by recursive contour following. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’99). 402--405. Google ScholarDigital Library
Boynton, G. M. 2005. Attention and visual perception. Current Opin. Neurobiol. 15, 4, 465--469.Google ScholarCross Ref
Burges, C. 1998. A tutorial on Support Vector Machines for pattern recognition. Data Min. Knowl. Disc. 2, 2, 1--47. Google ScholarDigital Library
Camastra, F. 2007. A SVM-based cursive character recognizer. Patt. Recog. 40, 12, 3721--3727. Google ScholarDigital Library
Chang, C. C. and Lin, C. J. 2011. LIBSVM -- A Library for Support Vector Machines. ACM Trans. Intell. Syst. Tech. 2, 3, 27:1--27:7. Google ScholarDigital Library
Cheriet, M., Kharma, N., Liu, C. L., and Suen, C. 2008. Character Recognition Systems: A Guide for Students and Practitioners. Wiley. Google ScholarDigital Library
Duda, Hart, and Stork. 1995. Pattern Classification. Springer Wiley. Google ScholarDigital Library
Fink, G. A., Vajda, S., Bhattacharya, U., Parui, S. K., and Chaudhuri, B. B. 2010. Online Bangla word recognition using sub-stroke level features and Hidden Markov Models. In Proceedings of the International Conference on Frontiers in Handwriting Recognition (ICFHR’10). 393--398. Google ScholarDigital Library
Furukawa, N., Tokuno, J., and Ikeda, H. 2006. Online character segmentation method for unconstrained handwritten strings using off-stroke features. In Proceedings of the International Workshop on Frontiers in Handwriting Recognition (IWFHR’06). 361--366.Google Scholar
Gao, X., Lallican, P. M., and Viard-Gaudin, C. 2005. A two-stage online handwritten Chinese character segmentation algorithm based on dynamic programming. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’05). 735--739. Google ScholarDigital Library
Jager, S., Manke, S., Reichert, J., and Waibel, A. 2001. Online handwriting recognition: The NPen++ recognizer. Int. J. Doc. Anal. Recog. 3, 3, 169--180.Google ScholarCross Ref
Joshi, N., Sita, G., Ramakrishnan, A. G., and Madhavanath, S. 2004. Comparison of elastic matching algorithms for online Tamil handwritten character recognition. In Proceedings of the International Workshop on Frontiers in Handwriting Recognition (IWFHR’04). 444--449. Google ScholarDigital Library
Koerich, A. L., Sabourin, R., and Suen, C. Y. 2005. Recognition and verification of unconstrained handwritten words. IEEE Trans. Patt. Anal. Mach. Intell. 27, 10, 1509--1522. Google ScholarDigital Library
Li, X. and Yeung, D. Y. 1997. Online handwritten alphanumeric character recognition using dominant points in strokes. Patt. Recog. 30, 1, 31--44.Google ScholarCross Ref
Liu, C. L., Jaeger, S., and Nakagawa, M. 2004a. Online recognition of Chinese characters: The state-of-the-art. IEEE Trans. Patt. Anal. Mach. Intell. 24, 2, 198--213. Google ScholarDigital Library
Liu, C. L., Sako, H., and Fujisawa, H. 2004b. Effects of classifier structures and training regimes on integrated segmentation and recognition of handwritten numeral strings. IEEE Trans. Patt. Anal. Mach. Intell. 26, 11, 1395--1407. Google ScholarDigital Library
Liwicki, M., Scherz, M., and Bunke, H. 2006. Word extraction from online handwritten text lines. In Proceedings of the International Conference on Pattern Recognition (ICPR’06). 929--933. Google ScholarDigital Library
Ma, L., Huo, Q., and Shi, Y. 2009. A study of feature design for online handwritten Chinese character recognition based on continuous-density Hidden Markov Models. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’09). 526--530. Google ScholarDigital Library
Madhvanath, S. and Govindaraju, V. 2001. The role of holistic paradigms in handwritten word recognition. IEEE Trans. Patt. Anal. Mach. Intell. 23, 2, 149--164. Google ScholarDigital Library
Marti, U. V. and Bunke, H. 2002. Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition system. Int. J. Patt. Recog. Artif. Intell. 15, 1, 65--90. Google ScholarDigital Library
Murase, H. 1988. Online recognition of free-format Japanese handwritings. In Proceedings of the International Conference on Pattern Recognition (ICPR’98). 1143--1147.Google ScholarCross Ref
Nagakawa, M., Zhu, B., and Onuma, M. 2005. A model of online handwritten Japanese text recognition free from line direction and writing format constraints. IECIE Trans. Inf. Syst. E88-D, 8, 1815--1822. Google ScholarDigital Library
Nethravathi, B., Archana, C. P., Shashikiran, K., Ramakrishnan, A. G., and Kumar, V. 2010. Creation of a huge annotated database for Tamil and Kannada OHR. In Proceedings of the International Conference on Frontiers in Handwriting Recognition (ICFHR’10). 415--420. Google ScholarDigital Library
Oliveira, L. E. S., Sabourin, R., Bortolozzi, F., and Suen, C. Y. 2002. Automatic recognition of handwritten numerical strings: A recognition and verification strategy. IEEE Trans. Patt. Anal. Mach. Intell. 24, 11, 1438--1454. Google ScholarDigital Library
Quiniou, S., Bouteruche, F., and Anquetil, E. 2009. Word extraction associated with a confidence index for online handwritten sentence recognition. Int. J. Patt. Recog. Artif. Intell. 23, 5, 945--966.Google ScholarCross Ref
Senior, A. W. and Robinson, A. J. 1998. An offline cursive handwriting recognition system. IEEE Trans. Patt. Anal. Mach. Intell. 20, 3, 309--321. Google ScholarDigital Library
Sillito, A. M. and Jones, H. E. 2002. Corticothalamic interactions in the transfer of visual information. Philos. Trans. R. Soc. Lond. B. Biol. Sci. 357, 1428, 1739--1752.Google ScholarCross Ref
Sundaram, S. and Ramakrishnan, A. G. 2010. Attention feedback based robust segmentation of online handwritten words. Indian Patent Office Reference Number: 03974/CHE/2010.Google Scholar
Sundaram, S. and Ramakrishnan, A. G. 2011. Lexicon-free, novel segmentation of online handwritten Indic words. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’11). 1175--1179. Google ScholarDigital Library
Swethalakshmi, H., Sekhar, C. C., and Chakravarthy, V. S. 2007. Spatiostructural features for recognition of online handwritten characters in Devanagari and Tamil scripts. In Proceedings of the International Conference on Artificial Neural Networks (ICANN’07). 230--239. Google ScholarDigital Library
Tonouchi, Y. 2010. Path evaluation and character classifier training on integrated segmentation and recognition of online handwritten Japanese character string. In Proceedings of the International Conference on Frontiers in Handwriting Recognition (ICFHR’10). 513--517. Google ScholarDigital Library
Tripathy, N. and Pal, U. 2004. Handwriting segmentation of unconstrained Oriya text. In Proceedings of the International Workshop on Frontiers in Handwriting Recognition (IWFHR’04). 306--311. Google ScholarDigital Library
Varga, T. and Bunke, H. 2005. Tree structure for word extraction from handwritten text lines. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’05). 352--356. Google ScholarDigital Library
Yang, S. and Dai, G. 2002. Detecting dominant points on online scripts with a simple approach. In Proceedings of the International Workshop on Frontiers in Handwriting Recognition (IWFHR’02). 351--356. Google ScholarDigital Library
Zhao, S. Y., Chi, Z. R., and Shi, P. F. 2003. Two-stage segmentation of unconstrained handwritten Chinese characters. Patt. Recog. 36, 1, 145--156.Google ScholarCross Ref
Zhou, X. D., Yu, J. L., Liu, C. L., Nagasaki, T., and Marukawa, K. 2007. Online handwritten Japanese character string recognition incorporating geometric context. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’07). 48--52. Google ScholarDigital Library
Zhou, X. D., Liu, C. L., and Nakagawa, M. 2009. Online handwritten Japanese character string recognition using conditional random fields. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’09). 521--525. Google ScholarDigital Library
Zhu, B. and Nakagawa, M. 2008. Segmentation of online freely written Japanese text using SVM for improving text recognition. IEICE Trans. Info. Syst. E91-D, 1, 105--113. Google ScholarDigital Library
Zhu, B., Zhou, X. D., Liu, C. L., and Nakagawa, M. 2009. Effect of improved path evaluation for online handwritten Japanese text recognition. In Proceedings of the International Conference on Document Analysis and Recognition (ICDAR’09). 516--521. Google ScholarDigital Library
Zhu, B., Zhou, X. D., Liu, C. L., and Nagakawa, M. 2010. A robust model for online handwritten Japanese text recognition. Int. J. Doc. Anal. Recog. 13, 2, 121--131. Google ScholarDigital Library

Index Terms

Attention-Feedback Based Robust Segmentation of Online Handwritten Isolated Tamil Words
1. Computing methodologies
  1. Machine learning

Recommendations

Bigram Language Models and Reevaluation Strategy for Improved Recognition of Online Handwritten Tamil Words

This article describes a postprocessing strategy for online, handwritten, isolated Tamil words. Contributions have been made with regard to two issues hardly addressed in the online Indic word recognition literature, namely, use of (1) language models ...
Read More
Lexicon-Free, Novel Segmentation of Online Handwritten Indic Words
ICDAR '11: Proceedings of the 2011 International Conference on Document Analysis and Recognition

Research in the field of recognizing unlimited vocabulary, online handwritten Indic words is still in its infancy. Most of the focus so far has been in the area of isolated character recognition. In the context of lexicon-free recognition of words, one ...
Read More
Stroke Segmentation and Recognition from Bangla Online Handwritten Text
ICFHR '12: Proceedings of the 2012 International Conference on Frontiers in Handwriting Recognition

This paper deals with recognition of online handwritten Bangla (Bengali) text. Here, at first, we segment cursive words into strokes. A stroke may represent a character or a part of a character. We selected a set of Bangla words written by different ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Asian Language Information Processing Volume 12, Issue 1
March 2013
102 pages
ISSN:1530-0226
EISSN:1558-3430
DOI:10.1145/2425327
Issue’s Table of Contents

Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 March 2013
- Revised: 1 March 2012
- Accepted: 1 March 2012
- Received: 1 December 2011
Published in talip Volume 12, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Attention Feedback Segmentation (AFS) module
Dominant Overlap Criterion Segmentation (DOCS) module
Handwriting recognition
Support Vector Machines (SVM)
Tamil
online Tamil words
stroke group
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 24
  Total Citations
  View Citations
- 257
  Total Downloads
- Downloads (Last 12 months)7
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Attention-Feedback Based Robust Segmentation of Online Handwritten Isolated Tamil Words

ACM Transactions on Asian Language Information Processing

Abstract

References

Cited By

Index Terms

Recommendations

Bigram Language Models and Reevaluation Strategy for Improved Recognition of Online Handwritten Tamil Words

Lexicon-Free, Novel Segmentation of Online Handwritten Indic Words

Stroke Segmentation and Recognition from Bangla Online Handwritten Text

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Attention-Feedback Based Robust Segmentation of Online Handwritten Isolated Tamil Words

ACM Transactions on Asian Language Information Processing

Abstract

References

Cited By

Index Terms

Recommendations

Bigram Language Models and Reevaluation Strategy for Improved Recognition of Online Handwritten Tamil Words

Lexicon-Free, Novel Segmentation of Online Handwritten Indic Words

Stroke Segmentation and Recognition from Bangla Online Handwritten Text

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media