skip to main content
10.1145/2432553.2432564acmotherconferencesArticle/Chapter ViewAbstractPublication PagesdarConference Proceedingsconference-collections
research-article

Content directed enhancement of degraded document images

Published:16 December 2012Publication History

ABSTRACT

Most of the document pre-processing techniques are parameter dependent. In this paper, we present a novel framework that learns optimal parameters, depending on the nature of the document image content for binarization and text/graphics segmentation. The learning problem has been formulated as an optimization problem using EM algorithm to adaptively learn optimal parameters. Experimental results have established the effectiveness of our approach.

References

  1. J. Banerjee, A. M. Namboodiri, and C. V. Jawahar. Contextual restoration of severely degraded document images. In CVPR, pages 517--524. IEEE, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  2. K. C. Fan, C. H. Liu, and Y. K. Wang. Segmentation and classification of mixed text/graphics/image documents. Pattern Recognition Letters, 15(12): 1201--1209, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Cao and C. L. Tan. Text/graphics separation in maps. In Fourth International Workshop on Graphics Recognition Algorithms and Applications, pages 167--177, London, UK, UK, 2002. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Chowdhury, S. Mandal, A. Das, and B. Chanda. Segmentation of text and graphics from document images. In Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02, pages 619--623, Washington, DC, USA, 2007. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. L. A. Fletcher and R. Kasturi. A robust algorithm for text string separation from mixed text/graphics images. IEEE Transaction Pattern Analysis Machine Intelligence, 10(6): 910--918, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. B. Gatos, I. Pratikakis, and S. J. Perantonis. Adaptive degraded document image binarization. Pattern Recognition, 39: 317--327, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. K. Jain and S. Bhattacharjee. Texture segmentation using gabor filters for automatic document processing. Machine Vision and Application, 5: 169--184, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. N. Journet, V. Eglin, J. Ramel, and R. Mullot. Text/graphic labelling of ancient printed documents. In Proceedings of International Conference on Document Analysis and Recognition, volume 2, pages 1010--1014, August 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Kumar, R. Gupta, N. Khanna, S. Chaudhury, and S. D. Joshi. Text extraction and document image segmentation using matched wavelets and mrf model. IEEE Transactions of Image Processing, 16: 2117--2128, August 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. W. Niblack. An Introduction to Digital Image Processing. Strandberg Publishing Company, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. N. Otsu. A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man and Cybernetics, 9: 62--66, 1979.Google ScholarGoogle ScholarCross RefCross Ref
  12. P. P. Roy, J. Llados, and U. Pal. Text/graphics separation in color maps. In Proceedings of the International Conference on Computing: Theory and Applications, pages 545--551, Washington, DC, USA, 2007. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Sauvola and M. Pietikainen. Adaptive document image binarization. Pattern Recognition, 33: 225--236, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  14. G. Sharma, R. Garg, and S. Chaudhury. Curvature feature distribution based classification of indian scripts from document images. In Proceedings of the International Workshop on Multilingual OCR, pages 3:1--3:6, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. C. L. Tan and P. O. Ng. Text extraction using pyramid. Pattern Recognition, 31: 63--72, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  16. K. Tombre, S. Tabbone, L. Pélissier, B. Lamiroy, and P. Dosch. Text/graphics separation revisited. In Proceedings of the 5th International Workshop on Document Analysis Systems V, pages 200--211, London, UK, UK, 2002. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. F. M. Wahl, K. Y. Wong, and R. G. Casey. Block segmentation and text extraction in mixed text/image documents. In Computer Graphics and Image Processing, volume 20, pages 375--390, 1982.Google ScholarGoogle Scholar
  18. H. Yan. Unified formulation of a class of image thresholding techniques. Pattern Recognition, 29: 2025--2032, 1996.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Content directed enhancement of degraded document images

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          DAR '12: Proceeding of the workshop on Document Analysis and Recognition
          December 2012
          162 pages
          ISBN:9781450317979
          DOI:10.1145/2432553

          Copyright © 2012 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 16 December 2012

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader