skip to main content
10.1145/2034617.2034619acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesmocr-andConference Proceedingsconference-collections
research-article

New method for the selection of binarization parameters based on noise features of historical documents

Authors Info & Claims
Published:17 September 2011Publication History

ABSTRACT

Historical documents contain generally different kind of degradations. Due to this degradations the application of methods of noise removal during a preprocessing stage seems to be necessary. Since the noise which, exists in the original document can not be eliminated using a simple noise removal algorithm and it influences the preprocessing result e.g. the binarization, a function of noise detection seems to be necessary. We present in this paper a method for the selection of the input parameters of binarization methods according to the noise type detected in the image. The tests are achieved on benchmarking datasets used at DIBCO 2009 and H-DIBCO 2010. The results returned by the binarization methods using the noise features are promising.

References

  1. K. Coyle, "Mass digitization of books," Journal of Academic Librarianship, vol. 32, no. 6, pp. 641--645, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  2. I. Ben Messaoud and H. El Abed, "Automatic annotation for handwritten historical documents using markov models," in International Conference on Frontiers in Handwriting Recognition (ICFHR), 2010, pp. 381--386. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. N. Otsu, "A threshold selection method from gray level histograms," IEEE Transactions on Systems, Man, and Cybernetics, vol. 9, pp. 62--66, 1979.Google ScholarGoogle ScholarCross RefCross Ref
  4. J. Bernsen, "Dynamic thresholding of grey-level images," in International Conference on Pattern Recognition (ICPR), 1986, pp. 1251--1255.Google ScholarGoogle Scholar
  5. W. Niblack, "An introduction to digital image processing," in Prentice Hall Englewood Cliffs, 1986, pp. 115--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Sauvola and M. Pietikäinen, "Adaptive document image binarization," Pattern Recognition, vol. 33, no. 2, pp. 225--236, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  7. B. Gatos, I. Pratikakis, and S. Perantonis, "Adaptive degraded document image binarization," Pattern Recognition, vol. 39, pp. 317--327, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. I. Ben Messaoud, H. El Abed, H. Amiri, and V. Märgner, "New binarization approach based on text block extraction," in International Conference on Document Analysis and Recognition (ICDAR), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. T. Kuo, Y. Lai, and Y. Lo, "A novel image binarization method using hybrid thresholding," in IEEE International Conference on Multimedia & Expo (ICME), 2010, pp. 608--612.Google ScholarGoogle Scholar
  10. B. Gatos, K. Ntirogiannis, and I. Pratikakis, "ICDAR 2009 document image binarization contest (DIBCO 2009)," in International Conference on Document Analysis and Recognition (ICDAR), 2009, pp. 1375--1382. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. I. Pratikakis, B. Gatos, and K. Ntirogiannis, "H-DIBCO 2010-handwritten document image binarization competition," in International Conference on Frontiers in Handwriting Recognition (ICFHR), 2010, pp. 727--732. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. Paredes and E. Kavallieratou, "ICFHR 2010 contest: Quantitative evaluation of binarization algorithms," in International Conference on Frontiers in Handwriting Recognition (ICFHR), 2010, pp. 733--736. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. E. Badekas, N. Nikolaou, and N. Papamarkos, "Text binarization in color documents," International Journal Intelligent Systems, vol. 16, no. 6, pp. 262--274, 2006.Google ScholarGoogle Scholar
  14. R. D. Lins, S. Banergee, and M. Thielo, "Automatically detecting and classifying noises in document images," in ACM Symposium on Applied Computing, 2010, pp. 33--39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. E. Barney Smith, "An anlysis of binarization ground truth," in IAPR International Workshop on Document Analysis Systems (DAS), 2010, pp. 27--34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. Schilling, Fundamentals of Robotics Analysis and Control, E. Cliffs, Ed. Prentice-Hall, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. New method for the selection of binarization parameters based on noise features of historical documents

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Other conferences
            MOCR_AND '11: Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data
            September 2011
            144 pages
            ISBN:9781450306850
            DOI:10.1145/2034617

            Copyright © 2011 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 17 September 2011

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader