Abstract
Document analysis systems often begin with binarization as a first processing stage. Although numerous techniques for binarization have been proposed, the results produced can vary in quality and often prove sensitive to the settings of one or more control parameters. This paper examines a promising approach to binarization based upon simple principles, and shows that its success depends most significantly upon the values of two key parameters. It further describes an automatic technique for setting these parameters in a manner that tunes them to the individual image, yielding a final binarization algorithm that can cut total error by one-third with respect to the baseline version. The results of this method advance the state of the art on recent benchmarks.
Similar content being viewed by others
Notes
Some ambiguity remains over how to define the ground truth binarization. For instance, should an unintended ink spill made by a scribe be included? For the record, stipulate that the ground truth should identify the set of all pixels at least 50 % covered with ink during the production of the document. In practice, the ground truth used on most benchmark datasets probably does not correspond perfectly with this ideal, but represents the imperfect judgment of a human expert [4].
The numbers for Lelore & Bouchara exclude image PR6 because its results were not available.
References
Agrawal, M., Doermann, D.: Stroke-like pattern noise removal in binary document images. In: International Conference on Document Analysis and Recognition, pp. 17–21 (2011)
Badekas, E., Papamarkos, N.: Estimation of proper parameter values for document binarization. Int. J. Robot. Autom. 24(1), 66–78 (2009)
Bar-Yosef, I., Beckman, I., Kedem, K., Dinstein, I.: Binarization, character extraction and writer identification of historical Hebrew calligraphy documents. Int. J. Doc. Anal. Recogn. 9(2), 89–99 (2007)
Barney-Smith, E.: An analysis of binarization ground truthing. In: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems, pp. 27–33 Boston (2010)
Boykov, Y., Kolmogorov, V.: An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Trans. Pattern Analy. Mach. Intell. 26(9), 1124–1137 (2004)
Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8(6), 679–714 (1986)
Chen, Y., Leedham, D.: Decompose algorithm for thresholding degraded historical document images. IEEE Proc. Vis. Image Signal. Process. 152(6), 702–714 (2005)
Dawoud, A.: Iterative cross section sequence graph for handwritten character segmentation. IEEE Transact. Image Process. 16(8), 2150–2154 (2007)
Gatos, B., Pratikakis, I., Perantonis, S.J.: Adaptive degraded document image binarization. Pattern Recog. 39(3), 317–327 (2006)
Gatos, B., Pratikakis, I., Perantonis, S.J.: Improved document image binarization by using a combination of multiple binarization techniques and adapted edge information. In: International Conference on Pattern Recognition, pp. 1–4 (2008)
Howe, N.: A Laplacian energy for document binarization. In: International Conference on Document Analysis and Recognition, pp. 6–10 (2011)
Lelore, T., Bouchara, F.: Document image binarization using Markov field model. In: International Conference on Document Analysis and Recognition, pp. 551–555 (2009)
Lelore, T., Bouchara, F.: Super-resolved binarization of text based on FAIR algorithm. In: International Conference on Document Analysis and Recognition, pp. 839–843 (2011)
Lu, S., Su, B., Tan, C.L.: Document image binarization using background estimation and stroke edges. Int. J. Document Anal. Recogn. 13(4), 303–314 (2010)
Mishra, A., Alahari, K., Jawahar, C.V.: An MRF model for binarization of natural scene text. In: International Conference on Document Analysis and Recognition (2011)
Niblack, W.: An Introduction to Digital Image Processing. Prentice-Hall, Englewood Cliffs (1986)
Otsu, N.: A threshold selection method from graylevel histogram. IEEE Trans. Syst. Man Cybern. 19(1), 62–66 (1978)
Peng, X., Setlur, S., Govindaraju, V., Sitaram, R.: Markov random field based binarization for hand-held devices captured document images. In: Proceedings of the Seventh Indian Conference on Computer Vision, Graphics and Image Processing, pp. 71–76 (2010)
Pratikakis, I., Gatos, B., Ntirogiannis, K.: ICDAR 2011 document image binarization contest (DIBCO 2011). In: International Conference on Document Analysis and Recognition, pp. 1506–1510 (2011)
Ramírez-Ortegón, M.A., Tapia, E., Ramírez-Ramírez, L.L., Rojas, R., Cuevas, E.: Transition pixel: A concept for binarization based on edge detection and gray-intensity histograms. Pattern Recogn. 43(4), 1233–1243 (2010)
Sauvola, N., Pietikainen, M.: Adaptive document image binarization. Pattern Recogn. 33(2), 225–236 (2000)
Su, B., Lu, S., Tan, C.L.: Binarization of historical document images using the local maximum and minimum. In: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems, pp. 159–166 (2010)
Vonikakis, V., Andreadis, I., Papamarkos, N., Gasteratos, A.: Adaptive document binarization: A human vision approach. In: 2nd International Conference on Computer Vision Theory and Applications, pp. 104–110. Barcelona (2007)
Acknowledgments
The author thanks those who kindly shared results or implementation details of their algorithms for comparison purposes, including Basilis Gatos, Su Bolan, and Frédéric Bouchara.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Howe, N.R. Document binarization with automatic parameter tuning. IJDAR 16, 247–258 (2013). https://doi.org/10.1007/s10032-012-0192-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10032-012-0192-x