Abstract
Document image analysis refers to algorithms and techniques that are applied to images of documents to obtain a computer-readable description from pixel data. A well-known document image analysis product is the Optical Character Recognition (OCR) software that recognizes characters in a scanned document. OCR makes it possible for the user to edit or search the document’s contents. In this paper we briefly describe various components of a document analysis system. Many of these basic building blocks are found in most document analysis systems, irrespective of the particular domain or language to which they are applied. We hope that this paper will help the reader by providing the background necessary to understand the detailed descriptions of specific techniques presented in other papers in this issue.
Similar content being viewed by others
References
Arcelli C, Sanniti di Baja G 1985 A width-independent fast thinning algorithm.IEEE Trans. Pattern Anal. Machine Intell. PAMI-7: 463–74
Arcelli C, Sanniti di Baja G 1993 Euclidean skeleton via center-of-maximal-disc extraction.Image Vision Comput. 11: 163–173
Akiyama T, Hagita N 1990 Automated entry system for printed documents.Pattern Recogn. 23: 1141–1154
Baird H S 1987 The skew angle of printed documents.Proceedings of the Conference of the Society of Photographic Scientists and Engineers on Hybrid Imaging Systems (Springfield, VA: Soc. Photogr. Sci. Eng.) pp 14–21
Bharati A, Chaitanya V, Sangal R 1998 Computational linguistics in India: An overview. Technical Report, Indian Institute of Information Technologies, Hyderabad
Dengel A, Bleisinger R, Hoch R, Fein F, Hones F 1992 From paper to office document standard representation.IEEE Comput. 25: 63–67
Fletcher A, Kasturi R 1988 A robust algorithm for text string separation from mixed text/graphics images.IEEE Trans. Pattern Anal. Machine Intell. PAMI-10: 910–918
Freeman H 1974 Computer processing of line drawing images.Comput. Surv. 6: 57–98
Freeman H, Davis L 1977 A corner-finding algorithm for chain-coded curves.IEEE Trans. Comput. C-26: 297–303
Fukunaga K, Hostetler L D 1975 K-nearest-neighbour Bayes-risk estimation.IEEE Trans. Inf. Theor. 21: 285–293
Garris M D, Dimmick D L 1996 Form design for high accuracy optical character recognition.IEEE Trans. Pattern Anal. Machine Intel. PAMI-18: 653–656
GREC 1995, 97, 99 Selected papers from the International Workshops on Graphics Recognition 1995, 1997, and 1999.Lecture Notes in Computer Science series (Springer Verlag) vols. 1072 (1996), 1389 (1998), 1941 (2000)
Haralick R M, Shapiro L G 1992Computer and robot vision (Reading, MA: Addison-Wesley)
Haralick R M, Sternberg S R, Zhuang X 1987 Image analysis using mathematical morphology.IEEE Trans. Pattern Anal. Machine Intell. PAMI-9: 532–550
Hashizume A, Yeh P S, Rosenfeld A 1986 A method of detecting the orientation of aligned components.Pattern Recogn. Lett. 4: 125–132
Hart P E 1968 The condensed nearest neighbour rule.IEEE Trans. Inf. Theor. 14: 515–516
ICDAR 19995th Int. Conf. on Document Analysis and Recognition (Los Alamitos, CA: IEEE Comput. Soc.)
Illingworth J, Kittler J 1988 A survey of the Hough transform.Comput. Graphics Image Process. 44: 87–116
Karnik R P 1999 Identifying Devnagari characters.Proc. Int. Conf. on Document Analysis and Recognition (Los Alamitos, CA: IEEE Comput. Soc.) pp. 669–672
Jain A K, Bhattacharjee S K 1992 Text segmentation using Gabor filters for automatic document processing.Machine Vision Appl. J. 5: 169–184
Lai C P, Kasturi R 1991 Detection of dashed lines in engineering drawings and maps.Proc. First Int. Conf. on Document Analysis and Recognition, St. Malo, France, pp. 507–515
Lam L, Lee S-W, Suen C Y 1992 Thinning methodologies -A comprehensive survey.IEEE Trans. Pattern Anal. Machine Intell. PAMI-14: 869–885
Lam L, Suen C Y 1995 An evaluation of parallel thinning algorithms for character recognition.IEEE Trans. Pattern Recogn. Machine Intell. 17: 914–919
Medioni G, Yasumoto Y 1987 Corner detection and curve representation using cubic B-splines.Comput. Vision, Graphics, Image Process. 29: 267–278
Murthy B K, Deshpande W R 1998 Optical character recognition (OCR) for Indian languages.Proc. Int. Conf. on Comput. Vision, Graphics, Vision, Image Process. ICVGIP, New Delhi
Nartker T A, Rice S V, Kanai J 1994 OCR Accuracy. UNLV’s Second Annual Test. Technical Journal INFORM, University of Nevada, Las Vegas
O’Gorman L 1988 Curvilinear feature detection from curvature estimation.9th Int. Conference on Pattern Recognition, Rome, Italy, pp 1116–1119
O’Gorman L 1990 k x k Thinning.Comput. Vision, Graphics, Image Process. 51: 195–215
O’Gorman L 1992 Image and document processing techniques for the right pages electronic library system.Int. Conf. Pattern Recognition (ICPR), The Netherlands, pp 260–263
O’Gorman L 1993 The document spectrum for structural page layout analysis.IEEE Trans. Pattern Anal. Machine Intelli. PAMI-15: 1162–73
O’Gorman L 1994 Binarization and multi-thresholding of document images using connectivity.CVGIP: Graphical Models Image Process. 56: 494–506
O’Gorman L, Kasturi R 1997 Document image analysis.IEEE Computer Society Press Executive Briefing Series, Los Alamitos, CA
Pavlidis T 1982Algorithms for graphics and image processing (Rockville, MD: Comput. Sci. Press)
Pavlidis T, Zhou J 1991 Page segmentation by white streams.Proc. 1st Int. Conf. on Document Analysis and Recognition (ICDAR), St. Malo, France, pp 945–953
Postl W 1986 Detection of linear oblique structures and skew scan in digitized documents.Proc. 8th Int. Conf. on Pattern Recognition (ICPR), Paris, France, pp 687–689
Ramanujan P 1999 Development of a general-purpose Sanskrit parser, M Sc thesis, Dept. of Computer Science & Automation, Indian Institute of Science, Bangalore
Ramer U E 1972 An iterative procedure for the polygonal approximation of plane curvesComput. Graphics Image Process. 1: 244–256
Reddi S S, Rudin S F, Keshavan H R 1984 An optimal multiple threshold scheme for image segmentation.IEEE Trans. Syst. Man Cybern. SMC-14: 661–665
Rice S V, Kanai J, Nartker T A 1992 A report on the accuracy of OCR devices. Technical Report, Information Science Research Institute of Nevada, Las Vegas
Sawaki M, Hagita K 1998 Text-line extraction and character recognition of document headlines with graphical design using complimentary similarity measure.IEEE Trans. Pattern Anal. Machine Intell. PAMI-20: 1103–1109
Sahoo P K, Soltani S, Wong A K C, Chen Y C 1988 A survey of thresholding techniques.Comput. Vision, Graphics, Image Process. 41: 233–260
Sanniti di Baja G 1994 Well-shaped, stable and reversible skeletons from the (3,4)-distance transform.Visual Commun. Image Representation 5: 107–115
Serra J 1982Image analysis and mathematical morphology (London: Academic Press)
Shih C-C, Kasturi R 1988 Generation of a line-description file for graphics recognition.Proc. SPIE Conf. on Applications of Artificial Intelligence 937: 568–575
Spitz L 1997 Determination of the Script and Language Content of Document Images.IEEE Trans. Pattern Analy. Machine Intell. PAMI-19: 235–245
Srihari S N, Govindaraju V 1989 Analysis of textual images using the Hough Transform.Machine Vision Appl. 2: 141–153
Trier O D, Taxt T 1995 Evaluation of binarization methods for document imagesIEEE Trans. Pattern Anal. Machine Intell. PAMI-17: 312–315
Tsai W-H 1985 Moment-preserving thresholding: A new approach.Comput. Vision, Grapics, Image Process. 29: 377–393
Wilson C L, Geist J, Garris M D, Chellapa R 1996 Design, integration, and evaluation of form-based handprint and OCR systems. Technical Report, NISTIR5932, National Institute of Standards & Technology, US; download fromhttp://www.itl.nist.gov/iad/894.03/pubs.html
Wong K Y, Casey R G, Wahl F M 1982 Document analysis system.IBM J. Res. Dev. 6: 647–656
Wu W-Y, Wang M-J J 1993 Detecting the dominant points by the curvature-based polygonal approximation.CVGIP: Graphical Models Image Process. 55: 79–88
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Kasturi, R., O’Gorman, L. & Govindaraju, V. Document image analysis: A primer. Sadhana 27, 3–22 (2002). https://doi.org/10.1007/BF02703309
Issue Date:
DOI: https://doi.org/10.1007/BF02703309