Skip to main content
Log in

A review on document image analysis techniques directly in the compressed domain

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

The rapid growth of digital libraries, e-governance, and internet based applications has caused an exponential escalation in the volume of ‘Big-data’ particularly due to texts, images, audios and videos that are being both archived and transmitted on a daily basis. In order to make their storage and transfer efficient, different data compression techniques are used in the literature. The ultimate motive behind data compression is to transform a big size data into small size data, which eventually implies less space while archiving, and less time in transferring. However, in order to operate/analyze compressed data, it is usually necessary to decompress it, so as to bring back the data to its original form, which unfortunately warrants an additional computing cost. In this backdrop, if operating upon the compressed data itself can be made possible without going through the stage of decompression, then the advantage that could be accomplished due to compression would escalate. Further due to compression, from the data structure and storage perspectives, the original visibility structure of the data also being lost, it turns into a potential challenge to trace the original information in the compressed representation. This challenge is the motivation behind exploring the idea of direct processing on the compressed data itself in the literature. The proposed survey paper specifically focuses on compressed document images and brings out two original contributions. The first contribution is that it presents a critical study on different image analysis and image compression techniques, and highlights the motivational reasons for pursuing document image analysis in the compressed domain. The second contribution is that it summarizes the different compressed domain techniques in the literature so far based on the type of compression and operations performed by them. Overall, the paper aims to provide a perspective for pursuing further research in the area of document image analysis and pattern recognition directly based on the compressed data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Adjeroh D, Bell T, Mukherjee A (2008) The burrows-wheeler transform: data compression, suffix arrays and pattern matching. Springer, New York

    Google Scholar 

  • Adjeroh D, Bell T, Mukherjee A (2013) Pattern matching in compressed texts and images. Now Publishers, Hanover

    MATH  Google Scholar 

  • Adjeroh DA, Lee MC, King I (1999) A distance measure for video sequence similarity matching. Comput Vis Image Underst 75(1):25–45

    Google Scholar 

  • Ahmed N, Natarajan T, Rao K (1974) Discrete cosine transform. IEEE Trans Comput 23:90–93

    MathSciNet  MATH  Google Scholar 

  • Ahmed R, Al-Khatib WG, Mahmoud S (2017) A survey on handwritten documents word spotting. Int J Multimed Inf Retr 6(1):31–47

    Google Scholar 

  • Aho AV, Corasick M (1975) Efficient string matching: an aid to bibliographic search. Commun ACM 18(6):333–340

    MathSciNet  MATH  Google Scholar 

  • Akutsu T (1994) Approximate string matching with dont care characters. In: Proceedings combinatorial pattern matching, LNCS, vol  807, pp 240–249

    Google Scholar 

  • Alvarez S, Salvatella A, Vanrell M, Otazu X (2012) Low-dimensional and comprehensive color texture description. Comput Vis Image Underst 116(1):54–67

    Google Scholar 

  • Amir A, Benson G (1992) Efficient two-dimensional compressed matching. In: IEEE proceedings of data compression conference, pp 279–288

  • Amir A, Calinescu G (1996) Alphabet independent and dictionary scaled matching. In: Proceedings of combinatorial pattern matching (LNCS 1075), pp 320–334

    Google Scholar 

  • Amir A, Landau G, Vishkin U (1992) Efficient pattern matching with scaling. J Algorithms 13:2–32

    MATH  Google Scholar 

  • Amir A, Bensonb G, Farach M (1996) Let sleeping files lie: pattern matching in z-compressed files. J Comput Syst Sci 52(2):299–307

    MathSciNet  MATH  Google Scholar 

  • Amir A, Kapah O, Tsur D (2006) Faster two-dimensional pattern matching with rotations. Theor Comput Sci 368(3):196–204

    MathSciNet  MATH  Google Scholar 

  • Anantharaman B (2001) Compressed domain processing of MPEG audio. PhD thesis, Indian Institute of Science, Bangalore

  • Andrews H (1970) Computer techniques in image processing. Academic Press, New York

    Google Scholar 

  • Angadi SA (2007) An intelligent integrated automation system for efficient processing of postal mail. PhD thesis, Department of Studies in Computer Science, University of Mysore

  • Antonacopoulos A, Bridson D, Papadopoulos C, Pletschacher S (2009) A realistic dataset for performance evaluation of document layout analysis. In: Proceedings of the 10th international conference on document analysis and recognition, (ICDAR2009). Barcelona, pp 296–300

  • Apostolico A, Landau GM, Skiena S (1997) Matching for run-length encoded strings. In: Proceedings of complexity and compression of sequences

  • Ascher R, Nagy G (1974) A means for achieving a high degree of compaction on scan-digitized printed text. IEEE Trans Comput 23:1174–1179

    MATH  Google Scholar 

  • Asghari E, KeyvanPour M (2015) Xml document clustering: techniques and challenges. Artif Intell Rev 43(3):417–436

    Google Scholar 

  • Avcibas I, Kharrazi M, Memon ND, Sankur B (2005) Image steganalysis with binary similarity measures. EURASIP J Appl Signal Process 17:2749–2757

    MATH  Google Scholar 

  • Avrithis YS, Doulamis AD, Doulamis ND, Kollias SD (1999) Astochastic framework for optimal key frame extraction from mpeg video databases. Comput Vis Image Underst 75(1/2):3–24

    Google Scholar 

  • Baird H (1987) Skew angle of printed documents. In: Proceedings of SPSE’s 40th annual conference and symposium on hhybrid imaging systems, pp 21–24

  • Baird HS, Bunke H, Yamamoto K (eds) (1992) Structured document image analysis. Springer, New York

    MATH  Google Scholar 

  • Baird HS, Nagy G (1994) Self-correcting 100-font classifier. Doc Recognit 2181:106–115

    Google Scholar 

  • Baird HS, Tombre K (2014) The evolution of document image analysis. In: Doermann D, Tombre K (eds) Handbook of document image processing and recognition, pp 63–71

    Google Scholar 

  • Bell T, Powell M, Mukherjee A, Adjeroh DA (2002) Searching bwt compressed text with the boyer-moore algorithm and binary search. In: IEEE proceedings of data compression conference, pp 112–121

  • Berry M W (2013) Survey of text mining: clustering, classification, and retrieval. Springer, New York

    Google Scholar 

  • Bhaskaran V, Konstantinides K, Beretta G (1997) Text and image sharpening of scanned images in the jpeg domain. In: Proceedings of international conference on image processing, vol 2, pp 326–329

  • Bolan S (2012) Document image enhancement. PhD thesis, National University of Singapore

  • Breuel TM (2003) High performance document layout analysis. In: Proceedings of symposium on document image understanding technology

  • Breuel TM (2008) Binary morphology and related operations on run-length representations. In: International conference on computer vision theory and applications - VISAPP, pp 159–166

  • Bunke H, Csirik J (1993) An algorithm for matching run-length coded strings. Computing 50:297–314

    MathSciNet  MATH  Google Scholar 

  • Bunke H, Csirik J (1995) An improved algorithm for computing the edit distance of run-length coded strings. Inf Process Lett 54:93–96

    MATH  Google Scholar 

  • Ceci M, Berardi M, Malerba D (2005) Relational learning techniques for document image understanding: comparing statistical and logical approaches. In: Proceedings of the eighth international conference on document analysis and recognition, pp 473–477

  • Chang S (1995a) Compressed domain techniques of image/ video indexing and manipulation. In: IEEE international conference on image processing (ICIP95), special session on digital library and video on demand

  • Chang S (1995b) Some new algorithms for processing images in the transform compressed domain. In: SPIE symposium on visual communications and image processing

  • Chang S, Messerschmitt D (1995) Manipulation and compositing mc-dct compressed video. IEEE J Sel Areas Commun 13(1):1–11

    Google Scholar 

  • Chang S, Chen W, Messerschmitt D (1992) Video compositing in the dct domain. In: IEEE workshop on visual signal processing and communications

  • Chen B, Wornell GW (2001) Quantization index modulation: a class of provably good methods for digital watermarking and information embedding. IEEE Trans Inf Theory 47(4):1423–1443

    MathSciNet  MATH  Google Scholar 

  • Chen N, Blostein D (2007) A survey of document image classification: problem statement, classifier architecture and performance evaluation. IJDAR 10(1):1–16

    Google Scholar 

  • Chen B, Latifi S, Kanai J (1999) Edge enhancement of remote sensing image data in the dct domain. Image Vis Comput Elsevier 17:913–921

    Google Scholar 

  • Chen K, Yin F, Liu C-L (2013) Page segmentation with efficient whitespace rectangles extraction and grouping. In: 12th international conference on document analysis and recognition, pp 958–962

  • Chiptrasert B, Rao K (1990) Discrete cosine transform filtering. Signal Process 19(3):233–245

    MathSciNet  MATH  Google Scholar 

  • Chua TS, Zhao Y, Kankanhalli MS (2002) Detection of human faces in a compressed domain for video stratification. Vis Comput 18:121–133

    MATH  Google Scholar 

  • Chung K-L, Huang H-L, Lu H-I (2004) Efficient region segmentation on compressed gray images using quadtree and shading representation. Pattern Recognit 37:1591–1605

    MATH  Google Scholar 

  • Cleary JG, Teahan WJ (1997) Unbounded length contexts for ppm. Comput J 40(2/3):67–75

    Google Scholar 

  • Crochemore M, Hancart C, Lecroq T (2007) Algorithms on strings. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  • Cvision Technologies (2015). Reduce tiff file size (http://www.cvisiontech.com/file-formats/tiff/reduce-tiff-file-size.html)

  • Dash KS, Puhan NB, Panda G (2016) Odia character recognition: a directional review. Artif Intell Rev, pp 1–25

  • de Queiroz RL (1998) Processing jpeg-compressed images and documents. IEEE Trans Image Process 7(12):1661–1672

    Google Scholar 

  • de Queiroz RL, Eschbach R (1997) Segmentation of compressed documents. In: Proceedings of international conference on image processing, vol 3, pp 70–73

  • de Queiroz RL, Eschbach R (1998) Fast segmentation of the jpeg compressed documents. J Electron Imaging 7(2):367–377

    Google Scholar 

  • Deng S, Latifi S, Kanai J (1998) Manipulation of text documents in the modified group 4 domain. In: Multimedia signal processing, IEEE second workshop, pp 438–443

  • Deng S, Latifi S, Kanai J (1999) Document image analysis using a new compression algorithm. In: Document analysis systems: theory and practice (Lecture notes in computer science), vol 1655, pp 32–41

    Google Scholar 

  • Dhandra BV, Nagabhushan P, Hangarge M, Hegadi R, Malemath VS (2006) Script identification based on morphological reconstruction in document images. In: Proceedings of the 18th international conference on pattern recognition, vol 2, pp 950–953

  • Ding S, Zhu H, Jia W, Su C (2012) A survey on feature extraction for pattern recognition. Artif Intell Rev 37:169–180

    Google Scholar 

  • Doermann D (1998) The indexing and retrieval of document images: a survey. Comput Vis Image Underst 70(3):287–298

    Google Scholar 

  • Doermann D, Li H, Kia O (1998) The detection of duplicates in document image database. Image Vis Comput 16:907–920

    Google Scholar 

  • Doermann D, Tombre K (eds) (2014) Handbook of document image processing and recognition. Springer, London

    MATH  Google Scholar 

  • Dong Y, Tao D, Li X (2015a) Nonnegative multiresolution representation-based texture image classification. ACM Trans Intell Syst Technol 7(1):4:1–4:21

    Google Scholar 

  • Dong Y, Tao D, Li X, Ma J, Pu J (2015b) Texture classification and retrieval using shearlets and linear regression. IEEE Trans Cybern 45(3):358–369

  • Dugad R, Ahuja N (2001) A fast scheme for image size change in the compressed domain. IEEE Trans Circuits Syst Video Technol 11(4):461–474

    Google Scholar 

  • Eilam-Tzoreff T, Vishkin U (1988) Matching patterns in strings subject to multi-linear transformations. Theor Comput Sci 60:231–254

    MathSciNet  MATH  Google Scholar 

  • Farach M, Thorup M (1995) String matching in lempel-ziv compressed strings. In: Proceedings of annual ACM symposium on the theory of computing, pp 703–712

  • Farahmand A, Sarrafzadeh A, Shanbehzadeh J (2013) Document image noises and removal methods. In: Proceedings of the international multiconference of engineers and computer scientists, vol 1, pp 436–440

  • Faro S, Lecroq T (2013) The exact online string matching problem: a review of the most recent results. ACM Comput Surv 45(2):13:1–13:42

    MATH  Google Scholar 

  • Fredriksson K, Mozgovoy M (2006) Efficient parameterized string matching. Inf Process Lett 100(3):91–96

    MathSciNet  MATH  Google Scholar 

  • Gambhir M, Gupta V (2016) Recent automatic text summarization techniques: a survey. Artif Intell Rev 47:1–66

    Google Scholar 

  • Garain U, Chakraborty MP, Chanda B (2006a) Lossless compression of textual images: a study on indic script documents. In: ICPR, vol 3, pp 806–809

  • Garain U, Datta AK, Bhattacharya U, Parui SK (2006b) Summarization of jbig2 compressed indian language textual images. In: ICPR, vol 3, pp 344–347

  • Gargi U, Antani S, Kasturi R (1998) Indexing text events in digital video databases. In: IEEE proceedings of ICPR, pp 916–918

  • Gasieniec L, Rytter W (1999) Almost optimal fully lzw-compressed pattern matching. In: IEEE proceedings of data compression conference, pp 316–325

  • Gawrychowski P (2011) Optimal pattern matching in lzw compressed strings. In: Proceedings of symposium on discrete algorithms, pp 362–372

    Google Scholar 

  • Gawrychowski P (2012) Tying up the loose ends in fully lzw-compressed pattern matching. In: Proceedings of symposium on theoretical aspects of computer sciences, vol 14, pp 624–635

  • Ghosh D, Dube T, Shivaprasad A (2010) Script recognition-a review. IEEE Trans Pattern Anal Mach Intell 32(12):2142–2161

    Google Scholar 

  • Giancarlo R, Gross R (1997) Multi-dimensional pattern matching with dimensional wildcards: data structures and optimal on-line search algorithm. J Algorithms 24:223–265

    MathSciNet  MATH  Google Scholar 

  • Gonzalez RC, Woods RE (2009) Digital Image Processing, 3rd edn. Pearson, New Delhi

  • Habibi A (1977) Survey of adaptive image coding techniques. IEEE Trans Commun 25:1275–1284

    MATH  Google Scholar 

  • Hendahewa A (2010) 8 Image enhancement techniques in document capture. EIM BLOG (http://www.docudude.com/2010/04/8-image-enhancement-techniques-in.html)

  • Hernndez JR, Amado M, Gonzlez FP (2000) Dct-domain watermarking techniques for still images: detector performance analysis and a new structure. IEEE Trans Image Process 9(1):55–68

    Google Scholar 

  • Hinds S, Fisher J, D’Amato D (1990) A document skew detection method using run-length encoding and the hough transform. In: Proceedings of 10th international conference on pattern recognition, vol 1, pp 464–468

  • Hull JJ (1997) Document matching on ccitt group 4 compressed images. In: SPIE conference on document recognition IV, pp 8–14

  • Hull JJ, Cullen J (1997) Document image similarity and equivalence detection. In: IEEE proceedings of ICDAR, vol 1, pp 308–312

  • Hull JJ (1998) Document image similarity and equivalence detection. Int J Doc Anal Recognit 1:37–42

    Google Scholar 

  • Inglis S, Witten I (1994) Compression based template matching. In: IEEE proceedings of data compression conference, pp 106–115

  • Ito I, Kiya H (2007) Dct sign-only correlation with application to image matching and the relationship with phase-only correlation. In: IEEE proceedings of international conference on speech, acoustic and signal processing, pp 1237–1240

  • Iwamura M, Shafait F (2013) Camera-based document analysis and recognition. In: 5th international workshop on camera-based document analysis and recognition

  • Jain A (1989) Fundamentals of digital image processing. Prentice Hall, New Jersey

  • Jathanna VE, Nagabhushan P (2015) Microcontroller based mechanised videographing of text and auto-generation of voice text in real time. IJCSIT 6(3):2419–2425

    Google Scholar 

  • Javed M, Nagabhushan P, Chaudhuri BB (2013) Extraction of projection profile, run-histogram and entropy features straight from run-length compressed text-documents. In: Second IAPR Asian conference on pattern recognition (ACPR2013), pp 813–817

  • Javed M, Nagabhushan P, Chaudhuri BB (2015a) Automatic extraction of correlation-entropy features for text document analysis directly in run-length compressed domain. In: 13th international conference on document analysis and recognition (ICDAR), pp 1–5

  • Javed M, Nagabhushan P, Chaudhuri BB (2015b) A direct approach for word and character segmentation in run-length compressed documents with an application to word spotting. In: 13th international conference on document analysis and recognition (ICDAR), pp 216–220

  • Javed M (2016) On the possibility of processing document images in compressed domain. PhD thesis, Department of Studies in Computer Science, University of Mysore

  • Javed M, Krishnanand SH, Nagabhushan P, Chaudhuri BB (2016a) Visualizing ccitt group 3 and group 4 tiff documents and transforming to run-length compressed format enabling direct processing in compressed domain. Procedia Comput Sci 85:213–221

    Google Scholar 

  • Javed M, Nagabhushan P, Chaudhuri BB (2016b) Spotting of keyword directly in run-length compressed documents. In: Proceedings of Computer Vision and Image Processing (CVIP), vol 459. Springer, pp 367–376

  • Jawahar CV, Meshesha M, Balasubramanian A (2004a) Searching in document images. In: Proceedings of the international conference on visualization, graphics and image processing, pp 622–627

  • Jawahar CV, Million M, Balasubramanian A (2004b) Word level access to document image datasets. In: Proceedings of the workshop on computer vision, graphics and image processing, pp 73–76

  • Jayadevan R, Kolhe SR, Patil PM, Pal U (2012) Automatic processing of handwritten bank cheque images: a survey. Int J Doc Anal Recognit 15(4):267–296

    Google Scholar 

  • Jing XY, Zhang D (2004) A face and palmprint recognition approach based on discriminant dct feature extraction. IEEE Trans Syst Man Cybern 34(6):2405–2415

    Google Scholar 

  • Kanai J, Bagdanov AD (1998) Projection profile based skew estimation algorithm for jbig compressed images. Int J Doc Anal Recognit 1:43–51

    Google Scholar 

  • Kasturi R, Gorman LO, Govindaraju V (2002) Document image analysis: a primer. Sadhana Part 1(27):3–22

    Google Scholar 

  • Kia O (1997) Document compression and analysis. PhD thesis, Institute for Advanced Computer Studies, University of Maryland

  • Kieffer JC, Yang EH (2000) Grammar-based codes: a new class of universal lossless source codes. IEEE Trans Inf Theory 46(3):737–754

    MathSciNet  MATH  Google Scholar 

  • Klein B, Agne S, Dengel A (2004) Results of a study on invoice-reading systems in germany. Lecture notes in computer science, vol 3163, pp 451–462

  • Klein ST, Shapira D (2005) Pattern matching in huffman encoded texts. Inf Process Manag Elsevier 41:829–841

    MATH  Google Scholar 

  • Klein ST, Shapira D (2011) Compressed matching in dictionaries. Algorithms 4(1):61–74

    MathSciNet  Google Scholar 

  • Knight JR, Myers, EW (1999) Super-pattern matching. Technical Report TR-92-29, Department of Computer Science, University of Arizona

  • Kou W (1995) Digital Image compression: algorithms and standards. Kluwer Academic Publishers, Amsterdam

    Google Scholar 

  • Kresch R, Merhav N (1999) Fast dct domain filtering using the dct and the dst. IEEE Trans Image Process 8:821–833

    Google Scholar 

  • Latifi S, Kanai J (1997) Rapid manipulation of images compressed by the ccitt group iii 1-d coding scheme. In: Proceedings of international conference on imaging sciences, systems, and technology (CISST’97), pp 351–354

  • Lee DS, Hull JJ (2001) Detecting duplicates among symbolically compressed images in a large document database. Pattern Recognit Lett 22:545–550

    MATH  Google Scholar 

  • Lee I, On B-W (2011) An effective web document clustering algorithm based on bisection and merge. Artif Intell Rev 36(1):69–85

    Google Scholar 

  • Lee J, Lee B (1992) Transform domain filtering based on pipelining structure. IEEE Trans Signal Process 40(8):2061–2064

    Google Scholar 

  • Lee JS, Kim DK, Park K, Cho Y (1997) Efficient algorithms for approximate string matching with swaps. In: Proceedings of combinatorial pattern matching (LNCS), vol 1264, pp 28–39

    Google Scholar 

  • Lee MS, Shen M, Yoneyama A, Kuo CCJ (2005) Dct-domain image registration techniques for compressed video. In: IEEE proceedings of international symposium on circuit systems, vol 5, pp 4562–4565

  • Lee S (2007) An efficient content-based image enhancement in the compressed domain using retinex theory. IEEE Trans Circuits Syst Video Technol 17(2):199–213

    Google Scholar 

  • Li L, Tong CS, Choy SK (2010) Texture classification using refined histogram. IEEE Trans Image Process 19(5):1371–1378

    MathSciNet  MATH  Google Scholar 

  • Li M, Han J (2009) Streaming audio retrieval based on fuzzy classification in mpeg-1 compressed domain. In: International conference on mechatronics and automation, pp 5035–5039

  • Li X, Cui G, Dong Y (2016) Graph regularized non-negative low-rank matrix factorization for image clustering. IEEE Trans Cybern PP(99):1–14

    Google Scholar 

  • Lim J (1990) Two dimensional signal and image processing. Prentice Hall, New Jersey

  • Lloret E, Palomar M (2012) Text summarisation in progress: a literature review. Artif Intell Rev 37(1):1–41

    Google Scholar 

  • Lu CS (2002) Block dct-based robust watermarking using side information extracted by mean filtering. In: IEEE proceedings of ICPR, vol 2, pp 1001–1004

  • Lu J, Jiang D (2011) Survey on the technology of image processing based on dct compressed domain. In: ICMT, pp 786–789

  • Lu S, Su B, Tan CL (2010) Document image binarization using background estimation and stroke edges. IJDAR 13(4):303–314

    Google Scholar 

  • Lu Y, Tan CL (2003a) Document retrieval from compressed images. Pattern Recognit 36:987–996

    Google Scholar 

  • Lu Y, Tan CL (2003b) Word searching in ccitt group 4 compressed document images. In: IEEE proceedings of ICDAR, pp 467–471

  • Lu Y, Tan CL, Huang W, Fan L (2001) An approch to word image matching based on weighted hausdorff distance. In: Proceedings of ICDAR, pp 921–925

  • Maa CY (1994) Identifying the existence of bar codes in compressed images. CVGIP. Graph Models Image Process 56(4):352–356

    Google Scholar 

  • Makinen V, Ukkonen E, Navarro G (2003) Approximate matching of run length compressed strings. Algorithmica 35:347–369

    MathSciNet  MATH  Google Scholar 

  • Manber U (1997) A text compression scheme that allows fast searching directly in the compressed file. ACM Trans Inf Syst 15(2):124–136

    Google Scholar 

  • Mandal MK, Idris F, Panchanathan S (1999) A critical evaluation of image and video indexing techniques in the compressed domain. J Image Vis Comput 17:513–529

    Google Scholar 

  • Marinai S, Gori M, Soda G (2005) Artificial neural network s for document analysis and recognition. IEEE Trans PAMI 27(1):23–35

    Google Scholar 

  • Marinai S (2008a) Introduction to document analysis and recognition. Stud Comput Intell 90:1–20

  • Marinai S (2008b) Machine learning in document analysis and recognition. Springer, Heidelberg

    MATH  Google Scholar 

  • Marti UV, Wymann D, Bunke H (2000) Ocr on compressed images using pass modes and hidden markov models. In: Proceedings of IAPR workshop on document analysis systems, pp 77–86

  • Martucci SA (1995) Image resizing in the discrete cosine transform domain. In: IEEE proceedings of internation conference on image processing, vol 2, pp 224–227

  • Mazzarri A, Leonardi R (1995) Perceptual embedded image coding using wavelet tranforms. ICIP, pp 586–587

  • Merhav N, Bhaskaran V (1997) Fast algorithms for dct-domain image down-sampling and for inverse motion compensation. IEEE Trans Circuits Syst Video Technol 7(6):468–476

    Google Scholar 

  • Meunier JL (2005) Optimized xy-cut for determining a page reading order. In: International conference on document analysis and recognition, vol 1, pp 347–351

  • Miano J (1999) Compressed image file formats: JPEG, PNG, GIF, XBM, BMP. ACM Press, New York

    Google Scholar 

  • Moiron S, Faria S, Navarro A, Silva V, Assunc P (2009) Video transcoding from h.264/avc to mpeg-2 with reduced computational complexity. Signal Process Image Commun 24:637–650

    Google Scholar 

  • Moura ES, Navarro G, Baeza-Yates R (2000) Fast and flexible word searching on compressed text. ACM Trans Inf Syst 18(2):113–139

    Google Scholar 

  • Mukherjee A, Acharya T (1994) Compressed pattern-matching. In: IEEE proceedings of data compression conference, p 468

  • Mukherjee J, Mitra SK (2006) Image filtering in the compressed domain. In: Proceedings of the 5th Indian conference on computer vision, graphics and image processing (ICVGIP’06), LNCS, vol 4338, pp 194–205

    Google Scholar 

  • Mukherjee J, Mitra SK (2008) Enhancement of color images by scaling the dct coefficients. IEEE Trans Image Process 17(10):1783–1794

    MathSciNet  MATH  Google Scholar 

  • Mukhopadhyay J, Mitra SK (2009) Color constancy in the compressed domain. In: IEEE proceedings of internation conference on image processing, pp 705–708

  • Mukhopadhyay J (2011) Image and video processing in compressed domain. Chapman and Hall/CRC, Boca Raton

    MATH  Google Scholar 

  • Murugappan A, Ramachandran B, Dhavachelvan P (2011) A survey of keyword spotting techniques for printed document images. Artif Intell Rev 35(2):119–136

    Google Scholar 

  • Na S, Jinxiao P (2011) Fast and robust skew detection for scanned documents. In: International conference on electronic and mechanical engineering and information technology (EMEIT), vol 8, pp 4170–4173

  • Nagy G (2000) Twenty years of document image analysis in pami. IEEE Trans PAMI 22(1):38–62

    MathSciNet  Google Scholar 

  • Nagy G, Seth S, Viswanathan M (1992) A prototype document image analysis system for technical journals. Computer 25(7):10–22

    Google Scholar 

  • Namboodiri AM, Jain AK (2007) Document structure and layout anal. Digit Doc Process, pp 29–48

  • Navarro G, Raffinot M (1999) A general practical approach to pattern matching over ziv-lempel compressed text. In: Proceedings of combinatorial pattern matching (LNCS 1645), pp 14–36

    MATH  Google Scholar 

  • Navarro G (2001) A guided tour to approximate string matching. ACM Comput Surv 33(1):31–88

    Google Scholar 

  • Navarro G, Raffinot M (2004) Practical and flexible pattern matching over ziv-lempel compressed text. J Discrete Algorithms 2(3):347–371

    MathSciNet  MATH  Google Scholar 

  • Ngan K, Clarke R (1980) Lowpass filtering in the cosine transform domain. In: International conference on communication

  • Nixon MS, Aguado AS (2012) Feature extraction and image processing. Elsevier, Oxford

    Google Scholar 

  • Ogier JM, Liu W, Llados J (2009) Graphics recognition: achievements, challenges and evolution. In: ICDAR 2009

  • Pirsch S (1982) Adaptive intra/interframe dpcm coder. Bell Syst Tech J 61:747–764

    Google Scholar 

  • Provos N (2001) Defending against statistical steganalysis. In: Proceedings of 10th USENIX security symposium, vol 10, pp 323–335

  • Ramanathan R, Soman KP, Thaneshwaran L, Viknesh V, Arunkumar T, Yuvaraj P (2009) A novel technique for english font recognition using support vector machines. In: International conference on advances in recent technologies in communication and computing, pp 766–769

  • Rath T, Manmatha R (2003) Features for word spotting in historical manuscripts. In: International conference on document analysis and recognition, pp 218–222

  • Rath TM, Manmatha R (2007) Word spotting for historical documents. IJDAR 9(2–4):139–152

    Google Scholar 

  • Reeves R, Kubik K, Osberger W (1997) Texture characterization of compressed aerial images using dct coefficients. In: Proceedings of SPIE: storage and retrieval for image and video databases, vol 3022, pp 398–407

  • Regentova E, Latifi S, Deng S, Yao D (2002) An algorithm with reduced operations for connected components detection in itu-t group 3/4 coded images. IEEE Trans Pattern Anal Mach Intell 24(8):1039–1047

    Google Scholar 

  • Regentova E, Latifi S, Chen D, Taghva K, Yao D (2005) Document analysis by processing jbig-encoded images. IJDAR 7:260–272

    Google Scholar 

  • Rehman A, Saba T (2012) Off-line cursive script recognition: current advances, comparisons and remaining problems. Artif Intell Rev 37:261–288

    Google Scholar 

  • Rehman A, Saba T (2014) Neural networks for document image preprocessing: state of the art. Artif Intell Rev 42(2):253–273

    Google Scholar 

  • Rizzi A, Buccino M, Panella M, Uncini A (2006) Optimal short-time features for music/speech classification of compressed audio data. In: International conference on computational intelligence for modelling, control and automation, p 210

  • Ronse C, Devijver P (1984) Connected components in binary images: the detection problem. Research Studies Press, Letchworth

    Google Scholar 

  • Rosenbaum R, Taubman D (2003) Merging images in jpeg domain. In: ICIP, vol 1, pp 249–252

  • Saini K, Kaur S (2016) Forensic examination of computer-manipulated documents using image processing techniques. Egypt J Forensic Sci 6(3):317–322

    Google Scholar 

  • Salomon D, Motta G, Bryant D (2010) Handbook of data compression. Springer, London

    MATH  Google Scholar 

  • Salton G (1988) Automatic text processing. Addison-Wesley Longman Publishing Co, Boston

    Google Scholar 

  • Saragiotis P, Papamarkos N (2008) Local skew correction in documents. IJPRAI 22(4):691–710

    Google Scholar 

  • Sayood K (2012) Introduction to data compression, 4th edn. Morgan Kaufmann, Burlington

    Google Scholar 

  • Schaefer G (2010) Content-based retrieval of compressed images. In: International workshop on databases, texts, specifications and objects (DATESO2010), pp 175–185

  • Schuller G, Gruhne M, Friedrich T (2011) Fast audio feature extraction from compressed audio data. IEEE J Sel Top Signal Process 5:1262–1271

    Google Scholar 

  • Scotney BW, Coleman S, Herron M (2005) Direct feature detection on compressed images. Pattern Recogn Lett 26:2336–2345

    Google Scholar 

  • Shahnaz F, Berry MW, Pauca VP, Plemmons RJ (2006) Document clustering using nonnegative matrix factorization. Inf Process Manag 42(2):373–386

    MATH  Google Scholar 

  • Shao X, Xu C, Wang Y, Kankanhall MS (2004) Automatic music summarization in compressed domain. In: IEEE proceedings of acoustics, speech, and signal processing, vol 4, pp 261–264

  • Shen B, Sethi I (1995) Inner-block operations on compressed images. In: Proceedings of ACM multimedia’95 San Francisco, pp 490–499

  • Shen B, Sethi I (1996) Direct feature extraction from compressed images. In: Proceedings of SPIE, storage & retrieval for image and video databases IV, vol 2670, pp 404–414

  • Shen K, Delp E (1995) A fast algorithm for video parsing using mpeg compressed sequences. In: IEEE proceedings of internation conference on image processing, vol 2, pp 252–255

  • Shibata Y, Takeda M, Shinohara A, Arikawa S (1999) Pattern matching in text compressed by using anti-dictionaries. In: Proceedings, combinatorial pattern matching, vol 1645, pp 37–49

    MATH  Google Scholar 

  • Shima Y, Kashioka S, Higashino J (1989) A high-speed rotation method for binary images based on coordinate operation of run data. Syst Comput Jpn 20(6):91–102

    Google Scholar 

  • Shima Y, Kashioka S, Higashino J (1990) A high-speed algorithm for propagation-type labeling based on block sorting of runs in binary images. In: Proceedings of 10th international conference on pattern recognition (ICPR), vol 1, pp 655–658

  • Shiraishi S, Feng Y, Uchida S (2013) Skew estimation by parts. IEICE Trans Inf Syst 96:1503–1512

    Google Scholar 

  • Shneier M, Mottaleb MA (1996) Exploiting the jpeg compression scheme for image retrieval. IEEE Trans Pattern Anal Mach Intell 18(8):849–853

    Google Scholar 

  • Slimane F, Kanoun S, Hennebert J, Alimi AM, Ingold R (2013) A study on font-family and font-size recognition applied to arabic word images at ultra-low resolution. Pattern Recognit Lett 34(2):209–218

    Google Scholar 

  • Smith B, Rowe L (1993) Algorithms for manipulating compressed images. IEEE Comput Graph Appl 13:34–42

    Google Scholar 

  • Smith JR, Chang SF (1994) Transform features for texture classification and discrimination in large image databases. In: IEEE proceedings of ICPR, pp 407–411

  • Spitz AL (1998) Analysis of compressed document images for dominant skew, multiple skew, and logotype detection. Comput Vis Image Underst 70(3):321–334

    Google Scholar 

  • T.4-Recommedation (1985) Standardization of group 3 facsimile apparatus for document transmission, terminal equipments and protocols for telematic services, vol. vii, fascicle, vii. 3, Geneva. Technical report

  • T.6-Recommendation (1985) Standardization of group 4 facsimile apparatus for document transmission, terminal equipments and protocols for telematic services, vol. vii, fascicle, vii. 3, Geneva. Technical report

  • Tamakoshi Y, Tomohiro I, Inenaga S, Bannai H, Takeda M (2013) From run length encoding to lz78 and back again. In: IEEE proceedings of data compression conference, pp 143–152

  • Tang J, Peli E, Acton S (2003) Image enhancement using a contrast measure in the compressed domain. IEEE Signal Process Lett 10:289–292

    Google Scholar 

  • Tang YY, Lee S-W, Suen CY (1996) Automatic document processing: a survey. Pattern Recognit 29(12):1931–1952

    Google Scholar 

  • Tao T, Mukherjee A (2005) Pattern matching in lzw compressed file. IEEE Trans Comput 54(8):929–938

    Google Scholar 

  • TIFF (1992) (tagged image file format) revision 6.0 specification. Technical report

  • Tzanetakis G, Cook P (2000) Sound analysis using mpeg compressed audio. In: IEEE proceedings of acoustics, speech, and signal processing, vol 2, pp 761–764

  • Vasudev T (2007) Automatic data extraction from pre-printed input data forms: some new approaches. PhD thesis, University of Mysore

  • Venter F, Stein A (2012) Images & videos: really big data. Anal Mag, pp 15–20

  • Vetterli M (1984) Multi-dimensional sub-band coding: some theory and algorithms. Signal Process 6(2):97–112

    MathSciNet  Google Scholar 

  • Viswanath K (2009) Image transcoding in transform domain. PhD thesis, Dept. of Computer Science and Engineering, Indian Institute of Technology, Kharagpur

  • Viswanath K, Mukherjee J, Biswas PK, Pal RN (2010) Wavelet to dct transcoding in transform domain. Signal Image Video Process Springer 4(2):129–144

    MATH  Google Scholar 

  • Wang H, Chang SF (1997) A highly efficient system for automatic face region detection in mpeg video. IEEE Trans Circuits Syst Video Technol 7(4):615–628

    Google Scholar 

  • Woods J, O’Niel S (1986) Subband coding of images. IEEE Trans Acoust Speech Signal Process 34:1278–1288

    Google Scholar 

  • Wshah S, Kumar G, Govindaraju V (2012a). Multilingual word spotting in offline handwritten documents. In: ICPR, pp 310–313

  • Wshah S, Kumar G, Govindaraju V (2012b) Script independent word spotting in offline handwritten documents based on hidden markov models. In: ICFHR, pp 14–19

  • Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval, pp 267–273

  • Yang EH, Kaltchenko A, Kieffer JC (2001) Universal lossless data compression with side information by using a conditional mpm grammar transform. IEEE Trans Inf Theory 47(6):2130–2150

    MathSciNet  MATH  Google Scholar 

  • Ye Q, Gao W, Zeng W, Zhang T, Wang W, Liu Y (2003) Objectionable image recognition system in compression domain. In: 4th international conference on intelligent data engineering and automated learning (IDEAL 2003), LNCS, vol 2690, pp 1131–1135

    Google Scholar 

  • Yeo BL, Liu B (1995a) Rapid scene analysis on compressed video. IEEE Trans Circuits Syst Video Technol 5(6):533–544

  • Yeo BL, Liu B (1995b) Visual content highlighting via automatic extraction of embedded captions on mpeg compressed video. In: Proceedings of SPIE digital video compression, algorithms and technologies, pp 142–149

  • Yim C (2004) An efficient method for dct-domain separable symmetric 2-d linear filtering. IEEE Trans Circuits Syst Video Technol 14(4):517–521

    Google Scholar 

  • Yong X, Guangri Q, Yongdong X, Yushan S (2010) Keyword spotting in degraded document using mixed ocr and word shape coding. In: IEEE international conference on intelligent computing and intelligent systems, pp 411–414

  • Yucun P, Qunfei Z, kamata S (2010) Document layout analysis and reading order determination for a reading robot. In: IEEE proceedings of TENCON, pp 1607–1612

  • Zeng K, Yu J, Li C, You J, Jin T (2014) Image clustering by hyper-graph regularized non-negative matrix factorization. Neurocomputing 138:209–217

    Google Scholar 

  • Zhang HJ, Low CY, Smolia SW (1995) Video parsing and browsing using compressed data. Multimed Tools Appl 1:89–111

    Google Scholar 

  • Zirari F, Ennaji A, Nicolas S, Mammass D (2013) A document image segmentation system using analysis of connected components. In: 12th international conference on document analysis and recognition, pp 753–757

  • Ziviani N, Moura ES, Navarro G, Baeza-Yates R (2000) Compression: a key for next generation text retrieval systems. IEEE Comput 33(11):37–44

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammed Javed.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Javed, M., Nagabhushan, P. & Chaudhuri, B.B. A review on document image analysis techniques directly in the compressed domain. Artif Intell Rev 50, 539–568 (2018). https://doi.org/10.1007/s10462-017-9551-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-017-9551-9

Keywords

Navigation