Skip to main content
Log in

Abstract.

Real-world text on street signs, nameplates, etc. often lies in an oblique plane and hence cannot be recognized by traditional OCR systems due to perspective distortion. Furthermore, such text often comprises only one or two lines, preventing the use of existing perspective rectification methods that were primarily designed for images of document pages. We propose an approach that reliably rectifies and subsequently recognizes individual lines of text. Our system, which includes novel algorithms for extraction of text from real-world scenery, perspective rectification, and binarization, has been rigorously tested on still imagery as well as on MPEG-2 video clips in real time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Clark P, Mirmehdi M (2002) Location and recovery of text on oriented surfaces. In: 7th SPIE conference on document recognition and retrieval, pp 267-277

  2. Clark P, Mirmehdi M (2002) On the recovery of oriented documents from single images. In: Proc. 4th IEEE conference on advanced concepts for intelligent vision systems, pp 190-197

  3. Clark P, Mirmehdi M (2002) Recognising text in real scenes. Int J Doc Anal Recog 4(4):243-257

    Google Scholar 

  4. Dance CR (2001) Perspective estimation for document images. In: Proceedings SPIE, 4670:244-254

  5. Gandhi T, Kasturi R, Antani S (2000) Application of planar motion segmentation for scene text extraction. In: Proc. international conference on pattern recognition, 1:445-449

  6. Hashizume A, Yeh PS, Rosenfeld A (1986) A method of detecting the orientation of aligned components. Pattern Recog Lett 4:125-132

    Google Scholar 

  7. Jain A, Bhattacharjee S (1992) Text segmentation using Gabor filters for automatic document processing. Mach Vis Appl 5:169-184

    Google Scholar 

  8. Jain A, Yu B (1998) Automatic text location in images and video frames. In: Proc. international conference on pattern recognition, pp 1497-1599

  9. Li H, Doermann D (1998) Automatic identification of text in digital video key frames. In: Proc. international conference on pattern recognition, pp 129-132

  10. Li H, Doermann D (1998) Automatic text tracking In digital videos. In: Proc. IEEE workshop on multimedia signal processing, pp 21-26

  11. Li H, Doermann D, Kia O (1998) Text extraction and recognition in digital video. In: Proc. 3rd IAPR workshop on document analysis systems, pp 119-128

  12. Li H, Doermann D, Kia O (1999) Automatic text detection and tracking in digital video. IEEE Trans Image Process 9(1):147-155

    Google Scholar 

  13. Lienhart R (1996) Indexing and retrieval of digital video sequences based on automatic text recognition. In: Proc. 4th ACM international multimedia conference, Boston

  14. Mirmehdi M, Clark P, Lam J (2001) Extracting low resolution text with an active camera for OCR. In: Proc. 9th Spanish symposium on pattern recognition and image processing, pp 43-48

  15. Nakano Y, Shima Y, Fujisawa H, Higashino J, Fojinawa M (1990) An algorithm for the skew normalization of document images. In: Proc. international conference on pattern recognition, 2:8-13

  16. O’Gorman L (1993) The document spectrum for page layout analysis. IEEE Trans Pattern Anal Mach Intell 15(11):1162-1173

    Google Scholar 

  17. Ohya J, Shio A, Akamatsu S (1994) Recognizing characters in scene images. IEEE Trans Pattern Anal Mach Intell 16(2):214-220

    Google Scholar 

  18. Pilu M (2001) Extraction of illusory linear clues in perspectively skewed documents. In: Proc. CVPR, 1:363-368 (2001)

  19. Sato T, Takeo K, Hughes E, Smith M (1998) Video OCR for digital news archive. In: Proc. international workshop on content-based access of image and video databases (CAIVD ‘98), Bombay, India. IEEE Press, New York, ISBN 0-8186-8329-5

  20. Smith MA, Kanade T (1995) Video skimming for quick browsing based on audio and image characterization. Technical Report CMU-CS-95-186, Carnegie Mellon University, Pittsburgh, PA

  21. Taylor MJ, Zappala A, Newman WM, Dance CR (1999) Documents through cameras. Image Vis Comput 17(11):831-844

    Google Scholar 

  22. Wu V, Manmatha R, Riseman E (1997) Automatic text detection and recognition. In: Proc. workshop on image understanding, pp 707-712

  23. Yeo B-L, Liu B (1996) Visual content highlighting via automatic extraction of embedded captions on MPEG compressed video in digital video compression: algorithms and technologies. In: Proc. SPIE, vol 2668

  24. Zhong Y, Karu K, Jain A (1995) Locating text in complex color images. In: Proc. 3rd international conference on document analysis and recognition, Montreal

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gregory K. Myers.

Additional information

Received: 15 December 2003, Published online: 14 December 2004

Gregory K. Myers: Correspondence to

Rights and permissions

Reprints and permissions

About this article

Cite this article

Myers, G.K., Bolles, R.C., Luong, QT. et al. Rectification and recognition of text in 3-D scenes. IJDAR 7, 147–158 (2005). https://doi.org/10.1007/s10032-004-0133-4

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-004-0133-4

Keywords:

Navigation