Skip to main content

Content-Based Representation and Retrieval of Visual Media: A State-of-the-Art Review

  • Chapter

Abstract

This paper reviews a number of recently available techniques in content analysis of visual media and their application to the indexing, retrieval, abstracting, relevance assessment, interactive perception, annotation and re-use of visual documents.

This work was performed while this author was with Institute of Systems Science, Singapore.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. T.G. Aguierre-Smith and G. Davenport, “The stratification system: A design environment for random access video,” Proc. 3rd Int. Workshop on Network and Operating System Support for Digital Audio and Video, La Jolla, CA, USA, Nov. 1992, pp. 250–261.

    Google Scholar 

  2. P. Aigrain, “Organizing image banks for visual access: Model and techniques,” OPTICA’87 Conf. Proc., Amsterdam, Learned Information, April 1987, pp. 257–270.

    Google Scholar 

  3. P. Aigrain, “Image and sound digital libraries need more than storage and networked access,” Proc. International Symposium on Digital Libraries, ULIS, Tsukuba, Japan, Aug. 1995, pp. 112–118.

    Google Scholar 

  4. P. Aigrain, “Software research for video libraries and archives,” IFLA Journal, special issue on the UNESCO Memory of the World Project, Vol. 21,No. 3, pp. 198–202, 1995.

    Google Scholar 

  5. P. Aigrain and P. Joly, “The automatic real-time analysis of film editing and transition effects and its applications,” Computers & Graphics, Vol. 18,No. 1, pp. 93–103, Jan.–Feb. 1994.

    Article  Google Scholar 

  6. P. Aigrain and P. Joly, “Discrete visual manipulation user interfaces for video,” Proc. RIAO’94 Conference, New-York, Oct. 1994, Vol. 2, pp. 12–17.

    Google Scholar 

  7. P. Aigrain and V. Longueville, “A connection graph for user navigation in a large image bank,” Proc. RIAO’91, Barcelona, Spain, April 1991, Vol. 1, pp. 67–84.

    Google Scholar 

  8. P. Aigrain and V. Longueville, “Evaluation of navigational links between images,” Information Processing and Management, Vol. 28,No. 4, pp. 517–528, 1992.

    Article  Google Scholar 

  9. P. Aigrain, P. Joly, and V. Longueville, “Medium-knowledge-based macro-segmentation of video into sequences,” Working Notes of IJCAI Workshop on Intelligent Multimedia Information Retrieval, Montreal, Aug. 1995, pp. 5–14.

    Google Scholar 

  10. P. Aigrain, P. Joly, H.-K. Kim, and P. Lepain, Software Tools for Moving Image Archives: Access, Indexing and User Interfaces, G. Boston (Ed.), Proc. Joint Technical Sympoisum on Technology and Our Audiovisual Heritage, FIAF/FIAT/IASA/IFLA/ICA, London, Jan. 1995.

    Google Scholar 

  11. P. Aigrain, P. Joly, P. Lepain, and V. Longueville, “Representation-based user interfaces for the audiovisual library of year 2000,” Proc. IS&T/SPIE’95 Multimedia Computing and Networking, San Jose, Feb. 1995, pp. 35–45.

    Google Scholar 

  12. A. Akutsu, Y. Tonomura, H. Hashimoto, and Y. Ohba, “Video indexing using motion vectors,” Proc. Visual Communication and Image Processing, SPIE, Amsterdam, 1992, Vol. 1818, pp. 1522–1530.

    Google Scholar 

  13. A. Akutsu and Y. Tonomura, “Video tomography: An efficient method for camerawork extraction and motion analysis,” Proc. A.C.M. Multimedia Conference, San Francisco, Oct. 1994.

    Google Scholar 

  14. F. Arman, A. Hsu, and M.Y. Chiu, “Feature management for large video databases,” Proc. Storage and Retrieval for Image and Video Databases I, SPIE, Feb. 1993, Vol. 1908, pp. 2–12.

    Google Scholar 

  15. T. Blum, D. Keislar, J. Wheaton, and E. Wold, “Audio databases with content-based retrieval,” Working Notes of IJCAI Workshop on Intelligent Multimedia Information Retrieval, Montreal, Aug. 1995, pp. 71–92.

    Google Scholar 

  16. J.S. Boreczky and L.A. Rowe, “Comparison of video shot boundary detection techniques,” Proc. SPIE Conf. Storage and Retrieval for Video Databases IV, San Jose, CA, USA, Feb. 1995.

    Google Scholar 

  17. V.M. Bove, Jr., “Entropy-based depth from focus,” Journal of the Optical Society of America A, Vol. 10, pp. 561–566, April 1993.

    Article  Google Scholar 

  18. S. Butler and A.P. Parkes, “Filmic spacetime diagrams for video structure representation,” to appear in Image Communication special issue on Image and Video Semantics: Processing, Analysis and Application, 1996.

    Google Scholar 

  19. N.-S. Chang and K.-S. Fu, “Query by pictorial example,” IEEE Transactions on Software Engineering, Vol. 6,No. 6, pp. 519–524, Nov. 1980.

    Article  Google Scholar 

  20. M. Cherfaoui and C. Bertin, “Two-stage strategy for indexing and presenting video,” Proc. SPIE Conf. Storage and Retrieval for Video Databases III, San Jose, CA, USA, Feb. 1994, Vol. 2185.

    Google Scholar 

  21. A. Dailianas, R. Allen, and P. England, “Comparison of automatic video segmentation algorithms,” Proceedings of SPIE Photonics West, Philadelphia, Oct. 1995.

    Google Scholar 

  22. J. Ens and P. Lawrence, “An investigation of methods determining depth from focus,” IEEE Transactions on Pattern Matching and Machine Intelligence, Vol. 15, pp. 97–108, Feb. 1993.

    Article  Google Scholar 

  23. M. Flickner et al., “Query by image and video content,” IEEE Computer, pp. 23–32, Sept. 1995.

    Google Scholar 

  24. Y. Gong, L.T. Sin, H.C. Chuan, H.J. Zhang, and M. Sakauchi, “Automatic parsing of TV soccer programs,” Proc. Second IEEE International Conference on Multimedia Computing and Systems, Washington DC, May 15–18, 1995, pp. 167–174.

    Google Scholar 

  25. A.S. Gordon and E. A. Domeshek, “Conceptual indexing for video retrieval,” Working Notes of IJCAI Workshop on Intelligent Multimedia Information Retrieval, Montreal, Aug. 1995, pp. 23–38.

    Google Scholar 

  26. V.N. Gudivada, “On spatial similarity measures for multimedia applications,” Proc. SPIE Conf. Storage and Retrieval for Image and Video Databases III, San Jose, CA, USA, Feb. 1994, Vol. 2420, pp. 363–380.

    Google Scholar 

  27. V.N. Gudivada and V.V. Raghavan, “Design and evaluation of algorithms for image retrieval by spatial similarity,” ACM Transactions on Information Systems, Vol. 13,No. 2, pp. 115–144, April 1995.

    Article  Google Scholar 

  28. V. Guigueno, “L’identité de l’image: Expression et systémes documentaires,” rapport d’option, Ecole Poly-technique, Palaiseau, France, Juillet, 1991.

    Google Scholar 

  29. K. Haase, “Framer: A persistent portable representation library,” Proc. of ECAI’94, 1994.

    Google Scholar 

  30. A. Hampapur, R. Jain, and T.E. Weymouth, “Production model based digital video segmentation,” Multimedia Tools and Applications, Vol. 1,No. 1, pp. 9–46, 1995.

    Article  Google Scholar 

  31. A.G. Hauptmann and M. Smith, “Text, speech and vision for video segmentation: The Informedia project,” Working Notes of IJCAI Workshop on Intelligent Multimedia Information Retrieval, Montreal, Aug. 1995, pp. 17–22.

    Google Scholar 

  32. M. Hawley, Structure Out of Sound, Ph.D. Dissertation, MIT Media Laboratory, Cambridge, Mass., USA, 1993.

    Google Scholar 

  33. K. Hirata and T. Kato, “Query by Visual Example: Content-Based Image Retrieval,” Proc. E.D.B.T.’92 Conf. on Advances in Database Technology, in Pirotte, Delobel, and Gottlob (Eds.), Springer-Verlag, Lecture Notes in Computer Science, Vol. 580, pp. 56–71, 1994.

    Google Scholar 

  34. M.E. Hodges, R.M. Sassnett, and M.S. Ackerman, “A construction set for multimedia applications,” IEEE Software, pp. 37–43, Jan. 1989.

    Google Scholar 

  35. M. Irani, P. Anandan, J. Bergen, R. Kumar, and S. Hsu, “Mosaic based representations of video sequences and their applications,” Image Communication special issue on Image and Video Semantics: Processing, Analysis and Application, 1996.

    Google Scholar 

  36. R. Jain, A. Pentland, and D. Petkovic (Eds.), Workshop Report: NSF-ARPA Workshop on Visual Information Management Systems, Cambridge, Mass., USA, June 1995.

    Google Scholar 

  37. P. Joly and H.-K. Kim, “Efficient automatic analysis of camera work and micro-segmentation of video using spatio-temporal images,” Image Communication special issue on Image and Video Semantics: Processing, Analysis and Application, 1996.

    Google Scholar 

  38. T. Kato, “Database architecture for content-based image retrieval,” Proc. of SPIE Conf. on Image Storage and Retrieval Systems, San Jose, Feb. 1992, Vol. 1662, pp. 112–123.

    Google Scholar 

  39. P. Lepain and R. André-Obrecht, “Micro-segmentation d’enregistrements musicaux,” Actes des Journees d’Informatique Musicale, Vol. 95–13, pp. 81–90, 1995.

    Google Scholar 

  40. W.E. Mackay and G. Davenport, “Virtual video editing in interactive multimedia applications,” Communications of the A.C.M., Vol. 32,No. 9, July 1989.

    Google Scholar 

  41. J. Meng, Y. Juan, and S.-F. Chang, “Scene change detection in an MPEG compressed video sequence,” IS&T/SPIE’95 Digital Video Compression: Algorithm and Technologies, San Jose, Feb. 1995, Vol. 2419, pp. 14–25.

    Google Scholar 

  42. M. Mills, J. Cohen, and Y.Y. Wong, “A magnifier tool for video data,” Proc. INTERCHI’92, ACM, May 1992, pp. 93–98.

    Google Scholar 

  43. A. Nagasaka and Y. Tanaka, “Automatic scene-change detection method for video works,” E. Knuth and I.M. Wegener (Eds.), Proc. 40th National Con. Information Processing Society of Japan, 1990.

    Google Scholar 

  44. A. Nagasaka and Y. Tanaka, Automatic Video Indexing and Full-Search for Video Appearances, E. Knuth and I.M. Wegener (Eds.), Visual Database Systems, Elsevier Science Publishers: Amsterdam, Vol. II, pp. 113–127, 1992.

    Google Scholar 

  45. B.C. O’Connor, “Selecting key frames of moving image documents: A digital environment for analysis and navigation,” Microcomputers for Information Management, Vol. 8,No. 2, pp. 119–133, 1991.

    MathSciNet  Google Scholar 

  46. B. Peeters, J. Faton, and P. de Pierpont, Storyboard-Le Cinema Dessine, Editions Yellow Now, 1992.

    Google Scholar 

  47. A. Pentland, R.W. Picard, and S. Sclaroff, “Photobook: Content-based manipulation of image databases,” Proc. Storage and Retrieval for Image and Video Databases II, San Jose, CA, USA, Feb. 1994, Vol. 2185.

    Google Scholar 

  48. R. Picard and Fang Liu, “A new World ordering for image similarity,” Proc. Int. Conf. on Acoustic Signals and Signal Processing, Adelaide, Australia, March 1994, Vol. 5, p. 129.

    Google Scholar 

  49. R.W. Picard and T.O. Minka T., “Vision texture for annotation,” Multimedia Systems, ACM-Springer, Vol. 3,No. 3, pp. 3–14, Feb. 1995.

    Article  Google Scholar 

  50. F. Salazar, “Analyse automatique des mouvements de caméra dans un document vid’eo,” IRIT, rapport de recherche, 95-33-R, Universit’e Paul Sabatier, Toulouse, France, Sept. 1995.

    Google Scholar 

  51. F. Salazar and F. Valéro, “Analyse automatique de documents video,” IRIT, rapport de recherche, 95-28-R, Université Paul Sabatier, Toulouse, France, Juin 1995.

    Google Scholar 

  52. S. Sclaroff and A. Pentland, “Modal matching for correspondence and recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 17,No. 6, pp. 545–561, June 1995.

    Article  Google Scholar 

  53. I.K. Sethi and N. Patel, “A statistic approach to scene change detection,” Proc. SPIE Storage and Retrieval for Image and Video Databases III, San Jose, CA, USA, Feb. 1995, Vol. 2420, pp. 329–338.

    Google Scholar 

  54. B. Shahraray, “Scene change detection and content-based sampling of video sequences,” IS&T/SPIE’95 Digital Video Compression: Algorithm and Technologies, San Jose, Feb. 1995, Vol. 2419, pp. 2–13, SPIE Proceedings.

    Google Scholar 

  55. B. Shahraray and D.C. Gibbon, “Automatic generation of pictorial transcripts of video programs,” IS&T/SPIE’95 Digital Video Compression: Algorithm and Technologies, San Jose, Feb. 1995, Vol. 2417, pp. 512–519, SPIE Proceedings.

    Google Scholar 

  56. M. Strieker and M. Orengo, “Similarity of color images,” Proc. Storage and Retrieval for Image and Video Databases III, San Jose, CA, USA, Feb. 1995, Vol. 2420, pp. 381–392, SPIE Conference Proceedings.

    Google Scholar 

  57. A. Takeshita, T. Inoue, and K. Tanaka, “Extracting text skim structures for multimedia browsing,” in M. Maybury (Ed.), Working Notes of IJCAI Workshop on Intelligent Multimedia Information Retrieval, Montreal, Aug. 1995, pp. 46–58.

    Google Scholar 

  58. H. Tamura, S. Mori, and T. Yamawaki, “Texture features corresponding to visual perception,” IEEE Trans, on Syst., Man, and Cybern., Vol. 6,No. 4, pp. 460–473, 1979.

    Google Scholar 

  59. L. Teodosio and W. Bender, “Salient video stills: Content and context preserved,” Proc. ACM Multimedia’ 93, Anaheim, CA, USA, Aug. 1993.

    Google Scholar 

  60. Y. Tonomura, A. Akutsu, K. Otsuji, and T. Sadakata, “VideoMAP and VideoSpacelcon: Tools for anatomizing video content,” Proc. InterChi’93, ACM, 1993, pp. 131–136.

    Google Scholar 

  61. Y.T. Tse and R.L. Baker, “Global zoom/pan estimation and compensation for video compression,” Proc ICASSP’91, May 1991, Vol. 4.

    Google Scholar 

  62. H. Ueda, T. Miyatake, and S. Yoshisawa, “IMPACT: An interactive natural-motion-picture dedicated multimedia authoring system,” Proc. CHI’91, ACM, 1991, pp. 343–350.

    Google Scholar 

  63. H.D. Wactlar, D. Christel, A. Hauptmann, T. Kanade, M. Mauldin, R. Reddy, M. Smith, and S. Stevens, “Technical challenges for the informedia digital video library,” Proc. International Symposium on Digital Libraries, Tsukuba, Japan, Aug. 1995, pp. 10–16.

    Google Scholar 

  64. L. Wyse and S.W. Smoliar, “Towards content-based audio indexing and retrieval,” Proc. IJCAI Workshop on Computational Auditory Scene Analysis, D. Rosenthal and H.G. Okuno (Eds.), Montreal, Aug 1995 pp. 149–152.

    Google Scholar 

  65. B.-L. Yeo and B. Liu, “On the extraction of DC sequence from MPEG compressed video,” International Conference on Image Processing (ICIP’95), Washington, DC, USA, Oct. 1995, IEEE.

    Google Scholar 

  66. M.M. Yeung, B.-L. Yeo, W. Wolf, and B. Liu, “Video browsing using clustering and scene transitions on compressed sequences,” IS&T/SPIE’95 Multimedia Computing and Networking, San Jose Feb 1995 Vol 2417 pp. 399–413.

    Google Scholar 

  67. M.M. Yeung and B. Liu, “Efficient matching and clustering of video shots,” International Conference on Image Processing (ICIP’95), Washington, DC, USA, Oct. 1995, IEEE.

    Google Scholar 

  68. R. Zabih, K. Mai, and J. Miller, “A robust method for detecting cuts and dissolves in video sequences,” Proc. ACM Multimedia’95, San Francisco, Nov. 1995.

    Google Scholar 

  69. H.J. Zhang, A. Kankanhalli, and S.W. Smoliar, “Automatic partitioning of full-motion video,” Multimedia Systems, ACM-Springer, Vol. 1,No. 1, pp. 10–28, 1993.

    Article  Google Scholar 

  70. H._J. Zhang and S.W. Smoliar, “Developing power tools for video indexing and retrieval,” Proc. SPIE’94 Storage and Retrieval for Video Databases, San Jose, CA, USA, Feb. 1994.

    Google Scholar 

  71. H._J. Zhang, S.W. Smoliar, and J.H. Wu, “Content-based video browsing tools,” Proceedings of IS&T/SPIE’95 Multimedia Computing and Networking, San Jose, Feb. 1994, Vol. 2417.

    Google Scholar 

  72. H.J. Zhang, C.Y. Low, Y. Gong, and S.W. Smoliar, “Video parsing using compressed data,” Proc. SPIE’94 Image and Video Processing II, San Jose, CA, USA, Feb. 1994, pp. 142–149.

    Google Scholar 

  73. H.J. Zhang, S. Y. Tan, S.W. Smoliar, and Y. Gong, “Automatic parsing and indexing of news video,” Multimedia Systems, Vol. 2,No. 6, pp. 256–265, 1995.

    Article  Google Scholar 

  74. H.J. Zhang, C.Y. Low, S.W. Smoliar, and J.H. Wu, “Video parsing, retrieval and browsing: An integrated and content-based solution,” Proc. ACM Multimedia’95, San Francisco, Nov. 5–9, 1995, pp. 15–24.

    Google Scholar 

  75. D. Zhong, H.J. Zhang, and S.-F. Chang, “Clustering methods for video browsing and annotation,” Proc. Storage and Retrieval for Image and Video Databases IV, San Jose, CA, USA, Feb. 1995.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Kluwer Academic Publishers

About this chapter

Cite this chapter

Aigrain, P., Zhang, H., Petkovic, D. (1996). Content-Based Representation and Retrieval of Visual Media: A State-of-the-Art Review. In: Zhang, H., Aigrain, P., Petkovic, D. (eds) Representation and Retrieval of Visual Media in Multimedia Systems. Springer, Boston, MA. https://doi.org/10.1007/978-0-585-34549-9_2

Download citation

  • DOI: https://doi.org/10.1007/978-0-585-34549-9_2

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-7923-9771-7

  • Online ISBN: 978-0-585-34549-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics