Content-Based Representation and Retrieval of Visual Media: A State-of-the-Art Review

Aigrain, Philippe; Zhang, Hongjiang; Petkovic, Dragutin

doi:10.1007/978-0-585-34549-9_2

Content-Based Representation and Retrieval of Visual Media: A State-of-the-Art Review

Philippe Aigrain⁴,
Hongjiang Zhang⁵ &
Dragutin Petkovic⁶

Chapter

66 Accesses
10 Citations

Abstract

This paper reviews a number of recently available techniques in content analysis of visual media and their application to the indexing, retrieval, abstracting, relevance assessment, interactive perception, annotation and re-use of visual documents.

This work was performed while this author was with Institute of Systems Science, Singapore.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

T.G. Aguierre-Smith and G. Davenport, “The stratification system: A design environment for random access video,” Proc. 3rd Int. Workshop on Network and Operating System Support for Digital Audio and Video, La Jolla, CA, USA, Nov. 1992, pp. 250–261.
Google Scholar
P. Aigrain, “Organizing image banks for visual access: Model and techniques,” OPTICA’87 Conf. Proc., Amsterdam, Learned Information, April 1987, pp. 257–270.
Google Scholar
P. Aigrain, “Image and sound digital libraries need more than storage and networked access,” Proc. International Symposium on Digital Libraries, ULIS, Tsukuba, Japan, Aug. 1995, pp. 112–118.
Google Scholar
P. Aigrain, “Software research for video libraries and archives,” IFLA Journal, special issue on the UNESCO Memory of the World Project, Vol. 21,No. 3, pp. 198–202, 1995.
Google Scholar
P. Aigrain and P. Joly, “The automatic real-time analysis of film editing and transition effects and its applications,” Computers & Graphics, Vol. 18,No. 1, pp. 93–103, Jan.–Feb. 1994.
Article Google Scholar
P. Aigrain and P. Joly, “Discrete visual manipulation user interfaces for video,” Proc. RIAO’94 Conference, New-York, Oct. 1994, Vol. 2, pp. 12–17.
Google Scholar
P. Aigrain and V. Longueville, “A connection graph for user navigation in a large image bank,” Proc. RIAO’91, Barcelona, Spain, April 1991, Vol. 1, pp. 67–84.
Google Scholar
P. Aigrain and V. Longueville, “Evaluation of navigational links between images,” Information Processing and Management, Vol. 28,No. 4, pp. 517–528, 1992.
Article Google Scholar
P. Aigrain, P. Joly, and V. Longueville, “Medium-knowledge-based macro-segmentation of video into sequences,” Working Notes of IJCAI Workshop on Intelligent Multimedia Information Retrieval, Montreal, Aug. 1995, pp. 5–14.
Google Scholar
P. Aigrain, P. Joly, H.-K. Kim, and P. Lepain, Software Tools for Moving Image Archives: Access, Indexing and User Interfaces, G. Boston (Ed.), Proc. Joint Technical Sympoisum on Technology and Our Audiovisual Heritage, FIAF/FIAT/IASA/IFLA/ICA, London, Jan. 1995.
Google Scholar
P. Aigrain, P. Joly, P. Lepain, and V. Longueville, “Representation-based user interfaces for the audiovisual library of year 2000,” Proc. IS&T/SPIE’95 Multimedia Computing and Networking, San Jose, Feb. 1995, pp. 35–45.
Google Scholar
A. Akutsu, Y. Tonomura, H. Hashimoto, and Y. Ohba, “Video indexing using motion vectors,” Proc. Visual Communication and Image Processing, SPIE, Amsterdam, 1992, Vol. 1818, pp. 1522–1530.
Google Scholar
A. Akutsu and Y. Tonomura, “Video tomography: An efficient method for camerawork extraction and motion analysis,” Proc. A.C.M. Multimedia Conference, San Francisco, Oct. 1994.
Google Scholar
F. Arman, A. Hsu, and M.Y. Chiu, “Feature management for large video databases,” Proc. Storage and Retrieval for Image and Video Databases I, SPIE, Feb. 1993, Vol. 1908, pp. 2–12.
Google Scholar
T. Blum, D. Keislar, J. Wheaton, and E. Wold, “Audio databases with content-based retrieval,” Working Notes of IJCAI Workshop on Intelligent Multimedia Information Retrieval, Montreal, Aug. 1995, pp. 71–92.
Google Scholar
J.S. Boreczky and L.A. Rowe, “Comparison of video shot boundary detection techniques,” Proc. SPIE Conf. Storage and Retrieval for Video Databases IV, San Jose, CA, USA, Feb. 1995.
Google Scholar
V.M. Bove, Jr., “Entropy-based depth from focus,” Journal of the Optical Society of America A, Vol. 10, pp. 561–566, April 1993.
Article Google Scholar
S. Butler and A.P. Parkes, “Filmic spacetime diagrams for video structure representation,” to appear in Image Communication special issue on Image and Video Semantics: Processing, Analysis and Application, 1996.
Google Scholar
N.-S. Chang and K.-S. Fu, “Query by pictorial example,” IEEE Transactions on Software Engineering, Vol. 6,No. 6, pp. 519–524, Nov. 1980.
Article Google Scholar
M. Cherfaoui and C. Bertin, “Two-stage strategy for indexing and presenting video,” Proc. SPIE Conf. Storage and Retrieval for Video Databases III, San Jose, CA, USA, Feb. 1994, Vol. 2185.
Google Scholar
A. Dailianas, R. Allen, and P. England, “Comparison of automatic video segmentation algorithms,” Proceedings of SPIE Photonics West, Philadelphia, Oct. 1995.
Google Scholar
J. Ens and P. Lawrence, “An investigation of methods determining depth from focus,” IEEE Transactions on Pattern Matching and Machine Intelligence, Vol. 15, pp. 97–108, Feb. 1993.
Article Google Scholar
M. Flickner et al., “Query by image and video content,” IEEE Computer, pp. 23–32, Sept. 1995.
Google Scholar
Y. Gong, L.T. Sin, H.C. Chuan, H.J. Zhang, and M. Sakauchi, “Automatic parsing of TV soccer programs,” Proc. Second IEEE International Conference on Multimedia Computing and Systems, Washington DC, May 15–18, 1995, pp. 167–174.
Google Scholar
A.S. Gordon and E. A. Domeshek, “Conceptual indexing for video retrieval,” Working Notes of IJCAI Workshop on Intelligent Multimedia Information Retrieval, Montreal, Aug. 1995, pp. 23–38.
Google Scholar
V.N. Gudivada, “On spatial similarity measures for multimedia applications,” Proc. SPIE Conf. Storage and Retrieval for Image and Video Databases III, San Jose, CA, USA, Feb. 1994, Vol. 2420, pp. 363–380.
Google Scholar
V.N. Gudivada and V.V. Raghavan, “Design and evaluation of algorithms for image retrieval by spatial similarity,” ACM Transactions on Information Systems, Vol. 13,No. 2, pp. 115–144, April 1995.
Article Google Scholar
V. Guigueno, “L’identité de l’image: Expression et systémes documentaires,” rapport d’option, Ecole Poly-technique, Palaiseau, France, Juillet, 1991.
Google Scholar
K. Haase, “Framer: A persistent portable representation library,” Proc. of ECAI’94, 1994.
Google Scholar
A. Hampapur, R. Jain, and T.E. Weymouth, “Production model based digital video segmentation,” Multimedia Tools and Applications, Vol. 1,No. 1, pp. 9–46, 1995.
Article Google Scholar
A.G. Hauptmann and M. Smith, “Text, speech and vision for video segmentation: The Informedia project,” Working Notes of IJCAI Workshop on Intelligent Multimedia Information Retrieval, Montreal, Aug. 1995, pp. 17–22.
Google Scholar
M. Hawley, Structure Out of Sound, Ph.D. Dissertation, MIT Media Laboratory, Cambridge, Mass., USA, 1993.
Google Scholar
K. Hirata and T. Kato, “Query by Visual Example: Content-Based Image Retrieval,” Proc. E.D.B.T.’92 Conf. on Advances in Database Technology, in Pirotte, Delobel, and Gottlob (Eds.), Springer-Verlag, Lecture Notes in Computer Science, Vol. 580, pp. 56–71, 1994.
Google Scholar
M.E. Hodges, R.M. Sassnett, and M.S. Ackerman, “A construction set for multimedia applications,” IEEE Software, pp. 37–43, Jan. 1989.
Google Scholar
M. Irani, P. Anandan, J. Bergen, R. Kumar, and S. Hsu, “Mosaic based representations of video sequences and their applications,” Image Communication special issue on Image and Video Semantics: Processing, Analysis and Application, 1996.
Google Scholar
R. Jain, A. Pentland, and D. Petkovic (Eds.), Workshop Report: NSF-ARPA Workshop on Visual Information Management Systems, Cambridge, Mass., USA, June 1995.
Google Scholar
P. Joly and H.-K. Kim, “Efficient automatic analysis of camera work and micro-segmentation of video using spatio-temporal images,” Image Communication special issue on Image and Video Semantics: Processing, Analysis and Application, 1996.
Google Scholar
T. Kato, “Database architecture for content-based image retrieval,” Proc. of SPIE Conf. on Image Storage and Retrieval Systems, San Jose, Feb. 1992, Vol. 1662, pp. 112–123.
Google Scholar
P. Lepain and R. André-Obrecht, “Micro-segmentation d’enregistrements musicaux,” Actes des Journees d’Informatique Musicale, Vol. 95–13, pp. 81–90, 1995.
Google Scholar
W.E. Mackay and G. Davenport, “Virtual video editing in interactive multimedia applications,” Communications of the A.C.M., Vol. 32,No. 9, July 1989.
Google Scholar
J. Meng, Y. Juan, and S.-F. Chang, “Scene change detection in an MPEG compressed video sequence,” IS&T/SPIE’95 Digital Video Compression: Algorithm and Technologies, San Jose, Feb. 1995, Vol. 2419, pp. 14–25.
Google Scholar
M. Mills, J. Cohen, and Y.Y. Wong, “A magnifier tool for video data,” Proc. INTERCHI’92, ACM, May 1992, pp. 93–98.
Google Scholar
A. Nagasaka and Y. Tanaka, “Automatic scene-change detection method for video works,” E. Knuth and I.M. Wegener (Eds.), Proc. 40th National Con. Information Processing Society of Japan, 1990.
Google Scholar
A. Nagasaka and Y. Tanaka, Automatic Video Indexing and Full-Search for Video Appearances, E. Knuth and I.M. Wegener (Eds.), Visual Database Systems, Elsevier Science Publishers: Amsterdam, Vol. II, pp. 113–127, 1992.
Google Scholar
B.C. O’Connor, “Selecting key frames of moving image documents: A digital environment for analysis and navigation,” Microcomputers for Information Management, Vol. 8,No. 2, pp. 119–133, 1991.
MathSciNet Google Scholar
B. Peeters, J. Faton, and P. de Pierpont, Storyboard-Le Cinema Dessine, Editions Yellow Now, 1992.
Google Scholar
A. Pentland, R.W. Picard, and S. Sclaroff, “Photobook: Content-based manipulation of image databases,” Proc. Storage and Retrieval for Image and Video Databases II, San Jose, CA, USA, Feb. 1994, Vol. 2185.
Google Scholar
R. Picard and Fang Liu, “A new World ordering for image similarity,” Proc. Int. Conf. on Acoustic Signals and Signal Processing, Adelaide, Australia, March 1994, Vol. 5, p. 129.
Google Scholar
R.W. Picard and T.O. Minka T., “Vision texture for annotation,” Multimedia Systems, ACM-Springer, Vol. 3,No. 3, pp. 3–14, Feb. 1995.
Article Google Scholar
F. Salazar, “Analyse automatique des mouvements de caméra dans un document vid’eo,” IRIT, rapport de recherche, 95-33-R, Universit’e Paul Sabatier, Toulouse, France, Sept. 1995.
Google Scholar
F. Salazar and F. Valéro, “Analyse automatique de documents video,” IRIT, rapport de recherche, 95-28-R, Université Paul Sabatier, Toulouse, France, Juin 1995.
Google Scholar
S. Sclaroff and A. Pentland, “Modal matching for correspondence and recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 17,No. 6, pp. 545–561, June 1995.
Article Google Scholar
I.K. Sethi and N. Patel, “A statistic approach to scene change detection,” Proc. SPIE Storage and Retrieval for Image and Video Databases III, San Jose, CA, USA, Feb. 1995, Vol. 2420, pp. 329–338.
Google Scholar
B. Shahraray, “Scene change detection and content-based sampling of video sequences,” IS&T/SPIE’95 Digital Video Compression: Algorithm and Technologies, San Jose, Feb. 1995, Vol. 2419, pp. 2–13, SPIE Proceedings.
Google Scholar
B. Shahraray and D.C. Gibbon, “Automatic generation of pictorial transcripts of video programs,” IS&T/SPIE’95 Digital Video Compression: Algorithm and Technologies, San Jose, Feb. 1995, Vol. 2417, pp. 512–519, SPIE Proceedings.
Google Scholar
M. Strieker and M. Orengo, “Similarity of color images,” Proc. Storage and Retrieval for Image and Video Databases III, San Jose, CA, USA, Feb. 1995, Vol. 2420, pp. 381–392, SPIE Conference Proceedings.
Google Scholar
A. Takeshita, T. Inoue, and K. Tanaka, “Extracting text skim structures for multimedia browsing,” in M. Maybury (Ed.), Working Notes of IJCAI Workshop on Intelligent Multimedia Information Retrieval, Montreal, Aug. 1995, pp. 46–58.
Google Scholar
H. Tamura, S. Mori, and T. Yamawaki, “Texture features corresponding to visual perception,” IEEE Trans, on Syst., Man, and Cybern., Vol. 6,No. 4, pp. 460–473, 1979.
Google Scholar
L. Teodosio and W. Bender, “Salient video stills: Content and context preserved,” Proc. ACM Multimedia’ 93, Anaheim, CA, USA, Aug. 1993.
Google Scholar
Y. Tonomura, A. Akutsu, K. Otsuji, and T. Sadakata, “VideoMAP and VideoSpacelcon: Tools for anatomizing video content,” Proc. InterChi’93, ACM, 1993, pp. 131–136.
Google Scholar
Y.T. Tse and R.L. Baker, “Global zoom/pan estimation and compensation for video compression,” Proc ICASSP’91, May 1991, Vol. 4.
Google Scholar
H. Ueda, T. Miyatake, and S. Yoshisawa, “IMPACT: An interactive natural-motion-picture dedicated multimedia authoring system,” Proc. CHI’91, ACM, 1991, pp. 343–350.
Google Scholar
H.D. Wactlar, D. Christel, A. Hauptmann, T. Kanade, M. Mauldin, R. Reddy, M. Smith, and S. Stevens, “Technical challenges for the informedia digital video library,” Proc. International Symposium on Digital Libraries, Tsukuba, Japan, Aug. 1995, pp. 10–16.
Google Scholar
L. Wyse and S.W. Smoliar, “Towards content-based audio indexing and retrieval,” Proc. IJCAI Workshop on Computational Auditory Scene Analysis, D. Rosenthal and H.G. Okuno (Eds.), Montreal, Aug 1995 pp. 149–152.
Google Scholar
B.-L. Yeo and B. Liu, “On the extraction of DC sequence from MPEG compressed video,” International Conference on Image Processing (ICIP’95), Washington, DC, USA, Oct. 1995, IEEE.
Google Scholar
M.M. Yeung, B.-L. Yeo, W. Wolf, and B. Liu, “Video browsing using clustering and scene transitions on compressed sequences,” IS&T/SPIE’95 Multimedia Computing and Networking, San Jose Feb 1995 Vol 2417 pp. 399–413.
Google Scholar
M.M. Yeung and B. Liu, “Efficient matching and clustering of video shots,” International Conference on Image Processing (ICIP’95), Washington, DC, USA, Oct. 1995, IEEE.
Google Scholar
R. Zabih, K. Mai, and J. Miller, “A robust method for detecting cuts and dissolves in video sequences,” Proc. ACM Multimedia’95, San Francisco, Nov. 1995.
Google Scholar
H.J. Zhang, A. Kankanhalli, and S.W. Smoliar, “Automatic partitioning of full-motion video,” Multimedia Systems, ACM-Springer, Vol. 1,No. 1, pp. 10–28, 1993.
Article Google Scholar
H._J. Zhang and S.W. Smoliar, “Developing power tools for video indexing and retrieval,” Proc. SPIE’94 Storage and Retrieval for Video Databases, San Jose, CA, USA, Feb. 1994.
Google Scholar
H._J. Zhang, S.W. Smoliar, and J.H. Wu, “Content-based video browsing tools,” Proceedings of IS&T/SPIE’95 Multimedia Computing and Networking, San Jose, Feb. 1994, Vol. 2417.
Google Scholar
H.J. Zhang, C.Y. Low, Y. Gong, and S.W. Smoliar, “Video parsing using compressed data,” Proc. SPIE’94 Image and Video Processing II, San Jose, CA, USA, Feb. 1994, pp. 142–149.
Google Scholar
H.J. Zhang, S. Y. Tan, S.W. Smoliar, and Y. Gong, “Automatic parsing and indexing of news video,” Multimedia Systems, Vol. 2,No. 6, pp. 256–265, 1995.
Article Google Scholar
H.J. Zhang, C.Y. Low, S.W. Smoliar, and J.H. Wu, “Video parsing, retrieval and browsing: An integrated and content-based solution,” Proc. ACM Multimedia’95, San Francisco, Nov. 5–9, 1995, pp. 15–24.
Google Scholar
D. Zhong, H.J. Zhang, and S.-F. Chang, “Clustering methods for video browsing and annotation,” Proc. Storage and Retrieval for Image and Video Databases IV, San Jose, CA, USA, Feb. 1995.
Google Scholar

Download references

Author information

Authors and Affiliations

Institut de Recherche en Informatique de Toulouse, Universite Paul Sabatier, 118, route de Narbonne, F-31062, Toulouse Cedex, France
Philippe Aigrain
Broadband Information Systems Lab., Hewlett-Packard Labs., 1501 Page Mill Road, Palo Alto, CA, 94304, USA
Hongjiang Zhang
IBM Almaden Research Center, San Jose, CA, 95120-6099, USA
Dragutin Petkovic

Authors

Philippe Aigrain
View author publications
You can also search for this author in PubMed Google Scholar
Hongjiang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Dragutin Petkovic
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Hewlett Packard Laboratories, USA
HongJiang Zhang
Universite Paul Sabatier, France
Philippe Aigrain
IBM Almaden Research Center, Almaden
Dragutin Petkovic

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Aigrain, P., Zhang, H., Petkovic, D. (1996). Content-Based Representation and Retrieval of Visual Media: A State-of-the-Art Review. In: Zhang, H., Aigrain, P., Petkovic, D. (eds) Representation and Retrieval of Visual Media in Multimedia Systems. Springer, Boston, MA. https://doi.org/10.1007/978-0-585-34549-9_2

Download citation

DOI: https://doi.org/10.1007/978-0-585-34549-9_2
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-7923-9771-7
Online ISBN: 978-0-585-34549-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics