skip to main content
article

Content-based multimedia information retrieval: State of the art and challenges

Published:01 February 2006Publication History
Skip Abstract Section

Abstract

Extending beyond the boundaries of science, art, and culture, content-based multimedia information retrieval provides new paradigms and methods for searching through the myriad variety of media all over the world. This survey reviews 100+ recent articles on content-based multimedia information retrieval and discusses their role in current research directions which include browsing and search paradigms, user studies, affective computing, learning, semantic queries, new features and media types, high performance indexing, and evaluation techniques. Based on the current state of the art, we discuss the major challenges for the future.

References

  1. Amir, A., Basu, S., Iyengar, G., Lin, C.-Y., Naphade, M., Smith, J. R., Srinivasan, S., and Tseng, B. 2004. A Multi-modal system for the retrieval of semantic video events. Comput. Vision Image Understand. 96, 2, 216--236.]] Google ScholarGoogle Scholar
  2. Assfalg, J., Del Bimbo, A., and Pala, P. 2004. Retrieval of 3D objects by visual similarity. In Proceedings of the 6th International Workshop on Multimedia Information Retrieval. New York, NY. (Oct.). M. S. Lew, N. Sebe, C. Djeraba, Eds. ACM, New York, NY. 77--83.]] Google ScholarGoogle Scholar
  3. Bach, J. R., Fuller, C., Gupta, A., Hampapur, A., Horowitz, B., Humphrey, R., Jain, R., and Shu, C. F. 1996. Virage image search engine: An open framework for image management. In Proceedings of the SPIE Storage and Retrieval for Still Image and Video Databases. 76--87.]]Google ScholarGoogle Scholar
  4. Balakrishnan, N., Hariharakrishnan, K., and Schonfeld, D. 2005. A new image representation algorithm inspired by image submodality models, redundancy reduction, and learning in biological vision. IEEE Trans. Patt. Analy. Machine Intellig. 27, 9, 1367--1378.]] Google ScholarGoogle Scholar
  5. Ballard, D. H. and Brown, C. M. 1982. Computer Vision. Prentice Hall, New Jersey, USA.]] Google ScholarGoogle Scholar
  6. Bakker, E. M. and Lew, M. S. 2002. Semantic video retrieval using audio analysis. In Proceedings of the 1st International Conference on Image and Video Retrieval. (July) London, UK. M. S. Lew, N. Sebe, and J. P. Eakins, Eds. Springer-Verlag, London, UK. 262--270.]] Google ScholarGoogle Scholar
  7. Bartolini, I., Ciaccia, P., and Patella, M. 2005. WARP: Accurate retrieval of shapes using phase of fourier descriptors and time warping distance. IEEE Trans. Patt. Analy. Machine Intellig. 27, 1, 142--147.]] Google ScholarGoogle Scholar
  8. Battelle, J. 2005. The Search: How Google and Its Rivals Rewrote the Rules of Business and Transformed Our Culture. Portfolio Hardcover.]] Google ScholarGoogle Scholar
  9. Baumberg, A. 2000. Reliable feature matching across widely separated views. IEEE Conference of Computer Vision and Pattern Recognition. 774--781.]]Google ScholarGoogle Scholar
  10. Bell, G. 2004. A new relevance for multimedia when we record everything personal. In Proceedings of the 12th Annual ACM International Conference on Multimedia. ACM, New York, NY.]] Google ScholarGoogle Scholar
  11. Benitez, A. B. and Chang, S.-F. 2002. Semantic knowledge construction from annotated image collection. In Proceedings of the IEEE International Conference on Multimedia. IEEE Computer Society Press, Los Alamitos, CA.]]Google ScholarGoogle Scholar
  12. Beretti, S., Del Bimbo, A., and Vicario, E. 2001. Efficient matching and indexing of graph models in content-based retrieval. IEEE Trans. Patt. Analy. Machine Intellig. 23, 10, 1089--1105.]] Google ScholarGoogle Scholar
  13. Berthouze, N. B. and Kato, T. 1998. Towards a comprehensive integration of subjective parameters in database browsing. In Advanced Database Systems for Integration of Media and User Environments, Y. Kambayashi, A. Makinouchi, S. Uemura, K. Tanaka, and Y. Masunaga, Eds. World Scientific, Singapore, 227--232.]]Google ScholarGoogle Scholar
  14. Bliujute, R., Saltenis, S., Slivinskas, G., and Jensen, C. S. 1999. Developing a DataBlade for a new index. In Proceedings of IEEE International Conference on Data Engineering. (March) Sydney, Australia 314--323.]] Google ScholarGoogle Scholar
  15. Bosson, A., Cawley, G. C., Chan, Y., and Harvey, R. 2002. Non-retrieval: Blocking Pornographic Images. In Proceedings of the 1st International Conference on Image and Video Retrieval. (July) London, M. S. Lew, N. Sebe, and J. P. Eakins, Eds. Springer-Verlag, London, UK, 50--60.]] Google ScholarGoogle Scholar
  16. Cappelli, R., Maio, D., and Maltoni, D. 2001. Multispace KL for pattern representation and classification. IEEE Trans. Pattern Analy. Machine Intellig. 23, 9, 977--996.]] Google ScholarGoogle Scholar
  17. Chang, S.-F., Chen, W., and Sundaram, H. 1998. Semantic visual templates: Linking visual features to semantics. In Proceedings of the IEEE International Conference on Image Processing. IEEE Computer Society Press, Los Alamitos, CA. 531--535.]]Google ScholarGoogle Scholar
  18. Chen, Y., Che, D., and Aberer, K. 2002. On the efficient evaluation of relaxed queries in biological databases. In Proceedings of the 11th International Conference on Information and Knowledge Management. McLean, VA, 227--236.]] Google ScholarGoogle Scholar
  19. Chen, Y., Zhou, X. S., and Huang, T. S. 2001. One-class SVM for learning in image retrieval. In Proceedings of IEEE International Conference on Image Processing, (Oct.), Thessaloniki, Greece, 815--818.]]Google ScholarGoogle Scholar
  20. Chiu, P., Girgensoh, A., Lertsithichai, S., Polak, W., and Shipman, F. 2005. MediaMetro: Browsing multimedia document collections with a 3D city metaphor. In Proceedings of the 13th ACM International Conference on Multimedia. (Nov.), Singapore, 213--214.]] Google ScholarGoogle Scholar
  21. Chua, T. S., Zhao, Y., and Kankanhalli, M. S. 2002. Detection of human faces in a compressed domain for video stratification. The Visual Computer 18, 2, 121--133.]]Google ScholarGoogle Scholar
  22. Cooper, M., Foote, J., Girgensohn, A., and Wilcox, L. 2005. Temporal event clustering for digital photo collections. ACM Trans. Multimedia Comput. Comm. Applica. 1, 3, 269--288.]] Google ScholarGoogle Scholar
  23. Dimitrova, N., Agnihotri, L., and Wei, G. 2000. Video classification based on HMM using text and faces. European Signal Processing Conference. Tampere, Finland.]]Google ScholarGoogle Scholar
  24. Dimitrova, N., Zhang, H. J., Shahraray, B., Sezan, I., Huang, T., and Zakhor, A. 2002. Applications of video-content analysis and retrieval. IEEE Multimedia 9, 3, 42--55.]] Google ScholarGoogle Scholar
  25. Dimitrova, N. 2003. Multimedia content analysis: The next wave. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK, 9--18.]]Google ScholarGoogle Scholar
  26. Djeraba, C. 2002. Content-based multimedia indexing and retrieval. IEEE Multimedia 9, 18--22.]] Google ScholarGoogle Scholar
  27. Djeraba, C. 2003. Association and content-based retrieval. IEEE Trans. Knowl. Data Engin. 15, 1, 118--135.]] Google ScholarGoogle Scholar
  28. Downie, J. S. 2003. Toward the scientific evaluation of music information retrieval systems. In Proceedings of the International Conference on Music Information Retrieval. Baltimore, MD, 25--32.]]Google ScholarGoogle Scholar
  29. Dufournaud, Y., Schmid, C., and Horaud, R. 2000. Matching images with different resolutions. IEEE Conference of Computer Vision and Pattern Recognition. 612--618.]]Google ScholarGoogle Scholar
  30. Dy, J. G., Brodley, C. E., Kak, A., Broderick, L. S., and Aisen, A. M. 2003. Unsupervised feature selection applied to content-based retrieval of lung images. IEEE Trans. Patt. Analy. Machine Intellig. 25, 3, 373--378.]] Google ScholarGoogle Scholar
  31. Eakins, J. P., Riley, K. J., and Edwards, J. D. 2003. Shape feature matching for trademark image retrieval. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, IL. E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK. 28--38.]]Google ScholarGoogle Scholar
  32. Egas, R., Huijsmans, N., Lew, M. S., and Sebe, N. 1999. Adapting k-d Trees to Visual Retrieval. In Proceedings of the International Conference on Visual Information Systems. (June) Amsterdam, A. Smeulders and R. Jain, Eds., 533--540.]] Google ScholarGoogle Scholar
  33. Eiter, T. and Libkin, L. 2005. Database Theory. Springer, London. UK.]] Google ScholarGoogle Scholar
  34. Elkwae, E. A. and Kabuka, M. R. 2000. Efficient content-based indexing of large image databases. ACM Trans. Inform. Sys. 18, 2, 171--210.]] Google ScholarGoogle Scholar
  35. Enser, P. G. B. and Sandom, C. J. 2003. Towards a comprehensive survey of the semantic gap in visual information retrieval. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe and X. Zhou, Eds. Springer-Verlag, London, UK. 291--299.]]Google ScholarGoogle Scholar
  36. Enser, P. G. B., Sandom, C. J., and Lewis, P. H. 2005. Automatic annotation of images from the practitioner perspective. In Proceedings of the 4th International Conference on Image and Video Retrieval. (July) Singapore, IL W. Leow, M. S. Lew, T.-S. Chua, W.-Y. Ma, E. M. Bakker, and L. Chaisorn, Eds. Springer-Verlag, London, UK. 497--506.]] Google ScholarGoogle Scholar
  37. Fan, J., Gao, Y., and Luo, H. 2004. Multi-level annotation of natural scenes using dominant image components and semantic concepts. In Proceedings of the ACM International Conference on Multimedia. ACM, New York, NY, 540--547.]] Google ScholarGoogle Scholar
  38. Flickner, M., Sawhney, H., Niblack, W., Ashley, J., Qian Huang Dom, B., Gorkani, M., Hafner, J., Lee, D., Petkovic, D., Steele, D., and Yanker, P. 1995. Query by image and video content: The QBIC system. IEEE Comput., (Sept.), 23--32.]] Google ScholarGoogle Scholar
  39. Foote, J. 1999. An overview of audio information retrieval. ACM Multimedia Syst. 7, 1, 42--51.]] Google ScholarGoogle Scholar
  40. Foote, J. 2000. Automatic audio segmentation using a measure of audio novelty. In Proceedings of the IEEE International Conference on Multimedia and Expo. IEEE Computer Society Press, Los Alamitos, CA, 452--455.]]Google ScholarGoogle Scholar
  41. Forsyth, D. A. and Fleck, M. M. 1999. Automatic detection of human nudes. Int. J. Comput. Vision 32, 1, 63--77.]] Google ScholarGoogle Scholar
  42. Frankel, C., Swain, M. J., and Athitsos, V. 1996. WebSeer: An image search engine for the World Wide Web. University of Chicago Tech. rep. 96-14, University of Chicago, Chicago, IL.]] Google ScholarGoogle Scholar
  43. Frohlich, D., Kuchinsky, A., Pering, C., Don, A., and Ariss, S. 2002. Requirements for photoware. In Proceedings of the ACM Conference on CSCW. ACM Press, New York, NY, 166--175.]] Google ScholarGoogle Scholar
  44. Funkhouser, T., Min, P., Kazhdan, M., Chen, J., Halderman, A., Dobkin, D., and Jacobs, D. 2003. A search engine for 3D models. ACM Trans. Graph. 22, 1, 83--105.]] Google ScholarGoogle Scholar
  45. Gevers, T. 2001. Color-based retrieval. In Principles of Visual Information Retrieval, M. S. Lew, Ed. Springer-Verlag, London, UK, 11--49.]] Google ScholarGoogle Scholar
  46. Gong, B., Singh, R., and Jain, R. 2004. ResearchExplorer: Gaining insights through exploration in multimedia scientific data. In Proceedings of the 6th International Workshop on Multimedia Information Retrieval. (Oct.) New York, M. S. Lew, N. Sebe, C. Djeraba, Eds. ACM, New York, NY, 7--14.]] Google ScholarGoogle Scholar
  47. Graham, A., Garcia-Molina, H., Paepcke, A., and Winograd, T. 2002. Time as the essence for photo browsing through personal digital libraries. In Proceedings of the Joint Conference on Digital Libraries. ACM Press, New York, NY, 326--335.]] Google ScholarGoogle Scholar
  48. Greenspan, H., Goldberger, J., and Mayer, A. 2004. Probabilistic space-time video modeling via piecewise GMM. IEEE Trans. Patt. Analy. Machine Intell. 26, 3, 384--396.]] Google ScholarGoogle Scholar
  49. Guo, G., Zhang, H. J., and Li, S. Z. 2001. Boosting for content-based audio classification and retrieval: An Evaluation, In Proceedings of the IEEE Conference on Multimedia and Expo. (Aug.) Tokyo, Japan.]]Google ScholarGoogle Scholar
  50. Haas, M., Lew, M. S., and Huijsmans, D. P. 1997. A new method for key frame based video content representation. In Image Databases and Multimedia Search. A. Smeulders and R. Jain, Eds. World Scientific. 191--200.]]Google ScholarGoogle Scholar
  51. Haas, M., Rijsdam, J., and Lew, M. 2004. Relevance feedback: Perceptual learning and retrieval in bio-computing, photos, and video, In Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval. (Oct.), New York, 151--156.]] Google ScholarGoogle Scholar
  52. Hanjalic, A., Lagendijk, R. L., and Biemond, J. 1997. A new method for key frame based video content representation. In Image Databases and Multimedia Search, A. Smeulders and R. Jain, Eds. World Scientific. 97--107.]]Google ScholarGoogle Scholar
  53. Hanjalic, A. and Xu, L.-Q. 2005. Affective video content representation and modeling. IEEE Trans. Multimedia, 7, 1, 171--180.]] Google ScholarGoogle Scholar
  54. Haralick, R. M. and Shapiro, L. G. 1993. Computer and Robot Vision. Addison-Wesley, New York, NY.]] Google ScholarGoogle Scholar
  55. Harris, C. and Stephens, M. 1988, A combined corner and edge detector. The 4th Alvey Vision Conference. 147--151.]]Google ScholarGoogle Scholar
  56. Hastings, S. K. 1999. Evaluation of image retrieval Systems: Role of User Feedback. Library Trends 48, 2, 438--452.]]Google ScholarGoogle Scholar
  57. He, X., MA, W.-Y., King, O., LI, M., and Zhang, H. 2002. Learning and inferring a semantic space from user's relevance feedback for image retrieval. In Proceedings of the ACM Multimedia Conference. ACM, New York, NY, 343--347.]] Google ScholarGoogle Scholar
  58. Howe, N. 2003. A closer look at boosted image retrieval. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK, 61--70.]]Google ScholarGoogle Scholar
  59. Huijsmans, D. P. and Sebe, N. 2005. How to complete performance graphs in content-based image retrieval: Add generality and normalize Scope. IEEE Trans. Patt. Analy. Machine Intellig. 27, 2, 245--251.]] Google ScholarGoogle Scholar
  60. Jacobs, D. W., Weinshall, D., and Gdalyahu, Y. 2000. Classification with nonmetric distances: Image hetrieval and class representation. IEEE Trans. Patt. Analy. Machine Intell. 22, 6, 583--600.]] Google ScholarGoogle Scholar
  61. Jafari-Khouzani, K. and Soltanian-Zadeh, H. 2005. Radon transform orientation estimation for rotation invariant texture analysis. IEEE Trans. Patt. Analy. Machine Intell. 27, 6, 1004--1008.]] Google ScholarGoogle Scholar
  62. Jaimes, A. and Sebe, N. 2006. Multimodal human-computer interaction: A survey. Comput. Vision Image Understand. To appear.]] Google ScholarGoogle Scholar
  63. Jain, R. 2003. A game experience in every application: Experiential computing. Comm. ACM 46, 7, 48--54.]] Google ScholarGoogle Scholar
  64. Jain, R., Kim, P., and Li, Z. 2003. Experiential meeting system. In Proceedings of the 2003 ACM SIGMM Workshop on Experiential Telepresence. Berkeley, CA, 1--12.]] Google ScholarGoogle Scholar
  65. Jolion, J. M. 2001. Feature similarity. In Principles of Visual Information Retrieval. M. S. Lew, Ed. Springer-Verlag, London, UK. 122--162.]] Google ScholarGoogle Scholar
  66. Krishnapuram, R., Medasani, S., Jung, S. H., Choi, Y. S., and Balasubramaniam, R. 2004. Content-based image retrieval based on a fuzzy approach. IEEE Trans. Knowl. Data Eng. 16, 10, 1185--1199.]] Google ScholarGoogle Scholar
  67. Levine, M. 1985. Vision in Man and Machine. Mcgraw Hill, Columbus, OH.]]Google ScholarGoogle Scholar
  68. Lew, M. S. and Huijsmans, N. 1996. Information theory and face detection. In Proceedings of the International Conference on Pattern Recogntion. Vienna, Austria, 601--605.]] Google ScholarGoogle Scholar
  69. Lew, M. S. 2000. Next generation Web searches for visual content. IEEE Comput. (Nov.). 46--53.]] Google ScholarGoogle Scholar
  70. Lew, M. S. 2001. Principles of Visual Information Retrieval. Springer, London, UK.]] Google ScholarGoogle Scholar
  71. Lew, M. S. and Denteneer, D. 2001. Fisher keys for content based retrieval. Image Vision Comput. 19, 561--566.]]Google ScholarGoogle Scholar
  72. Li, J. and Wang, J. Z. 2003. Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Trans. Patt. Analy. Machine Intell. 25, 9, 1075--1088.]] Google ScholarGoogle Scholar
  73. Lienhart, R. 2001. Reliable transition detection in videos: A survey and practitioner's guide. Int. J. Image Graph. 1, 3, 469--486.]]Google ScholarGoogle Scholar
  74. Lim, J.-H., Tian, Q., and Mulhelm, P. 2003. Home photo content modeling for personalized event-based retrieval. IEEE Multimedia 10, 4, 28--37.]]Google ScholarGoogle Scholar
  75. Lindeberg, T. 1998. Feature detection with automatic scale selection. Int. J. Comput. Vision. 30, 2, 79--116.]] Google ScholarGoogle Scholar
  76. Lindeberg, T. and Garding, J. 1997. Shape-adapted smoothing in estimation of the 3D shape cues from affine deformations of local 2D brightness structure. Image Vision Comput. 15, 6, 415--434.]]Google ScholarGoogle Scholar
  77. Liu, B., Gupta, A., and Jain, R. 2005. MedSMan: A streaming data management system over live multimedia. ACM Multimedia, 171--180.]] Google ScholarGoogle Scholar
  78. Liu, H., Xie, X., Tang, X., Li, Z. W., and MA, W. Y. 2004. Effective browsing of Web image search results. In Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval. ACM, New York, NY. 84--90.]] Google ScholarGoogle Scholar
  79. Liu, X., Srivastava, A., and Sun, D. 2003. Learning optimal representations for image retrieval applications. In Proceedings of the 3rd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK. 50--60.]]Google ScholarGoogle Scholar
  80. Lowe, D. 2004. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60, 2, 91--110.]] Google ScholarGoogle Scholar
  81. Markkula, M. and Sormunen, E. 2000. End-user searching challenges indexing practices in the digital newspaper photo archive. Inform. Retrieval 1, 4, 259--285.]] Google ScholarGoogle Scholar
  82. Mikolajczyk, K. and Schmid, C. 2004, Scale and affine invariant interest point detectors. Int. J. Comput. Vision 60, 1, 63--86.]] Google ScholarGoogle Scholar
  83. Mongy, S., Bouali, F., and Djeraba, C. 2005. Analyzing user's behavior on a video database. In Proceedings of ACM MDM/KDD Workshop on Multimedia Data Mining. Chicago, IL.]] Google ScholarGoogle Scholar
  84. Muller, H., Muller, W., Marchand-Maillet, S., Pun, T., and Squire, D. 2000. Strategies for positive and negative relevance feedback in image retrieval. In Proceedings of 15th International Conference on Pattern Recognition. (Sept.) Barcelona, Spain, 1043--1046.]] Google ScholarGoogle Scholar
  85. Muller, H., Marchand-Maillet, S., and Pun, T. 2002. The Truth about Corel-evaluation in image retrieval. In Proceedings of the 1st International Conference on Image and Video Retrieval. (July), London, UK, M. S. Lew, N. Sebe, and J. P. Eakins, Eds. Springer-Verlag, London, UK. 38--49.]] Google ScholarGoogle Scholar
  86. Müller, W. and Henrich, A. 2003. Fast retrieval of high-dimensional feature vectors in P2P networks using compact peer data summaries. In Proceedings of the 5th ACM SIGMM International Workshop on Multimedia Information Retrieval, Berkeley, CA, 79--86.]] Google ScholarGoogle Scholar
  87. Ojala, T., Pietikainen, M., and Harwood, D. 1996. Comparative study of texture measures with classification based on feature distributions. Patt. Recogn. 29, 1, 51--59.]]Google ScholarGoogle Scholar
  88. Pereira, F. and Koenen, R. 2001. MPEG-7: A standard for multimedia content description. Int. J. Image Graph. 1, 3, 527--546.]]Google ScholarGoogle Scholar
  89. Picard, R. W. 2000. Affective Computing. MIT Press, Cambridge, MA.]] Google ScholarGoogle Scholar
  90. Pickering, M. J. and Rüger, S. 2003. Evaluation of key-frame based retrieval techniques for video. Comput. Vision Image Understand. 92, 2, 217--235.]] Google ScholarGoogle Scholar
  91. Rautiainen, M., Seppanen, T., PenttiLA, J., and Peltola, J. 2003. Detecting semantic concepts from video using temporal gradients and audio classification. In Proceedings of the 3rd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK, 260--270.]]Google ScholarGoogle Scholar
  92. Rocchio 1971. Relevance feedback in information retrieval. In The Smart Retrieval System: Experiments in Automatic Document Processing. G. Salton, Ed. Prentice Hall, Englewoods Cliffs, NJ.]]Google ScholarGoogle Scholar
  93. Rodden, K., Basalaj, W., Sinclair, D., and Wood, K. 2001. Does organisation by similarity assist image browsing? In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. (Mar.), Seattle, WA. 190--197.]] Google ScholarGoogle Scholar
  94. Rodden, K. and Wood, K. 2003. How do people manage their digital photographs? In Proceedings of the ACM Conference on Human Factors in Computing Systems. ACM Press, New York, NY, 409--416.]] Google ScholarGoogle Scholar
  95. Rowe, L. A. and Jain, R. 2005. ACM SIGMM retreat report on future directions in multimedia research. ACM Trans. Multimedia Comput. Comm. Appl. 1, 1, 3--13.]] Google ScholarGoogle Scholar
  96. Rowley, H., Baluja, S., and Kanade, K. 1996. Human face detection in visual scenes. In Proceedings of NIPS Advances in Neural Information Processing Systems 8, (Nov.), Denver, CO, 875--881.]]Google ScholarGoogle Scholar
  97. Rubin, R. 2004. Foundations of Library and Information Science. Neal-Schuman Publishers, New York, NY.]]Google ScholarGoogle Scholar
  98. Rui, Y. and Huang, T. S. 2001. Relevance feedback techniques in image retrieval. In Principles of Visual Information Retrieval, M. S. Lew, Ed. Springer-Verlag, London, UK, 219--258.]] Google ScholarGoogle Scholar
  99. Salway, A. and Graham, M. 2003. Extracting information about emotions in films. In Proceedings of the ACM International Conference on Multimedia. (Nov.) Berkeley, CA, 299--302.]] Google ScholarGoogle Scholar
  100. Schmid, C., Mohr, R., and Bauckage, C. 2000, Evaluation of interest point detectors. Int. J. Comput. Vision 37, 2, 151--172.]] Google ScholarGoogle Scholar
  101. Schneiderman, H. and Kanade, T. 2004. Object detection using the statistics of parts. Int. J. Comput. Vision 56, 3, 151--177.]] Google ScholarGoogle Scholar
  102. Sclaroff, S., La Cascia, M., Sethi, S., and Taycher, L. 2001. Mix and match features in the ImageRover search engine. In Principles of Visual Information Retrieval. M. S. Lew, Ed. Springer-Verlag, London, UK, 259--277.]] Google ScholarGoogle Scholar
  103. Scott, G. J. and Shyu, C. R. 2003. EBS k-d tree: An entropy balanced statistical k-d tree for image databases with ground-truth labels. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK, 467--476.]]Google ScholarGoogle Scholar
  104. Sebastian, T. B., Klein, P. N., and Kimia, B. B. 2004. Recognition of shapes by editing their shock graphs. IEEE Trans. Patt. Analy. Machine Intell. 26, 5, 550--571.]] Google ScholarGoogle Scholar
  105. Sebe, N., Lew, M. S., and Huijsmans, D. P. 2000. Toward improved ranking metrics. IEEE Trans. Patt. Analy. Mach. Intell. 22, 10, 1132--1143.]] Google ScholarGoogle Scholar
  106. Sebe, N. and Lew, M. S. 2001. Color based retrieval. Pattern Recognition Letters 22, 2, 223--230.]] Google ScholarGoogle Scholar
  107. Sebe, N. and Lew, M. S. 2002. Robust shape matching. In Proceedings of the 1st International Conference on Image and Video Retrieval. (July) London, UK, M. S. Lew, N. Sebe, and J. P. Eakins, Eds. Springer-Verlag, London, UK, 17--28.]] Google ScholarGoogle Scholar
  108. Sebe, N., Cohen, I., Garg, A., Lew, M. S., and Huang, T. S. 2002. Emotion recognition using a Cauchy naive Bayes classifier. In Proceedings of International Conference on Pattern Recognition. (Aug.) Quebec, Canada, 17--20.]] Google ScholarGoogle Scholar
  109. Sebe, N., Tian, Q., Loupias, E., Lew, M. S., and Huang, T. S. 2003a. Evaluation of salient point techniques. Image Vision Computing 21, 13--14, 1087--1095.]]Google ScholarGoogle Scholar
  110. Sebe, N., Lew, M. S., Zhou, X., and Huang, T. S. 2003b. The state of the art in image and video retrieval. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK.]]Google ScholarGoogle Scholar
  111. Shao, H., Svoboda, T., Tuytelaars, T., and van Gool, L. 2003. HPAT indexing for fast object/scene recognition based on local appearance. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK, 71--80.]]Google ScholarGoogle Scholar
  112. Shen, H. T., Ooi, B. C., and Tan, K. L. 2000. Giving meanings to WWW images. In Proceedings of ACM Multimedia. ACM, New York, NY, 39--48.]] Google ScholarGoogle Scholar
  113. Silva, G. C., De, Yamasaki, T., and Aizawa, K. 2005. Evaluation of video summarization for a large number of cameras in ubiquitous home. In Proceedings of the 13th ACM International Conference on Multimedia. (Nov.) ACM, Singapore, 820--828.]] Google ScholarGoogle Scholar
  114. Smeaton, A. F. and Over, P. 2003. Benchmarking the effectiveness of information retrieval tasks on digital video. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK, 10--27.]]Google ScholarGoogle Scholar
  115. Smeulders, A., Worring, M., Santini, S., Gupta, A., and Jain, R. 2000. Content based image retrieval at the end of the early years. IEEE Trans. Patt. Analy. Mach. Intell. 22, 12, 1349--1380.]] Google ScholarGoogle Scholar
  116. Smith, J. R. and Chang, S. F. 1997. Visually searching the web for content. IEEE Multimedia 4, 3, 12--20.]] Google ScholarGoogle Scholar
  117. Snoek, C. G. M., Worring, M., Van Gemert, J., Geusebroek, J. M., Koelma, D., Nguyen, G. P., De Rooij, O., and Seinstra, F. 2005. MediaMill: Exploring news video archives based on learned semantics. In Proceedings of the 13th ACM International Conference on Multimedia. (Nov.) Singapore, 225--226.]] Google ScholarGoogle Scholar
  118. Spierenburg, J. A. and Huijsmans, D. P. 1997. VOICI: Video overview for image cluster indexing. In Proceedings of the 8th British Machine Vision Conference. (June) Colchester, UK.]]Google ScholarGoogle Scholar
  119. Srivastava, A., Joshi, S. H., Mio, W., and Liu, X. 2005. Statistical shape analysis: Clustering, learning, and testing. IEEE Trans. Patt. Analy. Mach. Intell. 27, 4, 590--602.]] Google ScholarGoogle Scholar
  120. Sundaram, H., Xie, L., and Chang, S. F. 2002. A utility framework for the automatic generation of audio-visual skims. In Proceedings of the 10th ACM International Conference on Multimedia. Juan-les-Pins, France, 189--198.]] Google ScholarGoogle Scholar
  121. Tangelder, J. and Veltkamp, R. C. 2004. A survey of content based 3d shape retrieval methods. In Proceedings of the International Conference on Shape Modeling and Applications. (June) Genova, Italy. IEEE, New York, NY, 157--166.]]Google ScholarGoogle Scholar
  122. Tian, Q., Sebe, N., Lew, M. S., Loupias, E., and Huang, T. S. 2001. Image retrieval using wavelet-based salient points. Journal of Electronic Imaging 10, 4, 835--849.]]Google ScholarGoogle Scholar
  123. Tian, Q., Moghaddam, B., and Huang, T. S. 2002. Visualization, estimation and user-modeling. In Proceedings of the 1st International Conference on Image and Video Retrieval. (July), London, UK, M. S. Lew, N. Sebe, and J. P. Eakins, Eds. Springer-Verlag, London, UK, 7--16.]] Google ScholarGoogle Scholar
  124. Tieu, K. and Viola, P. 2004. Boosting image retrieval. Int. J. Comput. Vision 56, 1, 17--36.]] Google ScholarGoogle Scholar
  125. Therrien, C. W. 1989. Decision, Estimation, and Classification. Wiley, New York, NY.]] Google ScholarGoogle Scholar
  126. Tuytelaars, T. and van Gool, L. 2000. Wide baseline stereo matching based on local affinely invariant regions. British Machine Vision Conference. 412--425.]]Google ScholarGoogle Scholar
  127. Uchihashi, S., Foote, J., Girgensohn, A., and Boreczky, J. 1999. Video manga: Generating semantically meaningful video summaries. In Proceedings of the 7th ACM International Conference on Multimedia. Orlando, FL, 383--392.]] Google ScholarGoogle Scholar
  128. Vailaya, A., Jain, A., and Zhang, H. 1998. On image classification: City vs landscape. In Proceedings of Workshop on Content-Based Access of Image and Video Libraries. 3--8.]] Google ScholarGoogle Scholar
  129. Veltkamp, R. C. and Hagedoorn, M. 2001. State of the art in shape matching. In Principles of Visual Information Retrieval. M. S. Lew, Ed. Springer-Verlag, London, UK, 87--119.]] Google ScholarGoogle Scholar
  130. Wang, W., Yu, Y., and Zhang, J. 2004. Image emotional classification: Static vs. dynamic. In Proceedings of IEEE International Conference on Systems, Man, and Cybernetics. (Oct.), 6407--6411.]]Google ScholarGoogle Scholar
  131. Winston, P. 1992. Artificial Intelligence, Addison-Wesley, New York, NY.]] Google ScholarGoogle Scholar
  132. Worring, M. and Gevers, T. 2001. Interactive retrieval of color images. Int. J. Image Graph. 1, 3, 387--414.]]Google ScholarGoogle Scholar
  133. Worring, M., Nguyen, G. P., Hollink, L., Gemert, J. C., and Koelma, D. C. 2004. Accessing video archives using interactive search. In Proceedings of IEEE International Conference on Multimedia and Expo. (June) IEEE, Taiwan, 297--300.]]Google ScholarGoogle Scholar
  134. Wu, P., Choi, Y., Ro., Y. M., and Won, C. S. 2001. MPEG-7 texture descriptors. Int. J. Image Graph. 1, 3, 547--563.]]Google ScholarGoogle Scholar
  135. Yang, M. H., Kriegman, D. J., and Ahuja, N. 2002. Detecting faces in images: A survey. IEEE Trans. Patt. Analy. Machine Intell. 24, 1, 34--58.]] Google ScholarGoogle Scholar
  136. Ye, H. and Xu, G. 2003. Fast search in large-scale image database using vector quantization. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK, 477--487.]]Google ScholarGoogle Scholar
  137. Yin, P. Y., Bhanu, B., Chang, K. C., and Dong, A. 2005. Integrating relevance feedback techniques for image retrieval using reinforcement learning. IEEE Trans. Patt. Analy. Machine Intell. 27, 10, 1536--1551.]] Google ScholarGoogle Scholar
  138. Zhou, X. S. and Huang, T. S. 2001. Comparing discriminating transformations and SVM for learning during multimedia retrieval. In Proceedings of the 9th ACM International Conference on Multimedia. Ottawa, Canada, 137--146.]] Google ScholarGoogle Scholar

Index Terms

  1. Content-based multimedia information retrieval: State of the art and challenges

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader