article

Content-based multimedia information retrieval: State of the art and challenges

Authors:
Michael S. Lew

Leiden University, The Netherlands

Leiden University, The Netherlands
View Profile

,
Nicu Sebe

University of Amsterdam, The Netherlands, SJ Amsterdam, The Netherlands

University of Amsterdam, The Netherlands, SJ Amsterdam, The Netherlands
View Profile

,
Chabane Djeraba

LIFL, France

LIFL, France
View Profile

,
Ramesh Jain

University of California at Irvine, USA

University of California at Irvine, USA
View Profile

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 2 Issue 1pp 1–19https://doi.org/10.1145/1126004.1126005

Published:01 February 2006Publication History

ACM Transactions on Multimedia Computing, Communications, and Applications

Abstract

Extending beyond the boundaries of science, art, and culture, content-based multimedia information retrieval provides new paradigms and methods for searching through the myriad variety of media all over the world. This survey reviews 100+ recent articles on content-based multimedia information retrieval and discusses their role in current research directions which include browsing and search paradigms, user studies, affective computing, learning, semantic queries, new features and media types, high performance indexing, and evaluation techniques. Based on the current state of the art, we discuss the major challenges for the future.

References

Amir, A., Basu, S., Iyengar, G., Lin, C.-Y., Naphade, M., Smith, J. R., Srinivasan, S., and Tseng, B. 2004. A Multi-modal system for the retrieval of semantic video events. Comput. Vision Image Understand. 96, 2, 216--236.]] Google Scholar
Assfalg, J., Del Bimbo, A., and Pala, P. 2004. Retrieval of 3D objects by visual similarity. In Proceedings of the 6th International Workshop on Multimedia Information Retrieval. New York, NY. (Oct.). M. S. Lew, N. Sebe, C. Djeraba, Eds. ACM, New York, NY. 77--83.]] Google Scholar
Bach, J. R., Fuller, C., Gupta, A., Hampapur, A., Horowitz, B., Humphrey, R., Jain, R., and Shu, C. F. 1996. Virage image search engine: An open framework for image management. In Proceedings of the SPIE Storage and Retrieval for Still Image and Video Databases. 76--87.]]Google Scholar
Balakrishnan, N., Hariharakrishnan, K., and Schonfeld, D. 2005. A new image representation algorithm inspired by image submodality models, redundancy reduction, and learning in biological vision. IEEE Trans. Patt. Analy. Machine Intellig. 27, 9, 1367--1378.]] Google Scholar
Ballard, D. H. and Brown, C. M. 1982. Computer Vision. Prentice Hall, New Jersey, USA.]] Google Scholar
Bakker, E. M. and Lew, M. S. 2002. Semantic video retrieval using audio analysis. In Proceedings of the 1st International Conference on Image and Video Retrieval. (July) London, UK. M. S. Lew, N. Sebe, and J. P. Eakins, Eds. Springer-Verlag, London, UK. 262--270.]] Google Scholar
Bartolini, I., Ciaccia, P., and Patella, M. 2005. WARP: Accurate retrieval of shapes using phase of fourier descriptors and time warping distance. IEEE Trans. Patt. Analy. Machine Intellig. 27, 1, 142--147.]] Google Scholar
Battelle, J. 2005. The Search: How Google and Its Rivals Rewrote the Rules of Business and Transformed Our Culture. Portfolio Hardcover.]] Google Scholar
Baumberg, A. 2000. Reliable feature matching across widely separated views. IEEE Conference of Computer Vision and Pattern Recognition. 774--781.]]Google Scholar
Bell, G. 2004. A new relevance for multimedia when we record everything personal. In Proceedings of the 12th Annual ACM International Conference on Multimedia. ACM, New York, NY.]] Google Scholar
Benitez, A. B. and Chang, S.-F. 2002. Semantic knowledge construction from annotated image collection. In Proceedings of the IEEE International Conference on Multimedia. IEEE Computer Society Press, Los Alamitos, CA.]]Google Scholar
Beretti, S., Del Bimbo, A., and Vicario, E. 2001. Efficient matching and indexing of graph models in content-based retrieval. IEEE Trans. Patt. Analy. Machine Intellig. 23, 10, 1089--1105.]] Google Scholar
Berthouze, N. B. and Kato, T. 1998. Towards a comprehensive integration of subjective parameters in database browsing. In Advanced Database Systems for Integration of Media and User Environments, Y. Kambayashi, A. Makinouchi, S. Uemura, K. Tanaka, and Y. Masunaga, Eds. World Scientific, Singapore, 227--232.]]Google Scholar
Bliujute, R., Saltenis, S., Slivinskas, G., and Jensen, C. S. 1999. Developing a DataBlade for a new index. In Proceedings of IEEE International Conference on Data Engineering. (March) Sydney, Australia 314--323.]] Google Scholar
Bosson, A., Cawley, G. C., Chan, Y., and Harvey, R. 2002. Non-retrieval: Blocking Pornographic Images. In Proceedings of the 1st International Conference on Image and Video Retrieval. (July) London, M. S. Lew, N. Sebe, and J. P. Eakins, Eds. Springer-Verlag, London, UK, 50--60.]] Google Scholar
Cappelli, R., Maio, D., and Maltoni, D. 2001. Multispace KL for pattern representation and classification. IEEE Trans. Pattern Analy. Machine Intellig. 23, 9, 977--996.]] Google Scholar
Chang, S.-F., Chen, W., and Sundaram, H. 1998. Semantic visual templates: Linking visual features to semantics. In Proceedings of the IEEE International Conference on Image Processing. IEEE Computer Society Press, Los Alamitos, CA. 531--535.]]Google Scholar
Chen, Y., Che, D., and Aberer, K. 2002. On the efficient evaluation of relaxed queries in biological databases. In Proceedings of the 11th International Conference on Information and Knowledge Management. McLean, VA, 227--236.]] Google Scholar
Chen, Y., Zhou, X. S., and Huang, T. S. 2001. One-class SVM for learning in image retrieval. In Proceedings of IEEE International Conference on Image Processing, (Oct.), Thessaloniki, Greece, 815--818.]]Google Scholar
Chiu, P., Girgensoh, A., Lertsithichai, S., Polak, W., and Shipman, F. 2005. MediaMetro: Browsing multimedia document collections with a 3D city metaphor. In Proceedings of the 13th ACM International Conference on Multimedia. (Nov.), Singapore, 213--214.]] Google Scholar
Chua, T. S., Zhao, Y., and Kankanhalli, M. S. 2002. Detection of human faces in a compressed domain for video stratification. The Visual Computer 18, 2, 121--133.]]Google Scholar
Cooper, M., Foote, J., Girgensohn, A., and Wilcox, L. 2005. Temporal event clustering for digital photo collections. ACM Trans. Multimedia Comput. Comm. Applica. 1, 3, 269--288.]] Google Scholar
Dimitrova, N., Agnihotri, L., and Wei, G. 2000. Video classification based on HMM using text and faces. European Signal Processing Conference. Tampere, Finland.]]Google Scholar
Dimitrova, N., Zhang, H. J., Shahraray, B., Sezan, I., Huang, T., and Zakhor, A. 2002. Applications of video-content analysis and retrieval. IEEE Multimedia 9, 3, 42--55.]] Google Scholar
Dimitrova, N. 2003. Multimedia content analysis: The next wave. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK, 9--18.]]Google Scholar
Djeraba, C. 2002. Content-based multimedia indexing and retrieval. IEEE Multimedia 9, 18--22.]] Google Scholar
Djeraba, C. 2003. Association and content-based retrieval. IEEE Trans. Knowl. Data Engin. 15, 1, 118--135.]] Google Scholar
Downie, J. S. 2003. Toward the scientific evaluation of music information retrieval systems. In Proceedings of the International Conference on Music Information Retrieval. Baltimore, MD, 25--32.]]Google Scholar
Dufournaud, Y., Schmid, C., and Horaud, R. 2000. Matching images with different resolutions. IEEE Conference of Computer Vision and Pattern Recognition. 612--618.]]Google Scholar
Dy, J. G., Brodley, C. E., Kak, A., Broderick, L. S., and Aisen, A. M. 2003. Unsupervised feature selection applied to content-based retrieval of lung images. IEEE Trans. Patt. Analy. Machine Intellig. 25, 3, 373--378.]] Google Scholar
Eakins, J. P., Riley, K. J., and Edwards, J. D. 2003. Shape feature matching for trademark image retrieval. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, IL. E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK. 28--38.]]Google Scholar
Egas, R., Huijsmans, N., Lew, M. S., and Sebe, N. 1999. Adapting k-d Trees to Visual Retrieval. In Proceedings of the International Conference on Visual Information Systems. (June) Amsterdam, A. Smeulders and R. Jain, Eds., 533--540.]] Google Scholar
Eiter, T. and Libkin, L. 2005. Database Theory. Springer, London. UK.]] Google Scholar
Elkwae, E. A. and Kabuka, M. R. 2000. Efficient content-based indexing of large image databases. ACM Trans. Inform. Sys. 18, 2, 171--210.]] Google Scholar
Enser, P. G. B. and Sandom, C. J. 2003. Towards a comprehensive survey of the semantic gap in visual information retrieval. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe and X. Zhou, Eds. Springer-Verlag, London, UK. 291--299.]]Google Scholar
Enser, P. G. B., Sandom, C. J., and Lewis, P. H. 2005. Automatic annotation of images from the practitioner perspective. In Proceedings of the 4th International Conference on Image and Video Retrieval. (July) Singapore, IL W. Leow, M. S. Lew, T.-S. Chua, W.-Y. Ma, E. M. Bakker, and L. Chaisorn, Eds. Springer-Verlag, London, UK. 497--506.]] Google Scholar
Fan, J., Gao, Y., and Luo, H. 2004. Multi-level annotation of natural scenes using dominant image components and semantic concepts. In Proceedings of the ACM International Conference on Multimedia. ACM, New York, NY, 540--547.]] Google Scholar
Flickner, M., Sawhney, H., Niblack, W., Ashley, J., Qian Huang Dom, B., Gorkani, M., Hafner, J., Lee, D., Petkovic, D., Steele, D., and Yanker, P. 1995. Query by image and video content: The QBIC system. IEEE Comput., (Sept.), 23--32.]] Google Scholar
Foote, J. 1999. An overview of audio information retrieval. ACM Multimedia Syst. 7, 1, 42--51.]] Google Scholar
Foote, J. 2000. Automatic audio segmentation using a measure of audio novelty. In Proceedings of the IEEE International Conference on Multimedia and Expo. IEEE Computer Society Press, Los Alamitos, CA, 452--455.]]Google Scholar
Forsyth, D. A. and Fleck, M. M. 1999. Automatic detection of human nudes. Int. J. Comput. Vision 32, 1, 63--77.]] Google Scholar
Frankel, C., Swain, M. J., and Athitsos, V. 1996. WebSeer: An image search engine for the World Wide Web. University of Chicago Tech. rep. 96-14, University of Chicago, Chicago, IL.]] Google Scholar
Frohlich, D., Kuchinsky, A., Pering, C., Don, A., and Ariss, S. 2002. Requirements for photoware. In Proceedings of the ACM Conference on CSCW. ACM Press, New York, NY, 166--175.]] Google Scholar
Funkhouser, T., Min, P., Kazhdan, M., Chen, J., Halderman, A., Dobkin, D., and Jacobs, D. 2003. A search engine for 3D models. ACM Trans. Graph. 22, 1, 83--105.]] Google Scholar
Gevers, T. 2001. Color-based retrieval. In Principles of Visual Information Retrieval, M. S. Lew, Ed. Springer-Verlag, London, UK, 11--49.]] Google Scholar
Gong, B., Singh, R., and Jain, R. 2004. ResearchExplorer: Gaining insights through exploration in multimedia scientific data. In Proceedings of the 6th International Workshop on Multimedia Information Retrieval. (Oct.) New York, M. S. Lew, N. Sebe, C. Djeraba, Eds. ACM, New York, NY, 7--14.]] Google Scholar
Graham, A., Garcia-Molina, H., Paepcke, A., and Winograd, T. 2002. Time as the essence for photo browsing through personal digital libraries. In Proceedings of the Joint Conference on Digital Libraries. ACM Press, New York, NY, 326--335.]] Google Scholar
Greenspan, H., Goldberger, J., and Mayer, A. 2004. Probabilistic space-time video modeling via piecewise GMM. IEEE Trans. Patt. Analy. Machine Intell. 26, 3, 384--396.]] Google Scholar
Guo, G., Zhang, H. J., and Li, S. Z. 2001. Boosting for content-based audio classification and retrieval: An Evaluation, In Proceedings of the IEEE Conference on Multimedia and Expo. (Aug.) Tokyo, Japan.]]Google Scholar
Haas, M., Lew, M. S., and Huijsmans, D. P. 1997. A new method for key frame based video content representation. In Image Databases and Multimedia Search. A. Smeulders and R. Jain, Eds. World Scientific. 191--200.]]Google Scholar
Haas, M., Rijsdam, J., and Lew, M. 2004. Relevance feedback: Perceptual learning and retrieval in bio-computing, photos, and video, In Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval. (Oct.), New York, 151--156.]] Google Scholar
Hanjalic, A., Lagendijk, R. L., and Biemond, J. 1997. A new method for key frame based video content representation. In Image Databases and Multimedia Search, A. Smeulders and R. Jain, Eds. World Scientific. 97--107.]]Google Scholar
Hanjalic, A. and Xu, L.-Q. 2005. Affective video content representation and modeling. IEEE Trans. Multimedia, 7, 1, 171--180.]] Google Scholar
Haralick, R. M. and Shapiro, L. G. 1993. Computer and Robot Vision. Addison-Wesley, New York, NY.]] Google Scholar
Harris, C. and Stephens, M. 1988, A combined corner and edge detector. The 4th Alvey Vision Conference. 147--151.]]Google Scholar
Hastings, S. K. 1999. Evaluation of image retrieval Systems: Role of User Feedback. Library Trends 48, 2, 438--452.]]Google Scholar
He, X., MA, W.-Y., King, O., LI, M., and Zhang, H. 2002. Learning and inferring a semantic space from user's relevance feedback for image retrieval. In Proceedings of the ACM Multimedia Conference. ACM, New York, NY, 343--347.]] Google Scholar
Howe, N. 2003. A closer look at boosted image retrieval. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK, 61--70.]]Google Scholar
Huijsmans, D. P. and Sebe, N. 2005. How to complete performance graphs in content-based image retrieval: Add generality and normalize Scope. IEEE Trans. Patt. Analy. Machine Intellig. 27, 2, 245--251.]] Google Scholar
Jacobs, D. W., Weinshall, D., and Gdalyahu, Y. 2000. Classification with nonmetric distances: Image hetrieval and class representation. IEEE Trans. Patt. Analy. Machine Intell. 22, 6, 583--600.]] Google Scholar
Jafari-Khouzani, K. and Soltanian-Zadeh, H. 2005. Radon transform orientation estimation for rotation invariant texture analysis. IEEE Trans. Patt. Analy. Machine Intell. 27, 6, 1004--1008.]] Google Scholar
Jaimes, A. and Sebe, N. 2006. Multimodal human-computer interaction: A survey. Comput. Vision Image Understand. To appear.]] Google Scholar
Jain, R. 2003. A game experience in every application: Experiential computing. Comm. ACM 46, 7, 48--54.]] Google Scholar
Jain, R., Kim, P., and Li, Z. 2003. Experiential meeting system. In Proceedings of the 2003 ACM SIGMM Workshop on Experiential Telepresence. Berkeley, CA, 1--12.]] Google Scholar
Jolion, J. M. 2001. Feature similarity. In Principles of Visual Information Retrieval. M. S. Lew, Ed. Springer-Verlag, London, UK. 122--162.]] Google Scholar
Krishnapuram, R., Medasani, S., Jung, S. H., Choi, Y. S., and Balasubramaniam, R. 2004. Content-based image retrieval based on a fuzzy approach. IEEE Trans. Knowl. Data Eng. 16, 10, 1185--1199.]] Google Scholar
Levine, M. 1985. Vision in Man and Machine. Mcgraw Hill, Columbus, OH.]]Google Scholar
Lew, M. S. and Huijsmans, N. 1996. Information theory and face detection. In Proceedings of the International Conference on Pattern Recogntion. Vienna, Austria, 601--605.]] Google Scholar
Lew, M. S. 2000. Next generation Web searches for visual content. IEEE Comput. (Nov.). 46--53.]] Google Scholar
Lew, M. S. 2001. Principles of Visual Information Retrieval. Springer, London, UK.]] Google Scholar
Lew, M. S. and Denteneer, D. 2001. Fisher keys for content based retrieval. Image Vision Comput. 19, 561--566.]]Google Scholar
Li, J. and Wang, J. Z. 2003. Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Trans. Patt. Analy. Machine Intell. 25, 9, 1075--1088.]] Google Scholar
Lienhart, R. 2001. Reliable transition detection in videos: A survey and practitioner's guide. Int. J. Image Graph. 1, 3, 469--486.]]Google Scholar
Lim, J.-H., Tian, Q., and Mulhelm, P. 2003. Home photo content modeling for personalized event-based retrieval. IEEE Multimedia 10, 4, 28--37.]]Google Scholar
Lindeberg, T. 1998. Feature detection with automatic scale selection. Int. J. Comput. Vision. 30, 2, 79--116.]] Google Scholar
Lindeberg, T. and Garding, J. 1997. Shape-adapted smoothing in estimation of the 3D shape cues from affine deformations of local 2D brightness structure. Image Vision Comput. 15, 6, 415--434.]]Google Scholar
Liu, B., Gupta, A., and Jain, R. 2005. MedSMan: A streaming data management system over live multimedia. ACM Multimedia, 171--180.]] Google Scholar
Liu, H., Xie, X., Tang, X., Li, Z. W., and MA, W. Y. 2004. Effective browsing of Web image search results. In Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval. ACM, New York, NY. 84--90.]] Google Scholar
Liu, X., Srivastava, A., and Sun, D. 2003. Learning optimal representations for image retrieval applications. In Proceedings of the 3rd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK. 50--60.]]Google Scholar
Lowe, D. 2004. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60, 2, 91--110.]] Google Scholar
Markkula, M. and Sormunen, E. 2000. End-user searching challenges indexing practices in the digital newspaper photo archive. Inform. Retrieval 1, 4, 259--285.]] Google Scholar
Mikolajczyk, K. and Schmid, C. 2004, Scale and affine invariant interest point detectors. Int. J. Comput. Vision 60, 1, 63--86.]] Google Scholar
Mongy, S., Bouali, F., and Djeraba, C. 2005. Analyzing user's behavior on a video database. In Proceedings of ACM MDM/KDD Workshop on Multimedia Data Mining. Chicago, IL.]] Google Scholar
Muller, H., Muller, W., Marchand-Maillet, S., Pun, T., and Squire, D. 2000. Strategies for positive and negative relevance feedback in image retrieval. In Proceedings of 15th International Conference on Pattern Recognition. (Sept.) Barcelona, Spain, 1043--1046.]] Google Scholar
Muller, H., Marchand-Maillet, S., and Pun, T. 2002. The Truth about Corel-evaluation in image retrieval. In Proceedings of the 1st International Conference on Image and Video Retrieval. (July), London, UK, M. S. Lew, N. Sebe, and J. P. Eakins, Eds. Springer-Verlag, London, UK. 38--49.]] Google Scholar
Müller, W. and Henrich, A. 2003. Fast retrieval of high-dimensional feature vectors in P2P networks using compact peer data summaries. In Proceedings of the 5th ACM SIGMM International Workshop on Multimedia Information Retrieval, Berkeley, CA, 79--86.]] Google Scholar
Ojala, T., Pietikainen, M., and Harwood, D. 1996. Comparative study of texture measures with classification based on feature distributions. Patt. Recogn. 29, 1, 51--59.]]Google Scholar
Pereira, F. and Koenen, R. 2001. MPEG-7: A standard for multimedia content description. Int. J. Image Graph. 1, 3, 527--546.]]Google Scholar
Picard, R. W. 2000. Affective Computing. MIT Press, Cambridge, MA.]] Google Scholar
Pickering, M. J. and Rüger, S. 2003. Evaluation of key-frame based retrieval techniques for video. Comput. Vision Image Understand. 92, 2, 217--235.]] Google Scholar
Rautiainen, M., Seppanen, T., PenttiLA, J., and Peltola, J. 2003. Detecting semantic concepts from video using temporal gradients and audio classification. In Proceedings of the 3rd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK, 260--270.]]Google Scholar
Rocchio 1971. Relevance feedback in information retrieval. In The Smart Retrieval System: Experiments in Automatic Document Processing. G. Salton, Ed. Prentice Hall, Englewoods Cliffs, NJ.]]Google Scholar
Rodden, K., Basalaj, W., Sinclair, D., and Wood, K. 2001. Does organisation by similarity assist image browsing? In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. (Mar.), Seattle, WA. 190--197.]] Google Scholar
Rodden, K. and Wood, K. 2003. How do people manage their digital photographs? In Proceedings of the ACM Conference on Human Factors in Computing Systems. ACM Press, New York, NY, 409--416.]] Google Scholar
Rowe, L. A. and Jain, R. 2005. ACM SIGMM retreat report on future directions in multimedia research. ACM Trans. Multimedia Comput. Comm. Appl. 1, 1, 3--13.]] Google Scholar
Rowley, H., Baluja, S., and Kanade, K. 1996. Human face detection in visual scenes. In Proceedings of NIPS Advances in Neural Information Processing Systems 8, (Nov.), Denver, CO, 875--881.]]Google Scholar
Rubin, R. 2004. Foundations of Library and Information Science. Neal-Schuman Publishers, New York, NY.]]Google Scholar
Rui, Y. and Huang, T. S. 2001. Relevance feedback techniques in image retrieval. In Principles of Visual Information Retrieval, M. S. Lew, Ed. Springer-Verlag, London, UK, 219--258.]] Google Scholar
Salway, A. and Graham, M. 2003. Extracting information about emotions in films. In Proceedings of the ACM International Conference on Multimedia. (Nov.) Berkeley, CA, 299--302.]] Google Scholar
Schmid, C., Mohr, R., and Bauckage, C. 2000, Evaluation of interest point detectors. Int. J. Comput. Vision 37, 2, 151--172.]] Google Scholar
Schneiderman, H. and Kanade, T. 2004. Object detection using the statistics of parts. Int. J. Comput. Vision 56, 3, 151--177.]] Google Scholar
Sclaroff, S., La Cascia, M., Sethi, S., and Taycher, L. 2001. Mix and match features in the ImageRover search engine. In Principles of Visual Information Retrieval. M. S. Lew, Ed. Springer-Verlag, London, UK, 259--277.]] Google Scholar
Scott, G. J. and Shyu, C. R. 2003. EBS k-d tree: An entropy balanced statistical k-d tree for image databases with ground-truth labels. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK, 467--476.]]Google Scholar
Sebastian, T. B., Klein, P. N., and Kimia, B. B. 2004. Recognition of shapes by editing their shock graphs. IEEE Trans. Patt. Analy. Machine Intell. 26, 5, 550--571.]] Google Scholar
Sebe, N., Lew, M. S., and Huijsmans, D. P. 2000. Toward improved ranking metrics. IEEE Trans. Patt. Analy. Mach. Intell. 22, 10, 1132--1143.]] Google Scholar
Sebe, N. and Lew, M. S. 2001. Color based retrieval. Pattern Recognition Letters 22, 2, 223--230.]] Google Scholar
Sebe, N. and Lew, M. S. 2002. Robust shape matching. In Proceedings of the 1st International Conference on Image and Video Retrieval. (July) London, UK, M. S. Lew, N. Sebe, and J. P. Eakins, Eds. Springer-Verlag, London, UK, 17--28.]] Google Scholar
Sebe, N., Cohen, I., Garg, A., Lew, M. S., and Huang, T. S. 2002. Emotion recognition using a Cauchy naive Bayes classifier. In Proceedings of International Conference on Pattern Recognition. (Aug.) Quebec, Canada, 17--20.]] Google Scholar
Sebe, N., Tian, Q., Loupias, E., Lew, M. S., and Huang, T. S. 2003a. Evaluation of salient point techniques. Image Vision Computing 21, 13--14, 1087--1095.]]Google Scholar
Sebe, N., Lew, M. S., Zhou, X., and Huang, T. S. 2003b. The state of the art in image and video retrieval. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK.]]Google Scholar
Shao, H., Svoboda, T., Tuytelaars, T., and van Gool, L. 2003. HPAT indexing for fast object/scene recognition based on local appearance. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK, 71--80.]]Google Scholar
Shen, H. T., Ooi, B. C., and Tan, K. L. 2000. Giving meanings to WWW images. In Proceedings of ACM Multimedia. ACM, New York, NY, 39--48.]] Google Scholar
Silva, G. C., De, Yamasaki, T., and Aizawa, K. 2005. Evaluation of video summarization for a large number of cameras in ubiquitous home. In Proceedings of the 13th ACM International Conference on Multimedia. (Nov.) ACM, Singapore, 820--828.]] Google Scholar
Smeaton, A. F. and Over, P. 2003. Benchmarking the effectiveness of information retrieval tasks on digital video. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK, 10--27.]]Google Scholar
Smeulders, A., Worring, M., Santini, S., Gupta, A., and Jain, R. 2000. Content based image retrieval at the end of the early years. IEEE Trans. Patt. Analy. Mach. Intell. 22, 12, 1349--1380.]] Google Scholar
Smith, J. R. and Chang, S. F. 1997. Visually searching the web for content. IEEE Multimedia 4, 3, 12--20.]] Google Scholar
Snoek, C. G. M., Worring, M., Van Gemert, J., Geusebroek, J. M., Koelma, D., Nguyen, G. P., De Rooij, O., and Seinstra, F. 2005. MediaMill: Exploring news video archives based on learned semantics. In Proceedings of the 13th ACM International Conference on Multimedia. (Nov.) Singapore, 225--226.]] Google Scholar
Spierenburg, J. A. and Huijsmans, D. P. 1997. VOICI: Video overview for image cluster indexing. In Proceedings of the 8th British Machine Vision Conference. (June) Colchester, UK.]]Google Scholar
Srivastava, A., Joshi, S. H., Mio, W., and Liu, X. 2005. Statistical shape analysis: Clustering, learning, and testing. IEEE Trans. Patt. Analy. Mach. Intell. 27, 4, 590--602.]] Google Scholar
Sundaram, H., Xie, L., and Chang, S. F. 2002. A utility framework for the automatic generation of audio-visual skims. In Proceedings of the 10th ACM International Conference on Multimedia. Juan-les-Pins, France, 189--198.]] Google Scholar
Tangelder, J. and Veltkamp, R. C. 2004. A survey of content based 3d shape retrieval methods. In Proceedings of the International Conference on Shape Modeling and Applications. (June) Genova, Italy. IEEE, New York, NY, 157--166.]]Google Scholar
Tian, Q., Sebe, N., Lew, M. S., Loupias, E., and Huang, T. S. 2001. Image retrieval using wavelet-based salient points. Journal of Electronic Imaging 10, 4, 835--849.]]Google Scholar
Tian, Q., Moghaddam, B., and Huang, T. S. 2002. Visualization, estimation and user-modeling. In Proceedings of the 1st International Conference on Image and Video Retrieval. (July), London, UK, M. S. Lew, N. Sebe, and J. P. Eakins, Eds. Springer-Verlag, London, UK, 7--16.]] Google Scholar
Tieu, K. and Viola, P. 2004. Boosting image retrieval. Int. J. Comput. Vision 56, 1, 17--36.]] Google Scholar
Therrien, C. W. 1989. Decision, Estimation, and Classification. Wiley, New York, NY.]] Google Scholar
Tuytelaars, T. and van Gool, L. 2000. Wide baseline stereo matching based on local affinely invariant regions. British Machine Vision Conference. 412--425.]]Google Scholar
Uchihashi, S., Foote, J., Girgensohn, A., and Boreczky, J. 1999. Video manga: Generating semantically meaningful video summaries. In Proceedings of the 7th ACM International Conference on Multimedia. Orlando, FL, 383--392.]] Google Scholar
Vailaya, A., Jain, A., and Zhang, H. 1998. On image classification: City vs landscape. In Proceedings of Workshop on Content-Based Access of Image and Video Libraries. 3--8.]] Google Scholar
Veltkamp, R. C. and Hagedoorn, M. 2001. State of the art in shape matching. In Principles of Visual Information Retrieval. M. S. Lew, Ed. Springer-Verlag, London, UK, 87--119.]] Google Scholar
Wang, W., Yu, Y., and Zhang, J. 2004. Image emotional classification: Static vs. dynamic. In Proceedings of IEEE International Conference on Systems, Man, and Cybernetics. (Oct.), 6407--6411.]]Google Scholar
Winston, P. 1992. Artificial Intelligence, Addison-Wesley, New York, NY.]] Google Scholar
Worring, M. and Gevers, T. 2001. Interactive retrieval of color images. Int. J. Image Graph. 1, 3, 387--414.]]Google Scholar
Worring, M., Nguyen, G. P., Hollink, L., Gemert, J. C., and Koelma, D. C. 2004. Accessing video archives using interactive search. In Proceedings of IEEE International Conference on Multimedia and Expo. (June) IEEE, Taiwan, 297--300.]]Google Scholar
Wu, P., Choi, Y., Ro., Y. M., and Won, C. S. 2001. MPEG-7 texture descriptors. Int. J. Image Graph. 1, 3, 547--563.]]Google Scholar
Yang, M. H., Kriegman, D. J., and Ahuja, N. 2002. Detecting faces in images: A survey. IEEE Trans. Patt. Analy. Machine Intell. 24, 1, 34--58.]] Google Scholar
Ye, H. and Xu, G. 2003. Fast search in large-scale image database using vector quantization. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK, 477--487.]]Google Scholar
Yin, P. Y., Bhanu, B., Chang, K. C., and Dong, A. 2005. Integrating relevance feedback techniques for image retrieval using reinforcement learning. IEEE Trans. Patt. Analy. Machine Intell. 27, 10, 1536--1551.]] Google Scholar
Zhou, X. S. and Huang, T. S. 2001. Comparing discriminating transformations and SVM for learning during multimedia retrieval. In Proceedings of the 9th ACM International Conference on Multimedia. Ottawa, Canada, 137--146.]] Google Scholar

Index Terms

Content-based multimedia information retrieval: State of the art and challenges
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
  2. Machine learning
2. Information systems
  1. Information retrieval

Recommendations

Multimedia information retrieval: what is it, and why isn't anyone using it?
MIR '05: Proceedings of the 7th ACM SIGMM international workshop on Multimedia information retrieval

In this paper, the participants of the panel at the 7th ACM SIGMM International Workshop on Multimedia Information Retrieval answer questions about what multimedia is, how MIR is different from other kinds of retrieval, the most important technical ...
Read More
A Relevance Feedback Architecture for Content-based Multimedia Information Retrieval Systems
CAIVL '97: Proceedings of the 1997 Workshop on Content-Based Access of Image and Video Libraries (CBAIVL '97)

Content-based multimedia information retrieval (MIR) has become one of the most active research areas in the past few years. Many retrieval approaches based on extracting and representing visual properties of multimedia data have been developed. While ...
Read More
k-Partite graph reinforcement and its application in multimedia information retrieval

In many example-based information retrieval tasks, example query actually contains multiple sub-queries. For example, in 3D object retrieval, the query is an object described by multiple views. In content-based video retrieval, the query is a video clip ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 2, Issue 1
February 2006
89 pages
ISSN:1551-6857
EISSN:1551-6865
DOI:10.1145/1126004
Issue’s Table of Contents

Copyright © 2006 ACM
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 February 2006
Published in tomm Volume 2, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Multimedia information retrieval
audio retrieval
human-computer interaction
image databases
image search
multimedia indexing
video retrieval
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1,144
  Total Citations
  View Citations
- 13,259
  Total Downloads
- Downloads (Last 12 months)244
- Downloads (Last 6 weeks)35
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Content-based multimedia information retrieval: State of the art and challenges

ACM Transactions on Multimedia Computing, Communications, and Applications

Abstract

References

Cited By

Index Terms

Recommendations

Multimedia information retrieval: what is it, and why isn't anyone using it?

A Relevance Feedback Architecture for Content-based Multimedia Information Retrieval Systems

k-Partite graph reinforcement and its application in multimedia information retrieval

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Content-based multimedia information retrieval: State of the art and challenges

ACM Transactions on Multimedia Computing, Communications, and Applications

Abstract

References

Cited By

Index Terms

Recommendations

Multimedia information retrieval: what is it, and why isn't anyone using it?

A Relevance Feedback Architecture for Content-based Multimedia Information Retrieval Systems

k-Partite graph reinforcement and its application in multimedia information retrieval

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media