Abstract
Extending beyond the boundaries of science, art, and culture, content-based multimedia information retrieval provides new paradigms and methods for searching through the myriad variety of media all over the world. This survey reviews 100+ recent articles on content-based multimedia information retrieval and discusses their role in current research directions which include browsing and search paradigms, user studies, affective computing, learning, semantic queries, new features and media types, high performance indexing, and evaluation techniques. Based on the current state of the art, we discuss the major challenges for the future.
- Amir, A., Basu, S., Iyengar, G., Lin, C.-Y., Naphade, M., Smith, J. R., Srinivasan, S., and Tseng, B. 2004. A Multi-modal system for the retrieval of semantic video events. Comput. Vision Image Understand. 96, 2, 216--236.]] Google Scholar
- Assfalg, J., Del Bimbo, A., and Pala, P. 2004. Retrieval of 3D objects by visual similarity. In Proceedings of the 6th International Workshop on Multimedia Information Retrieval. New York, NY. (Oct.). M. S. Lew, N. Sebe, C. Djeraba, Eds. ACM, New York, NY. 77--83.]] Google Scholar
- Bach, J. R., Fuller, C., Gupta, A., Hampapur, A., Horowitz, B., Humphrey, R., Jain, R., and Shu, C. F. 1996. Virage image search engine: An open framework for image management. In Proceedings of the SPIE Storage and Retrieval for Still Image and Video Databases. 76--87.]]Google Scholar
- Balakrishnan, N., Hariharakrishnan, K., and Schonfeld, D. 2005. A new image representation algorithm inspired by image submodality models, redundancy reduction, and learning in biological vision. IEEE Trans. Patt. Analy. Machine Intellig. 27, 9, 1367--1378.]] Google Scholar
- Ballard, D. H. and Brown, C. M. 1982. Computer Vision. Prentice Hall, New Jersey, USA.]] Google Scholar
- Bakker, E. M. and Lew, M. S. 2002. Semantic video retrieval using audio analysis. In Proceedings of the 1st International Conference on Image and Video Retrieval. (July) London, UK. M. S. Lew, N. Sebe, and J. P. Eakins, Eds. Springer-Verlag, London, UK. 262--270.]] Google Scholar
- Bartolini, I., Ciaccia, P., and Patella, M. 2005. WARP: Accurate retrieval of shapes using phase of fourier descriptors and time warping distance. IEEE Trans. Patt. Analy. Machine Intellig. 27, 1, 142--147.]] Google Scholar
- Battelle, J. 2005. The Search: How Google and Its Rivals Rewrote the Rules of Business and Transformed Our Culture. Portfolio Hardcover.]] Google Scholar
- Baumberg, A. 2000. Reliable feature matching across widely separated views. IEEE Conference of Computer Vision and Pattern Recognition. 774--781.]]Google Scholar
- Bell, G. 2004. A new relevance for multimedia when we record everything personal. In Proceedings of the 12th Annual ACM International Conference on Multimedia. ACM, New York, NY.]] Google Scholar
- Benitez, A. B. and Chang, S.-F. 2002. Semantic knowledge construction from annotated image collection. In Proceedings of the IEEE International Conference on Multimedia. IEEE Computer Society Press, Los Alamitos, CA.]]Google Scholar
- Beretti, S., Del Bimbo, A., and Vicario, E. 2001. Efficient matching and indexing of graph models in content-based retrieval. IEEE Trans. Patt. Analy. Machine Intellig. 23, 10, 1089--1105.]] Google Scholar
- Berthouze, N. B. and Kato, T. 1998. Towards a comprehensive integration of subjective parameters in database browsing. In Advanced Database Systems for Integration of Media and User Environments, Y. Kambayashi, A. Makinouchi, S. Uemura, K. Tanaka, and Y. Masunaga, Eds. World Scientific, Singapore, 227--232.]]Google Scholar
- Bliujute, R., Saltenis, S., Slivinskas, G., and Jensen, C. S. 1999. Developing a DataBlade for a new index. In Proceedings of IEEE International Conference on Data Engineering. (March) Sydney, Australia 314--323.]] Google Scholar
- Bosson, A., Cawley, G. C., Chan, Y., and Harvey, R. 2002. Non-retrieval: Blocking Pornographic Images. In Proceedings of the 1st International Conference on Image and Video Retrieval. (July) London, M. S. Lew, N. Sebe, and J. P. Eakins, Eds. Springer-Verlag, London, UK, 50--60.]] Google Scholar
- Cappelli, R., Maio, D., and Maltoni, D. 2001. Multispace KL for pattern representation and classification. IEEE Trans. Pattern Analy. Machine Intellig. 23, 9, 977--996.]] Google Scholar
- Chang, S.-F., Chen, W., and Sundaram, H. 1998. Semantic visual templates: Linking visual features to semantics. In Proceedings of the IEEE International Conference on Image Processing. IEEE Computer Society Press, Los Alamitos, CA. 531--535.]]Google Scholar
- Chen, Y., Che, D., and Aberer, K. 2002. On the efficient evaluation of relaxed queries in biological databases. In Proceedings of the 11th International Conference on Information and Knowledge Management. McLean, VA, 227--236.]] Google Scholar
- Chen, Y., Zhou, X. S., and Huang, T. S. 2001. One-class SVM for learning in image retrieval. In Proceedings of IEEE International Conference on Image Processing, (Oct.), Thessaloniki, Greece, 815--818.]]Google Scholar
- Chiu, P., Girgensoh, A., Lertsithichai, S., Polak, W., and Shipman, F. 2005. MediaMetro: Browsing multimedia document collections with a 3D city metaphor. In Proceedings of the 13th ACM International Conference on Multimedia. (Nov.), Singapore, 213--214.]] Google Scholar
- Chua, T. S., Zhao, Y., and Kankanhalli, M. S. 2002. Detection of human faces in a compressed domain for video stratification. The Visual Computer 18, 2, 121--133.]]Google Scholar
- Cooper, M., Foote, J., Girgensohn, A., and Wilcox, L. 2005. Temporal event clustering for digital photo collections. ACM Trans. Multimedia Comput. Comm. Applica. 1, 3, 269--288.]] Google Scholar
- Dimitrova, N., Agnihotri, L., and Wei, G. 2000. Video classification based on HMM using text and faces. European Signal Processing Conference. Tampere, Finland.]]Google Scholar
- Dimitrova, N., Zhang, H. J., Shahraray, B., Sezan, I., Huang, T., and Zakhor, A. 2002. Applications of video-content analysis and retrieval. IEEE Multimedia 9, 3, 42--55.]] Google Scholar
- Dimitrova, N. 2003. Multimedia content analysis: The next wave. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK, 9--18.]]Google Scholar
- Djeraba, C. 2002. Content-based multimedia indexing and retrieval. IEEE Multimedia 9, 18--22.]] Google Scholar
- Djeraba, C. 2003. Association and content-based retrieval. IEEE Trans. Knowl. Data Engin. 15, 1, 118--135.]] Google Scholar
- Downie, J. S. 2003. Toward the scientific evaluation of music information retrieval systems. In Proceedings of the International Conference on Music Information Retrieval. Baltimore, MD, 25--32.]]Google Scholar
- Dufournaud, Y., Schmid, C., and Horaud, R. 2000. Matching images with different resolutions. IEEE Conference of Computer Vision and Pattern Recognition. 612--618.]]Google Scholar
- Dy, J. G., Brodley, C. E., Kak, A., Broderick, L. S., and Aisen, A. M. 2003. Unsupervised feature selection applied to content-based retrieval of lung images. IEEE Trans. Patt. Analy. Machine Intellig. 25, 3, 373--378.]] Google Scholar
- Eakins, J. P., Riley, K. J., and Edwards, J. D. 2003. Shape feature matching for trademark image retrieval. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, IL. E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK. 28--38.]]Google Scholar
- Egas, R., Huijsmans, N., Lew, M. S., and Sebe, N. 1999. Adapting k-d Trees to Visual Retrieval. In Proceedings of the International Conference on Visual Information Systems. (June) Amsterdam, A. Smeulders and R. Jain, Eds., 533--540.]] Google Scholar
- Eiter, T. and Libkin, L. 2005. Database Theory. Springer, London. UK.]] Google Scholar
- Elkwae, E. A. and Kabuka, M. R. 2000. Efficient content-based indexing of large image databases. ACM Trans. Inform. Sys. 18, 2, 171--210.]] Google Scholar
- Enser, P. G. B. and Sandom, C. J. 2003. Towards a comprehensive survey of the semantic gap in visual information retrieval. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe and X. Zhou, Eds. Springer-Verlag, London, UK. 291--299.]]Google Scholar
- Enser, P. G. B., Sandom, C. J., and Lewis, P. H. 2005. Automatic annotation of images from the practitioner perspective. In Proceedings of the 4th International Conference on Image and Video Retrieval. (July) Singapore, IL W. Leow, M. S. Lew, T.-S. Chua, W.-Y. Ma, E. M. Bakker, and L. Chaisorn, Eds. Springer-Verlag, London, UK. 497--506.]] Google Scholar
- Fan, J., Gao, Y., and Luo, H. 2004. Multi-level annotation of natural scenes using dominant image components and semantic concepts. In Proceedings of the ACM International Conference on Multimedia. ACM, New York, NY, 540--547.]] Google Scholar
- Flickner, M., Sawhney, H., Niblack, W., Ashley, J., Qian Huang Dom, B., Gorkani, M., Hafner, J., Lee, D., Petkovic, D., Steele, D., and Yanker, P. 1995. Query by image and video content: The QBIC system. IEEE Comput., (Sept.), 23--32.]] Google Scholar
- Foote, J. 1999. An overview of audio information retrieval. ACM Multimedia Syst. 7, 1, 42--51.]] Google Scholar
- Foote, J. 2000. Automatic audio segmentation using a measure of audio novelty. In Proceedings of the IEEE International Conference on Multimedia and Expo. IEEE Computer Society Press, Los Alamitos, CA, 452--455.]]Google Scholar
- Forsyth, D. A. and Fleck, M. M. 1999. Automatic detection of human nudes. Int. J. Comput. Vision 32, 1, 63--77.]] Google Scholar
- Frankel, C., Swain, M. J., and Athitsos, V. 1996. WebSeer: An image search engine for the World Wide Web. University of Chicago Tech. rep. 96-14, University of Chicago, Chicago, IL.]] Google Scholar
- Frohlich, D., Kuchinsky, A., Pering, C., Don, A., and Ariss, S. 2002. Requirements for photoware. In Proceedings of the ACM Conference on CSCW. ACM Press, New York, NY, 166--175.]] Google Scholar
- Funkhouser, T., Min, P., Kazhdan, M., Chen, J., Halderman, A., Dobkin, D., and Jacobs, D. 2003. A search engine for 3D models. ACM Trans. Graph. 22, 1, 83--105.]] Google Scholar
- Gevers, T. 2001. Color-based retrieval. In Principles of Visual Information Retrieval, M. S. Lew, Ed. Springer-Verlag, London, UK, 11--49.]] Google Scholar
- Gong, B., Singh, R., and Jain, R. 2004. ResearchExplorer: Gaining insights through exploration in multimedia scientific data. In Proceedings of the 6th International Workshop on Multimedia Information Retrieval. (Oct.) New York, M. S. Lew, N. Sebe, C. Djeraba, Eds. ACM, New York, NY, 7--14.]] Google Scholar
- Graham, A., Garcia-Molina, H., Paepcke, A., and Winograd, T. 2002. Time as the essence for photo browsing through personal digital libraries. In Proceedings of the Joint Conference on Digital Libraries. ACM Press, New York, NY, 326--335.]] Google Scholar
- Greenspan, H., Goldberger, J., and Mayer, A. 2004. Probabilistic space-time video modeling via piecewise GMM. IEEE Trans. Patt. Analy. Machine Intell. 26, 3, 384--396.]] Google Scholar
- Guo, G., Zhang, H. J., and Li, S. Z. 2001. Boosting for content-based audio classification and retrieval: An Evaluation, In Proceedings of the IEEE Conference on Multimedia and Expo. (Aug.) Tokyo, Japan.]]Google Scholar
- Haas, M., Lew, M. S., and Huijsmans, D. P. 1997. A new method for key frame based video content representation. In Image Databases and Multimedia Search. A. Smeulders and R. Jain, Eds. World Scientific. 191--200.]]Google Scholar
- Haas, M., Rijsdam, J., and Lew, M. 2004. Relevance feedback: Perceptual learning and retrieval in bio-computing, photos, and video, In Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval. (Oct.), New York, 151--156.]] Google Scholar
- Hanjalic, A., Lagendijk, R. L., and Biemond, J. 1997. A new method for key frame based video content representation. In Image Databases and Multimedia Search, A. Smeulders and R. Jain, Eds. World Scientific. 97--107.]]Google Scholar
- Hanjalic, A. and Xu, L.-Q. 2005. Affective video content representation and modeling. IEEE Trans. Multimedia, 7, 1, 171--180.]] Google Scholar
- Haralick, R. M. and Shapiro, L. G. 1993. Computer and Robot Vision. Addison-Wesley, New York, NY.]] Google Scholar
- Harris, C. and Stephens, M. 1988, A combined corner and edge detector. The 4th Alvey Vision Conference. 147--151.]]Google Scholar
- Hastings, S. K. 1999. Evaluation of image retrieval Systems: Role of User Feedback. Library Trends 48, 2, 438--452.]]Google Scholar
- He, X., MA, W.-Y., King, O., LI, M., and Zhang, H. 2002. Learning and inferring a semantic space from user's relevance feedback for image retrieval. In Proceedings of the ACM Multimedia Conference. ACM, New York, NY, 343--347.]] Google Scholar
- Howe, N. 2003. A closer look at boosted image retrieval. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK, 61--70.]]Google Scholar
- Huijsmans, D. P. and Sebe, N. 2005. How to complete performance graphs in content-based image retrieval: Add generality and normalize Scope. IEEE Trans. Patt. Analy. Machine Intellig. 27, 2, 245--251.]] Google Scholar
- Jacobs, D. W., Weinshall, D., and Gdalyahu, Y. 2000. Classification with nonmetric distances: Image hetrieval and class representation. IEEE Trans. Patt. Analy. Machine Intell. 22, 6, 583--600.]] Google Scholar
- Jafari-Khouzani, K. and Soltanian-Zadeh, H. 2005. Radon transform orientation estimation for rotation invariant texture analysis. IEEE Trans. Patt. Analy. Machine Intell. 27, 6, 1004--1008.]] Google Scholar
- Jaimes, A. and Sebe, N. 2006. Multimodal human-computer interaction: A survey. Comput. Vision Image Understand. To appear.]] Google Scholar
- Jain, R. 2003. A game experience in every application: Experiential computing. Comm. ACM 46, 7, 48--54.]] Google Scholar
- Jain, R., Kim, P., and Li, Z. 2003. Experiential meeting system. In Proceedings of the 2003 ACM SIGMM Workshop on Experiential Telepresence. Berkeley, CA, 1--12.]] Google Scholar
- Jolion, J. M. 2001. Feature similarity. In Principles of Visual Information Retrieval. M. S. Lew, Ed. Springer-Verlag, London, UK. 122--162.]] Google Scholar
- Krishnapuram, R., Medasani, S., Jung, S. H., Choi, Y. S., and Balasubramaniam, R. 2004. Content-based image retrieval based on a fuzzy approach. IEEE Trans. Knowl. Data Eng. 16, 10, 1185--1199.]] Google Scholar
- Levine, M. 1985. Vision in Man and Machine. Mcgraw Hill, Columbus, OH.]]Google Scholar
- Lew, M. S. and Huijsmans, N. 1996. Information theory and face detection. In Proceedings of the International Conference on Pattern Recogntion. Vienna, Austria, 601--605.]] Google Scholar
- Lew, M. S. 2000. Next generation Web searches for visual content. IEEE Comput. (Nov.). 46--53.]] Google Scholar
- Lew, M. S. 2001. Principles of Visual Information Retrieval. Springer, London, UK.]] Google Scholar
- Lew, M. S. and Denteneer, D. 2001. Fisher keys for content based retrieval. Image Vision Comput. 19, 561--566.]]Google Scholar
- Li, J. and Wang, J. Z. 2003. Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Trans. Patt. Analy. Machine Intell. 25, 9, 1075--1088.]] Google Scholar
- Lienhart, R. 2001. Reliable transition detection in videos: A survey and practitioner's guide. Int. J. Image Graph. 1, 3, 469--486.]]Google Scholar
- Lim, J.-H., Tian, Q., and Mulhelm, P. 2003. Home photo content modeling for personalized event-based retrieval. IEEE Multimedia 10, 4, 28--37.]]Google Scholar
- Lindeberg, T. 1998. Feature detection with automatic scale selection. Int. J. Comput. Vision. 30, 2, 79--116.]] Google Scholar
- Lindeberg, T. and Garding, J. 1997. Shape-adapted smoothing in estimation of the 3D shape cues from affine deformations of local 2D brightness structure. Image Vision Comput. 15, 6, 415--434.]]Google Scholar
- Liu, B., Gupta, A., and Jain, R. 2005. MedSMan: A streaming data management system over live multimedia. ACM Multimedia, 171--180.]] Google Scholar
- Liu, H., Xie, X., Tang, X., Li, Z. W., and MA, W. Y. 2004. Effective browsing of Web image search results. In Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval. ACM, New York, NY. 84--90.]] Google Scholar
- Liu, X., Srivastava, A., and Sun, D. 2003. Learning optimal representations for image retrieval applications. In Proceedings of the 3rd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK. 50--60.]]Google Scholar
- Lowe, D. 2004. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60, 2, 91--110.]] Google Scholar
- Markkula, M. and Sormunen, E. 2000. End-user searching challenges indexing practices in the digital newspaper photo archive. Inform. Retrieval 1, 4, 259--285.]] Google Scholar
- Mikolajczyk, K. and Schmid, C. 2004, Scale and affine invariant interest point detectors. Int. J. Comput. Vision 60, 1, 63--86.]] Google Scholar
- Mongy, S., Bouali, F., and Djeraba, C. 2005. Analyzing user's behavior on a video database. In Proceedings of ACM MDM/KDD Workshop on Multimedia Data Mining. Chicago, IL.]] Google Scholar
- Muller, H., Muller, W., Marchand-Maillet, S., Pun, T., and Squire, D. 2000. Strategies for positive and negative relevance feedback in image retrieval. In Proceedings of 15th International Conference on Pattern Recognition. (Sept.) Barcelona, Spain, 1043--1046.]] Google Scholar
- Muller, H., Marchand-Maillet, S., and Pun, T. 2002. The Truth about Corel-evaluation in image retrieval. In Proceedings of the 1st International Conference on Image and Video Retrieval. (July), London, UK, M. S. Lew, N. Sebe, and J. P. Eakins, Eds. Springer-Verlag, London, UK. 38--49.]] Google Scholar
- Müller, W. and Henrich, A. 2003. Fast retrieval of high-dimensional feature vectors in P2P networks using compact peer data summaries. In Proceedings of the 5th ACM SIGMM International Workshop on Multimedia Information Retrieval, Berkeley, CA, 79--86.]] Google Scholar
- Ojala, T., Pietikainen, M., and Harwood, D. 1996. Comparative study of texture measures with classification based on feature distributions. Patt. Recogn. 29, 1, 51--59.]]Google Scholar
- Pereira, F. and Koenen, R. 2001. MPEG-7: A standard for multimedia content description. Int. J. Image Graph. 1, 3, 527--546.]]Google Scholar
- Picard, R. W. 2000. Affective Computing. MIT Press, Cambridge, MA.]] Google Scholar
- Pickering, M. J. and Rüger, S. 2003. Evaluation of key-frame based retrieval techniques for video. Comput. Vision Image Understand. 92, 2, 217--235.]] Google Scholar
- Rautiainen, M., Seppanen, T., PenttiLA, J., and Peltola, J. 2003. Detecting semantic concepts from video using temporal gradients and audio classification. In Proceedings of the 3rd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK, 260--270.]]Google Scholar
- Rocchio 1971. Relevance feedback in information retrieval. In The Smart Retrieval System: Experiments in Automatic Document Processing. G. Salton, Ed. Prentice Hall, Englewoods Cliffs, NJ.]]Google Scholar
- Rodden, K., Basalaj, W., Sinclair, D., and Wood, K. 2001. Does organisation by similarity assist image browsing? In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. (Mar.), Seattle, WA. 190--197.]] Google Scholar
- Rodden, K. and Wood, K. 2003. How do people manage their digital photographs? In Proceedings of the ACM Conference on Human Factors in Computing Systems. ACM Press, New York, NY, 409--416.]] Google Scholar
- Rowe, L. A. and Jain, R. 2005. ACM SIGMM retreat report on future directions in multimedia research. ACM Trans. Multimedia Comput. Comm. Appl. 1, 1, 3--13.]] Google Scholar
- Rowley, H., Baluja, S., and Kanade, K. 1996. Human face detection in visual scenes. In Proceedings of NIPS Advances in Neural Information Processing Systems 8, (Nov.), Denver, CO, 875--881.]]Google Scholar
- Rubin, R. 2004. Foundations of Library and Information Science. Neal-Schuman Publishers, New York, NY.]]Google Scholar
- Rui, Y. and Huang, T. S. 2001. Relevance feedback techniques in image retrieval. In Principles of Visual Information Retrieval, M. S. Lew, Ed. Springer-Verlag, London, UK, 219--258.]] Google Scholar
- Salway, A. and Graham, M. 2003. Extracting information about emotions in films. In Proceedings of the ACM International Conference on Multimedia. (Nov.) Berkeley, CA, 299--302.]] Google Scholar
- Schmid, C., Mohr, R., and Bauckage, C. 2000, Evaluation of interest point detectors. Int. J. Comput. Vision 37, 2, 151--172.]] Google Scholar
- Schneiderman, H. and Kanade, T. 2004. Object detection using the statistics of parts. Int. J. Comput. Vision 56, 3, 151--177.]] Google Scholar
- Sclaroff, S., La Cascia, M., Sethi, S., and Taycher, L. 2001. Mix and match features in the ImageRover search engine. In Principles of Visual Information Retrieval. M. S. Lew, Ed. Springer-Verlag, London, UK, 259--277.]] Google Scholar
- Scott, G. J. and Shyu, C. R. 2003. EBS k-d tree: An entropy balanced statistical k-d tree for image databases with ground-truth labels. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK, 467--476.]]Google Scholar
- Sebastian, T. B., Klein, P. N., and Kimia, B. B. 2004. Recognition of shapes by editing their shock graphs. IEEE Trans. Patt. Analy. Machine Intell. 26, 5, 550--571.]] Google Scholar
- Sebe, N., Lew, M. S., and Huijsmans, D. P. 2000. Toward improved ranking metrics. IEEE Trans. Patt. Analy. Mach. Intell. 22, 10, 1132--1143.]] Google Scholar
- Sebe, N. and Lew, M. S. 2001. Color based retrieval. Pattern Recognition Letters 22, 2, 223--230.]] Google Scholar
- Sebe, N. and Lew, M. S. 2002. Robust shape matching. In Proceedings of the 1st International Conference on Image and Video Retrieval. (July) London, UK, M. S. Lew, N. Sebe, and J. P. Eakins, Eds. Springer-Verlag, London, UK, 17--28.]] Google Scholar
- Sebe, N., Cohen, I., Garg, A., Lew, M. S., and Huang, T. S. 2002. Emotion recognition using a Cauchy naive Bayes classifier. In Proceedings of International Conference on Pattern Recognition. (Aug.) Quebec, Canada, 17--20.]] Google Scholar
- Sebe, N., Tian, Q., Loupias, E., Lew, M. S., and Huang, T. S. 2003a. Evaluation of salient point techniques. Image Vision Computing 21, 13--14, 1087--1095.]]Google Scholar
- Sebe, N., Lew, M. S., Zhou, X., and Huang, T. S. 2003b. The state of the art in image and video retrieval. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK.]]Google Scholar
- Shao, H., Svoboda, T., Tuytelaars, T., and van Gool, L. 2003. HPAT indexing for fast object/scene recognition based on local appearance. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK, 71--80.]]Google Scholar
- Shen, H. T., Ooi, B. C., and Tan, K. L. 2000. Giving meanings to WWW images. In Proceedings of ACM Multimedia. ACM, New York, NY, 39--48.]] Google Scholar
- Silva, G. C., De, Yamasaki, T., and Aizawa, K. 2005. Evaluation of video summarization for a large number of cameras in ubiquitous home. In Proceedings of the 13th ACM International Conference on Multimedia. (Nov.) ACM, Singapore, 820--828.]] Google Scholar
- Smeaton, A. F. and Over, P. 2003. Benchmarking the effectiveness of information retrieval tasks on digital video. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK, 10--27.]]Google Scholar
- Smeulders, A., Worring, M., Santini, S., Gupta, A., and Jain, R. 2000. Content based image retrieval at the end of the early years. IEEE Trans. Patt. Analy. Mach. Intell. 22, 12, 1349--1380.]] Google Scholar
- Smith, J. R. and Chang, S. F. 1997. Visually searching the web for content. IEEE Multimedia 4, 3, 12--20.]] Google Scholar
- Snoek, C. G. M., Worring, M., Van Gemert, J., Geusebroek, J. M., Koelma, D., Nguyen, G. P., De Rooij, O., and Seinstra, F. 2005. MediaMill: Exploring news video archives based on learned semantics. In Proceedings of the 13th ACM International Conference on Multimedia. (Nov.) Singapore, 225--226.]] Google Scholar
- Spierenburg, J. A. and Huijsmans, D. P. 1997. VOICI: Video overview for image cluster indexing. In Proceedings of the 8th British Machine Vision Conference. (June) Colchester, UK.]]Google Scholar
- Srivastava, A., Joshi, S. H., Mio, W., and Liu, X. 2005. Statistical shape analysis: Clustering, learning, and testing. IEEE Trans. Patt. Analy. Mach. Intell. 27, 4, 590--602.]] Google Scholar
- Sundaram, H., Xie, L., and Chang, S. F. 2002. A utility framework for the automatic generation of audio-visual skims. In Proceedings of the 10th ACM International Conference on Multimedia. Juan-les-Pins, France, 189--198.]] Google Scholar
- Tangelder, J. and Veltkamp, R. C. 2004. A survey of content based 3d shape retrieval methods. In Proceedings of the International Conference on Shape Modeling and Applications. (June) Genova, Italy. IEEE, New York, NY, 157--166.]]Google Scholar
- Tian, Q., Sebe, N., Lew, M. S., Loupias, E., and Huang, T. S. 2001. Image retrieval using wavelet-based salient points. Journal of Electronic Imaging 10, 4, 835--849.]]Google Scholar
- Tian, Q., Moghaddam, B., and Huang, T. S. 2002. Visualization, estimation and user-modeling. In Proceedings of the 1st International Conference on Image and Video Retrieval. (July), London, UK, M. S. Lew, N. Sebe, and J. P. Eakins, Eds. Springer-Verlag, London, UK, 7--16.]] Google Scholar
- Tieu, K. and Viola, P. 2004. Boosting image retrieval. Int. J. Comput. Vision 56, 1, 17--36.]] Google Scholar
- Therrien, C. W. 1989. Decision, Estimation, and Classification. Wiley, New York, NY.]] Google Scholar
- Tuytelaars, T. and van Gool, L. 2000. Wide baseline stereo matching based on local affinely invariant regions. British Machine Vision Conference. 412--425.]]Google Scholar
- Uchihashi, S., Foote, J., Girgensohn, A., and Boreczky, J. 1999. Video manga: Generating semantically meaningful video summaries. In Proceedings of the 7th ACM International Conference on Multimedia. Orlando, FL, 383--392.]] Google Scholar
- Vailaya, A., Jain, A., and Zhang, H. 1998. On image classification: City vs landscape. In Proceedings of Workshop on Content-Based Access of Image and Video Libraries. 3--8.]] Google Scholar
- Veltkamp, R. C. and Hagedoorn, M. 2001. State of the art in shape matching. In Principles of Visual Information Retrieval. M. S. Lew, Ed. Springer-Verlag, London, UK, 87--119.]] Google Scholar
- Wang, W., Yu, Y., and Zhang, J. 2004. Image emotional classification: Static vs. dynamic. In Proceedings of IEEE International Conference on Systems, Man, and Cybernetics. (Oct.), 6407--6411.]]Google Scholar
- Winston, P. 1992. Artificial Intelligence, Addison-Wesley, New York, NY.]] Google Scholar
- Worring, M. and Gevers, T. 2001. Interactive retrieval of color images. Int. J. Image Graph. 1, 3, 387--414.]]Google Scholar
- Worring, M., Nguyen, G. P., Hollink, L., Gemert, J. C., and Koelma, D. C. 2004. Accessing video archives using interactive search. In Proceedings of IEEE International Conference on Multimedia and Expo. (June) IEEE, Taiwan, 297--300.]]Google Scholar
- Wu, P., Choi, Y., Ro., Y. M., and Won, C. S. 2001. MPEG-7 texture descriptors. Int. J. Image Graph. 1, 3, 547--563.]]Google Scholar
- Yang, M. H., Kriegman, D. J., and Ahuja, N. 2002. Detecting faces in images: A survey. IEEE Trans. Patt. Analy. Machine Intell. 24, 1, 34--58.]] Google Scholar
- Ye, H. and Xu, G. 2003. Fast search in large-scale image database using vector quantization. In Proceedings of the 2nd International Conference on Image and Video Retrieval. (July), Urbana, IL, E. M. Bakker, T. S. Huang, M. S. Lew, N. Sebe, and X. Zhou, Eds. Springer-Verlag, London, UK, 477--487.]]Google Scholar
- Yin, P. Y., Bhanu, B., Chang, K. C., and Dong, A. 2005. Integrating relevance feedback techniques for image retrieval using reinforcement learning. IEEE Trans. Patt. Analy. Machine Intell. 27, 10, 1536--1551.]] Google Scholar
- Zhou, X. S. and Huang, T. S. 2001. Comparing discriminating transformations and SVM for learning during multimedia retrieval. In Proceedings of the 9th ACM International Conference on Multimedia. Ottawa, Canada, 137--146.]] Google Scholar
Index Terms
- Content-based multimedia information retrieval: State of the art and challenges
Recommendations
Multimedia information retrieval: what is it, and why isn't anyone using it?
MIR '05: Proceedings of the 7th ACM SIGMM international workshop on Multimedia information retrievalIn this paper, the participants of the panel at the 7th ACM SIGMM International Workshop on Multimedia Information Retrieval answer questions about what multimedia is, how MIR is different from other kinds of retrieval, the most important technical ...
A Relevance Feedback Architecture for Content-based Multimedia Information Retrieval Systems
CAIVL '97: Proceedings of the 1997 Workshop on Content-Based Access of Image and Video Libraries (CBAIVL '97)Content-based multimedia information retrieval (MIR) has become one of the most active research areas in the past few years. Many retrieval approaches based on extracting and representing visual properties of multimedia data have been developed. While ...
k-Partite graph reinforcement and its application in multimedia information retrieval
In many example-based information retrieval tasks, example query actually contains multiple sub-queries. For example, in 3D object retrieval, the query is an object described by multiple views. In content-based video retrieval, the query is a video clip ...
Comments