ABSTRACT
Libraries have traditionally used manual image annotation for indexing and then later retrieving their image collections. However, manual image annotation is an expensive and labor intensive procedure and hence there has been great interest in coming up with automatic ways to retrieve images based on content. Here, we propose an automatic approach to annotating and retrieving images based on a training set of images. We assume that regions in an image can be described using a small vocabulary of blobs. Blobs are generated from image features using clustering. Given a training set of images with annotations, we show that probabilistic models allow us to predict the probability of generating a word given the blobs in an image. This may be used to automatically annotate and retrieve images given a word as a query. We show that relevance models allow us to derive these probabilities in a natural way. Experiments show that the annotation performance of this cross-media relevance model is almost six times as good (in terms of mean precision) than a model based on word-blob co-occurrence model and twice as good as a state of the art model derived from machine translation. Our approach shows the usefulness of using formal information retrieval models for the task of image annotation and retrieval.
- K. Barnard, P. Duygulu, N. de~Freitas, D. Forsyth, D. Blei, and M. I. Jordan. Matching words and pictures. Journal of Machine Learning Research, 3:1107--1135, 2003.]] Google ScholarDigital Library
- K. Barnard and D. Forsyth. Learning the semantics of words and pictures. In International Conference on Computer Vision, Vol.2, pages 408--415, 2001.]]Google ScholarCross Ref
- D. Blei, Michael, and M. I. Jordan. Modeling annotated data. To appear in the Proceedings of the 26th annual international ACM SIGIR conference]] Google ScholarDigital Library
- Berger, A. and Lafferty, J. Information retrieval as statistical translation. In Proceedings of the 22nd annual international ACM SIGIR conference, pages 222--229, 1999.]] Google ScholarDigital Library
- P. Brown, S. D. Pietra, V. D. Pietra, and R. Mercer. The mathematics of statistical machine translation: Parameter estimation. In Computational Linguistics, 19(2):263--311, 1993.]] Google ScholarDigital Library
- W. B. Croft. Combining Approaches to Information Retrieval, in Advances in Information Retrieval ed. W. B. Croft, Kluwer Academic Publishers, Boston, MA.]]Google Scholar
- C. Carson, M. Thomas, S. Belongie, J. M. Hellerstein, and J. Malik. Blobworld: A system for region-based image indexing and retrieval. In Third International Conference on Visual Information Systems, Lecture Notes in Computer Science, 1614, pages 509--516, 1999.]] Google ScholarDigital Library
- M. Das and R. Manmatha and E. M. Riseman, Indexing Flowers by Color Names using Domain Knowledge-driven Segmentation, IEEE Intelligent Systems, 14(5):24--33, 1999.]] Google ScholarDigital Library
- P. Duygulu, K. Barnard, N. de~Freitas, and D. Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In Seventh European Conference on Computer Vision, pages 97--112, 2002.]] Google ScholarDigital Library
- D. Forsyth and J. Ponce, Computer Vision: A Modern Approach Prentice Hall, 2003]] Google ScholarDigital Library
- D. Hiemstra Using Language Models for Information Retrieval. PhD dissertation, University of Twente, Enschede, The Netherlands, 2001.]]Google Scholar
- J. M. Ponte, and W. B. Croft, A language modeling approach to information retrieval. Proceedings of the 21st annual international ACM SIGIR Conference, pages 275--281, 1998.]] Google ScholarDigital Library
- V. Lavrenko and W. Croft. Relevance-based language models. Proceedings of the 24th annual international ACM SIGIR conference, pages 120--127, 2001.]] Google ScholarDigital Library
- V. Lavrenko, M. Choquette, and W. Croft. Cross-lingual relevance models. Proceedings of the 25th annual international ACM SIGIR conference, pages 175--182, 2002.]] Google ScholarDigital Library
- Y. Mori, H. Takahashi, and R. Oka. Image-to-word transformation based on dividing and vector quantizing images with words. In MISRM'99 First International Workshop on Multimedia Intelligent Storage and Retrieval Management, 1999.]]Google Scholar
- R. W. Picard and T. P. Minka", Vision Texture for Annotation, In Multimedia Systems, 3(1):3--14, 1995.]] Google ScholarDigital Library
- J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888--905, 2000.]] Google ScholarDigital Library
- J. Lafferty and C. Zhai. Document language models, query models, and risk minimization for information retrieval, Proceedings of the 24th annual international ACM SIGIR Conference, pages 111--119, 2001.]] Google ScholarDigital Library
Index Terms
- Automatic image annotation and retrieval using cross-media relevance models
Recommendations
Effective automatic image annotation via a coherent language model and active learning
MULTIMEDIA '04: Proceedings of the 12th annual ACM international conference on MultimediaImage annotations allow users to access a large image database with textual queries. There have been several studies on automatic image annotation utilizing machine learning techniques, which automatically learn statistical models from annotated images ...
Automatic medical image annotation and retrieval
The demand for automatically annotating and retrieving medical images is growing faster than ever. In this paper, we present a novel medical image retrieval method for a special medical image retrieval problem where the images in the retrieval database ...
Automatic image annotation and semantic based image retrieval for medical domain
Automatic image annotation is the process of assigning meaningful words to an image taking into account its content. This process is of great interest as it allows indexing, retrieving, and understanding of large collections of image data. This paper ...
Comments