skip to main content
10.1145/860435.860459acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Automatic image annotation and retrieval using cross-media relevance models

Published:28 July 2003Publication History

ABSTRACT

Libraries have traditionally used manual image annotation for indexing and then later retrieving their image collections. However, manual image annotation is an expensive and labor intensive procedure and hence there has been great interest in coming up with automatic ways to retrieve images based on content. Here, we propose an automatic approach to annotating and retrieving images based on a training set of images. We assume that regions in an image can be described using a small vocabulary of blobs. Blobs are generated from image features using clustering. Given a training set of images with annotations, we show that probabilistic models allow us to predict the probability of generating a word given the blobs in an image. This may be used to automatically annotate and retrieve images given a word as a query. We show that relevance models allow us to derive these probabilities in a natural way. Experiments show that the annotation performance of this cross-media relevance model is almost six times as good (in terms of mean precision) than a model based on word-blob co-occurrence model and twice as good as a state of the art model derived from machine translation. Our approach shows the usefulness of using formal information retrieval models for the task of image annotation and retrieval.

References

  1. K. Barnard, P. Duygulu, N. de~Freitas, D. Forsyth, D. Blei, and M. I. Jordan. Matching words and pictures. Journal of Machine Learning Research, 3:1107--1135, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. K. Barnard and D. Forsyth. Learning the semantics of words and pictures. In International Conference on Computer Vision, Vol.2, pages 408--415, 2001.]]Google ScholarGoogle ScholarCross RefCross Ref
  3. D. Blei, Michael, and M. I. Jordan. Modeling annotated data. To appear in the Proceedings of the 26th annual international ACM SIGIR conference]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Berger, A. and Lafferty, J. Information retrieval as statistical translation. In Proceedings of the 22nd annual international ACM SIGIR conference, pages 222--229, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. Brown, S. D. Pietra, V. D. Pietra, and R. Mercer. The mathematics of statistical machine translation: Parameter estimation. In Computational Linguistics, 19(2):263--311, 1993.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. W. B. Croft. Combining Approaches to Information Retrieval, in Advances in Information Retrieval ed. W. B. Croft, Kluwer Academic Publishers, Boston, MA.]]Google ScholarGoogle Scholar
  7. C. Carson, M. Thomas, S. Belongie, J. M. Hellerstein, and J. Malik. Blobworld: A system for region-based image indexing and retrieval. In Third International Conference on Visual Information Systems, Lecture Notes in Computer Science, 1614, pages 509--516, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Das and R. Manmatha and E. M. Riseman, Indexing Flowers by Color Names using Domain Knowledge-driven Segmentation, IEEE Intelligent Systems, 14(5):24--33, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. P. Duygulu, K. Barnard, N. de~Freitas, and D. Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In Seventh European Conference on Computer Vision, pages 97--112, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. Forsyth and J. Ponce, Computer Vision: A Modern Approach Prentice Hall, 2003]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. Hiemstra Using Language Models for Information Retrieval. PhD dissertation, University of Twente, Enschede, The Netherlands, 2001.]]Google ScholarGoogle Scholar
  12. J. M. Ponte, and W. B. Croft, A language modeling approach to information retrieval. Proceedings of the 21st annual international ACM SIGIR Conference, pages 275--281, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. V. Lavrenko and W. Croft. Relevance-based language models. Proceedings of the 24th annual international ACM SIGIR conference, pages 120--127, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. V. Lavrenko, M. Choquette, and W. Croft. Cross-lingual relevance models. Proceedings of the 25th annual international ACM SIGIR conference, pages 175--182, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. Mori, H. Takahashi, and R. Oka. Image-to-word transformation based on dividing and vector quantizing images with words. In MISRM'99 First International Workshop on Multimedia Intelligent Storage and Retrieval Management, 1999.]]Google ScholarGoogle Scholar
  16. R. W. Picard and T. P. Minka", Vision Texture for Annotation, In Multimedia Systems, 3(1):3--14, 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888--905, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Lafferty and C. Zhai. Document language models, query models, and risk minimization for information retrieval, Proceedings of the 24th annual international ACM SIGIR Conference, pages 111--119, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Automatic image annotation and retrieval using cross-media relevance models

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGIR '03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
        July 2003
        490 pages
        ISBN:1581136463
        DOI:10.1145/860435

        Copyright © 2003 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 28 July 2003

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        SIGIR '03 Paper Acceptance Rate46of266submissions,17%Overall Acceptance Rate792of3,983submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader