Content-based query of image databases: inspirations from text retrieval

https://doi.org/10.1016/S0167-8655(00)00081-7Get rights and content

Abstract

This paper reports the application of techniques inspired by text retrieval research to content-based image retrieval. In particular, we show how the use of an inverted file data structure permits the use of an extremely high-dimensional feature-space, by restricting search to the subspace spanned by the features present in the query. A suitably sparse set of colour and texture features is proposed. A weighting scheme based on feature frequencies is used to combine disparate features in a compatible manner, and naturally extends to incorporate relevance feedback queries. The use of relevance feedback is shown consistently to improve system performance.

Introduction

In recent years the use of digital image databases has become common, both on the web and in general publishing. Consequently, the efficient querying and browsing of large image databases is ever more important. Content-based retrieval from large text databases has been studied for decades, yet the insights and techniques of text retrieval (TR) have largely been ignored or reinvented by content-based image retrieval (CBIR) researchers. The utility of relevance feedback (RF) is long-established (Salton and Buckley, 1990), yet its application in CBIR systems (CBIRSs) is very recent (Wood et al., 1998). Similarly, many term-weighting approaches have been investigated, both empirically and theoretically (Salton and Buckley, 1988). Performance evaluation has also been thoroughly studied (Salton, 1992), yet precision and recall, the usual performance measures, are not widely used in CBIR.

TR systems usually treat each possible term as a search space dimension: O(104) dimensions are thus typical. Crucially, in such systems both queries and stored objects are sparse: they have only a small subset (O(102)) of all possible attributes. Search can thus be restricted to the subspace spanned by the query terms. The data structure which makes this efficient is the inverted file (IF) (Squire et al., 1999). Conversely, CBIR researchers have devoted considerable effort to the search for compact image representations (choosing the “right” features), and the use of dimensionality reduction techniques such as factor analysis (Pun and Squire, 1996).

We present a CBIRS which uses an IF with more than 80 000 possible features (≈O(103) features per image). A feature weighting scheme based on feature frequencies in both the query image and the entire collection, commonly used in TR, is employed. RF is also used. Evaluation using precision and recall demonstrates a clear improvement over a previously reported system using a smaller feature set and nearest-neighbour search.

Section snippets

Related work

CBIR researchers generally acknowledge that semantic retrieval remains impossible. The usual approach is to attempt to capture image similarity using some function of a small set of low-level features. Most systems employ features based on colour, texture or shape. Features are often computed globally, and contain no spatial information. Some systems allow the user to influence the relative weights of these classes of features.

Features. The use of colour features, usually calculated in a space

Viper system overview

In this section, a brief overview of the Viper system is presented.2 A more detailed account may be found in (Squire et al., 1999). Viper employs more than 80 000 simple colour and spatial frequency features, both local and global, extracted at several scales. These are intended to correspond (roughly) to features present in the retina and early visual cortex. The fundamental difference between traditional computer vision and image database applications is that there is a

Experiments

Viper performance was evaluated using a set of 500 heterogeneous colour images provided by Télévision Suisse Romande (see Fig. 1). Ten images were selected as queries. Five users examined all 500 images to determine their relevant sets for each query.3 These relevant sets varied greatly in size, and the degree of visual similarity within each set also varied greatly. Viper returned the top 20 ranked images for each query.

Conclusion

In this paper we have shown how techniques inspired by text retrieval can be applied to the content-based query of image databases. We believe that there is much to be learnt from the decades of research in text retrieval, despite the fact that the terms of text queries (words) are much closer to the semantic level than the simple features usually used for image retrieval.

The use of inverted files, coupled with an appropriate choice of discrete features, allows feature spaces of extremely high

References (22)

  • A.K. Jain et al.

    Image retrieval using color and shape

    Pattern Recognition

    (1996)
  • T. Pun et al.

    Statistical structuring of pictorial databases for content-based image retrieval systems

    Pattern Recognition Letters

    (1996)
  • G. Salton et al.

    Term weighting approaches in automatic text retrieval

    Inf. Process. Manage.

    (1988)
  • Carson, C., Belongie, S., Greenspan, H., Malik, J., 1997. Region-based image querying. In: Proc. 1997 IEEE Conf....
  • Cohen, S.D., Guibas, L.J., 1997. Shape-based image retrieval using geometric hashing. In: Proc. ARPA Image...
  • Huang, J., Kumar, S.R., Mitra, M., 1997. Combining supervised learning with color correlograms for content-based image...
  • A. Jain et al.

    A multiscale representation including opponent color features for texture recognition

    IEEE Trans. Image Processing

    (1998)
  • Ma, W., Manjunath, B., 1996. Texture features and learning similarity. In: Proc. 1996 IEEE Conf. Computer Vision and...
  • Mokhtarian, F., Abbasi, S., Kittler, J., 1996. Efficient and robust retrieval by shape content through curvature scale...
  • Niblack, W., Barber, R., Equitz, W., Flickner, M.D., Glasman, E.H., Petkovic, D., Yanker, P., Faloutsos, C., Taubin,...
  • A. Pentland et al.

    Photobook: tools for content-based manipulation of image databases

    Internat. J. Comput. Vision

    (1996)
  • Cited by (158)

    • Hand-dorsa vein recognition with structure growing guided CNN

      2017, Optik
      Citation Excerpt :

      However, the results of the geometrical models are usually unsatisfied due to the fact that segmentation results of low quality images are often inaccurate. Vein texture descriptors based on the binary code are adopted in the second group, which covers the local binary pattern (LBP) [7], the local line binary pattern (LLBP) [8], the personalized best bit map (PBBM) [9], etc. These methods transform the image matrix into a 1-D or 2-D feature matrix.

    • Evaluating performance of biomedical image retrieval systems-An overview of the medical image retrieval task at ImageCLEF 2004-2013

      2015, Computerized Medical Imaging and Graphics
      Citation Excerpt :

      Teams also were successful in “learning” semantic terms, or connections of visual features and text terms. Many groups used the popular medGIFT (Gnu Image Finding Tool) search engine [39]; as the team from the University of Geneva had made available the baseline results using this tool to all participants. Participants explored the use of early and late fusion as well as a variety of schemes for filtering images based on modality as a way to combine the results from text-based and image-based search engines.

    • Hand-Dorsa Vein Recognition Based on Local Deep Feature

      2022, Communications in Computer and Information Science
    • Secure content-based image retrieval using modified Euclidean distance for encrypted features

      2021, Transactions on Emerging Telecommunications Technologies
    • Event detection in educational records: An application of big data approaches

      2021, International Journal of Business and Systems Research
    View all citing articles on Scopus
    1

    Supported by Swiss National Science Foundation grant No. 2000-052426.97.

    View full text