Skip to main content
Log in

Evaluation and analysis of similarity measures for content-based visual information retrieval

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

The selection of appropriate proximity measures is one of the crucial success factors of content-based visual information retrieval. In this area of research, proximity measures are used to estimate the similarity of media objects by the distance of feature vectors. The research focus of this work is the identification of proximity measures that perform better than the usual choices (e.g., Minkowski metrics). We evaluate a catalogue of 37 measures that are selected from various areas (psychology, sociology, economics, etc.). The evaluation is based on content-based MPEG-7 descriptions of carefully selected media collections. Unfortunately, some proximity measures are only defined on predicates (e.g., most psychological measures). One major contribution of this paper is a model that allows for the application of such measures on continuous feature data. The evaluation results uncover proximity measures that perform better than others on content-based features. Some predicate-based measures clearly outperform the frequently used distance norms. Eventually, the discussion of the evaluation leads to a catalogue of mathematical terms of successful retrieval and browsing measures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Benchathlon network website (available from http://www. benchathlon.net/, last visited 2006-02-17)

  2. Bober M. (2001) MPEG-7 visual shape descriptors. Special issue on MPEG-7. IEEE Trans. Circuits Syst. Video Technol. 11(6): 716–719

    Article  Google Scholar 

  3. Catell R.B. (1949) rp and other coefficients of pattern similarity. Psychometrica 14, 279–298

    Article  Google Scholar 

  4. Chang S.F., Sikora T., Puri A. (2001) Overview of the MPEG-7 standard. Special issue on MPEG-7. IEEE Trans. Circuits Syst. Video Technol. 11(6): 688–695

    Article  Google Scholar 

  5. Clark P.S. (1952) An extension of the coefficient of divergence for use with multiple characters. Copeia 2: 61–64

    Article  Google Scholar 

  6. Cohen J. (1969) A profile similarity coefficient invariant over variable reflection. Psychological Bulletin 71, 281–284

    Article  Google Scholar 

  7. Czekanowski J.: Zarys metod statystycznych w zastosowaniu do antropologii. Prace Towarzystwa Naukowego Warszawskiego 5 (1913)

  8. Del Bimbo A. (1999) Visual Information Retrieval. Morgan Kaufmann, San Francisco CA

    Google Scholar 

  9. Eidenberger H.: Distance measures for MPEG-7-based retrieval. In: Proceedings ACM SIGMM International Workshop on Multimedia Information Retrieval, Berkeley, CA: 130–137 (2003)

  10. Eidenberger H., Breiteneder C.: Visual similarity measurement with the Feature Contrast Model. SPIE (Storage and Retrieval for Media Databases Conference) 5021, 64–76

  11. Eidenberger H., Breiteneder C. (2003) VizIR – a framework for visual information retrieval. Vis. Lang. Comput. 14, 443–469

    Article  Google Scholar 

  12. Eidenberger H (2004) Statistical analysis of visual MPEG-7 descriptors. ACM Multimedia Systems 10(2): 84–97

    Article  Google Scholar 

  13. Fuhr N. Information retrieval methods for multimedia objects. In: Veltkamp R.C., Burkhardt H., Kriegel H.P., (eds) State-of-the-Art in Content-Based Image and Video Retrieval. Kluwer, Boston, pp. 191–212

  14. Goodall D.W. (1967) The distribution of the matching coefficient. Biometrics 23, 647–656

    Article  MathSciNet  Google Scholar 

  15. Gower J.G. (1967) Multivariate analysis and multidimensional geometry. The Statistician 17, 13–25

    Article  Google Scholar 

  16. Jaccard P. (1908) Nouvelles recherches sur la distribution florale. Bull. Soc. Vaudoise Sci. Nat. 44, 223–270

    Google Scholar 

  17. Jolion J.M. (2001) Feature similarity. In: Lew M.S. (ed) Principles of Visual Information Retrieval. Springer, Berlin Heidelberg Newyork, pp. 121–144

    Google Scholar 

  18. Kulczynski S.: Bulletin International de l’Acadamie Polonaise des Sciences et des Lettres, Classe des Sciences Mathématiques et Naturelles, Série B (Sciences Naturelles), Suppl. II 57–203 (1927)

  19. Lance G.N., Williams W.T. (1967) Mixed data classificatory programs. Agglom. Syst. Aust. Co. J. 9, 373–380

    Google Scholar 

  20. Manjunath B.S., Ohm J.R., Vasudevan V.V., Yamada A. (2001) MPEG-7 color and texture descriptors. IEEE Trans. Circuits Syst. Video Technol. 11(6): 703–715

    Article  Google Scholar 

  21. Manjunath B.S., Salembier P., Sikora T. (2002) Introduction to MPEG-7. Wiley, San Francisco CA

    Google Scholar 

  22. Meehl P.E. (1997) The problem is epistemology, not statistics: replace significance tests by confidence intervals and quantify accuracy of risky numerical predictions. In: Harlow L.L., Mulaik S.A., Steiger J.H. (eds) What if there were no significance tests?. Erlbaum, Mahwah NJ, pp 393–425

    Google Scholar 

  23. MPEG-7 eXperimentation Model website (available from http://www.lis.ei.tum.de/research/bv/topics/mmdb/e_mpeg7. html, last visited 2006-02-17)

  24. MPEG-7 similarity measurement website (available from http://vizir.ims.tuwien.ac.at/SimEval, last visited 20060217)

  25. Ochiai A. (1957) Zoogeographic studies on the soleoid fishes found in Japan and its neighbouring regions. Bull. Jpn. Soc. Sci. Fish 22, 526–530

    Google Scholar 

  26. Over P., Leung C., Ip H., Grubinger M. (2004) Multimedia Retrieval Benchmarks. IEEE Multimed. 11(2): 80–84

    Article  Google Scholar 

  27. Pearson K. (1926) On the coefficients of racial likeness. Biometrica 18, 105–117

    Article  MATH  Google Scholar 

  28. Rogers D.J., Tanimoto T.T. (1960) A computer program for classifying plants. Science 132, 1115–1118

    Article  Google Scholar 

  29. Russel P.F, Rao T.R. (1940) On habitat and association of species of anopheline larvae in south-eastern Madras. Malar. Inst. J. 3, 153–178

    Google Scholar 

  30. Santini S., Jain R. (1997) Similarity is a geometer. Multimed. Tools Appl. 5/3: 277–306

    Article  Google Scholar 

  31. Santini S., Jain R. (1999) Similarity measures. IEEE Trans. on Pattern Anal. Mach. Intell. 21(9): 871–883

    Article  Google Scholar 

  32. Sint P.P.: Similarity structures and similarity measures. Austrian Academy of Sciences Press, Vienna (in German) (1975)

  33. Smeulders A.W.M., Worring M., Santini S., Gupta A., Jain R. (2000) Content-based image retrieval at the end of the early years. IEEE Trans. Pattern Anal. Mach. Intell. 22(12): 1349–1380

    Article  Google Scholar 

  34. Sneath P.H.A., Sokal R.R. (1973) Numerical Taxonomy. W. H. Freeman, San Francisco CA

    MATH  Google Scholar 

  35. Tversky A. (1977) Features of similarity. Psychol. Rev. 84(4): 327–351

    Article  Google Scholar 

  36. Web appendix of data tables (available from http://www. ims.tuwien.ac.at/~hme/papers/acmms04b-appendix-datatables-1.pdf, last visited 2006-02-17)

  37. Webster H. (1952) A note on profile similarity. Psychol. Bull. 49, 538–539

    Article  Google Scholar 

  38. Yule G.U. (1911) An Introduction of the Theory of Statistics. Charles Griffin & Co., London UK

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Horst Eidenberger.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Eidenberger, H. Evaluation and analysis of similarity measures for content-based visual information retrieval. Multimedia Systems 12, 71–87 (2006). https://doi.org/10.1007/s00530-006-0043-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-006-0043-z

Keywords

Navigation