Skip to main content

Seven Numeric Properties of Effectiveness Metrics

  • Conference paper
Information Retrieval Technology (AIRS 2013)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8281))

Included in the following conference series:

Abstract

Search effectiveness metrics quantify the relevance of the ranked document lists returned by retrieval systems. In this paper we characterize metrics according to seven numeric properties – boundedness, monotonicity, convergence, top-weightedness, localization, completeness, and realizability. We demonstrate that these properties partition the commonly-used evaluation metrics, and hence provide a framework in which the relationships between effectiveness metrics can be better understood, including their relative merits for different applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aslam, J.A., Pavlu, V., Yilmaz, E.: A statistical method for system evaluation using incomplete judgments. In: Proc. SIGIR, Seattle, Washington, pp. 541–548 (2006)

    Google Scholar 

  2. Buckley, C., Voorhees, E.M.: Evaluating evaluation measure stability. In: Proc. SIGIR, Athens, Greece, pp. 33–40 (2000)

    Google Scholar 

  3. Buckley, C., Voorhees, E.M.: Retrieval evaluation with incomplete information. In: Proc. SIGIR, Sheffield, England, pp. 25–32 (2004)

    Google Scholar 

  4. Büttcher, S., Clarke, C.L.A., Cormack, G.V.: Information Retrieval: Implementing and Evaluating Search Engines. The MIT Press (2010)

    Google Scholar 

  5. Carterette, B.: System effectiveness, user models, and user utility: A conceptual framework for investigation. In: Proc. SIGIR, Beijing, China, pp. 903–912 (2011)

    Google Scholar 

  6. Carterette, B., Kanoulas, E., Yilmaz, E.: Simulating simple user behavior for system effectiveness evaluation. In: Proc. CIKM, Glasgow, Scotland, pp. 611–620 (2011)

    Google Scholar 

  7. Chapelle, O., Metzler, D., Zhang, Y., Grinspan, P.: Expected reciprocal rank for graded relevance. In: Proc. CIKM, Hong Kong, China, pp. 621–630 (2009)

    Google Scholar 

  8. Demartini, G., Mizzaro, S.: A classification of IR effectiveness metrics. In: Lalmas, M., MacFarlane, A., Rüger, S.M., Tombros, A., Tsikrika, T., Yavlinsky, A. (eds.) ECIR 2006. LNCS, vol. 3936, pp. 488–491. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  9. Dupret, G., Piwowarski, B.: A user browsing model to predict search engine click data from past observations. In: Proc. SIGIR, Singapore, pp. 331–338 (2008)

    Google Scholar 

  10. Dupret, G., Piwowarski, B.: A user behavior model for average precision and its generalization to graded judgments. In: Proc. SIGIR, Geneva, Switzerland, pp. 531–538 (2010)

    Google Scholar 

  11. Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Sys. 20(4), 422–446 (2002)

    Article  Google Scholar 

  12. Losee, R.M.: Percent perfect performance (PPP). Inf. Proc. Man. 43(4), 1020–1029 (2007)

    Article  Google Scholar 

  13. Mizzaro, S.: The good, the bad, the difficult, and the easy: Something wrong with information retrieval evaluation? In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 642–646. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  14. Moffat, A., Thomas, P., Scholer, F.: Users versus models: What observation tells us about effectiveness metrics. In: Proc. CIKM, San Francisco, California (to appear, 2013)

    Google Scholar 

  15. Moffat, A., Zobel, J.: Rank-biased precision for measurement of retrieval effectiveness. ACM Trans. Inf. Sys. 27(1:2), 1–27 (2008)

    Google Scholar 

  16. Robertson, S.: On GMAP: and other transformations. In: Proc. CIKM, Arlington, Virginia, pp. 78–83 (2006)

    Google Scholar 

  17. Robertson, S.: A new interpretation of average precision. In: Proc. SIGIR, Singapore, pp. 689–690 (2008)

    Google Scholar 

  18. Sakai, T., Kando, N.: On information retrieval metrics designed for evaluation with incomplete relevance assessments. Inf. Ret. 11(5), 447–470 (2008)

    Article  Google Scholar 

  19. Sakai, T.: Alternatives to BPref. In: Proc. SIGIR, Amsterdam, Netherlands, pp. 71–78 (2007)

    Google Scholar 

  20. Sanderson, M., Zobel, J.: Information retrieval system evaluation: Effort, sensitivity, and reliability. In: Proc. SIGIR, Salvador, Brazil, pp. 162–169 (2005)

    Google Scholar 

  21. Smucker, M.D., Clarke, C.L.A.: Time-based calibration of effectiveness measures. In: Proc. SIGIR, Portland, Oregon, pp. 95–104 (2012)

    Google Scholar 

  22. Turpin, A., Scholer, F.: User performance versus precision measures for simple search tasks. In: Proc. SIGIR, Seattle, Washington, pp. 11–18 (2006)

    Google Scholar 

  23. Webber, W., Moffat, A., Zobel, J.: Score standardization for inter-collection comparison of retrieval systems. In: Proc. SIGIR, Singapore, pp. 51–58 (2008)

    Google Scholar 

  24. Yilmaz, E., Aslam, J.A.: Estimating average precision with incomplete and imperfect judgments. In: Proc. CIKM, Arlington, Virginia, pp. 102–111 (2006)

    Google Scholar 

  25. Zobel, J., Moffat, A., Park, L.A.F.: Against recall: Is it persistence, cardinality, density, coverage, or totality? SIGIR Forum 43(1), 3–15 (2009)

    Article  Google Scholar 

  26. Zobel, J.: How reliable are the results of large-scale information retrieval experiments? In: Proc. SIGIR, Melbourne, Australia, pp. 307–314 (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Moffat, A. (2013). Seven Numeric Properties of Effectiveness Metrics. In: Banchs, R.E., Silvestri, F., Liu, TY., Zhang, M., Gao, S., Lang, J. (eds) Information Retrieval Technology. AIRS 2013. Lecture Notes in Computer Science, vol 8281. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45068-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-45068-6_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-45067-9

  • Online ISBN: 978-3-642-45068-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics