skip to main content
10.1145/2911451.2914803acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article
Public Access

Counterfactual Evaluation and Learning for Search, Recommendation and Ad Placement

Published:07 July 2016Publication History

ABSTRACT

Online metrics measured through A/B tests have become the gold standard for many evaluation questions. But can we get the same results as A/B tests without actually fielding a new system? And can we train systems to optimize online metrics without subjecting users to an online learning algorithm? This tutorial summarizes and unifies the emerging body of methods on counterfactual evaluation and learning. These counterfactual techniques provide a well-founded way to evaluate and optimize online metrics by exploiting logs of past user interactions. In particular, the tutorial unifies the causal inference, information retrieval, and machine learning view of this problem, providing the basis for future research in this emerging area of great potential impact. Supplementary material and resources are available online at http://www.cs.cornell.edu/~adith/CfactSIGIR2016.

References

  1. S. Athey and G. Imbens. Recursive Partitioning for Heterogeneous Causal Effects. ArXiv e-prints, 2015.Google ScholarGoogle Scholar
  2. A. Beygelzimer and J. Langford. The offset tree for learning with partial labels. In KDD, pages 129--138, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. Bottou, J. Peters, J. Q. Candela, D. X. Charles, M. Chickering, E. Portugaly, D. Ray, P. Y. Simard, and E. Snelson. Counterfactual reasoning and learning systems: The example of computational advertising. Journal of Machine Learning Research, 14(1):3207--3260, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. B. Carterette, E. Kanoulas, V. Pavlu, and H. Fang. Reusable test collections through experimental design. In SIGIR, pages 547--554, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. B. Carterette, E. Kanoulas, and E. Yilmaz. Advances on the development of evaluation measures. In SIGIR, pages 1200--1201, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Dudík, D. Erhan, J. Langford, and L. Li. Doubly robust policy evaluation and optimization. Statistical Science, pages 485--511, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  7. M. Dudík, J. Langford, and L. Li. Doubly robust policy evaluation and learning. In ICML, pages 1097--1104, 2011.Google ScholarGoogle Scholar
  8. N. Gupta, E. Koh, and L. Li. Workshop on online and offline evaluation of web-based services. In WWW Companion, 2015.Google ScholarGoogle Scholar
  9. K. Hofmann, A. Schuth, S. Whiteson, and M. de Rijke. Reusing historical interaction data for faster online learning to rank for IR. In WSDM, pages 183--192, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. Kohavi, R. Longbotham, D. Sommerfield, and R. M. Henne. Controlled experiments on the web: survey and practical guide. Knowledge Discovery and Data mining, pages 140--181, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Langford, A. Strehl, and J. Wortman. Exploration scavenging. In ICML, pages 528--535, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. L. Li, S. Chen, J. Kleban, and A. Gupta. Counterfactual estimation and optimization of click metrics in search engines: A case study. In WWW Companion, pages 929--934, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. L. Li, W. Chu, J. Langford, and X. Wang. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms. In WSDM, pages 297--306, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. L. Li, R. Munos, and C. Szepesvari. Toward minimax off-policy value estimation. In AISTATS, 2015.Google ScholarGoogle Scholar
  15. J. Mary, P. Preux, and O. Nicol. Improving offline evaluation of contextual bandit algorithms via bootstrapping techniques. In ICML, pages 172--180, 2014.Google ScholarGoogle Scholar
  16. P. Rosenbaum and D. Rubin. The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1):41--55, 1983.Google ScholarGoogle ScholarCross RefCross Ref
  17. R. Rubinstein and D. Kroese. Simulation and the Monte Carlo Method. Wiley, 2 edition, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. T. Schnabel, A. Swaminathan, A. Singh, N. Chandak, and T. Joachims. Recommendations as treatments: Debiasing learning and evaluation. Preprint, 2016.Google ScholarGoogle Scholar
  19. A. L. Strehl, J. Langford, L. Li, and S. Kakade. Learning from logged implicit exploration data. In NIPS, pages 2217--2225, 2010.Google ScholarGoogle Scholar
  20. R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. The MIT Press, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. Swaminathan and T. Joachims. Counterfactual risk minimization: Learning from logged bandit feedback. In ICML, 2015.Google ScholarGoogle Scholar
  22. A. Swaminathan and T. Joachims. The self-normalized estimator for counterfactual learning. In NIPS, pages 3213--3221, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. Swaminathan, A. Krishnamurthy, A. Agarwal, M. Dudík, and J. Langford. Off-policy evaluation and optimization for slate recommendation. 2016.Google ScholarGoogle Scholar
  24. E. Yilmaz, E. Kanoulas, and J. A. Aslam. A simple and efficient sampling method for estimating AP and NDCG. In SIGIR, pages 603--610, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. B. Zadrozny, J. Langford, and N. Abe. Cost-sensitive learning by cost-proportionate example weighting. In ICDM, pages 435--, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Counterfactual Evaluation and Learning for Search, Recommendation and Ad Placement

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
        July 2016
        1296 pages
        ISBN:9781450340694
        DOI:10.1145/2911451

        Copyright © 2016 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 7 July 2016

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        SIGIR '16 Paper Acceptance Rate62of341submissions,18%Overall Acceptance Rate792of3,983submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader