The LambdaLoss Framework for Ranking Metric Optimization

Authors:
Xuanhui Wang

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

,
Cheng Li

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

,
Nadav Golbandi

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

,
Michael Bendersky

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

,
Marc Najork

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge ManagementOctober 2018Pages 1313–1322https://doi.org/10.1145/3269206.3271784

Published:17 October 2018Publication History

CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management

Pages 1313–1322

ABSTRACT

How to optimize ranking metrics such as Normalized Discounted Cumulative Gain (NDCG) is an important but challenging problem, because ranking metrics are either flat or discontinuous everywhere, which makes them hard to be optimized directly. Among existing approaches, LambdaRank is a novel algorithm that incorporates ranking metrics into its learning procedure. Though empirically effective, it still lacks theoretical justification. For example, the underlying loss that LambdaRank optimizes for remains unknown until now. Due to this, there is no principled way to advance the LambdaRank algorithm further. In this paper, we present LambdaLoss, a probabilistic framework for ranking metric optimization. We show that LambdaRank is a special configuration with a well-defined loss in the LambdaLoss framework, and thus provide theoretical justification for it. More importantly, the LambdaLoss framework allows us to define metric-driven loss functions that have clear connection to different ranking metrics. We show a few cases in this paper and evaluate them on three publicly available data sets. Experimental results show that our metric-driven loss functions can significantly improve the state-of-the-art learning-to-rank algorithms.

References

Ricardo A. Baeza-Yates and Berthier Ribeiro-Neto. 1999. Modern Information Retrieval .Addison-Wesley Longman Publishing Co., Inc. Google ScholarDigital Library
Ralph A. Bradley and Milton E. Terry. 1952. The Rank Analysis of Incomplete Block Designs -- I. The Method of Paired Comparisons. Biometrika , Vol. 39 (1952), 324--345.Google Scholar
Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. 2005. Learning to rank using gradient descent. In Proc. of the 22nd International Conference on Machine Learning (ICML). 89--96. Google ScholarDigital Library
Christopher J.C. Burges. 2010. From RankNet to LambdaRank to LambdaMART: An Overview . Technical Report MSR-TR-2010--82. Microsoft Research.Google Scholar
Christopher J. C. Burges, Robert Ragno, and Quoc Viet Le. 2006. Learning to Rank with Nonsmooth Cost Functions. In Proc. of the 20th Annual Conference on Neural Information Processing Systems (NIPS). 193--200. Google ScholarDigital Library
Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li. 2007. Learning to rank: from pairwise approach to listwise approach. In Proc. of the 24th International Conference on Machine Learning (ICML). 129--136. Google ScholarDigital Library
Olivier Chapelle and Yi Chang. 2011. Yahoo! learning to rank challenge overview. In Proceedings of the Learning to Rank Challenge . 1--24. Google ScholarDigital Library
Olivier Chapelle and Mingrui Wu. 2010. Gradient descent optimization of smoothed information retrieval metrics. Inf. Retr. , Vol. 13, 3 (2010), 216--235. Google ScholarDigital Library
Wei Chu and Zoubin Ghahramani. 2005. Preference learning with Gaussian processes. In Proc. of the 22nd International Conference on Machine Learning (ICML). 137--144. Google ScholarDigital Library
Arthur P. Dempster, Nan M. Laird, and Donald B. Rubin. 1977. Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B (Methodological) , Vol. 39, 1 (1977), 1--38.Google ScholarCross Ref
Pinar Donmez, Krysta M. Svore, and Christopher J.C. Burges. 2009. On the Local Optimality of LambdaRank. In Proc. of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 460--467. Google ScholarDigital Library
Yoav Freund and Robert E. Schapire. 1997. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. J. Comput. Syst. Sci. , Vol. 55, 1 (1997), 119--139. Google ScholarDigital Library
Jerome H Friedman. 2001. Greedy function approximation: a gradient boosting machine. Annals of Statistics , Vol. 29, 5 (2001), 1189--1232.Google ScholarCross Ref
Norbert Fuhr. 1989. Optimum polynomial retrieval functions based on the probability ranking principle. ACM Transactions on Information Systems (TOIS) , Vol. 7, 3 (1989), 183--204. Google ScholarDigital Library
Fredric C Gey. 1994. Inferring probability of relevance using the method of logistic regression. In Proc. of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 222--231. Google ScholarDigital Library
Kalervo J"a rvelin and Jaana Kek"a l"a inen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems , Vol. 20, 4 (2002), 422--446. Google ScholarDigital Library
Thorsten Joachims. 2002. Optimizing Search Engines Using Clickthrough Data. In Proc. of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 133--142. Google ScholarDigital Library
Thorsten Joachims, Adith Swaminathan, and Tobias Schnabel. 2017. Unbiased Learning-to-Rank with Biased Feedback. In Proc. of the 10th ACM International Conference on Web Search and Data Mining (WSDM). 781--789. Google ScholarDigital Library
N.L. Johnson, S. Kotz, and N. Balakrishnan. 1995. Continuous univariate distributions . Number v. 2 in Wiley series in probability and mathematical statistics: Applied probability and statistics. Wiley & Sons.Google Scholar
Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. LightGBM: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems. 3149--3157.Google Scholar
Quoc V. Le and Alexander J. Smola. 2007. Direct Optimization of Ranking Measures. CoRR , Vol. abs/0704.3359 (2007).Google Scholar
Ping Li, Christopher J. C. Burges, and Qiang Wu. 2007. McRank: Learning to Rank Using Multiple Classification and Gradient Boosting. In Proc. of the 21st Annual Conference on Neural Information Processing Systems (NIPS). 897--904. Google ScholarDigital Library
Tie-Yan Liu. 2011. Learning to Rank for Information Retrieval .Springer.Google ScholarDigital Library
Donald Metzler and W Bruce Croft. 2007. Linear feature-based models for information retrieval. Information Retrieval , Vol. 10, 3 (2007), 257--274. Google ScholarDigital Library
Pritish Mohapatra, Michal Rolinek, C. V. Jawahar, Vladimir Kolmogorov, and M. Pawan Kumar. 2018. Efficient Optimization for Rank-based Loss Functions. In IEEE Conference on Computer Vision and Pattern Recognition .Google Scholar
Tao Qin and Tie-Yan Liu. 2013. Introducing LETOR 4.0 Datasets. CoRR , Vol. abs/1306.2597 (2013).Google Scholar
Tao Qin, Tie-Yan Liu, and Hang Li. 2010. A general approximation framework for direct optimization of information retrieval measures. Information Retrieval , Vol. 13, 4 (2010), 375--397. Google ScholarDigital Library
Christian P. Robert and George Casella. 2005. Monte Carlo Statistical Methods (Springer Texts in Statistics) .Springer-Verlag. Google ScholarDigital Library
Zhengya Sun, Tao Qin, Qing Tao, and Jue Wang. 2009. Robust Sparse Rank Learning for Non-smooth Ranking Measures. In Proc. of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 259--266. Google ScholarDigital Library
Martin Szummer and Emine Yilmaz. 2011. Semi-supervised Learning to Rank with Preference Regularization. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management (CIKM). 269--278. Google ScholarDigital Library
Ming Tan, Tian Xia, Lily Guo, and Shaojun Wang. 2013. Direct Optimization of Ranking Measures for Learning to Rank Models. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '13). 856--864. Google ScholarDigital Library
Niek Tax, Sander Bockting, and Djoerd Hiemstra. 2015. A cross-benchmark comparison of 87 learning to rank methods. Information Processing & Management , Vol. 51, 6 (2015), 757--772. Google ScholarDigital Library
Michael Taylor, John Guiver, Stephen Robertson, and Tom Minka. 2008. Softrank: optimizing non-smooth rank metrics. In Proc. of the 2008 International Conference on Web Search and Data Mining (WSDM). 77--86. Google ScholarDigital Library
Michael Taylor, Hugo Zaragoza, Nick Craswell, Stephen Robertson, and Chris Burges. 2006. Optimisation methods for ranking functions with multiple parameters. In Proc. of the 15th ACM International Conference on Information and Knowledge Management (CIKM). 585--593. Google ScholarDigital Library
Ioannis Tsochantaridis, Thomas Hofmann, Thorsten Joachims, and Yasemin Altun. 2004. Support Vector Machine Learning for Interdependent and Structured Output Spaces. In Proc. of the 21st International Conference on Machine Learning (ICML). 104. Google ScholarDigital Library
Hamed Valizadegan, Rong Jin, Ruofei Zhang, and Jianchang Mao. 2009. Learning to Rank by Optimizing NDCG Measure. In Proc. of the 22nd International Conference on Neural Information Processing Systems (NIPS). 1883--1891. Google ScholarDigital Library
Ellen M. Voorhees and Donna K. Harman (Eds.). 1999. Proc. of The Eighth Text REtrieval Conference, TREC. Vol. Special Publication 500--246. National Institute of Standards and Technology (NIST) .Google Scholar
Xuanhui Wang, Nadav Golbandi, Michael Bendersky, Donald Metzler, and Marc Najork. 2018. Position Bias Estimation for Unbiased Learning to Rank in Personal Search. In Proc. of the 11th Conference on Web Search and Data Mining (WSDM). 610--618. Google ScholarDigital Library
Fen Xia, Tie-Yan Liu, Jue Wang, Wensheng Zhang, and Hang Li. 2008. Listwise Approach to Learning to Rank: Theory and Algorithm. In Proceedings of the 25th International Conference on Machine Learning (ICML '08). 1192--1199. Google ScholarDigital Library
Jun Xu and Hang Li. 2007. AdaRank: A Boosting Algorithm for Information Retrieval. In Proc. of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 391--398. Google ScholarDigital Library
Yisong Yue, Thomas Finley, Filip Radlinski, and Thorsten Joachims. 2007. A support vector method for optimizing average precision. In Proc. of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 271--278. Google ScholarDigital Library

Index Terms

The LambdaLoss Framework for Ranking Metric Optimization
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Learning to rank

Recommendations

On Optimizing Top-K Metrics for Neural Ranking Models
SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

Top-K metrics such as NDCG@K are frequently used to evaluate ranking performance. The traditional tree-based models such as LambdaMART, which are based on Gradient Boosted Decision Trees (GBDT), are designed to optimize NDCG@K using the LambdaRank ...
Read More
Co-optimization of multiple relevance metrics in web search
WWW '10: Proceedings of the 19th international conference on World wide web

Several relevance metrics, such as NDCG, precision and pSkip, are proposed to measure search relevance, where different metrics try to characterize search relevance from different perspectives. Yet we empirically find that the direct optimization of one ...
Read More
Metric-agnostic Ranking Optimization
SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

Ranking is at the core of Information Retrieval. Classic ranking optimization studies often treat ranking as a sorting problem with the assumption that the best performance of ranking would be achieved if we rank items according to their individual ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management
October 2018
2362 pages
ISBN:9781450360142
DOI:10.1145/3269206
General Chair:
Alfredo Cuzzocrea
University of Trieste, Italy
,
Program Chairs:
James Allan
University of Massachusetts, USA
,
Norman Paton
University of Manchester, United Kingdom
,
Divesh Srivastava
AT&T Labs Research, USA
,
Rakesh Agrawal
Data Insights Lab, USA
,
Andrei Broder
Google Research, USA
,
Mohammed Zaki
Rensselaer Polytechnic Institute, USA
,
Selcuk Candan
Arizona State University, USA
,
Alexandros Labrinidis
University of Pittsburgh, USA
,
Assaf Schuster
Technion, Israel
,
Haixun Wang
Google Research, USA
Copyright © 2018 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 October 2018
Check for updates
Author Tags
lambdaloss
lambdarank
ranking metric optimization
Qualifiers
- research-article
Conference

Acceptance Rates
CIKM '18 Paper Acceptance Rate147of826submissions,18%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 62
  Total Citations
  View Citations
- 4,292
  Total Downloads
- Downloads (Last 12 months)1,066
- Downloads (Last 6 weeks)108
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

The LambdaLoss Framework for Ranking Metric Optimization

CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

On Optimizing Top-K Metrics for Neural Ranking Models

Co-optimization of multiple relevance metrics in web search

Metric-agnostic Ranking Optimization