research-article

On the local optimality of LambdaRank

Authors:
Pinar Donmez

Carnegie Mellon University, Pittsburgh, PA, USA

Carnegie Mellon University, Pittsburgh, PA, USA
View Profile

,
Krysta M. Svore

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

,
Christopher J.C. Burges

Microsoft Research, Redmond, WA, USA

Microsoft Research, Redmond, WA, USA
View Profile

SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrievalJuly 2009Pages 460–467https://doi.org/10.1145/1571941.1572021

Published:19 July 2009Publication History

SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval

Pages 460–467

ABSTRACT

A machine learning approach to learning to rank trains a model to optimize a target evaluation measure with repect to training data. Currently, existing information retrieval measures are impossible to optimize directly except for models with a very small number of parameters. The IR community thus faces a major challenge: how to optimize IR measures of interest directly. In this paper, we present a solution. Specifically, we show that LambdaRank, which smoothly approximates the gradient of the target measure, can be adapted to work with four popular IR target evaluation measures using the same underlying gradient construction. It is likely, therefore, that this construction is extendable to other evaluation measures. We empirically show that LambdaRank finds a locally optimal solution for mean NDCG@10, mean NDCG, MAP and MRR with a 99% confidence rate. We also show that the amount of effective training data varies with IR measure and that with a sufficiently large training set size, matching the training optimization measure to the target evaluation measure yields the best accuracy.

References

C.J.C. Burges, R. Ragno, and Q.V. Le. Learning to rank with nonsmooth cost functions. In Neural Information Processing Systems (NIPS), 2006. See also MSR Technical Report MSR-TR-2006-60.Google Scholar
C.J.C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. In International Conference on Machine Learning (ICML), Bonn, Germany, 2005. Google ScholarDigital Library
Z. Cao, T. Qin, T.Y. Liu, M.F. Tsai, and H. Li. Learning to rank: From pairwise to listwise approach. In International Conference on Machine Learning (ICML), pages 129--136, 2007. Google ScholarDigital Library
K. Crammer and Y. Singer. Pranking with ranking. In Neural Information Processing Systems (NIPS), 2001.Google Scholar
R. Herbrich, T. Graepel, and K. Obermayer. Large margin rank boundaries for ordinal regression. Advances in Large Margin Classifiers, pages 115--132, 2000.Google Scholar
T. Qin, T.Y. Liu, and H. Li. A general approximation framework for direct optimization of information retrieval measures. Microsoft Technical Report MSR-TR-2008-164, 2008.Google Scholar
T. Qin, X.-D. Zhang, M.-F. Tsai, D.-S. Wang, T.-Y. Liu, and H. Li. Query-level loss functions for information retrieval. Information Processing and Management, 44(2):838--855, 2007. Google ScholarDigital Library
S. Robertson and H. Zaragoza. On rank-based effectiveness measures and optimization. Information Processing and Management, 10:321--339, 2007. Google ScholarDigital Library
B. Taskar, V. Chatalbashev, D. Koller, and C. Guestrin. Learning structured prediction models: A large margin approach. In International Conference on Machine Learning (ICML), Bonn, Germany, 2005. Google ScholarDigital Library
I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun. Support vector machine learning for interdependent and structured output spaces. In International Conference on Machine Learning (ICML), 2004. Google ScholarDigital Library
J. Xu and H. Li. Adarank: A boosting algorithm for information retrieval. In ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pages 391--398, 2007. Google ScholarDigital Library
Y. Yue and C.J.C Burges. On using simultaneous perturbation stochastic approximation for ir measures, and the empirical optimality of lambdarank. NIPS Machine Learning for Web Search Workshop, 2007.Google Scholar
Y. Yue, T. Finley, F. Radlinski, and T. Joachims. A support vector method for optimizing average precision. In ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2007. Google ScholarDigital Library

Index Terms

On the local optimality of LambdaRank
1. Computing methodologies
  1. Machine learning
2. Information systems
  1. Information retrieval

Recommendations

Quality-biased ranking for queries with commercial intent
WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide Web

Modern search engines are good enough to answer popular commercial queries with mainly highly relevant documents. However, our experiments show that users behavior on such relevant commercial sites may differ from one to another web-site with the same ...
Read More
Incremental learning to rank with partially-labeled data
WSCD '09: Proceedings of the 2009 workshop on Web Search Click Data

In this paper we present a semi-supervised learning method for a problem of learning to rank where we exploit Markov random walks and graph regularization in order to incorporate not only "labeled" web pages but also plenty of "un-labeled" web pages (...
Read More
Learning to rank with multiple objective functions
WWW '11: Proceedings of the 20th international conference on World wide web

We investigate the problem of learning to rank with document retrieval from the perspective of learning for multiple objective functions. We present solutions to two open problems in learning to rank: first, we show how multiple measures can be combined ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
July 2009
896 pages
ISBN:9781605584836
DOI:10.1145/1571941
General Chairs:
James Allan
University of Massachusetts Amherst, USA
,
Javed Aslam
Northeastern University, USA
,
Program Chairs:
Mark Sanderson
University of Sheffield, UK
,
ChengXiang Zhai
University of Illinois at Urbana-Champaign, USA
,
Justin Zobel
University of Melbourne, Australia
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 July 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
learning to rank
web search
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate792of3,983submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 55
  Total Citations
  View Citations
- 683
  Total Downloads
- Downloads (Last 12 months)38
- Downloads (Last 6 weeks)7
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

On the local optimality of LambdaRank

SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Quality-biased ranking for queries with commercial intent

Incremental learning to rank with partially-labeled data

Learning to rank with multiple objective functions