skip to main content
10.1145/1367497.1367529acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Online learning from click data for sponsored search

Published:21 April 2008Publication History

ABSTRACT

Sponsored search is one of the enabling technologies for today's Web search engines. It corresponds to matching and showing ads related to the user query on the search engine results page. Users are likely to click on topically related ads and the advertisers pay only when a user clicks on their ad. Hence, it is important to be able to predict if an ad is likely to be clicked, and maximize the number of clicks. We investigate the sponsored search problem from a machine learning perspective with respect to three main sub-problems: how to use click data for training and evaluation, which learning framework is more suitable for the task, and which features are useful for existing models. We perform a large scale evaluation based on data from a commercial Web search engine. Results show that it is possible to learn and evaluate directly and exclusively on click data encoding pairwise preferences following simple and conservative assumptions. We find that online multilayer perceptron learning, based on a small set of features representing content similarity of different kinds, significantly outperforms an information retrieval baseline and other learning models, providing a suitable framework for the sponsored search task.

References

  1. E. Agichtein, E. Brill, and S. Dumais. Improving web search ranking by incorporating user behavior information. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Broder, M. Fontoura, V. Josifovski, and L. Riedel. A semantic approach to contextual advertising. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 559--566. ACM Press, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. In Proceedings of the 22nd International Conference on Machine Learning (ICML), pages 89--96, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Carrasco, D. Fain, K. Lang, and L. Zhukov. Clustering of bipartite advertiser-keyword graph. In Proceedings of the Workshop on Clustering Large Datasets, IEEE Conference on Data Mining. IEEE Computer Society Press, 2003.Google ScholarGoogle Scholar
  5. G. Cauwenberghs and T. Poggio. Incremental and decremental support vector machine learning. In Advances in Neural Information Processing Systems, pages 409--415, 2000.Google ScholarGoogle Scholar
  6. M. Ciaramita, V. Murdock, and V. Plachouras. Semantic associations for contextual advertising. International Journal of Electronic Commerce Research - Special Issue on Online Advertising and Sponsored Search, 9(1), 2008.Google ScholarGoogle Scholar
  7. M. Collins and B. Roark. Incremental parsing with the perceptron algorithm. In Proceedings of ACL 2004, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. R. Duda, P. Hart, and D. Stork. Pattern Classification (2nd ed.). Wiley-Interscience, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Feng, H. Bhargava, and D. Pennock. Implementing sponsored search in web search engines: Computational evaluation of alternative mechanisms. INFORMS Journal on Computing, 19(1):137--148, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Y. Freund and R. Schapire. Large margin classification using the perceptron algorithm. Machine Learning, 37(3):277--296, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. L. Granka, T. Joachims, and G. Gay. Eye-tracking analysis of user behavior in WWW search. In ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pages 478--479, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. IAB. Internet advertising revenue report: 2006 full-year results, 2007. http://www.iab.net/resources/.Google ScholarGoogle Scholar
  13. T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD). ACM, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. T. Joachims, L. Granka, B. Pang, H. Hembrooke, and G. Gay. Accurately interpreting clickthrough data as implicit feedback. In ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pages 154--161, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. R. Jones, B. Rey, O. Madani, and W. Greiner. Generating query substitutions. In Proceedings of the 15th International World Wide Web Conference (WWW), 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. Kelly. Implicit feedback: Using behavior to infer relevance. In New Directions in Cognitive Information Retrieval, pages 169--186. Springer Publishing, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  17. J. Kivinen, A. Smola, and R. Williamson. Online learning with kernels. In Advances in Neural Information Processing Systems, pages 785--792, 2001.Google ScholarGoogle Scholar
  18. R. Krovetz. Viewing morphology as an inference process. In Proceedings of the 16th International ACM SIGIR Conference on Research and Development in Information Retrieval, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Lacerda, M. Cristo, M. Goncalves, W. Fan, N. Ziviani, and B. Ribeiro-Neto. Learning to advertise. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 549--556. ACM Press, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Y. Li, H. Zaragoza, R. Herbrich, J. Shawe-Taylor, and J. Kandola. The perceptron algorithm with uneven margins. In Proceedings of the Nineteenth International Conference on Machine Learning (ICML), pages 379--386, San Francisco, CA, USA, 2002. Morgan Kaufmann Publishers Inc. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. T. Liu, J. Xu, T. Qin, W. Xiong, and H. Li. Letor: Benchmark dataset for research on learning to rank for information retrieval. In Proceedings of the SIGIR 2007 Workshop on Learning to Rank for Information Retrieval, 2007.Google ScholarGoogle Scholar
  22. M. Minsky and S. Papert. Perceptrons. MIT Press, Cambridge, MA, 1969.Google ScholarGoogle Scholar
  23. V. Murdock, M. Ciaramita, and V. Plachouras. A noisy channel approach to contextual advertising. In Proceedings of the 1st International Workshop on Data Mining and Audience Intelligence for Advertising (ADKDD'07), pages 21--27, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. OneUpWeb. How keyword length affects conversion rates, 2005. http://www.oneupweb.com/landing/keywordstudy_landing.htm.Google ScholarGoogle Scholar
  25. B. Ribeiro-Neto, M. Cristo, P. Golgher, and E. D. Moura. Impedance coupling in content-targeted advertising. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pages 496--503. ACM Press, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. F. Rosemblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6):386--408, 1958.Google ScholarGoogle ScholarCross RefCross Ref
  27. D. Rumelhart, G. Hinton, and R. Williams. Learning internal representation by backpropagating errors. Nature, 323(99):533--536, 1986.Google ScholarGoogle ScholarCross RefCross Ref
  28. M. Sahami and T. D. Heilman. A web-based kernel function for measuring the similarity of short text snippets. In Proceedings of the 15th international conference on World Wide Web, pages 377--386, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. F. Sha and F. Pereira. Shallow parsing with conditional random fields. In Proceedings of Human Language Technology and North-American Chapter of the Association for Computational Linguistics (HLT-NAACL), 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. L. Shen and A. Joshi. Ranking and reranking with perceptron. Machine Learning. Special Issue on Learning in Speech and Language Technologies, 60(1-3):73--96, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. M. Surdeanu and M. Ciaramita. Robust information extraction with perceptrons. In Proceedings of NIST Automatic Content Extraction Workshop (ACE), 2007.Google ScholarGoogle Scholar
  32. G.-R. Xue, H.-J. Zeng, Z. Chen, Y. Yu, W.-Y. Ma, W. Xi, and W. Fan. Optimizing web search using web click-through data. In Proceedings of the thirteenth ACM international conference on Information and knowledge management (CIKM), pages 118--126, New York, NY, USA, 2004. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. W. Yih, J. Goodman, and V. Carvalho. Finding advertising keywords on web pages. In Proceedings of the 15th international conference on World Wide Web, pages 213--222, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. C. Y. Yoo. Preattentive Processing of Web Advertising. PhD thesis, University of Texas at Austin, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. W. V. Zhang, X. He, B. Rey, and R. Jones. Query rewriting using active learning for sponsored search. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Online learning from click data for sponsored search

                  Recommendations

                  Reviews

                  Julien Velcin

                  Ranking advertisements in sponsored search is a very strategic task nowadays. Using machine learning (ML) techniques based on queries and clicks is a very natural way to solve this kind of task. The paper proposes two contributions: using only users' logs for learning and evaluation, and comparing three methods of ML-simple perceptron, ranking perceptron, and multilayer perceptron (MLP). A minor contribution consists of using different features to describe the instance of the learning problem, such as cosine and word overlap. The text is well structured and the related references are very accurate. Using only click data for learning is a valid approach. However, sponsored search is highly related to the economic model-this is clearly stated in the paper. Unfortunately, Ciaramita et al. do not go into detail on the question of aggregating the two models. Another weak point relates to learning algorithms. It is well known that the binary classifier is not to be adapted to the ranking task; I wonder why the authors consider it in their experiments. The description of MLP is not very clear, lacking a simple figure. In Equation 11, there seems to be a mistake in the back-propagation formula. Also, this paper lacks a comparison against other nonlinear classifiers, such as support vector machines (SVMs). Finally, the authors do not explain why they use two different stemming algorithms-Krovetz and Porter. Otherwise, this paper is of high quality Online Computing Reviews Service

                  Access critical reviews of Computing literature here

                  Become a reviewer for Computing Reviews.

                  Comments

                  Login options

                  Check if you have access through your login credentials or your institution to get full access on this article.

                  Sign in
                  • Published in

                    cover image ACM Conferences
                    WWW '08: Proceedings of the 17th international conference on World Wide Web
                    April 2008
                    1326 pages
                    ISBN:9781605580852
                    DOI:10.1145/1367497

                    Copyright © 2008 ACM

                    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                    Publisher

                    Association for Computing Machinery

                    New York, NY, United States

                    Publication History

                    • Published: 21 April 2008

                    Permissions

                    Request permissions about this article.

                    Request Permissions

                    Check for updates

                    Qualifiers

                    • research-article

                    Acceptance Rates

                    Overall Acceptance Rate1,899of8,196submissions,23%

                  PDF Format

                  View or Download as a PDF file.

                  PDF

                  eReader

                  View online with eReader.

                  eReader