ABSTRACT
Current research on web search has focused on optimizing and evaluating single queries. However, a significant fraction of user queries are part of more complex tasks [20] which span multiple queries across one or more search sessions [26,24]. An ideal search engine would not only retrieve relevant results for a user's particular query but also be able to identify when the user is engaged in a more complex task and aid the user in completing that task [29,1]. Toward optimizing whole-session or task relevance, we characterize and address the problem of intrinsic diversity (ID) in retrieval [30], a type of complex task that requires multiple interactions with current search engines. Unlike existing work on extrinsic diversity [30] that deals with ambiguity in intent across multiple users, ID queries often have little ambiguity in intent but seek content covering a variety of aspects on a shared theme. In such scenarios, the underlying needs are typically exploratory, comparative, or breadth-oriented in nature. We identify and address three key problems for ID retrieval: identifying authentic examples of ID tasks from post-hoc analysis of behavioral signals in search logs; learning to identify initiator queries that mark the start of an ID search task; and given an initiator query, predicting which content to prefetch and rank.
- E. Agichtein, R. White, S. Dumais, and P. Bennett. Search, Interrupted: Understanding and Predicting Search Task Continuation. In SIGIR '12, 2012. Google ScholarDigital Library
- P. Bailey et al. User task understanding: a web search engine perspective. http://research.microsoft.com/apps/-pubs/default.aspx?id=180594, 2012.Google Scholar
- P. L. Bartlett, M. I. Jordan, and J. M. Mcauliffe. Large Margin Classifiers: Convex Loss, Low Noise, and Convergence Rates. In NIPS 16, 2004.Google Scholar
- P. N. Bennett, K. Svore, and S. Dumais. Classification-Enhanced Ranking. In WWW '10, 2010. Google ScholarDigital Library
- C. Brandt, T. Joachims, Y. Yue, and J. Bank. Dynamic Ranked Retrieval. In WSDM '11, 2011. Google ScholarDigital Library
- J. Carbonell and J. Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In SIGIR '98, 1998. Google ScholarDigital Library
- H. Chen and D. R. Karger. Less is more: Probabilistic models for retrieving fewer relevant documents. In SIGIR '06, 2006. Google ScholarDigital Library
- C. L. Clarke et al. Novelty and diversity in information retrieval evaluation. In SIGIR '08, 2008. Google ScholarDigital Library
- W. Dakka et al. Automatic Discovery of Useful Facet Terms. In SIGIR 2006 Workshop on Faceted Search, 2006.Google Scholar
- V. Dang, M. Bendersky, and W. B. Croft. Learning to rank query reformulations. In SIGIR '10, 2010. Google ScholarDigital Library
- S. Fox, K. Kulddep, M. Mydland, S. Dumais, and T. White. Evaluating implicit measures to improve web search. ACM TOIS, 23(2):147-168, 2005. Google ScholarDigital Library
- J. Gao, W. Yuan, X. Li, K. Deng, and J. Nie. Smoothing clickthrough data for web search ranking. In SIGIR '09, 2009. Google ScholarDigital Library
- S. Gollapudi, S. Ieong, A. Ntoulas, and S. Paparizos. Efficient query rewrite for structured web queries. In CIKM '11, 2011. Google ScholarDigital Library
- A. Hassan, Y. Song, and L.-w. He. A task level metric for measuring web search satisfaction and its application on improving relevance estimation. In CIKM '11, 2011. Google ScholarDigital Library
- A. Hassan and R. W. White. Task tours: helping users tackle complex search tasks. In CIKM '12, 2012. Google ScholarDigital Library
- J. He et al. CWI at TREC 2011: session, web, and medical. In TREC '11, 2012.Google Scholar
- K. Jarvelin, S. L. Price, L. M. L. Delcambre, and M. L. Nielsen. Discounted cumulated gain based evaluation of multiple-query IR sessions. In ECIR'08, 2008. Google ScholarDigital Library
- T. Joachims. Making large-scale support vector machine learning practical. In Advances in kernel methods, pages 169-184. MIT Press, 1999. Google ScholarDigital Library
- T. Joachims, L. Granka, B. Pan, H. Hembrooke, F. Radlinski, and G. Gay. Evaluating the accuracy of implicit feedback from clicks and query reformulations in web search. TOIS, 25(2), Apr. 2007. Google ScholarDigital Library
- R. Jones and K. Klinkner. Beyond the session timeout: Automatic hierarchical segmentation of search topics in query logs. In CIKM '08, 2008. Google ScholarDigital Library
- E. Kanoulas, B. Carterette, P. D. Clough, and M. Sanderson. Evaluating multi-query sessions. In SIGIR '11, 2011. Google ScholarDigital Library
- E. Kanoulas et al. Overview of the TREC 2011 Session Track. In TREC '11, 2012.Google Scholar
- C. Kohlschutter, P.-A. Chirita, and W. Nejdl. Using link analysis to identify aspects in faceted web search. In SIGIR '06, 2006.Google Scholar
- A. Kotov, P. N. Bennett, R. W. White, S. T. Dumais, and J. Teevan. Modeling and analysis of cross-session search tasks. In SIGIR '11, 2011. Google ScholarDigital Library
- D. J. Liebling, P. N. Bennett, and R. W. White. Anticipatory search: using context to initiate search. In SIGIR '12, 2012. Google ScholarDigital Library
- J. Liu and N. Belkin. Personalizing information retrieval for multi-session tasks: The roles of task stage and task type. In SIGIR '10, 2010. Google ScholarDigital Library
- J. Liu, M. J. Cole, C. Liu, R. Bierig, J. Gwizdka, N. J. Belkin, J. Zhang, and X. Zhang. Search behaviors in different task types. In JCDL '10, 2010. Google ScholarDigital Library
- H. Ma, M. R. Lyu, and I. King. Diversifying query suggestion results. In AAAI '10, 2010.Google Scholar
- D. Morris, M. R. Morris, and G. Venolia. Searchbar: A search-centric web history for task resumption and information rending. In CHI '08, 2008. Google ScholarDigital Library
- F. Radlinski, P. N. Bennett, B. Carterette, and T. Joachims. Redundancy, diversity and interdependent document relevance. SIGIR Forum, 43(2):46-52, Dec. 2009. Google ScholarDigital Library
- F. Radlinski and T. Joachims. Query chains: learning to rank from implicit feedback. In KDD '05, 2005. Google ScholarDigital Library
- K. Raman, T. Joachims, and P. Shivaswamy. Structured learning of two-level dynamic rankings. In CIKM '11, 2011. Google ScholarDigital Library
- A. Singla, R. White, and J. Huang. Studying trailfinding algorithms for enhanced web search. In SIGIR '10, 2010. Google ScholarDigital Library
- A. Slivkins, F. Radlinski, and S. Gollapudi. Learning optimally diverse rankings over large document collections. In ICML '10, 2010.Google Scholar
- R. White and S. Drucker. Investigating behavioral variability in web search. In WWW '07, 2007. Google ScholarDigital Library
- R. W. White, P. N. Bennett, and S. T. Dumais. Predicting short-term interests using activity-based search context. In CIKM '10, 2010. Google ScholarDigital Library
- R. W. White, G. Marchionini, and G. Muresan. Editorial: Evaluating exploratory search systems. Inf. Process. Manage., 44(2):433-436, Mar. 2008. Google ScholarDigital Library
- X. Yuan and R. White. Building the trail best traveled: effects of domain knowledge on web search trailblazing. In CHI '12, 2012. Google ScholarDigital Library
- Y. Yue and T. Joachims. Predicting diverse subsets using structural SVMs. In ICML '08, 2008. Google ScholarDigital Library
- C. X. Zhai, W. W. Cohen, and J. Lafferty. Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In SIGIR '03, 2003. Google ScholarDigital Library
- L. Zhang and Y. Zhang. Interactive retrieval based on faceted feedback. In SIGIR '10, 2010. Google ScholarDigital Library
- Q. Zhao et al. Time-dependent semantic similarity measure of queries using historical click-through data. In WWW '06, 2006. Google ScholarDigital Library
Index Terms
- Toward whole-session relevance: exploring intrinsic diversity in web search
Recommendations
Understanding Intrinsic Diversity in Web Search: Improving Whole-Session Relevance
Current research on Web search has focused on optimizing and evaluating single queries. However, a significant fraction of user queries are part of more complex tasks [Jones and Klinkner 2008] which span multiple queries across one or more search ...
Proactive Information Retrieval by Capturing Search Intent from Primary Task Context
A significant fraction of information searches are motivated by the user’s primary task. An ideal search engine would be able to use information captured from the primary task to proactively retrieve useful information. Previous work has shown that many ...
Image Search Reranking with Relevance, Diversity and Topic Coverage
ICIMCS'16: Proceedings of the International Conference on Internet Multimedia Computing and ServiceImage search reranking has recently been proposed to improve image search results. Most of the conventional reranking methods cannot leverage both relevance and diversity of the search results simultaneously. In addition, they usually ignore the latent ...
Comments