ABSTRACT
The PC Desktop is a very rich repository of personal information, efficiently capturing user's interests. In this paper we propose a new approach towards an automatic personalization of web search in which the user specific information is extracted from such local desktops, thus allowing for an increased quality of user profiling, while sharing less private information with the search engine. More specifically, we investigate the opportunities to select personalized query expansion terms for web search using three different desktop oriented approaches: summarizing the entire desktop data, summarizing only the desktop documents relevant to each user query, and applying natural language processing techniques to extract dispersive lexical compounds from relevant desktop resources. Our experiments with the Google API showed at least the latter two techniques to produce a very strong improvement over current web search.
- P. G. Anick and S. Tipirneni. The paraphrase search assistant: Terminological feedback for iterative information seeking. In Proc. of the 22nd Annual Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 1999. Google ScholarDigital Library
- R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. ACM Press / Addison-Wesley, 1999. Google ScholarDigital Library
- J. Budzik and K. Hammond. Watson: Anticipating and contextualizing information needs. In Proceedings of the Sixty-second Annual Meeting of the American Society for Information Science, 1999.Google Scholar
- J. Carbonell and J. Goldstein. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In Proc. of the 21st Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 1998. Google ScholarDigital Library
- P. A. Chirita, C. S. Firan, and W. Nejdl. Pushing task relevant web links down to the desktop. In Proc. of the 8th ACM Intl. Workshop on Web Information and Data Management held at the 15th Intl. ACM CIKM Conference on Information and Knowledge Management, 2006. Google ScholarDigital Library
- P.-A. Chirita, W. Nejdl, R. Paiu, and C. Kohlschütter. Using odp metadata to personalize search. In Proc. of the 28th Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2005. Google ScholarDigital Library
- P.-A. Chirita, D. Olmedilla, and W. Nejdl. Pros: A personalized ranking platform for web search. In Proc. of the 3rd Intl. Conf. on Adaptive Hypermedia and Adaptive Web-Based Systems, 2004.Google ScholarCross Ref
- D. R. Cutting, D. R. Karger, and J. O. Pedersen. Constant interaction-time scatter/gather browsing of very large document collections. In SIGIR, 1993. Google ScholarDigital Library
- D. R. Cutting, J. O. Pedersen, D. R. Karger, and J. W. Tukey. Scatter/gather: A cluster-based approach to browsing large document collections. In SIGIR, 1992. Google ScholarDigital Library
- S. Dumais, E. Cutrell, R. Sarin, and E. Horvitz. Implicit queries (iq) for contextualized search. In Proc. of the 27th Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2004. Google ScholarDigital Library
- G. Erkan and D. R. Radev. Lexrank: Graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. (JAIR), 22:457--479, 2004. Google ScholarDigital Library
- S. Gauch, J. Chaffee, and A. Pretschner. Ontology-based personalized search and browsing. Web Intelli. and Agent Sys., 1(3-4):219--234, 2003. Google ScholarDigital Library
- J. Goldstein, M. Kantrowitz, V. Mittal, and J. Carbonell. Summarizing text documents: Sentence selection and evaluation metrics. In Proc. of the 22nd Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 1999. Google ScholarDigital Library
- T. Haveliwala. Topic-sensitive pagerank. In In Proceedings of the Eleventh International World Wide Web Conference, Honolulu, Hawaii, May 2002. Google ScholarDigital Library
- G. Jeh and J. Widom. Scaling personalized web search. In Proc. of the 12th Intl. World Wide Web Conference, 2003. Google ScholarDigital Library
- T. Joachims, L. Granka, B. Pan, H. Hembrooke, and G. Gay. Accurately interpreting clickthrough data as implicit feedback. In Proc. of the 28th Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2005. Google ScholarDigital Library
- K. S. Jones, S. Walker, and S. Robertson. Probabilistic model of information retrieval: Development and status. Technical report, Cambridge University, 1998.Google Scholar
- S. Katz. Distribution of content words and phrases in text and language modelling. Natural Language Engineering, 2(1):15--59, 1996. Google ScholarDigital Library
- A. M. Lam-Adesina and G. J. F. Jones. Applying summarization techniques for term selection in relevance feedback. In Proc. of the 24th Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2001. Google ScholarDigital Library
- D. Lawrie and W. Croft. Generating hierarchical summaries for web searches. In Proc. of the 26th Intl. ACM SIGIR Conf. on Research and Development in Information Retr., 2003. Google ScholarDigital Library
- D. Lawrie, W. B. Croft, and A. L. Rosenberg. Finding topic words for hierarchical summarization. In Proc. of the 24th Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2001. Google ScholarDigital Library
- F. Liu, C. Yu, and W. Meng. Personalized web search for improving retrieval effectiveness. IEEE Trans. on Knowledge and Data Eng., 16(1):28--40, 2004. Google ScholarDigital Library
- H. Luhn. Automatic creation of literature abstracts. IBM Journ. of Research and Development, 2(2):159--165, 1958.Google ScholarDigital Library
- G. Miller. Wordnet: An electronic lexical database. Communications of the ACM, 38(11):39--41, 1995. Google ScholarDigital Library
- M. Mitra, A. Singhal, and C. Buckley. Improving automatic query expansion. In Proc. of the 21st Intl. ACM SIGIR Conf. on Research and Development in Information Retr., 1998. Google ScholarDigital Library
- T. Nomoto and Y. Matsumoto. A new approach to unsupervised text summarization. In Proc. of the 24th Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2001. Google ScholarDigital Library
- L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, 1998.Google Scholar
- D. R. Radev, H. Jing, M. Stys, and D. Tam. Centroid-based summarization of multiple documents. Inf. Process. and Management, 40(6):919--938, 2004. Google ScholarDigital Library
- B. J. Rhodes and P. Maes. Just-in-time information retrieval agents. IBM Syst. J., 39(3-4):685--704, 2000. Google ScholarDigital Library
- S. E. Robertson and S. Walker. Okapi/keenbow at trec-8. In TREC, 1999.Google Scholar
- J. Rocchio. Relevance feedback in information retrieval. The Smart Retrieval System: Experiments in Automatic Document Processing, pages 313--323, 1971.Google Scholar
- D. Rose, R. Mander, T. Oren, D. Ponceleon, G. Salomon, and Y. Wong. Content awareness in a file system interface: Implementing the 'pile' metaphor for organizing information. In Proc. of the 16th Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 1993. Google ScholarDigital Library
- M. Sanderson and W. B. Croft. Deriving concept hierarchies from text. In Proc. of the 22nd Intl. ACM SIGIR Conf. on Research and Development in Information Retr., 1999. Google ScholarDigital Library
- K. Sugiyama, K. Hatano, and M. Yoshikawa. Adaptive web search based on user profile constructed without any effort from users. In Proc. of the 13th Intl. WWW Conf., 2004. Google ScholarDigital Library
- D. Sullivan. The older you are, the more you want personalized search, 2004. http://searchenginewatch.com/searchday/article.php/3385131.Google Scholar
- J. Teevan, S. Dumais, and E. Horvitz. Personalizing search via automated analysis of interests and activities. In Proc. of the 28th Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2005. Google ScholarDigital Library
- A. Tombros and M. Sanderson. Advantages of query biased summaries in information retrieval. In Proc. of the 21st Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 1998. Google ScholarDigital Library
- E. Volokh. Personalization and privacy. Commun. ACM, 43(8), 2000. Google ScholarDigital Library
- P. Willett. Recent trends in hierarchic document clustering: a critical review. Inf. Process. and Management, 24(5), 1988. Google ScholarDigital Library
- O. Zamir and O. Etzioni. Grouper: a dynamic clustering interface to web search results. Comput. Networks, 31(11-16), 1999. Google ScholarDigital Library
- H.-J. Zeng, Q.-C. He, Z. Chen, W.-Y. Ma, and J. Ma. Learning to cluster web search results. In Proc. of the 27th Intl. ACM SIGIR Conf. on Research and Development in Information Retrieval, 2004. Google ScholarDigital Library
Index Terms
- Summarizing local context to personalize global web search
Recommendations
Implicitly Learning a User Interest Profile for Personalization of Web Search Using Collaborative Filtering
WI-IAT '14: Proceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) - Volume 02The increasing abundance of content on the web has made information filtering even more important in helping users find information related to their interests. Personalization of web search is one such effort, that aims at improving the efficiency with ...
Personalizing web search using long term browsing history
WSDM '11: Proceedings of the fourth ACM international conference on Web search and data miningPersonalizing web search results has long been recognized as an avenue to greatly improve the search experience. We present a personalization approach that builds a user interest profile using users' complete browsing behavior, then uses this model to ...
Improving personalized web search using result diversification
SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrievalWe present and evaluate methods for diversifying search results to improve personalized web search. A common personalization approach involves reranking the top N search results such that documents likely to be preferred by the user are presented ...
Comments