ABSTRACT
In the original PageRank algorithm for improving the ranking of search-query results, a single PageRank vector is computed, using the link structure of the Web, to capture the relative "importance" of Web pages, independent of any particular search query. To yield more accurate search results, we propose computing a set of PageRank vectors, biased using a set of representative topics, to capture more accurately the notion of importance with respect to a particular topic. By using these (precomputed) biased PageRank vectors to generate query-specific importance scores for pages at query time, we show that we can generate more accurate rankings than with a single, generic PageRank vector. For ordinary keyword search queries, we compute the topic-sensitive PageRank scores for pages satisfying the query using the topic of the query keywords. For searches done in context (e.g., when the search query is performed by highlighting words in a Web page), we compute the topic-sensitive PageRank scores using the topic of the context in which the query appeared.
- The Google Search Engine: Commercial search engine founded by the originators of PageRank. http://www.google.com/.Google Scholar
- The Open Directory Project: Web directory for over 2.5 million URLs. http://www.dmoz.org/.Google Scholar
- 'More Evil Than Dr. Evil?' http://searchenginewatch.com/sereport/99/11-google.html.Google Scholar
- Krishna Bharat and Monika R. Henzinger. Improved algorithms for topic distillation in a hyperlinked environment. In Proceedings of the ACM-SIGIR, 1998. Google ScholarDigital Library
- Krishna Bharat and George~A. Mihaila. When experts agree: Using non-affiliated experts to rank popular topics. In Proceedings of the Tenth International World Wide Web Conference, 2001. Google ScholarDigital Library
- Sergey Brin, Rajeev Motwani, Larry Page, and Terry Winograd. What can you do with a web in your pocket. In Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 1998.Google Scholar
- Sergey Brin and Larry Page. The anatomy of a large-scale hypertextual web search engine. In Proceedings of the Seventh International World Wide Web Conference, 1998. Google ScholarDigital Library
- S. Chakrabarti, B. Dom, D. Gibson, J. Kleinberg, P. Raghavan, and S. Rajagopalan. Automatic resource compilation by analyzing hyperlink structure and associated text. In Proceedings of the Seventh International World Wide Web Conference, 1998. Google ScholarDigital Library
- Cynthia Dwork, Ravi Kumar, Moni Naor, and D. Sivakumar. Rank aggregation methods for the web. In Proceedings of the Tenth International World Wide Web Conference, 2001. Google ScholarDigital Library
- Lev Finkelstein, Evgeniy Gabrilovich, Yossi Matias, Ehud Rivlin, Zach Solan, Gadi Wolfman, and Eytan Ruppin. Placing search in context: the concept revisited. In Proceedings of the Tenth International World Wide Web Conference, 2001. Google ScholarDigital Library
- Taher H. Haveliwala. Efficient computation of PageRank. Stanford University Technical Report, 1999.Google Scholar
- J. Hirai, S. Raghavan, H. Garcia-Molina, and A. Paepcke. Webbase: A repository of web pages. In Proceedings of the Ninth International World Wide Web Conference, 2000. Google ScholarDigital Library
- Glen Jeh and Jennifer Widom. Scaling personalized web search. Stanford University Technical Report, 2002.Google Scholar
- Jon Kleinberg. Authoritative sources in a hyperlinked environment. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, 1998. Google ScholarDigital Library
- Rajeev Motwani and Prabhakar Raghavan. Randomized Algorithms. Cambridge University Press, United Kingdom, 1995. Google ScholarCross Ref
- Larry Page. PageRank: Bringing order to the web. Stanford Digital Libraries Working Paper, 1997.Google Scholar
- Davood Rafiei and Alberto O. Mendelzon. What is this page known for? Computing web page reputations. In Proceedings of the Ninth International World Wide Web Conference, 2000. Google ScholarDigital Library
- Matthew Richardson and Pedro Domingos. The Intelligent Surfer: Probabilistic Combination of Link and Content Information in PageRank, volume 14. MIT Press, Cambridge, MA, 2002 (To appear).Google Scholar
Index Terms
- Topic-sensitive PageRank
Recommendations
Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search
The original PageRank algorithm for improving the ranking of search-query results computes a single vector, using the link structure of the Web, to capture the relative “importance” of Web pages, independent of any particular search query. To yield more ...
Node ranking in labeled directed graphs
CIKM '04: Proceedings of the thirteenth ACM international conference on Information and knowledge managementOur work is motivated by the problem of ranking hyper-linked documents for a given query. Given an arbitrary directed graph with edge and node labels, we present a new flow-based model and an efficient method to dynamically rank the nodes of this graph ...
PageRank revisited
PageRank, one part of the search engine Google, is one of the most prominent link-based rankings of documents in the World Wide Web. Usually it is described as a Markov chain modeling a specific random surfer. In this article, an alternative ...
Comments