research-article

DivRank: the interplay of prestige and diversity in information networks

Authors:
Qiaozhu Mei

University of Michigan, Ann Arbor, MI, USA

University of Michigan, Ann Arbor, MI, USA
View Profile

,
Jian Guo

University of Michigan, Ann Arbor, MI, USA

University of Michigan, Ann Arbor, MI, USA
View Profile

,
Dragomir Radev

University of Michigan, Ann Arbor, MI, USA

University of Michigan, Ann Arbor, MI, USA
View Profile

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data miningJuly 2010Pages 1009–1018https://doi.org/10.1145/1835804.1835931

Published:25 July 2010Publication History

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 1009–1018

ABSTRACT

Information networks are widely used to characterize the relationships between data items such as text documents. Many important retrieval and mining tasks rely on ranking the data items based on their centrality or prestige in the network. Beyond prestige, diversity has been recognized as a crucial objective in ranking, aiming at providing a non-redundant and high coverage piece of information in the top ranked results. Nevertheless, existing network-based ranking approaches either disregard the concern of diversity, or handle it with non-optimized heuristics, usually based on greedy vertex selection.

We propose a novel ranking algorithm, DivRank, based on a reinforced random walk in an information network. This model automatically balances the prestige and the diversity of the top ranked vertices in a principled way. DivRank not only has a clear optimization explanation, but also well connects to classical models in mathematics and network science. We evaluate DivRank using empirical experiments on three different networks as well as a text summarization task. DivRank outperforms existing network-based ranking methods in terms of enhancing diversity in prestige.

Supplemental Material

kdd2010_mei_dripd_01.mov

mov

70.9 MB

Download

References

R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data Mining, pages 5--14, 2009. Google ScholarDigital Library
A.-L. Barabasi and R. Albert. Emergence of scaling in random networks. Science, 286:509--512, 1999.Google ScholarCross Ref
J. Carbonell and J. Goldstein. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In SIGIR '98: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pages 335--336, 1998. Google ScholarDigital Library
C. L. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkan, S. Buttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation. In SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 659--666, 2008. Google ScholarDigital Library
G. Erkan and D. R. Radev. Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Int. Res., 22(1):457--479, 2004. Google ScholarDigital Library
S. Gollapudi and A. Sharma. An axiomatic approach for result diversification. In WWW '09: Proceedings of the 18th international conference on World wide web, pages 381--390, 2009. Google ScholarDigital Library
T. H. Haveliwala. Topic-sensitive pagerank. In WWW '02: Proceedings of the 11th international conference on World wide web, pages 517--526, 2002. Google ScholarDigital Library
J. Hirsch. An index to quantify an individual's scientific research output. PNAS, 102(46):16569--16572, 2005.Google ScholarCross Ref
J. M. Kleinberg. Authoritative sources in a hyperlinked environment. J. ACM, 46(5):604--632, 1999. Google ScholarDigital Library
C.-Y. Lin and E. Hovy. Automatic evaluation of summaries using n-gram co-occurrence statistics. In NAACL '03: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pages 71--78, 2003. Google ScholarDigital Library
R. E. Madsen, D. Kauchak, and C. Elkan. Modeling word burstiness using the dirichlet distribution. In ICML '05: Proceedings of the 22th international conference on Machine learning, pages 545--552, 2005. Google ScholarDigital Library
Q. Mei, D. Zhang, and C. Zhai. A general optimization framework for smoothing language models on graph structures. In SIGIR '08: Proceedings of the 31th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 611--618, 2008. Google ScholarDigital Library
M. E. J. Newman. The structure and function of complex networks. SIAM Review, 45:167--256, 2003.Google ScholarDigital Library
L. Page, S. Brin, RajeevMotwani, and TerryWinograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, 1998.Google Scholar
R. Pemantle. Vertex reinforced random walk. Prob. Th. and Rel. Fields, pages 117--136, 1992.Google ScholarCross Ref
F. Radlinski, P. N. Bennett, B. Carterette, and T. Joachims. Redundancy, diversity and interdependent document relevance. SIGIR Forum, 43(2):46--52, 2009. Google ScholarDigital Library
J. Shi and J. Malik. Normalized cuts and image segmentation. In CVPR '97: Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition, pages 731--737, 1997. Google ScholarDigital Library
C. X. Zhai, W. W. Cohen, and J. Lafferty. Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In SIGIR '03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 10--17, 2003. Google ScholarDigital Library
B. Zhang, H. Li, Y. Liu, L. Ji, W. Xi, W. Fan, Z. Chen, and W.-Y. Ma. Improving web search results using affinity graph. In SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 504--511, 2005. Google ScholarDigital Library
Y. Zhang, J. Callan, and T. Minka. Novelty and redundancy detection in adaptive filtering. In SIGIR '02: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 81--88, 2002. Google ScholarDigital Library
D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Scholkopf. Learning with local and global consistency. In NIPS '04. 2004.Google Scholar
D. Zhou, J. Huang, and B. Scholkopf. Learning from labeled and unlabeled data on a directed graph. In Proceedings of the 22th international conference on Machine learning, pages 1036--1043, 2005. Google ScholarDigital Library
D. Zhou, J. Weston, A. Gretton, O. Bousquet, and B. Scholkopf. Ranking on data manifolds. In NIPS '04. 2004.Google Scholar
X. Zhu, A. Goldberg, J. Van Gael, and D. Andrzejewski. Improving diversity in ranking using absorbing random walks. In NAACL-HLT 2007, pages 97--104, April 2007.Google Scholar
C.-N. Ziegler, S. M. McNee, J. A. Konstan, and G. Lausen. Improving recommendation lists through topic diversification. In WWW '05: Proceedings of the 14th international conference on World wide web, pages 22--32, 2005. Google ScholarDigital Library

Index Terms

DivRank: the interplay of prestige and diversity in information networks
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Decayed DivRank: capturing relevance, diversity and prestige in information networks
SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

Many network-based ranking approaches have been proposed to rank objects according to different criteria, including relevance, prestige and diversity. However, existing approaches either only aim at one or two of the criteria, or handle them with ...
Read More
Re-ranking search results using query logs
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management

This work addresses two common problems in search, frequently occurring with underspecified user queries: the top-ranked results for such queries may not contain documents relevant to the user's search intent, and fresh and relevant pages may not get ...
Read More
Diverse and Proportional Size-l Object Summaries for Keyword Search
SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data

The abundance and ubiquity of graphs (e.g., Online Social Networks such as Google+ and Facebook; bibliographic graphs such as DBLP) necessitates the effective and efficient search over them. Given a set of keywords that can identify a Data Subject (DS), ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
July 2010
1240 pages
ISBN:9781450300551
DOI:10.1145/1835804
General Chairs:
Bharat Rao
Siemens
,
Balaji Krishnapuram
Siemens
,
Program Chairs:
Andrew Tomkins
Google Inc.
,
Qiang Yang
Hong Kong University of Science and Technology
Copyright © 2010 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 July 2010
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
diversity
information networks
ranking
reinforced random walk
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 114
  Total Citations
  View Citations
- 1,490
  Total Downloads
- Downloads (Last 12 months)31
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

DivRank: the interplay of prestige and diversity in information networks

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Decayed DivRank: capturing relevance, diversity and prestige in information networks

Re-ranking search results using query logs

Diverse and Proportional Size-l Object Summaries for Keyword Search