skip to main content
10.1145/1835804.1835931acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

DivRank: the interplay of prestige and diversity in information networks

Published:25 July 2010Publication History

ABSTRACT

Information networks are widely used to characterize the relationships between data items such as text documents. Many important retrieval and mining tasks rely on ranking the data items based on their centrality or prestige in the network. Beyond prestige, diversity has been recognized as a crucial objective in ranking, aiming at providing a non-redundant and high coverage piece of information in the top ranked results. Nevertheless, existing network-based ranking approaches either disregard the concern of diversity, or handle it with non-optimized heuristics, usually based on greedy vertex selection.

We propose a novel ranking algorithm, DivRank, based on a reinforced random walk in an information network. This model automatically balances the prestige and the diversity of the top ranked vertices in a principled way. DivRank not only has a clear optimization explanation, but also well connects to classical models in mathematics and network science. We evaluate DivRank using empirical experiments on three different networks as well as a text summarization task. DivRank outperforms existing network-based ranking methods in terms of enhancing diversity in prestige.

Skip Supplemental Material Section

Supplemental Material

kdd2010_mei_dripd_01.mov

mov

70.9 MB

References

  1. R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data Mining, pages 5--14, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A.-L. Barabasi and R. Albert. Emergence of scaling in random networks. Science, 286:509--512, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  3. J. Carbonell and J. Goldstein. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In SIGIR '98: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pages 335--336, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C. L. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkan, S. Buttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation. In SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 659--666, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. G. Erkan and D. R. Radev. Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Int. Res., 22(1):457--479, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Gollapudi and A. Sharma. An axiomatic approach for result diversification. In WWW '09: Proceedings of the 18th international conference on World wide web, pages 381--390, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. T. H. Haveliwala. Topic-sensitive pagerank. In WWW '02: Proceedings of the 11th international conference on World wide web, pages 517--526, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Hirsch. An index to quantify an individual's scientific research output. PNAS, 102(46):16569--16572, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  9. J. M. Kleinberg. Authoritative sources in a hyperlinked environment. J. ACM, 46(5):604--632, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C.-Y. Lin and E. Hovy. Automatic evaluation of summaries using n-gram co-occurrence statistics. In NAACL '03: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pages 71--78, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. R. E. Madsen, D. Kauchak, and C. Elkan. Modeling word burstiness using the dirichlet distribution. In ICML '05: Proceedings of the 22th international conference on Machine learning, pages 545--552, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Q. Mei, D. Zhang, and C. Zhai. A general optimization framework for smoothing language models on graph structures. In SIGIR '08: Proceedings of the 31th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 611--618, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. E. J. Newman. The structure and function of complex networks. SIAM Review, 45:167--256, 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. L. Page, S. Brin, RajeevMotwani, and TerryWinograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project, 1998.Google ScholarGoogle Scholar
  15. R. Pemantle. Vertex reinforced random walk. Prob. Th. and Rel. Fields, pages 117--136, 1992.Google ScholarGoogle ScholarCross RefCross Ref
  16. F. Radlinski, P. N. Bennett, B. Carterette, and T. Joachims. Redundancy, diversity and interdependent document relevance. SIGIR Forum, 43(2):46--52, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Shi and J. Malik. Normalized cuts and image segmentation. In CVPR '97: Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition, pages 731--737, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. X. Zhai, W. W. Cohen, and J. Lafferty. Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In SIGIR '03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 10--17, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. B. Zhang, H. Li, Y. Liu, L. Ji, W. Xi, W. Fan, Z. Chen, and W.-Y. Ma. Improving web search results using affinity graph. In SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 504--511, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Y. Zhang, J. Callan, and T. Minka. Novelty and redundancy detection in adaptive filtering. In SIGIR '02: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in informaion retrieval, pages 81--88, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Scholkopf. Learning with local and global consistency. In NIPS '04. 2004.Google ScholarGoogle Scholar
  22. D. Zhou, J. Huang, and B. Scholkopf. Learning from labeled and unlabeled data on a directed graph. In Proceedings of the 22th international conference on Machine learning, pages 1036--1043, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. D. Zhou, J. Weston, A. Gretton, O. Bousquet, and B. Scholkopf. Ranking on data manifolds. In NIPS '04. 2004.Google ScholarGoogle Scholar
  24. X. Zhu, A. Goldberg, J. Van Gael, and D. Andrzejewski. Improving diversity in ranking using absorbing random walks. In NAACL-HLT 2007, pages 97--104, April 2007.Google ScholarGoogle Scholar
  25. C.-N. Ziegler, S. M. McNee, J. A. Konstan, and G. Lausen. Improving recommendation lists through topic diversification. In WWW '05: Proceedings of the 14th international conference on World wide web, pages 22--32, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. DivRank: the interplay of prestige and diversity in information networks

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
      July 2010
      1240 pages
      ISBN:9781450300551
      DOI:10.1145/1835804

      Copyright © 2010 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 25 July 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader