skip to main content
10.1145/988672.988715acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
Article

Link fusion: a unified link analysis framework for multi-type interrelated data objects

Authors Info & Claims
Published:17 May 2004Publication History

ABSTRACT

Web link analysis has proven to be a significant enhancement for quality based web search. Most existing links can be classified into two categories: intra-type links (e.g., web hyperlinks), which represent the relationship of data objects within a homogeneous data type (web pages), and inter-type links (e.g., user browsing log) which represent the relationship of data objects across different data types (users and web pages). Unfortunately, most link analysis research only considers one type of link. In this paper, we propose a unified link analysis framework, called "link fusion", which considers both the inter- and intra- type link structure among multiple-type inter-related data objects and brings order to objects in each data type at the same time. The PageRank and HITS algorithms are shown to be special cases of our unified link analysis framework. Experiments on an instantiation of the framework that makes use of the user data and web pages extracted from a proxy log show that our proposed algorithm could improve the search effectiveness over the HITS and DirectHit algorithms by 24.6% and 38.2% respectively.

References

  1. The Clever Searching, the Clever project of IBM Almaden Research Center, www.almaden.ibm.com/cs/k53/clever.html <http://www.almaden.ibm.com/cs/k53/clever.html>.Google ScholarGoogle Scholar
  2. Berman, A. and Plemmons, R. J. Nonnegative matrices in the mathematical sciences. in Classics in Applied Mathematics, 1994.Google ScholarGoogle ScholarCross RefCross Ref
  3. Bharat, K. and Henzinger, M. R., Improved algorithms for topic distillation in a hyperlinked environment. in 21st ACM SIGIR International Conference on Research and Development in Information Retrieval, (Melbourne, Australia, 1998), 104--111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Brin, S. and Page, L. The Anatomy of a Large-Scale Hypertextual Web Search Engine. Computer Networks and ISDN Systems, 30. 107--117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Chakrabarti, S., Dom, B., Gibson, D., Kleinberg, J., Raghavan, P. and Rajagopalan, S., Automatic Resource Compilation by Analyzing Hyperlink Structure and Associated Text. in 7th international conference on World Wide Web, (Brisbane, Australia, 1998), 65--74. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Chakrabarti, S., Dom, B. E., Kumar, S. R., Raghavan, P., Rajagopalan, S., Tomkins, A., Gibson, D. and Kleinberg, J. M. Mining the Web's Link Structure. IEEE Computer, 32 (8). 60--67. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Cohn, D. and Chang, H., Learning to Probabilistically Identify Authoritative Documents. in 17th International Conference on Machine Learning, (Stanford, CA 2000), 167--174. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Craswell, N. and Hawking, D., Overview of the TREC-2002 Web Track. in 11th Text Retrieval Conference, (Gaithersburg, MD,2002).Google ScholarGoogle Scholar
  9. Craswell, N., Hawking, D. and Robertson, S., Effective Site Finding using Link Anchor Information. in 24th annual international ACM SIGIR conference on Research and development in information retrieval, (New Orleans, LA, 01), 250--257. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Davison, B. D., Toward a unification of text and link analysis. in 26th annual international ACM SIGIR conference on Research and development in information retrieval, (Toronto, Canada, 2003), 367--368. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. DirectHit. <http://www.directhit.com>.Google ScholarGoogle Scholar
  12. Garfield, E. Citation analysis as a tool in journal evaluation. Science, 178. 471-479.Google ScholarGoogle Scholar
  13. Hayes, B. Graph Theory in Practice, 2000.Google ScholarGoogle Scholar
  14. Herlocker, J. L., Konstan, J. A., Borchers, A. and Riedl, J., An algorithmic framework for performing collaborative filtering. in 22nd annual international ACM SIGIR conference on Research and development in information retrieval, (Berkeley, CA 1999), 230--237. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Hubbell, C. H. An input-output approach to clique identification. Sociometry, 28. 377--399.Google ScholarGoogle Scholar
  16. Katz, L. A new status index derived from sociometric analysis. Psychometrika, 18 (1). 39--42.Google ScholarGoogle ScholarCross RefCross Ref
  17. Kleinberg, J. M. Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM), 46 (5). 604--632. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Lempel, R., Moran, S. SALSA: the Stochastic Approach for Link-Structure Analysis (TOIS), 19 (2). 131-160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Miller, J. C., Rae, G., Schaefer, F., Ward, L. A., LoFaro, T. and Farahat, A., Modifications of Kleinberg's HITS algorithm using matrix exponentiation and web log records. in 24th annual international ACM SIGIR conference on Research and development in information retrieval, (New Orleans, LA, 2001), 444--445. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Ng, A. Y., Zheng, A. X. and Jordan, M. I., Stable algorithms for link analysis. in 24th ACM SIGIR International Conference on Research and Development in Information Retrieval, (New Orleans, LA 2001), 258--266. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Pinski, G. and Narin, N. Citation influence for journal aggregates of scientific publications: Theory, with application to the literature of physics. Information Process and Management, 12. 297--312.Google ScholarGoogle Scholar
  22. Vogt, C. C. and Cottrell, G. W., Predicting the performance of linearly combined IR systems. in 21st annual international ACM SIGIR Conference on Research and Development in Information Retrieval, (Melbourne, Australia, 1998), 190--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Wen, J.-R., Nie, J.-Y. and Zhang, H.-J. Query Clustering Using User Logs. ACM Transactions on Information Systems (TOIS), 20 (1). 59--81. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Link fusion: a unified link analysis framework for multi-type interrelated data objects

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        WWW '04: Proceedings of the 13th international conference on World Wide Web
        May 2004
        754 pages
        ISBN:158113844X
        DOI:10.1145/988672

        Copyright © 2004 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 17 May 2004

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate1,899of8,196submissions,23%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader