skip to main content
10.1145/3459637.3482316acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

AdaSim: A Recursive Similarity Measure in Graphs

Published:30 October 2021Publication History

ABSTRACT

In the literature, various link-based similarity measures such as Adamic/Adar (in short Ada), SimRank, and random walk with restart (RWR) have been proposed. Contrary to SimRank and RWR, Ada is a non-recursive measure, which exploits the local graph structure in similarity computation. Motivated by Ada's promising results in various graph-related tasks, along with the fact that SimRank is a recursive generalization of the co -citation measure, in this paper, we propose AdaSim, a recursive similarity measure based on the Ada philosophy. Our AdaSim provides identical accuracy to that of Ada on the first iteration and it is applicable to both directed and undirected graphs. To accelerate our iterative form, we also propose a matrix form that is dramatically faster while providing the exact AdaSim scores. We conduct extensive experiments with five real-world datasets to evaluate both the effectiveness and efficiency of our AdaSim in comparison with those of existing similarity measures and graph embedding methods in the task of similarity computation of nodes. Our experimental results show that 1) AdaSim significantly improves the effectiveness of Ada and outperforms other competitors, 2) its efficiency is comparable to that of SimRank* while being better than the others, 3) AdaSim is not sensitive to the parameter tuning, and 4) similarity measures are better than embedding methods to compute similarity of nodes.

Skip Supplemental Material Section

Supplemental Material

CIKM21-rgfp0668.mp4

mp4

38.5 MB

References

  1. Lada A. Adamic and Eytan Adar. 2003. Friends and Neighbors on the Web. Social Networks, Vol. 25, 3 (July 2003), 211--230.Google ScholarGoogle ScholarCross RefCross Ref
  2. Paweena Chaiwanarom and Chidchanok Lursinsap. 2015. Collaborator Recommendation in Interdisciplinary Computer Science Using Degrees of Collaborative Forces, Temporal Evolution of Research Interest, and Comparative Seniority Status. Knowledge-Based Systems, Vol. 75 (February 2015), 161--172. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Hung-Hsuan Chen and C. Lee Giles. 2015. ASCOS: An Asymmetric Similarity Measure for Weighted Networks to Address the Problem of SimRank. ACM Transactions on Knowledge Discovery from Data, TKDD, Vol. 10, 2, Article 15 (October 2015), 26 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Quanyu Dai. 2019. Implementation of DWNS. https://github.com/wonniu/AdvT4NE_WWW2019 Retrieved May 26, 2021 fromGoogle ScholarGoogle Scholar
  5. Quanyu Dai, Xiao Shen, Liang Zhang, Qiang Li, and Dan Wang. 2019. Adversarial Training Methods for Network Embedding. In Proceedings of the 28th International Conference on World Wide Web, WWW. 329--339. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Daniel Fogaras and Balazs Racz. 2005. Scaling Link-based Similarity Search. In Proceedings of the 14th International Conference on World Wide Web, WWW. 641--650. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Aditya Grover. 2017. Implementation of node2vec. https://github.com/aditya-grover/node2vec Retrieved May 26, 2021 fromGoogle ScholarGoogle Scholar
  8. Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable Feature Learning for Networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM SIGKDD. 855--864. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Masoud Reyhani Hamedani and Sang-Wook Kim. 2017. JacSim: An Accurate and Efficient Link-Based Similarity Measure In Graphs. Information Sciences, Vol. 414 (November 2017), 203--224.Google ScholarGoogle ScholarCross RefCross Ref
  10. Masoud Reyhani Hamedani and Sang-Wook Kim. 2019. Pairwise Normalization in Simrank Variants: Problem, Solution, and Evaluation. In Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, ACM SAC. 534--541. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Jiawei Han, Micheline Kamber, and Jian Pei. 2006. Data Mining: Concepts and Techniques, Second Edition. Morgan Kaufmann, San Francisco. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Roger A. Horn Han and Charles R. Johnson. 2013. Matrix Analysis, Second Edition. Cambridge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Glen Jeh and Jennifer Widom. 2002. SimRank: A Measure of Structural-Context Similarity. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, SIGKDD. 538--543. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Jinhong Jung, Namyong Park, Sael Lee, and U Kang. 2017. BePI: Fast and Memory-Efficient Method for Billion-Scale Random Walk with Restart. In Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD. 789--804. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Jinhong Jung, Kijung Shin, Lee Sael, and U Kang. 2016. Random Walk with Restart on Large Graphs Using Block Elimination. ACM Transactions on Database Systems, TODS, Vol. 41, 2, Article 12 (May 2016), 43 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Jérôme Kunegis, Julia Preusse, and Felix Schwager. 2013. What is the Added Value of Negative Links in Online Social Networks? In Proceedings of the 22nd International Conference on World Wide Web, WWW. 727--736. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jundong Li, Liang Wu, Ruocheng Guo, Chenghao Liu, and Huan Liu. 2019b. Multi-Level Network Embedding with Boosted Low-Rank Matrix Approximation. In Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM. 49--56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Mo Li, Farhana M. Choudhury, Renata Borovica-Gajic, Zhiqiong Wang, Junchang Xin, and Jianxin Li. 2019a. CrashSim: An Efficient Algorithm for Computing SimRank over Static and Temporal Graphs. In Proceedings of the 36th IEEE International Conference on Data Engineering, IEEE ICDE. 1141--1152.Google ScholarGoogle Scholar
  19. David Liben-Nowell and Jon Kleinberg. 2007. The Link-prediction Problem for Social Networks. Journal of the American Society for Information Science and Technology, JASIST, Vol. 58, 7 (May 2007), 1--23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Zhenjiang Lin, Michael R. Lyu, and Irwin King. 2012. MatchSim: A Novel Similarity Measure Based on Maximum Neighborhood Matching. Knowledge and Information Systems, KAIS, Vol. 32, 1 (July 2012), 141--166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Dmitry Lizorkin, Pavel Velikhov, Maxim Grinev, and Denis Turdakov. 2008. Accuracy Estimate and Optimization Techniques for SimRank Computation. In Proceedings of the VLDB Endowment. 422--433.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Walid Magdy and Gareth J.Jones. 2010. PRES: A Score Metric for Evaluating Recall-oriented Information Retrieval Applications. In Proceedings of the 33rd International Conference on Research and Development in Information Retrieval, ACM SIGIR. 611--618. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Christopher.D. Manning, Prabhakar Raghavan, and Hinrich Schutze. 2008. Introduction to Information Retrieval. Cambridge University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. arxiv: 1301.3781 [cs.CL]Google ScholarGoogle Scholar
  25. Bryan Perozzi. 2014. Implementation of DeepWalk. https://github.com/phanein/deepwalk Retrieved May 26, 2021 fromGoogle ScholarGoogle Scholar
  26. Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. DeepWalk: Online Learning of Social Representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM SIGKDD. 701--710. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Tao Qin, Tie-Yan Liu, Jun Xu, and Hang Li. 2010. LETOR: A Benchmark Collection for Research on Learning to Rank for Information Retrieval. Information Retrieval, Vol. 13, 4 (2010), 346--374. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Jiezhong Qiu. 2018. Implementation of NetMF. https://github.com/xptree/NetMF Retrieved May 26, 2021 fromGoogle ScholarGoogle Scholar
  29. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Kuansan Wang, and Jie Tang. 2018. Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec. In Proceedings of the 11st ACM International Conference on Web Search and Data Mining, ACM WSDM. 459--467. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Y. Saad. 2003. Iterative Methods for Sparse Linear Systems. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Henry Small. 1973. Co-citation in the Scientific Literature: A New Measure of the Relationship between Two Documents. Journal of the American Society for Information Science and Technology, JASIST, Vol. 24, 4 (1973), 165--269.Google ScholarGoogle Scholar
  32. Jiankai Sun, Bortik Bandyopadhyay, Armin Bashizade, Jiongqian Liang, P. Sadayappan, and Srinivasan Parthasarathy. 2019. ATP: Directed Graph Embedding with Asymmetric Transitivity Preservation. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence, AAAI. 265--272.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2014. Intriguing Properties of Neural Networks. arxiv: 1312.6199 [cs.CV]Google ScholarGoogle Scholar
  34. Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. LINE: Large-scale Information Network Embedding. In Proceedings of the 24th International Conference on World Wide Web, WWW. 1067--1077. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Hanghang Tong, Christos Faloutsos, and Jia yu Pan. 2006. Fast Random Walk with Restart and Its Applications. In Proceedings of the 6th IEEE International Conference on Data Mining, IEEE ICDM. 613--622. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Hongwei Wang, Jia Wang, Jialin Wang, Miao Zhao, Weinan Zhang, Fuzheng Zhang, Xing Xie, and Minyi Guo1. 2018. GraphGAN: Graph Representation Learning with Generative Adversarial Nets. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, AAAI. 2508--2515.Google ScholarGoogle ScholarCross RefCross Ref
  37. Zhitao Wang, Chengyao Chen, and Wenjie Li. 2017. Predictive Network Representation Learning for Link Prediction. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM SIGIR. 969--972. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Jaewon Yang and Jure Leskovec. 2012. Defining and Evaluating Network Communities Based on Ground-Truth. In Proceedings of the 12th IEEE International Conference on Data Mining, IEEE ICDM. 745--754. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Seok-Ho Yoon, Sang-Wook Kim, and Sunju Park. 2016. C-Rank: A Link-based Similarity Measure for Scientific Literature Databases. Information Sciences, Vol. 326 (January 2016), 25--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Weiren Yu, Xuemin Lin, Wenjie Zhang, Jian Pei, and Julie A. McCann. 2019a. Simrank*: Effective and Scalable Pairwise Similarity Search Based on Graph Topology. The VLDB Journal, Vol. 28, 3 (June 2019), 401--426. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Weiren Yu, Wenjie Zhang, Xuemin Lin, Qing Zhang, and Jiajin Le. 2019b. Accelerating Pairwise SimRank Estimation Over Static and Dynamic Graphs. The VLDB Journal, Vol. 28, 1 (2019), 99--122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Peixiang Zhao, Jiawei Han, and Sun Yizhou. 2009. P-Rank: A Comprehensive Structural Similarity Measure over Information Networks. In Proceedings of the 18th ACM Conference on Information and Knowledge Management, ACM CIKM. 553--562. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. AdaSim: A Recursive Similarity Measure in Graphs

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
      October 2021
      4966 pages
      ISBN:9781450384469
      DOI:10.1145/3459637

      Copyright © 2021 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 30 October 2021

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader