research-article

AdaSim: A Recursive Similarity Measure in Graphs

Authors:
Masoud Rehyani Hamedani

Hanyang University, Seoul, Republic of Korea

Hanyang University, Seoul, Republic of Korea
View Profile

,
Sang-Wook Kim

Hanyang University, Seoul, Republic of Korea

Hanyang University, Seoul, Republic of Korea
View Profile

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge ManagementOctober 2021Pages 1528–1537https://doi.org/10.1145/3459637.3482316

Published:30 October 2021Publication History

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

Pages 1528–1537

ABSTRACT

In the literature, various link-based similarity measures such as Adamic/Adar (in short Ada), SimRank, and random walk with restart (RWR) have been proposed. Contrary to SimRank and RWR, Ada is a non-recursive measure, which exploits the local graph structure in similarity computation. Motivated by Ada's promising results in various graph-related tasks, along with the fact that SimRank is a recursive generalization of the co -citation measure, in this paper, we propose AdaSim, a recursive similarity measure based on the Ada philosophy. Our AdaSim provides identical accuracy to that of Ada on the first iteration and it is applicable to both directed and undirected graphs. To accelerate our iterative form, we also propose a matrix form that is dramatically faster while providing the exact AdaSim scores. We conduct extensive experiments with five real-world datasets to evaluate both the effectiveness and efficiency of our AdaSim in comparison with those of existing similarity measures and graph embedding methods in the task of similarity computation of nodes. Our experimental results show that 1) AdaSim significantly improves the effectiveness of Ada and outperforms other competitors, 2) its efficiency is comparable to that of SimRank* while being better than the others, 3) AdaSim is not sensitive to the parameter tuning, and 4) similarity measures are better than embedding methods to compute similarity of nodes.

Supplemental Material

CIKM21-rgfp0668.mp4

mp4

38.5 MB

Download

References

Lada A. Adamic and Eytan Adar. 2003. Friends and Neighbors on the Web. Social Networks, Vol. 25, 3 (July 2003), 211--230.Google ScholarCross Ref
Paweena Chaiwanarom and Chidchanok Lursinsap. 2015. Collaborator Recommendation in Interdisciplinary Computer Science Using Degrees of Collaborative Forces, Temporal Evolution of Research Interest, and Comparative Seniority Status. Knowledge-Based Systems, Vol. 75 (February 2015), 161--172. Google ScholarDigital Library
Hung-Hsuan Chen and C. Lee Giles. 2015. ASCOS: An Asymmetric Similarity Measure for Weighted Networks to Address the Problem of SimRank. ACM Transactions on Knowledge Discovery from Data, TKDD, Vol. 10, 2, Article 15 (October 2015), 26 pages. Google ScholarDigital Library
Quanyu Dai. 2019. Implementation of DWNS. https://github.com/wonniu/AdvT4NE_WWW2019 Retrieved May 26, 2021 fromGoogle Scholar
Quanyu Dai, Xiao Shen, Liang Zhang, Qiang Li, and Dan Wang. 2019. Adversarial Training Methods for Network Embedding. In Proceedings of the 28th International Conference on World Wide Web, WWW. 329--339. Google ScholarDigital Library
Daniel Fogaras and Balazs Racz. 2005. Scaling Link-based Similarity Search. In Proceedings of the 14th International Conference on World Wide Web, WWW. 641--650. Google ScholarDigital Library
Aditya Grover. 2017. Implementation of node2vec. https://github.com/aditya-grover/node2vec Retrieved May 26, 2021 fromGoogle Scholar
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable Feature Learning for Networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM SIGKDD. 855--864. Google ScholarDigital Library
Masoud Reyhani Hamedani and Sang-Wook Kim. 2017. JacSim: An Accurate and Efficient Link-Based Similarity Measure In Graphs. Information Sciences, Vol. 414 (November 2017), 203--224.Google ScholarCross Ref
Masoud Reyhani Hamedani and Sang-Wook Kim. 2019. Pairwise Normalization in Simrank Variants: Problem, Solution, and Evaluation. In Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, ACM SAC. 534--541. Google ScholarDigital Library
Jiawei Han, Micheline Kamber, and Jian Pei. 2006. Data Mining: Concepts and Techniques, Second Edition. Morgan Kaufmann, San Francisco. Google ScholarDigital Library
Roger A. Horn Han and Charles R. Johnson. 2013. Matrix Analysis, Second Edition. Cambridge University Press. Google ScholarDigital Library
Glen Jeh and Jennifer Widom. 2002. SimRank: A Measure of Structural-Context Similarity. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, SIGKDD. 538--543. Google ScholarDigital Library
Jinhong Jung, Namyong Park, Sael Lee, and U Kang. 2017. BePI: Fast and Memory-Efficient Method for Billion-Scale Random Walk with Restart. In Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD. 789--804. Google ScholarDigital Library
Jinhong Jung, Kijung Shin, Lee Sael, and U Kang. 2016. Random Walk with Restart on Large Graphs Using Block Elimination. ACM Transactions on Database Systems, TODS, Vol. 41, 2, Article 12 (May 2016), 43 pages. Google ScholarDigital Library
Jérôme Kunegis, Julia Preusse, and Felix Schwager. 2013. What is the Added Value of Negative Links in Online Social Networks? In Proceedings of the 22nd International Conference on World Wide Web, WWW. 727--736. Google ScholarDigital Library
Jundong Li, Liang Wu, Ruocheng Guo, Chenghao Liu, and Huan Liu. 2019b. Multi-Level Network Embedding with Boosted Low-Rank Matrix Approximation. In Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM. 49--56. Google ScholarDigital Library
Mo Li, Farhana M. Choudhury, Renata Borovica-Gajic, Zhiqiong Wang, Junchang Xin, and Jianxin Li. 2019a. CrashSim: An Efficient Algorithm for Computing SimRank over Static and Temporal Graphs. In Proceedings of the 36th IEEE International Conference on Data Engineering, IEEE ICDE. 1141--1152.Google Scholar
David Liben-Nowell and Jon Kleinberg. 2007. The Link-prediction Problem for Social Networks. Journal of the American Society for Information Science and Technology, JASIST, Vol. 58, 7 (May 2007), 1--23. Google ScholarDigital Library
Zhenjiang Lin, Michael R. Lyu, and Irwin King. 2012. MatchSim: A Novel Similarity Measure Based on Maximum Neighborhood Matching. Knowledge and Information Systems, KAIS, Vol. 32, 1 (July 2012), 141--166. Google ScholarDigital Library
Dmitry Lizorkin, Pavel Velikhov, Maxim Grinev, and Denis Turdakov. 2008. Accuracy Estimate and Optimization Techniques for SimRank Computation. In Proceedings of the VLDB Endowment. 422--433.Google ScholarDigital Library
Walid Magdy and Gareth J.Jones. 2010. PRES: A Score Metric for Evaluating Recall-oriented Information Retrieval Applications. In Proceedings of the 33rd International Conference on Research and Development in Information Retrieval, ACM SIGIR. 611--618. Google ScholarDigital Library
Christopher.D. Manning, Prabhakar Raghavan, and Hinrich Schutze. 2008. Introduction to Information Retrieval. Cambridge University Press. Google ScholarDigital Library
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. arxiv: 1301.3781 [cs.CL]Google Scholar
Bryan Perozzi. 2014. Implementation of DeepWalk. https://github.com/phanein/deepwalk Retrieved May 26, 2021 fromGoogle Scholar
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. DeepWalk: Online Learning of Social Representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM SIGKDD. 701--710. Google ScholarDigital Library
Tao Qin, Tie-Yan Liu, Jun Xu, and Hang Li. 2010. LETOR: A Benchmark Collection for Research on Learning to Rank for Information Retrieval. Information Retrieval, Vol. 13, 4 (2010), 346--374. Google ScholarDigital Library
Jiezhong Qiu. 2018. Implementation of NetMF. https://github.com/xptree/NetMF Retrieved May 26, 2021 fromGoogle Scholar
Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Kuansan Wang, and Jie Tang. 2018. Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec. In Proceedings of the 11st ACM International Conference on Web Search and Data Mining, ACM WSDM. 459--467. Google ScholarDigital Library
Y. Saad. 2003. Iterative Methods for Sparse Linear Systems. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA. Google ScholarDigital Library
Henry Small. 1973. Co-citation in the Scientific Literature: A New Measure of the Relationship between Two Documents. Journal of the American Society for Information Science and Technology, JASIST, Vol. 24, 4 (1973), 165--269.Google Scholar
Jiankai Sun, Bortik Bandyopadhyay, Armin Bashizade, Jiongqian Liang, P. Sadayappan, and Srinivasan Parthasarathy. 2019. ATP: Directed Graph Embedding with Asymmetric Transitivity Preservation. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence, AAAI. 265--272.Google ScholarDigital Library
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2014. Intriguing Properties of Neural Networks. arxiv: 1312.6199 [cs.CV]Google Scholar
Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. LINE: Large-scale Information Network Embedding. In Proceedings of the 24th International Conference on World Wide Web, WWW. 1067--1077. Google ScholarDigital Library
Hanghang Tong, Christos Faloutsos, and Jia yu Pan. 2006. Fast Random Walk with Restart and Its Applications. In Proceedings of the 6th IEEE International Conference on Data Mining, IEEE ICDM. 613--622. Google ScholarDigital Library
Hongwei Wang, Jia Wang, Jialin Wang, Miao Zhao, Weinan Zhang, Fuzheng Zhang, Xing Xie, and Minyi Guo1. 2018. GraphGAN: Graph Representation Learning with Generative Adversarial Nets. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, AAAI. 2508--2515.Google ScholarCross Ref
Zhitao Wang, Chengyao Chen, and Wenjie Li. 2017. Predictive Network Representation Learning for Link Prediction. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM SIGIR. 969--972. Google ScholarDigital Library
Jaewon Yang and Jure Leskovec. 2012. Defining and Evaluating Network Communities Based on Ground-Truth. In Proceedings of the 12th IEEE International Conference on Data Mining, IEEE ICDM. 745--754. Google ScholarDigital Library
Seok-Ho Yoon, Sang-Wook Kim, and Sunju Park. 2016. C-Rank: A Link-based Similarity Measure for Scientific Literature Databases. Information Sciences, Vol. 326 (January 2016), 25--40. Google ScholarDigital Library
Weiren Yu, Xuemin Lin, Wenjie Zhang, Jian Pei, and Julie A. McCann. 2019a. Simrank*: Effective and Scalable Pairwise Similarity Search Based on Graph Topology. The VLDB Journal, Vol. 28, 3 (June 2019), 401--426. Google ScholarDigital Library
Weiren Yu, Wenjie Zhang, Xuemin Lin, Qing Zhang, and Jiajin Le. 2019b. Accelerating Pairwise SimRank Estimation Over Static and Dynamic Graphs. The VLDB Journal, Vol. 28, 1 (2019), 99--122. Google ScholarDigital Library
Peixiang Zhao, Jiawei Han, and Sun Yizhou. 2009. P-Rank: A Comprehensive Structural Similarity Measure over Information Networks. In Proceedings of the 18th ACM Conference on Information and Knowledge Management, ACM CIKM. 553--562. Google ScholarDigital Library

Index Terms

AdaSim: A Recursive Similarity Measure in Graphs
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

GELTOR: A Graph Embedding Method based on Listwise Learning to Rank
WWW '23: Proceedings of the ACM Web Conference 2023

Similarity-based embedding methods have introduced a new perspective on graph embedding by conforming the similarity distribution of latent vectors in the embedding space to that of nodes in the graph; they show significant effectiveness over ...
Read More
Measuring Similarity Based on Link Information: A Comparative Study

Measuring similarity between objects is a fundamental task in domains such as data mining, information retrieval, and so on. Link-based similarity measures have attracted the attention of many researchers and have been widely applied in recent years. ...
Read More
Pairwise normalization in SimRank variants: problem, solution, and evaluation
SAC '19: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing

Despite of the success in the real-world applications, SimRank and its variants, rvs-SimRank and PRank, suffer from the pairwise normalization problem (PNP) as a counter intuitive property hidden in their computation paradigm. Jac-Sim, a state-of-the-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
October 2021
4966 pages
ISBN:9781450384469
DOI:10.1145/3459637
General Chairs:
Gianluca Demartini
The University of Queensland, Australia
,
Guido Zuccon
The University of Queensland, Australia
,
Program Chairs:
J. Shane Culpepper
RMIT University, Australia
,
Zi Huang
The University of Queensland, Australia
,
Hanghang Tong
University of Illinois at Urbana-Champaign, USA
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 October 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
adamic/adar
graph structure
link-based similarity
pairwise normalization
recursive measure
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 170
  Total Downloads
- Downloads (Last 12 months)35
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

AdaSim: A Recursive Similarity Measure in Graphs

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

GELTOR: A Graph Embedding Method based on Listwise Learning to Rank

Measuring Similarity Based on Link Information: A Comparative Study

Pairwise normalization in SimRank variants: problem, solution, and evaluation