skip to main content
10.1145/3394486.3403218acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Understanding Negative Sampling in Graph Representation Learning

Published:20 August 2020Publication History

ABSTRACT

Graph representation learning has been extensively studied in recent years, in which sampling is a critical point. Prior arts usually focus on sampling positive node pairs, while the strategy for negative sampling is left insufficiently explored. To bridge the gap, we systematically analyze the role of negative sampling from the perspectives of both objective and risk, theoretically demonstrating that negative sampling is as important as positive sampling in determining the optimization objective and the resulted variance. To the best of our knowledge, we are the first to derive the theory and quantify that a nice negative sampling distribution is pn(u|v) ∝ pd(u|v)α, 0 < α < 1. With the guidance of the theory, we propose MCNS, approximating the positive distribution with self-contrast approximation and accelerating negative sampling by Metropolis-Hastings. We evaluate our method on 5 datasets that cover extensive downstream graph learning tasks, including link prediction, node classification and recommendation, on a total of 19 experimental settings. These relatively comprehensive experimental results demonstrate its robustness and superiorities.

Skip Supplemental Material Section

Supplemental Material

3394486.3403218.mp4

mp4

31 MB

References

  1. Yoshua Bengio and Jean-Sébastien Senécal. 2008. Adaptive importance sampling to accelerate training of a neural probabilistic language model. IEEE Transactions on Neural Networks, Vol. 19, 4 (2008), 713--722.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Avishek Joey Bose, Huan Ling, and Yanshuai Cao. 2018. Adversarial Contrastive Estimation. (2018), 1021--1032.Google ScholarGoogle Scholar
  3. Liwei Cai and William Yang Wang. 2018. KBGAN: Adversarial Learning for Knowledge Graph Embeddings. In NAACL-HLT?18. 1470--1480.Google ScholarGoogle Scholar
  4. Hugo Caselles-Dupré, Florian Lesaint, and Jimena Royo-Letelier. 2018. Word2vec applied to recommendation: Hyperparameters matter. In RecSys'18. ACM, 352--356.Google ScholarGoogle Scholar
  5. Jie Chen, Tengfei Ma, and Cao Xiao. 2018. FastGCN: fast learning with graph convolutional networks via importance sampling. ICLR'18 (2018).Google ScholarGoogle Scholar
  6. Siddhartha Chib and Edward Greenberg. 1995. Understanding the metropolis-hastings algorithm. The american statistician, Vol. 49, 4 (1995), 327--335.Google ScholarGoogle Scholar
  7. Paolo Cremonesi, Yehuda Koren, and Roberto Turrin. 2010. Performance of recommender algorithms on top-n recommendation tasks. In Proceedings of the fourth ACM conference on Recommender systems. ACM, 39--46.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Ming Ding, Jie Tang, and Jie Zhang. 2018. Semi-supervised learning on graphs with generative adversarial nets. In CIKM'18. ACM, 913--922.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A library for large linear classification. Journal of machine learning research, Vol. 9, Aug (2008), 1871--1874.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Hongchang Gao and Heng Huang. 2018. Self-Paced Network Embedding. (2018), 1406--1415.Google ScholarGoogle Scholar
  11. Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In KDD'16. ACM, 855--864.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Michael U Gutmann and Aapo Hyv"arinen. 2012. Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics. Journal of Machine Learning Research, Vol. 13, Feb (2012), 307--361.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In NIPS'17. 1024--1034.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Henry Hsu and Peter A Lachenbruch. 2007. Paired t test. Wiley encyclopedia of clinical trials (2007), 1--3.Google ScholarGoogle Scholar
  15. Yifan Hu, Yehuda Koren, and Chris Volinsky. 2008. Collaborative filtering for implicit feedback datasets. In ICDM'08. Ieee, 263--272.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Hong Huang, Jie Tang, Sen Wu, Lu Liu, and Xiaoming Fu. 2014. Mining triadic closure patterns in social networks. In WWW'14. 499--504.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Thomas N Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. ICLR'17 (2017).Google ScholarGoogle Scholar
  18. Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. 2007. Graph evolution: Densification and shrinking diameters. TKDD'07, Vol. 1, 1 (2007), 2--es.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Omer Levy and Yoav Goldberg. 2014. Neural word embedding as implicit matrix factorization. In NIPS'14. 2177--2185.Google ScholarGoogle Scholar
  20. Qimai Li, Zhichao Han, and Xiao-Ming Wu. 2018. Deeper insights into graph convolutional networks for semi-supervised learning. In AAAI'18.Google ScholarGoogle ScholarCross RefCross Ref
  21. Greg Linden, Brent Smith, and Jeremy York. 2003. Amazon. com recommendations: Item-to-item collaborative filtering. IEEE Internet computing, Vol. 7, 1 (2003), 76--80.Google ScholarGoogle Scholar
  22. Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton Van Den Hengel. 2015. Image-based recommendations on styles and substitutes. In SIGIR'15. ACM, 43--52.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Nicholas Metropolis, Arianna W Rosenbluth, Marshall N Rosenbluth, Augusta H Teller, and Edward Teller. 1953. Equation of state calculations by fast computing machines. The journal of chemical physics, Vol. 21, 6 (1953), 1087--1092.Google ScholarGoogle ScholarCross RefCross Ref
  24. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In NIPS'13. 3111--3119.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Andriy Mnih and Koray Kavukcuoglu. 2013. Learning word embeddings efficiently with noise-contrastive estimation. In NIPS'13. 2265--2273.Google ScholarGoogle Scholar
  26. Rong Pan, Yunhong Zhou, Bin Cao, Nathan N Liu, Rajan Lukose, Martin Scholz, and Qiang Yang. 2008. One-class collaborative filtering. In ICDM'08. IEEE, 502--511.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In KDD'14. ACM, 701--710.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Kuansan Wang, and Jie Tang. 2018. Network embedding as matrix factorization: Unifying deepwalk, line, pte, and node2vec. In WSDM'18. ACM, 459--467.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian personalized ranking from implicit feedback. In UAI'09. AUAI Press, 452--461.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Kazunari Sugiyama and Min-Yen Kan. 2010. Scholarly paper recommendation via user's recent research interests. In JCDL'10. ACM, 29--38.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Zhiqing Sun, Zhi-Hong Deng, Jian-Yun Nie, and Jian Tang. 2019. Rotate: Knowledge graph embedding by relational rotation in complex space. arXiv preprint arXiv:1902.10197 (2019).Google ScholarGoogle Scholar
  32. Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. Line: Large-scale information network embedding. In WWW'15. 1067--1077.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Cunchao Tu, Han Liu, Zhiyuan Liu, and Maosong Sun. 2017. Cane: Context-aware network embedding for relation modeling. In ACL'17. 1722--1731.Google ScholarGoogle ScholarCross RefCross Ref
  34. Petar Velivc ković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2018. Graph attention networks. ICLR'18 (2018).Google ScholarGoogle Scholar
  35. Jun Wang, Lantao Yu, Weinan Zhang, Yu Gong, Yinghui Xu, Benyou Wang, Peng Zhang, and Dell Zhang. 2017b. Irgan: A minimax game for unifying generative and discriminative information retrieval models. In SIGIR'17. ACM, 515--524.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Qinyong Wang, Hongzhi Yin, Zhiting Hu, Defu Lian, Hao Wang, and Zi Huang. 2018. Neural memory streaming recommender networks with adversarial training. In KDD'18. ACM, 2467--2475.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Xiao Wang, Peng Cui, Jing Wang, Jian Pei, Wenwu Zhu, and Shiqiang Yang. 2017a. Community preserving network embedding. In AAAI'17.Google ScholarGoogle ScholarCross RefCross Ref
  38. Jason Weston, Samy Bengio, and Nicolas Usunier. 2011. Wsabie: Scaling up to large vocabulary image annotation. In IJCAI'11 .Google ScholarGoogle Scholar
  39. Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2018. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826 (2018).Google ScholarGoogle Scholar
  40. Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L Hamilton, and Jure Leskovec. 2018. Graph convolutional neural networks for web-scale recommender systems. In KDD'18. ACM, 974--983.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Weinan Zhang, Tianqi Chen, Jun Wang, and Yong Yu. 2013. Optimizing Top-N Collaborative Filtering via Dynamic Negative Item Sampling. In SIGIR'13. ACM, 785--788.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Yongqi Zhang, Quanming Yao, Yingxia Shao, and Lei Chen. 2019. NSCaching: Simple and Efficient Negative Sampling for Knowledge Graph Embedding. (2019), 614--625.Google ScholarGoogle Scholar
  43. Zheng Zhang and Pierre Zweigenbaum. 2018. GNEG: Graph-Based Negative Sampling for word2vec. In ACL'18. 566--571.Google ScholarGoogle ScholarCross RefCross Ref
  44. Tong Zhao, Julian McAuley, and Irwin King. 2015. Improving latent factor models via personalized feature projection for one class recommendation. In CIKM'15. ACM, 821--830.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Chang Zhou, Yuqiong Liu, Xiaofei Liu, Zhongyi Liu, and Jun Gao. 2017. Scalable graph embedding for asymmetric proximity. In AAAI'17.Google ScholarGoogle ScholarCross RefCross Ref
  46. Chang Zhou, Jianxin Ma, Jianwei Zhang, Jingren Zhou, and Hongxia Yang. 2020. Contrastive Learning for Debiased Candidate Generation in Large-Scale Recommender Systems. arxiv: cs.IR/2005.12964Google ScholarGoogle Scholar

Index Terms

  1. Understanding Negative Sampling in Graph Representation Learning

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
        August 2020
        3664 pages
        ISBN:9781450379984
        DOI:10.1145/3394486

        Copyright © 2020 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 20 August 2020

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate1,133of8,635submissions,13%

        Upcoming Conference

        KDD '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader