skip to main content
10.1145/3485447.3512160acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article
Open Access

CGC: Contrastive Graph Clustering forCommunity Detection and Tracking

Published:25 April 2022Publication History

ABSTRACT

Given entities and their interactions in the web data, which may have occurred at different time, how can we find communities of entities and track their evolution? In this paper, we approach this important task from graph clustering perspective. Recently, state-of-the-art clustering performance in various domains has been achieved by deep clustering methods. Especially, deep graph clustering (DGC) methods have successfully extended deep clustering to graph-structured data by learning node representations and cluster assignments in a joint optimization framework. Despite some differences in modeling choices (e.g., encoder architectures), existing DGC methods are mainly based on autoencoders and use the same clustering objective with relatively minor adaptations. Also, while many real-world graphs are dynamic, previous DGC methods considered only static graphs. In this work, we develop CGC, a novel end-to-end framework for graph clustering, which fundamentally differs from existing methods. CGC learns node embeddings and cluster assignments in a contrastive graph learning framework, where positive and negative samples are carefully selected in a multi-level scheme such that they reflect hierarchical community structures and network homophily. Also, we extend CGC for time-evolving data, where temporal graph clustering is performed in an incremental learning fashion, with the ability to detect change points. Extensive evaluation on real-world graphs demonstrates that the proposed CGC consistently outperforms existing methods.

References

  1. Leman Akoglu, Hanghang Tong, Brendan Meeder, and Christos Faloutsos. 2012. PICS: Parameter-free Identification of Cohesive Subgroups in Large Attributed Graphs. In SDM. SIAM / Omnipress, 439–450.Google ScholarGoogle Scholar
  2. Thomas Aynaud and Jean-Loup Guillaume. 2011. Multi-step community detection and hierarchical time segmentation in evolving networks. In Proceedings of the 5th SNA-KDD workshop, Vol. 11.Google ScholarGoogle Scholar
  3. Stephen T. Barnard and Horst D. Simon. 1993. A Fast Multilevel Implementation of Recursive Spectral Bisection for Partitioning Unstructured Problems. In PPSC. SIAM, 711–718.Google ScholarGoogle Scholar
  4. Mohamed Ishmael Belghazi, Aristide Baratin, Sai Rajeswar, Sherjil Ozair, Yoshua Bengio, R. Devon Hjelm, and Aaron C. Courville. 2018. Mutual Information Neural Estimation. In ICML(Proceedings of Machine Learning Research, Vol. 80). PMLR, 530–539.Google ScholarGoogle Scholar
  5. Tanya Y. Berger-Wolf and Jared Saia. 2006. A framework for analysis of dynamic social networks. In KDD. ACM, 523–528.Google ScholarGoogle Scholar
  6. Deyu Bo, Xiao Wang, Chuan Shi, Meiqi Zhu, Emiao Lu, and Peng Cui. 2020. Structural Deep Clustering Network. In WWW. ACM / IW3C2, 1400–1410.Google ScholarGoogle Scholar
  7. Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey E. Hinton. 2020. A Simple Framework for Contrastive Learning of Visual Representations. In ICML(Proceedings of Machine Learning Research, Vol. 119). PMLR, 1597–1607.Google ScholarGoogle Scholar
  8. Yun Chi, Xiaodan Song, Dengyong Zhou, Koji Hino, and Belle L. Tseng. 2007. Evolutionary spectral clustering by incorporating temporal smoothness. In KDD. ACM, 153–162.Google ScholarGoogle Scholar
  9. [9] CiteSeer.2021 [Online]. https://citeseerx.ist.psu.edu. Accessed: 2021-10-01.Google ScholarGoogle Scholar
  10. Joseph Crawford and Tijana Milenković. 2018. ClueNet: Clustering a temporal network based on topological similarity rather than denseness. PLOS ONE 13, 5 (05 2018), 1–25.Google ScholarGoogle Scholar
  11. [11] DBLP.2021 [Online]. https://dblp.org. Accessed: 2021-10-01.Google ScholarGoogle Scholar
  12. [12] Foursquare.2021 [Online]. https://foursquare.com. Accessed: 2021-10-01.Google ScholarGoogle Scholar
  13. PyTorch Geometric. 2021. PyG. https://github.com/pyg-team/pytorch_geometric. Accessed: 2021-10-20.Google ScholarGoogle Scholar
  14. Michelle Girvan and Mark EJ Newman. 2002. Community structure in social and biological networks. Proceedings of the national academy of sciences 99, 12 (2002), 7821–7826.Google ScholarGoogle ScholarCross RefCross Ref
  15. Gene H Golub and Christian Reinsch. 1971. Singular value decomposition and least squares solutions. In Linear algebra. Springer, 134–151.Google ScholarGoogle Scholar
  16. Palash Goyal, Sujit Rokka Chhetri, and Arquimedes Canedo. 2020. dyngraph2vec: Capturing network dynamics using dynamic graph representation learning. Knowl. Based Syst. 187(2020).Google ScholarGoogle Scholar
  17. Palash Goyal, Nitin Kamra, Xinran He, and Yan Liu. 2018. DynGEM: Deep Embedding Method for Dynamic Graphs. CoRR abs/1805.11273(2018).Google ScholarGoogle Scholar
  18. Derek Greene, Dónal Doyle, and Padraig Cunningham. 2010. Tracking the Evolution of Communities in Dynamic Social Networks. In ASONAM. IEEE Computer Society, 176–183.Google ScholarGoogle Scholar
  19. Ekta Gujral, Ravdeep Pasricha, and Evangelos E. Papalexakis. 2020. Beyond Rank-1: Discovering Rich Community Structure in Multi-Aspect Graphs. In WWW. ACM / IW3C2, 452–462.Google ScholarGoogle Scholar
  20. Xifeng Guo, Long Gao, Xinwang Liu, and Jianping Yin. 2017. Improved Deep Embedded Clustering with Local Structure Preservation. In IJCAI. ijcai.org, 1753–1759.Google ScholarGoogle Scholar
  21. Michael Gutmann and Aapo Hyvärinen. 2010. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In AISTATS(JMLR Proceedings, Vol. 9). JMLR.org, 297–304.Google ScholarGoogle Scholar
  22. John A Hartigan and Manchek A Wong. 1979. Algorithm AS 136: A k-means clustering algorithm. Journal of the royal statistical society. series c (applied statistics) 28, 1(1979), 100–108.Google ScholarGoogle ScholarCross RefCross Ref
  23. Geoffrey E Hinton and Ruslan R Salakhutdinov. 2006. Reducing the dimensionality of data with neural networks. science 313, 5786 (2006), 504–507.Google ScholarGoogle Scholar
  24. R. Devon Hjelm, Alex Fedorov, Samuel Lavoie-Marchildon, Karan Grewal, Philip Bachman, Adam Trischler, and Yoshua Bengio. 2019. Learning deep representations by mutual information estimation and maximization. In ICLR. OpenReview.net.Google ScholarGoogle Scholar
  25. Zhuxi Jiang, Yin Zheng, Huachun Tan, Bangsheng Tang, and Hanning Zhou. 2017. Variational Deep Embedding: An Unsupervised and Generative Approach to Clustering. In IJCAI. ijcai.org, 1965–1972.Google ScholarGoogle Scholar
  26. George Karypis and Vipin Kumar. 1998. A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs. SIAM J. Sci. Comput. 20, 1 (1998), 359–392.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2020. Supervised Contrastive Learning. CoRR abs/2004.11362(2020).Google ScholarGoogle Scholar
  28. Thomas N. Kipf and Max Welling. 2016. Variational Graph Auto-Encoders. CoRR abs/1611.07308(2016).Google ScholarGoogle Scholar
  29. Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR (Poster). OpenReview.net.Google ScholarGoogle Scholar
  30. Harold W Kuhn. 1955. The Hungarian method for the assignment problem. Naval research logistics quarterly 2, 1-2 (1955), 83–97.Google ScholarGoogle Scholar
  31. Andrea Lancichinetti and Santo Fortunato. 2012. Consensus clustering in complex networks. Scientific reports 2, 1 (2012), 1–7.Google ScholarGoogle Scholar
  32. Peizhao Li, Han Zhao, and Hongfu Liu. 2020. Deep Fair Clustering for Visual Learning. In CVPR. Computer Vision Foundation / IEEE, 9067–9076.Google ScholarGoogle Scholar
  33. [33] ACM Digital Library.2021 [Online]. https://dl.acm.org. Accessed: 2021-10-01.Google ScholarGoogle Scholar
  34. Deep Graph Library. 2021. DGI. https://github.com/dmlc/dgl/tree/master/examples/pytorch/dgi. Accessed: 2021-10-20.Google ScholarGoogle Scholar
  35. J. Liu, C. Xu, C. Yin, W. Wu, and Y. Song. 2020. K-Core based Temporal Graph Convolutional Network for Dynamic Graphs. IEEE Transactions on Knowledge and Data Engineering (2020), 1–1. https://doi.org/10.1109/TKDE.2020.3033829Google ScholarGoogle ScholarCross RefCross Ref
  36. Rui Lu, Zhiyao Duan, and Changshui Zhang. 2019. Audio-Visual Deep Clustering for Speech Separation. IEEE ACM Trans. Audio Speech Lang. Process. 27, 11 (2019), 1697–1712.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Naveen Sai Madiraju, Seid M. Sadat, Dimitry Fisher, and Homa Karimabadi. 2018. Deep Temporal Clustering : Fully Unsupervised Learning of Time-Domain Features. CoRR abs/1802.01059(2018).Google ScholarGoogle Scholar
  38. Giang Hoang Nguyen, John Boaz Lee, Ryan A. Rossi, Nesreen K. Ahmed, Eunyee Koh, and Sungchul Kim. 2018. Continuous-Time Dynamic Network Embeddings. In WWW (Companion Volume). ACM, 969–976.Google ScholarGoogle Scholar
  39. Sejoon Oh, Namyong Park, Lee Sael, and U Kang. 2018. Scalable Tucker Factorization for Sparse Tensors - Algorithms and Discoveries. In ICDE. IEEE Computer Society, 1120–1131.Google ScholarGoogle Scholar
  40. Shirui Pan, Ruiqi Hu, Sai-Fu Fung, Guodong Long, Jing Jiang, and Chengqi Zhang. 2020. Learning Graph Embedding With Adversarial Training Methods. IEEE Trans. Cybern. 50, 6 (2020), 2475–2487.Google ScholarGoogle ScholarCross RefCross Ref
  41. Aldo Pareja, Giacomo Domeniconi, Jie Chen, Tengfei Ma, Toyotaro Suzumura, Hiroki Kanezashi, Tim Kaler, Tao B. Schardl, and Charles E. Leiserson. 2020. EvolveGCN: Evolving Graph Convolutional Networks for Dynamic Graphs. In AAAI. AAAI Press, 5363–5370.Google ScholarGoogle Scholar
  42. Ha-Myung Park, Namyong Park, Sung-Hyon Myaeng, and U Kang. 2016. Partition Aware Connected Component Computation in Distributed Systems. In ICDM. IEEE Computer Society, 420–429.Google ScholarGoogle Scholar
  43. Ha-Myung Park, Namyong Park, Sung-Hyon Myaeng, and U Kang. 2020. PACC: Large scale connected component computation on Hadoop and Spark. PLOS ONE 15, 3 (03 2020), 1–25. https://doi.org/10.1371/journal.pone.0229936Google ScholarGoogle ScholarCross RefCross Ref
  44. Namyong Park, Fuchen Liu, Purvanshi Mehta, Dana Cristofor, Christos Faloutsos, and Yuxiao Dong. 2022. EvoKG: Jointly Modeling Event Time and Network Structure for Reasoning over Temporal Knowledge Graphs. In WSDM. ACM.Google ScholarGoogle Scholar
  45. Namyong Park, Sejoon Oh, and U Kang. 2017. Fast and Scalable Distributed Boolean Tensor Factorization. In ICDE. IEEE Computer Society, 1071–1082.Google ScholarGoogle Scholar
  46. Namyong Park, Sejoon Oh, and U Kang. 2019. Fast and scalable method for distributed Boolean tensor factorization. VLDB J. 28, 4 (2019), 549–574.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Xi Peng, Shijie Xiao, Jiashi Feng, Wei-Yun Yau, and Zhang Yi. 2016. Deep Subspace Clustering with Sparsity Prior. In IJCAI. IJCAI/AAAI Press, 1925–1931.Google ScholarGoogle Scholar
  48. Zhihao Peng, Hui Liu, Yuheng Jia, and Junhui Hou. 2021. Attention-driven Graph Clustering Network. In ACM Multimedia. ACM, 935–943.Google ScholarGoogle Scholar
  49. Ben Poole, Sherjil Ozair, Aäron van den Oord, Alex Alemi, and George Tucker. 2019. On Variational Bounds of Mutual Information. In ICML(Proceedings of Machine Learning Research, Vol. 97). PMLR, 5171–5180.Google ScholarGoogle Scholar
  50. [50] Yahoo Webscope Program.2021 [Online]. https://webscope.sandbox.yahoo.com. Accessed: 2021-10-01.Google ScholarGoogle Scholar
  51. Martin Rosvall and Carl T Bergstrom. 2008. Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences 105, 4 (2008), 1118–1123.Google ScholarGoogle ScholarCross RefCross Ref
  52. Martin Rosvall and Carl T Bergstrom. 2011. Multilevel compression of random walks on networks reveals hierarchical organization in large integrated systems. PloS one 6, 4 (2011), e18209.Google ScholarGoogle ScholarCross RefCross Ref
  53. scikit learn. 2021. scikit-learn. https://github.com/scikit-learn/scikit-learn. Accessed: 2021-10-20.Google ScholarGoogle Scholar
  54. Uriel Singer. 2021. CTDNE. https://github.com/urielsinger/CTDNE.Google ScholarGoogle Scholar
  55. Jimeng Sun, Christos Faloutsos, Spiros Papadimitriou, and Philip S. Yu. 2007. GraphScope: parameter-free mining of large time-evolving graphs. In KDD. ACM, 687–696.Google ScholarGoogle Scholar
  56. Ke Sun, Zhouchen Lin, and Zhanxing Zhu. 2020. Multi-Stage Self-Supervised Learning for Graph Convolutional Networks on Graphs with Few Labeled Nodes. In AAAI. AAAI Press, 5892–5899.Google ScholarGoogle Scholar
  57. Fei Tian, Bin Gao, Qing Cui, Enhong Chen, and Tie-Yan Liu. 2014. Learning Deep Representations for Graph Clustering. In AAAI. AAAI Press, 1293–1299.Google ScholarGoogle Scholar
  58. Anton Tsitsulin, John Palowitch, Bryan Perozzi, and Emmanuel Müller. 2020. Graph clustering with graph neural networks. arXiv preprint arXiv:2006.16904(2020).Google ScholarGoogle Scholar
  59. Aäron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation Learning with Contrastive Predictive Coding. CoRR abs/1807.03748(2018).Google ScholarGoogle Scholar
  60. Petar Velickovic, William Fedus, William L. Hamilton, Pietro Liò, Yoshua Bengio, and R. Devon Hjelm. 2019. Deep Graph Infomax. In ICLR (Poster). OpenReview.net.Google ScholarGoogle Scholar
  61. Chun Wang, Shirui Pan, Ruiqi Hu, Guodong Long, Jing Jiang, and Chengqi Zhang. 2019. Attributed Graph Clustering: A Deep Attentional Embedding Approach. In IJCAI. ijcai.org, 3670–3676.Google ScholarGoogle Scholar
  62. Xiao Wang, Nian Liu, Hui Han, and Chuan Shi. 2021. Self-supervised Heterogeneous Graph Neural Network with Co-contrastive Learning. In KDD. ACM, 1726–1736.Google ScholarGoogle Scholar
  63. Junyuan Xie, Ross B. Girshick, and Ali Farhadi. 2016. Unsupervised Deep Embedding for Clustering Analysis. In ICML(JMLR Workshop and Conference Proceedings, Vol. 48). JMLR.org, 478–487.Google ScholarGoogle Scholar
  64. Bo Yang, Xiao Fu, Nicholas D. Sidiropoulos, and Mingyi Hong. 2017. Towards K-means-friendly Spaces: Simultaneous Deep Learning and Clustering. In ICML(Proceedings of Machine Learning Research, Vol. 70). PMLR, 3861–3870.Google ScholarGoogle Scholar
  65. Di Yao, Chao Zhang, Zhihua Zhu, Jian-Hui Huang, and Jingping Bi. 2017. Trajectory clustering via deep representation learning. In IJCNN. IEEE, 3880–3887.Google ScholarGoogle Scholar
  66. Yuning You, Tianlong Chen, Zhangyang Wang, and Yang Shen. 2020. When Does Self-Supervision Help Graph Convolutional Networks?. In ICML(Proceedings of Machine Learning Research, Vol. 119). PMLR, 10871–10880.Google ScholarGoogle Scholar
  67. Mingxuan Yue, Yaguang Li, Haoze Yang, Ritesh Ahuja, Yao-Yi Chiang, and Cyrus Shahabi. 2019. DETECT: Deep Trajectory Clustering for Mobility-Behavior Analysis. In IEEE BigData. IEEE, 988–997.Google ScholarGoogle Scholar
  68. Ziwei Zhang, Peng Cui, Jian Pei, Xiao Wang, and Wenwu Zhu. 2018. TIMERS: Error-Bounded SVD Restart on Dynamic Networks. In AAAI. AAAI Press, 224–231.Google ScholarGoogle Scholar

Index Terms

  1. CGC: Contrastive Graph Clustering forCommunity Detection and Tracking
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format