ABSTRACT
Given entities and their interactions in the web data, which may have occurred at different time, how can we find communities of entities and track their evolution? In this paper, we approach this important task from graph clustering perspective. Recently, state-of-the-art clustering performance in various domains has been achieved by deep clustering methods. Especially, deep graph clustering (DGC) methods have successfully extended deep clustering to graph-structured data by learning node representations and cluster assignments in a joint optimization framework. Despite some differences in modeling choices (e.g., encoder architectures), existing DGC methods are mainly based on autoencoders and use the same clustering objective with relatively minor adaptations. Also, while many real-world graphs are dynamic, previous DGC methods considered only static graphs. In this work, we develop CGC, a novel end-to-end framework for graph clustering, which fundamentally differs from existing methods. CGC learns node embeddings and cluster assignments in a contrastive graph learning framework, where positive and negative samples are carefully selected in a multi-level scheme such that they reflect hierarchical community structures and network homophily. Also, we extend CGC for time-evolving data, where temporal graph clustering is performed in an incremental learning fashion, with the ability to detect change points. Extensive evaluation on real-world graphs demonstrates that the proposed CGC consistently outperforms existing methods.
- Leman Akoglu, Hanghang Tong, Brendan Meeder, and Christos Faloutsos. 2012. PICS: Parameter-free Identification of Cohesive Subgroups in Large Attributed Graphs. In SDM. SIAM / Omnipress, 439–450.Google Scholar
- Thomas Aynaud and Jean-Loup Guillaume. 2011. Multi-step community detection and hierarchical time segmentation in evolving networks. In Proceedings of the 5th SNA-KDD workshop, Vol. 11.Google Scholar
- Stephen T. Barnard and Horst D. Simon. 1993. A Fast Multilevel Implementation of Recursive Spectral Bisection for Partitioning Unstructured Problems. In PPSC. SIAM, 711–718.Google Scholar
- Mohamed Ishmael Belghazi, Aristide Baratin, Sai Rajeswar, Sherjil Ozair, Yoshua Bengio, R. Devon Hjelm, and Aaron C. Courville. 2018. Mutual Information Neural Estimation. In ICML(Proceedings of Machine Learning Research, Vol. 80). PMLR, 530–539.Google Scholar
- Tanya Y. Berger-Wolf and Jared Saia. 2006. A framework for analysis of dynamic social networks. In KDD. ACM, 523–528.Google Scholar
- Deyu Bo, Xiao Wang, Chuan Shi, Meiqi Zhu, Emiao Lu, and Peng Cui. 2020. Structural Deep Clustering Network. In WWW. ACM / IW3C2, 1400–1410.Google Scholar
- Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey E. Hinton. 2020. A Simple Framework for Contrastive Learning of Visual Representations. In ICML(Proceedings of Machine Learning Research, Vol. 119). PMLR, 1597–1607.Google Scholar
- Yun Chi, Xiaodan Song, Dengyong Zhou, Koji Hino, and Belle L. Tseng. 2007. Evolutionary spectral clustering by incorporating temporal smoothness. In KDD. ACM, 153–162.Google Scholar
- [9] CiteSeer.2021 [Online]. https://citeseerx.ist.psu.edu. Accessed: 2021-10-01.Google Scholar
- Joseph Crawford and Tijana Milenković. 2018. ClueNet: Clustering a temporal network based on topological similarity rather than denseness. PLOS ONE 13, 5 (05 2018), 1–25.Google Scholar
- [11] DBLP.2021 [Online]. https://dblp.org. Accessed: 2021-10-01.Google Scholar
- [12] Foursquare.2021 [Online]. https://foursquare.com. Accessed: 2021-10-01.Google Scholar
- PyTorch Geometric. 2021. PyG. https://github.com/pyg-team/pytorch_geometric. Accessed: 2021-10-20.Google Scholar
- Michelle Girvan and Mark EJ Newman. 2002. Community structure in social and biological networks. Proceedings of the national academy of sciences 99, 12 (2002), 7821–7826.Google ScholarCross Ref
- Gene H Golub and Christian Reinsch. 1971. Singular value decomposition and least squares solutions. In Linear algebra. Springer, 134–151.Google Scholar
- Palash Goyal, Sujit Rokka Chhetri, and Arquimedes Canedo. 2020. dyngraph2vec: Capturing network dynamics using dynamic graph representation learning. Knowl. Based Syst. 187(2020).Google Scholar
- Palash Goyal, Nitin Kamra, Xinran He, and Yan Liu. 2018. DynGEM: Deep Embedding Method for Dynamic Graphs. CoRR abs/1805.11273(2018).Google Scholar
- Derek Greene, Dónal Doyle, and Padraig Cunningham. 2010. Tracking the Evolution of Communities in Dynamic Social Networks. In ASONAM. IEEE Computer Society, 176–183.Google Scholar
- Ekta Gujral, Ravdeep Pasricha, and Evangelos E. Papalexakis. 2020. Beyond Rank-1: Discovering Rich Community Structure in Multi-Aspect Graphs. In WWW. ACM / IW3C2, 452–462.Google Scholar
- Xifeng Guo, Long Gao, Xinwang Liu, and Jianping Yin. 2017. Improved Deep Embedded Clustering with Local Structure Preservation. In IJCAI. ijcai.org, 1753–1759.Google Scholar
- Michael Gutmann and Aapo Hyvärinen. 2010. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In AISTATS(JMLR Proceedings, Vol. 9). JMLR.org, 297–304.Google Scholar
- John A Hartigan and Manchek A Wong. 1979. Algorithm AS 136: A k-means clustering algorithm. Journal of the royal statistical society. series c (applied statistics) 28, 1(1979), 100–108.Google ScholarCross Ref
- Geoffrey E Hinton and Ruslan R Salakhutdinov. 2006. Reducing the dimensionality of data with neural networks. science 313, 5786 (2006), 504–507.Google Scholar
- R. Devon Hjelm, Alex Fedorov, Samuel Lavoie-Marchildon, Karan Grewal, Philip Bachman, Adam Trischler, and Yoshua Bengio. 2019. Learning deep representations by mutual information estimation and maximization. In ICLR. OpenReview.net.Google Scholar
- Zhuxi Jiang, Yin Zheng, Huachun Tan, Bangsheng Tang, and Hanning Zhou. 2017. Variational Deep Embedding: An Unsupervised and Generative Approach to Clustering. In IJCAI. ijcai.org, 1965–1972.Google Scholar
- George Karypis and Vipin Kumar. 1998. A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs. SIAM J. Sci. Comput. 20, 1 (1998), 359–392.Google ScholarDigital Library
- Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2020. Supervised Contrastive Learning. CoRR abs/2004.11362(2020).Google Scholar
- Thomas N. Kipf and Max Welling. 2016. Variational Graph Auto-Encoders. CoRR abs/1611.07308(2016).Google Scholar
- Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR (Poster). OpenReview.net.Google Scholar
- Harold W Kuhn. 1955. The Hungarian method for the assignment problem. Naval research logistics quarterly 2, 1-2 (1955), 83–97.Google Scholar
- Andrea Lancichinetti and Santo Fortunato. 2012. Consensus clustering in complex networks. Scientific reports 2, 1 (2012), 1–7.Google Scholar
- Peizhao Li, Han Zhao, and Hongfu Liu. 2020. Deep Fair Clustering for Visual Learning. In CVPR. Computer Vision Foundation / IEEE, 9067–9076.Google Scholar
- [33] ACM Digital Library.2021 [Online]. https://dl.acm.org. Accessed: 2021-10-01.Google Scholar
- Deep Graph Library. 2021. DGI. https://github.com/dmlc/dgl/tree/master/examples/pytorch/dgi. Accessed: 2021-10-20.Google Scholar
- J. Liu, C. Xu, C. Yin, W. Wu, and Y. Song. 2020. K-Core based Temporal Graph Convolutional Network for Dynamic Graphs. IEEE Transactions on Knowledge and Data Engineering (2020), 1–1. https://doi.org/10.1109/TKDE.2020.3033829Google ScholarCross Ref
- Rui Lu, Zhiyao Duan, and Changshui Zhang. 2019. Audio-Visual Deep Clustering for Speech Separation. IEEE ACM Trans. Audio Speech Lang. Process. 27, 11 (2019), 1697–1712.Google ScholarDigital Library
- Naveen Sai Madiraju, Seid M. Sadat, Dimitry Fisher, and Homa Karimabadi. 2018. Deep Temporal Clustering : Fully Unsupervised Learning of Time-Domain Features. CoRR abs/1802.01059(2018).Google Scholar
- Giang Hoang Nguyen, John Boaz Lee, Ryan A. Rossi, Nesreen K. Ahmed, Eunyee Koh, and Sungchul Kim. 2018. Continuous-Time Dynamic Network Embeddings. In WWW (Companion Volume). ACM, 969–976.Google Scholar
- Sejoon Oh, Namyong Park, Lee Sael, and U Kang. 2018. Scalable Tucker Factorization for Sparse Tensors - Algorithms and Discoveries. In ICDE. IEEE Computer Society, 1120–1131.Google Scholar
- Shirui Pan, Ruiqi Hu, Sai-Fu Fung, Guodong Long, Jing Jiang, and Chengqi Zhang. 2020. Learning Graph Embedding With Adversarial Training Methods. IEEE Trans. Cybern. 50, 6 (2020), 2475–2487.Google ScholarCross Ref
- Aldo Pareja, Giacomo Domeniconi, Jie Chen, Tengfei Ma, Toyotaro Suzumura, Hiroki Kanezashi, Tim Kaler, Tao B. Schardl, and Charles E. Leiserson. 2020. EvolveGCN: Evolving Graph Convolutional Networks for Dynamic Graphs. In AAAI. AAAI Press, 5363–5370.Google Scholar
- Ha-Myung Park, Namyong Park, Sung-Hyon Myaeng, and U Kang. 2016. Partition Aware Connected Component Computation in Distributed Systems. In ICDM. IEEE Computer Society, 420–429.Google Scholar
- Ha-Myung Park, Namyong Park, Sung-Hyon Myaeng, and U Kang. 2020. PACC: Large scale connected component computation on Hadoop and Spark. PLOS ONE 15, 3 (03 2020), 1–25. https://doi.org/10.1371/journal.pone.0229936Google ScholarCross Ref
- Namyong Park, Fuchen Liu, Purvanshi Mehta, Dana Cristofor, Christos Faloutsos, and Yuxiao Dong. 2022. EvoKG: Jointly Modeling Event Time and Network Structure for Reasoning over Temporal Knowledge Graphs. In WSDM. ACM.Google Scholar
- Namyong Park, Sejoon Oh, and U Kang. 2017. Fast and Scalable Distributed Boolean Tensor Factorization. In ICDE. IEEE Computer Society, 1071–1082.Google Scholar
- Namyong Park, Sejoon Oh, and U Kang. 2019. Fast and scalable method for distributed Boolean tensor factorization. VLDB J. 28, 4 (2019), 549–574.Google ScholarDigital Library
- Xi Peng, Shijie Xiao, Jiashi Feng, Wei-Yun Yau, and Zhang Yi. 2016. Deep Subspace Clustering with Sparsity Prior. In IJCAI. IJCAI/AAAI Press, 1925–1931.Google Scholar
- Zhihao Peng, Hui Liu, Yuheng Jia, and Junhui Hou. 2021. Attention-driven Graph Clustering Network. In ACM Multimedia. ACM, 935–943.Google Scholar
- Ben Poole, Sherjil Ozair, Aäron van den Oord, Alex Alemi, and George Tucker. 2019. On Variational Bounds of Mutual Information. In ICML(Proceedings of Machine Learning Research, Vol. 97). PMLR, 5171–5180.Google Scholar
- [50] Yahoo Webscope Program.2021 [Online]. https://webscope.sandbox.yahoo.com. Accessed: 2021-10-01.Google Scholar
- Martin Rosvall and Carl T Bergstrom. 2008. Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences 105, 4 (2008), 1118–1123.Google ScholarCross Ref
- Martin Rosvall and Carl T Bergstrom. 2011. Multilevel compression of random walks on networks reveals hierarchical organization in large integrated systems. PloS one 6, 4 (2011), e18209.Google ScholarCross Ref
- scikit learn. 2021. scikit-learn. https://github.com/scikit-learn/scikit-learn. Accessed: 2021-10-20.Google Scholar
- Uriel Singer. 2021. CTDNE. https://github.com/urielsinger/CTDNE.Google Scholar
- Jimeng Sun, Christos Faloutsos, Spiros Papadimitriou, and Philip S. Yu. 2007. GraphScope: parameter-free mining of large time-evolving graphs. In KDD. ACM, 687–696.Google Scholar
- Ke Sun, Zhouchen Lin, and Zhanxing Zhu. 2020. Multi-Stage Self-Supervised Learning for Graph Convolutional Networks on Graphs with Few Labeled Nodes. In AAAI. AAAI Press, 5892–5899.Google Scholar
- Fei Tian, Bin Gao, Qing Cui, Enhong Chen, and Tie-Yan Liu. 2014. Learning Deep Representations for Graph Clustering. In AAAI. AAAI Press, 1293–1299.Google Scholar
- Anton Tsitsulin, John Palowitch, Bryan Perozzi, and Emmanuel Müller. 2020. Graph clustering with graph neural networks. arXiv preprint arXiv:2006.16904(2020).Google Scholar
- Aäron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation Learning with Contrastive Predictive Coding. CoRR abs/1807.03748(2018).Google Scholar
- Petar Velickovic, William Fedus, William L. Hamilton, Pietro Liò, Yoshua Bengio, and R. Devon Hjelm. 2019. Deep Graph Infomax. In ICLR (Poster). OpenReview.net.Google Scholar
- Chun Wang, Shirui Pan, Ruiqi Hu, Guodong Long, Jing Jiang, and Chengqi Zhang. 2019. Attributed Graph Clustering: A Deep Attentional Embedding Approach. In IJCAI. ijcai.org, 3670–3676.Google Scholar
- Xiao Wang, Nian Liu, Hui Han, and Chuan Shi. 2021. Self-supervised Heterogeneous Graph Neural Network with Co-contrastive Learning. In KDD. ACM, 1726–1736.Google Scholar
- Junyuan Xie, Ross B. Girshick, and Ali Farhadi. 2016. Unsupervised Deep Embedding for Clustering Analysis. In ICML(JMLR Workshop and Conference Proceedings, Vol. 48). JMLR.org, 478–487.Google Scholar
- Bo Yang, Xiao Fu, Nicholas D. Sidiropoulos, and Mingyi Hong. 2017. Towards K-means-friendly Spaces: Simultaneous Deep Learning and Clustering. In ICML(Proceedings of Machine Learning Research, Vol. 70). PMLR, 3861–3870.Google Scholar
- Di Yao, Chao Zhang, Zhihua Zhu, Jian-Hui Huang, and Jingping Bi. 2017. Trajectory clustering via deep representation learning. In IJCNN. IEEE, 3880–3887.Google Scholar
- Yuning You, Tianlong Chen, Zhangyang Wang, and Yang Shen. 2020. When Does Self-Supervision Help Graph Convolutional Networks?. In ICML(Proceedings of Machine Learning Research, Vol. 119). PMLR, 10871–10880.Google Scholar
- Mingxuan Yue, Yaguang Li, Haoze Yang, Ritesh Ahuja, Yao-Yi Chiang, and Cyrus Shahabi. 2019. DETECT: Deep Trajectory Clustering for Mobility-Behavior Analysis. In IEEE BigData. IEEE, 988–997.Google Scholar
- Ziwei Zhang, Peng Cui, Jian Pei, Xiao Wang, and Wenwu Zhu. 2018. TIMERS: Error-Bounded SVD Restart on Dynamic Networks. In AAAI. AAAI Press, 224–231.Google Scholar
Index Terms
- CGC: Contrastive Graph Clustering forCommunity Detection and Tracking
Recommendations
Contrastive Learning with Cluster-Preserving Augmentation for Attributed Graph Clustering
Machine Learning and Knowledge Discovery in Databases: Research TrackAbstractGraph contrastive learning has attracted considerable attention and made remarkable progress in node representation learning and clustering for attributed graphs. However, existing contrastive-based clustering methods separate the processes of ...
Graph-Based Short Text Clustering via Contrastive Learning with Graph Embedding
Advanced Intelligent Computing Technology and ApplicationsAbstractClustering is an unsupervised learning technique that helps us quickly classify short texts. It works by effectively capturing the semantic themes of texts and assigning the similar texts into the same cluster. Due to the excellent ability of ...
Analysis of Graph Data Structure from the Perspective of Clustering
CSAI '23: Proceedings of the 2023 7th International Conference on Computer Science and Artificial IntelligenceDeep graph clustering is a fundamental task that partitions graph nodes into distinct clusters based on their similarity features, without using human-annotated data. Graph representation learning methods are widely used for deep graph clustering, as ...
Comments