skip to main content
10.1145/3394486.3403238acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Public Access

Parameterized Correlation Clustering in Hypergraphs and Bipartite Graphs

Published:20 August 2020Publication History

ABSTRACT

Motivated by applications in community detection and dense subgraph discovery, we consider new clustering objectives in hypergraphs and bipartite graphs. These objectives are parameterized by one or more resolution parameters in order to enable diverse knowledge discovery in complex data.

For both hypergraph and bipartite objectives, we identify relevant parameter regimes that are equivalent to existing objectives and share their (polynomial-time) approximation algorithms. We first show that our parameterized hypergraph correlation clustering objective is related to higher-order notions of normalized cut and modularity in hypergraphs. It is further amenable to approximation algorithms via hyperedge expansion techniques.

Our parameterized bipartite correlation clustering objective generalizes standard unweighted bipartite correlation clustering, as well as the bicluster deletion problem. For a certain choice of parameters it is also related to our hypergraph objective. Although in general it is NP-hard, we highlight a parameter regime for the bipartite objective where the problem reduces to the bipartite matching problem and thus can be solved in polynomial time. For other parameter settings, we present several approximation algorithms using linear program rounding techniques. These results allow us to introduce the first constant-factor approximation for bicluster deletion, the task of removing a minimum number of edges to partition a bipartite graph into disjoint bi-cliques.

In several experimental results, we highlight the flexibility of our framework and the diversity of results that can be obtained in different parameter settings. This includes clustering bipartite graphs across a range of parameters, detecting motif-rich clusters in an email network and a food web, and forming clusters of retail products in a product review hypergraph, that are highly correlated with known product categories.

References

  1. Sameer Agarwal, Jongwoo Lim, Lihi Zelnik-Manor, Pietro Perona, David Kriegman, and Serge Belongie. 2005. Beyond Pairwise Clustering (CVPR '05).Google ScholarGoogle Scholar
  2. Nir. Ailon, Noa. Avigdor-Elgrabli, Edo. Liberty, and Anke. van Zuylen. 2012. Improved Approximation Algorithms for Bipartite Correlation Clustering. SIAM J. Comput., Vol. 41, 5 (2012), 1110--1121.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Nir Ailon, Moses Charikar, and Alantha Newman. 2008. Aggregating inconsistent information: ranking and clustering. Journal of the ACM (JACM), Vol. 55, 5 (2008), 23.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Ilya Amburg, Nate Veldt, and Austin R Benson. Clustering in graphs and hypergraphs with categorical edge labels (WWW '20).Google ScholarGoogle Scholar
  5. Noga Amit. 2004. The bicluster graph editing problem. Master's thesis. Tel Aviv University.Google ScholarGoogle Scholar
  6. A Arenas, A Ferná ndez, S Fortunato, and S Gó mez. 2008b. Motif-based communities in complex networks. Journal of Physics A: Mathematical and Theoretical, Vol. 41, 22 (2008).Google ScholarGoogle ScholarCross RefCross Ref
  7. A Arenas, A Ferná ndez, and S Gó mez. 2008a. Analysis of the structure of complex networks at different resolution levels. New Journal of Physics, Vol. 10, 5 (2008).Google ScholarGoogle ScholarCross RefCross Ref
  8. M. Asteris, A. Kyrillidis, D. Papailiopoulos, and A. Dimakis. Bipartite correlation clustering: Maximizing agreements (AISTATS '16).Google ScholarGoogle Scholar
  9. Nikhil Bansal, Avrim Blum, and Shuchi Chawla. 2004. Correlation Clustering. Machine Learning, Vol. 56 (2004), 89--113.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Austin R. Benson, David F. Gleich, and Jure Leskovec. 2016. Higher-order organization of complex networks. Science, Vol. 353, 6295 (2016), 163--166.Google ScholarGoogle Scholar
  11. Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, Vol. 2008, 10 (2008), P10008.Google ScholarGoogle ScholarCross RefCross Ref
  12. Justin Brickell, Inderjit S. Dhillon, Suvrit Sra, and Joel A. Tropp. 2008. The Metric Nearness Problem. SIAM J. Matrix Anal. Appl., Vol. 30, 1 (2008), 375--396.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Ü mit V. cC atalyü rek and Cevdet Aykanat. 1999. Hypergraph-Partitioning Based Decomposition for Parallel Sparse-Matrix Vector Multiplication. IEEE Transactions on Parallel and Distributed Systems, Vol. 10, 7 (1999), 673--693.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Moses Charikar, Venkatesan Guruswami, and Anthony Wirth. 2005. Clustering with qualitative information. J. Comput. System Sci., Vol. 71, 3 (2005), 360 -- 383. Learning Theory 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Shuchi Chawla, Konstantin Makarychev, Tselil Schramm, and Grigory Yaroslavtsev. 2015. Near optimal LP rounding algorithm for correlation clustering on complete and complete k-partite graphs (STOC '15). ACM.Google ScholarGoogle Scholar
  16. J.-C. Delvenne, S. N. Yaliraki, and M. Barahona. 2010. Stability of graph communities across time scales. Proceedings of the National Academy of Sciences, Vol. 107, 29 (2010), 12755--12760.Google ScholarGoogle ScholarCross RefCross Ref
  17. Erik D. Demaine, Dotan Emanuel, Amos Fiat, and Nicole Immorlica. 2006. Correlation clustering in general weighted graphs. Theoretical Computer Science, Vol. 361, 2 (2006), 172 -- 187. Approximation and Online Algorithms.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Inderjit S. Dhillon, Yuqiang Guan, and Brian Kulis. 2007. Weighted Graph Cuts without Eigenvectors A Multilevel Approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 29, 11 (2007), 1944--1957.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Santo Fortunato and Marc Barthélemy. 2007. Resolution limit in community detection. Proceedings of the National Academy of Sciences, Vol. 104, 1 (2007), 36--41.Google ScholarGoogle ScholarCross RefCross Ref
  20. Takuro Fukunaga. 2018. LP-Based Pivoting Algorithm for Higher-Order Correlation Clustering. In Computing and Combinatorics .Google ScholarGoogle Scholar
  21. David F. Gleich, Nate Veldt, and Anthony Wirth. 2018. Correlation Clustering Generalized (ISAAC 2018).Google ScholarGoogle Scholar
  22. J. Gong and Sung Kyu Lim. 1998. Multiway partitioning with pairwise movement (ICAD '98).Google ScholarGoogle Scholar
  23. S. W. Hadley, B. L. Mark, and A. Vannelli. 1992. An efficient eigenvector approach for finding netlist partitions. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 11, 7 (1992).Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Matthias Hein, Simon Setzer, Leonardo Jost, and Syama Sundar Rangapuram. 2013. The Total Variation on Hypergraphs - Learning on Hypergraphs Revisited (NIPS'13).Google ScholarGoogle Scholar
  25. Edmund Ihler, Dorothea Wagner, and Frank Wagner. 1993. Modeling Hypergraphs by Graphs with the Same Mincut Properties. Inf. Process. Lett., Vol. 45, 4 (1993).Google ScholarGoogle Scholar
  26. Lucas G. S. Jeub, Marya Bazzi, Inderjit S. Jutla, and Peter J. Mucha. 2011--2017. A generalized Louvain method for community detection implemented in MATLAB. (2011--2017). http://netwiki.amath.unc.edu/GenLouvainGoogle ScholarGoogle Scholar
  27. Bogumił Kami'nski, Valérie Poulin, Paweł Prałat, Przemysław Szufel, and Francc ois Théberge. 2019. Clustering via hypergraph modularity. PloS one, Vol. 14, 11 (2019).Google ScholarGoogle Scholar
  28. George Karypis and Vipin Kumar. 1998. A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs. SIAM J. Sci. Comput., Vol. 20, 1 (1998), 359--392.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. George Karypis and Vipin Kumar. 1999. Multilevel K-way Hypergraph Partitioning (DAC '99). ACM, 343--348.Google ScholarGoogle Scholar
  30. Sungwoong Kim, Sebastian Nowozin, Pushmeet Kohli, and Chang D. Yoo. 2011. Higher-Order Correlation Clustering for Image Segmentation (NIPS '11).Google ScholarGoogle Scholar
  31. Christine Klymko, David F. Gleich, and Tamara G. Kolda. 2014. Using Triangles to Improve Community Detection in Directed Networks. In The Second ASE International Conference on Big Data Science and Computing, BigDataScience .Google ScholarGoogle Scholar
  32. Tarun Kumar, Sankaran Vaidyanathan, Harini Ananthapadmanabhan, Srinivasan Parthasarathy, and Balaraman Ravindran. 2020. A New Measure of Modularity in Hypergraphs: Theoretical Insights and Implications for Effective Clustering. In Complex Networks and Their Applications VIII. Springer International Publishing.Google ScholarGoogle Scholar
  33. Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. 2007. Graph evolution: Densification and shrinking diameters. ACM Transactions on Knowledge Discovery from Data (TKDD), Vol. 1, 1 (2007), 2.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Pan Li, H. Dau, Gregory J. Puleo, and Olgica Milenkovic. 2017. Motif clustering and overlapping clustering for social network analysis (INFOCOM '17). 1--9.Google ScholarGoogle Scholar
  35. Pan Li and Olgica Milenkovic. 2017. Inhomogeneous Hypergraph Clustering with Applications (NIPS '17). 2308--2318.Google ScholarGoogle Scholar
  36. Pan Li and Olgica Milenkovic. 2018. Submodular Hypergraphs: p-Laplacians, Cheeger Inequalities and Spectral Clustering (ICML '18). 3020--3029.Google ScholarGoogle Scholar
  37. Pan Li, Gregory. J. Puleo, and Olgica. Milenkovic. 2019. Motif and Hypergraph Correlation Clustering. IEEE Transactions on Information Theory (2019), 1--1.Google ScholarGoogle ScholarCross RefCross Ref
  38. Tom Michoel and Bruno Nachtergaele. 2012. Alignment and integration of complex networks by hypergraph-based spectral clustering. Physical Review E, Vol. 86 (2012), 056111. Issue 5.Google ScholarGoogle ScholarCross RefCross Ref
  39. Mark EJ Newman and Michelle Girvan. 2004. Finding and evaluating community structure in networks. Physical review E, Vol. 69, 026113 (2004).Google ScholarGoogle Scholar
  40. Jianmo Ni, Jiacheng Li, and Julian McAuley. 2019. Justifying Recommendations using Distantly-Labeled Reviews and Fine-Grained Aspects (EMNLP-IJCNLP '19). 188--197.Google ScholarGoogle Scholar
  41. Leto Peel, Daniel B. Larremore, and Aaron Clauset. 2017. The ground truth about metadata and community detection in networks. Science Advances, Vol. 3, 5 (2017).Google ScholarGoogle Scholar
  42. Gregory. J. Puleo and Olgica. Milenkovic. 2018. Correlation Clustering and Biclustering With Locally Bounded Errors. IEEE Transactions on Information Theory, Vol. 64, 6 (June 2018), 4105--4119.Google ScholarGoogle ScholarCross RefCross Ref
  43. Jörg Reichardt and Stefan Bornholdt. 2004. Detecting Fuzzy Community Structures in Complex Networks with a Potts Model. Phys. Rev. Lett., Vol. 93 (2004), 218701.Google ScholarGoogle ScholarCross RefCross Ref
  44. Cameron Ruggles, Nate Veldt, and David F. Gleich. A Parallel Projection Method for Metric Constrained Optimization (SIAM CSC '20).Google ScholarGoogle Scholar
  45. Satu Elisa Schaeffer. 2007. Graph clustering. Computer Science Review (2007).Google ScholarGoogle Scholar
  46. Ron Shamir, Roded Sharan, and Dekel Tsur. 2004. Cluster graph modification problems. Discrete Applied Mathematics, Vol. 144 (2004), 173--182.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Jianbo Shi and J. Malik. 2000. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22, 8 (2000), 888--905.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Rishi Sonthalia and Anna C. Gilbert. 2020. Project and Forget: Solving Large-Scale Metric Constrained Problems. (2020). arxiv: cs.LG/2005.03853Google ScholarGoogle Scholar
  49. Ze Tian, TaeHyun Hwang, and Rui Kuang. 2009. A hypergraph-based learning algorithm for classifying gene expression and arrayCGH data with prior knowledge. Bioinformatics, Vol. 25, 21 (2009), 2831--2838.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. V. A. Traag, P. Van Dooren, and Y. Nesterov. 2011. Narrow scope for resolution-limit-free community detection. Phys. Rev. E, Vol. 84 (Jul 2011), 016114. Issue 1.Google ScholarGoogle ScholarCross RefCross Ref
  51. Charalampos E. Tsourakakis, Jakub Pachocki, and Michael Mitzenmacher. 2017. Scalable Motif-aware Graph Clustering (WWW '17). 1451--1460.Google ScholarGoogle Scholar
  52. Anke van Zuylen and David P. Williamson. 2009. Deterministic Pivoting Algorithms for Constrained Ranking and Clustering Problems. Mathematics of Operations Research, Vol. 34, 3 (2009), 594--620.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Nate Veldt, Austin R. Benson, and Jon Kleinberg. 2020. Hypergraph Cuts with General Splitting Functions. (2020). arxiv: cs.DS/2001.02817Google ScholarGoogle Scholar
  54. Nate Veldt, David F. Gleich, and Anthony Wirth. 2018. A Correlation Clustering Framework for Community Detection (WWW '18). 439--448.Google ScholarGoogle Scholar
  55. Nate Veldt, David F. Gleich, and Anthony Wirth. 2019 a. Learning Resolution Parameters for Graph Clustering (WWW '19).Google ScholarGoogle Scholar
  56. Nate Veldt, David F. Gleich, Anthony Wirth, and James Saunderson. 2019 b. Metric-Constrained Optimization for Graph Clustering Algorithms. SIAM Journal on Mathematics of Data Science, Vol. 1, 2 (2019), 333--355.Google ScholarGoogle ScholarCross RefCross Ref
  57. Nate Veldt, Anthony Wirth, and David F. Gleich. 2020. Parameterized Correlation Clustering in Hypergraphs and Bipartite Graphs. (2020). arxiv: cs.DS/2002.09460Google ScholarGoogle Scholar
  58. Hao Yin, Austin R. Benson, and Jure Leskovec. 2018. Higher-order clustering in networks. Phys. Rev. E, Vol. 97 (2018), 052306. Issue 5.Google ScholarGoogle ScholarCross RefCross Ref
  59. Hao Yin, Austin R Benson, Jure Leskovec, and David F Gleich. 2017. Local higher-order graph clustering (KDD '17). 555--564.Google ScholarGoogle Scholar
  60. Dengyong Zhou, Jiayuan Huang, and Bernhard Schölkopf. 2006. Learning with Hypergraphs: Clustering, Classification, and Embedding (NIPS '06).Google ScholarGoogle Scholar
  61. J. Y. Zien, M. D. F. Schlag, and P. K. Chan. 1999. Multilevel spectral hypergraph partitioning with arbitrary vertex sizes. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 18, 9 (1999), 1389--1399.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Parameterized Correlation Clustering in Hypergraphs and Bipartite Graphs

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
          August 2020
          3664 pages
          ISBN:9781450379984
          DOI:10.1145/3394486

          Copyright © 2020 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 20 August 2020

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate1,133of8,635submissions,13%

          Upcoming Conference

          KDD '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader