research-article

Public Access

Parameterized Correlation Clustering in Hypergraphs and Bipartite Graphs

Authors:
Nate Veldt

Cornell University, Ithaca, NY, USA

Cornell University, Ithaca, NY, USA
View Profile

,
Anthony Wirth

The University of Melbourne, Melbourne, VIC, Australia

The University of Melbourne, Melbourne, VIC, Australia
View Profile

,
David F. Gleich

Purdue University, West Lafayette, IN, USA

Purdue University, West Lafayette, IN, USA
View Profile

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningAugust 2020Pages 1868–1876https://doi.org/10.1145/3394486.3403238

Published:20 August 2020Publication History

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 1868–1876

ABSTRACT

Motivated by applications in community detection and dense subgraph discovery, we consider new clustering objectives in hypergraphs and bipartite graphs. These objectives are parameterized by one or more resolution parameters in order to enable diverse knowledge discovery in complex data.

For both hypergraph and bipartite objectives, we identify relevant parameter regimes that are equivalent to existing objectives and share their (polynomial-time) approximation algorithms. We first show that our parameterized hypergraph correlation clustering objective is related to higher-order notions of normalized cut and modularity in hypergraphs. It is further amenable to approximation algorithms via hyperedge expansion techniques.

Our parameterized bipartite correlation clustering objective generalizes standard unweighted bipartite correlation clustering, as well as the bicluster deletion problem. For a certain choice of parameters it is also related to our hypergraph objective. Although in general it is NP-hard, we highlight a parameter regime for the bipartite objective where the problem reduces to the bipartite matching problem and thus can be solved in polynomial time. For other parameter settings, we present several approximation algorithms using linear program rounding techniques. These results allow us to introduce the first constant-factor approximation for bicluster deletion, the task of removing a minimum number of edges to partition a bipartite graph into disjoint bi-cliques.

In several experimental results, we highlight the flexibility of our framework and the diversity of results that can be obtained in different parameter settings. This includes clustering bipartite graphs across a range of parameters, detecting motif-rich clusters in an email network and a food web, and forming clusters of retail products in a product review hypergraph, that are highly correlated with known product categories.

References

Sameer Agarwal, Jongwoo Lim, Lihi Zelnik-Manor, Pietro Perona, David Kriegman, and Serge Belongie. 2005. Beyond Pairwise Clustering (CVPR '05).Google Scholar
Nir. Ailon, Noa. Avigdor-Elgrabli, Edo. Liberty, and Anke. van Zuylen. 2012. Improved Approximation Algorithms for Bipartite Correlation Clustering. SIAM J. Comput., Vol. 41, 5 (2012), 1110--1121.Google ScholarDigital Library
Nir Ailon, Moses Charikar, and Alantha Newman. 2008. Aggregating inconsistent information: ranking and clustering. Journal of the ACM (JACM), Vol. 55, 5 (2008), 23.Google ScholarDigital Library
Ilya Amburg, Nate Veldt, and Austin R Benson. Clustering in graphs and hypergraphs with categorical edge labels (WWW '20).Google Scholar
Noga Amit. 2004. The bicluster graph editing problem. Master's thesis. Tel Aviv University.Google Scholar
A Arenas, A Ferná ndez, S Fortunato, and S Gó mez. 2008b. Motif-based communities in complex networks. Journal of Physics A: Mathematical and Theoretical, Vol. 41, 22 (2008).Google ScholarCross Ref
A Arenas, A Ferná ndez, and S Gó mez. 2008a. Analysis of the structure of complex networks at different resolution levels. New Journal of Physics, Vol. 10, 5 (2008).Google ScholarCross Ref
M. Asteris, A. Kyrillidis, D. Papailiopoulos, and A. Dimakis. Bipartite correlation clustering: Maximizing agreements (AISTATS '16).Google Scholar
Nikhil Bansal, Avrim Blum, and Shuchi Chawla. 2004. Correlation Clustering. Machine Learning, Vol. 56 (2004), 89--113.Google ScholarDigital Library
Austin R. Benson, David F. Gleich, and Jure Leskovec. 2016. Higher-order organization of complex networks. Science, Vol. 353, 6295 (2016), 163--166.Google Scholar
Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, Vol. 2008, 10 (2008), P10008.Google ScholarCross Ref
Justin Brickell, Inderjit S. Dhillon, Suvrit Sra, and Joel A. Tropp. 2008. The Metric Nearness Problem. SIAM J. Matrix Anal. Appl., Vol. 30, 1 (2008), 375--396.Google ScholarDigital Library
Ü mit V. cC atalyü rek and Cevdet Aykanat. 1999. Hypergraph-Partitioning Based Decomposition for Parallel Sparse-Matrix Vector Multiplication. IEEE Transactions on Parallel and Distributed Systems, Vol. 10, 7 (1999), 673--693.Google ScholarDigital Library
Moses Charikar, Venkatesan Guruswami, and Anthony Wirth. 2005. Clustering with qualitative information. J. Comput. System Sci., Vol. 71, 3 (2005), 360 -- 383. Learning Theory 2003.Google ScholarDigital Library
Shuchi Chawla, Konstantin Makarychev, Tselil Schramm, and Grigory Yaroslavtsev. 2015. Near optimal LP rounding algorithm for correlation clustering on complete and complete k-partite graphs (STOC '15). ACM.Google Scholar
J.-C. Delvenne, S. N. Yaliraki, and M. Barahona. 2010. Stability of graph communities across time scales. Proceedings of the National Academy of Sciences, Vol. 107, 29 (2010), 12755--12760.Google ScholarCross Ref
Erik D. Demaine, Dotan Emanuel, Amos Fiat, and Nicole Immorlica. 2006. Correlation clustering in general weighted graphs. Theoretical Computer Science, Vol. 361, 2 (2006), 172 -- 187. Approximation and Online Algorithms.Google ScholarDigital Library
Inderjit S. Dhillon, Yuqiang Guan, and Brian Kulis. 2007. Weighted Graph Cuts without Eigenvectors A Multilevel Approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 29, 11 (2007), 1944--1957.Google ScholarDigital Library
Santo Fortunato and Marc Barthélemy. 2007. Resolution limit in community detection. Proceedings of the National Academy of Sciences, Vol. 104, 1 (2007), 36--41.Google ScholarCross Ref
Takuro Fukunaga. 2018. LP-Based Pivoting Algorithm for Higher-Order Correlation Clustering. In Computing and Combinatorics .Google Scholar
David F. Gleich, Nate Veldt, and Anthony Wirth. 2018. Correlation Clustering Generalized (ISAAC 2018).Google Scholar
J. Gong and Sung Kyu Lim. 1998. Multiway partitioning with pairwise movement (ICAD '98).Google Scholar
S. W. Hadley, B. L. Mark, and A. Vannelli. 1992. An efficient eigenvector approach for finding netlist partitions. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 11, 7 (1992).Google ScholarDigital Library
Matthias Hein, Simon Setzer, Leonardo Jost, and Syama Sundar Rangapuram. 2013. The Total Variation on Hypergraphs - Learning on Hypergraphs Revisited (NIPS'13).Google Scholar
Edmund Ihler, Dorothea Wagner, and Frank Wagner. 1993. Modeling Hypergraphs by Graphs with the Same Mincut Properties. Inf. Process. Lett., Vol. 45, 4 (1993).Google Scholar
Lucas G. S. Jeub, Marya Bazzi, Inderjit S. Jutla, and Peter J. Mucha. 2011--2017. A generalized Louvain method for community detection implemented in MATLAB. (2011--2017). http://netwiki.amath.unc.edu/GenLouvainGoogle Scholar
Bogumił Kami'nski, Valérie Poulin, Paweł Prałat, Przemysław Szufel, and Francc ois Théberge. 2019. Clustering via hypergraph modularity. PloS one, Vol. 14, 11 (2019).Google Scholar
George Karypis and Vipin Kumar. 1998. A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs. SIAM J. Sci. Comput., Vol. 20, 1 (1998), 359--392.Google ScholarDigital Library
George Karypis and Vipin Kumar. 1999. Multilevel K-way Hypergraph Partitioning (DAC '99). ACM, 343--348.Google Scholar
Sungwoong Kim, Sebastian Nowozin, Pushmeet Kohli, and Chang D. Yoo. 2011. Higher-Order Correlation Clustering for Image Segmentation (NIPS '11).Google Scholar
Christine Klymko, David F. Gleich, and Tamara G. Kolda. 2014. Using Triangles to Improve Community Detection in Directed Networks. In The Second ASE International Conference on Big Data Science and Computing, BigDataScience .Google Scholar
Tarun Kumar, Sankaran Vaidyanathan, Harini Ananthapadmanabhan, Srinivasan Parthasarathy, and Balaraman Ravindran. 2020. A New Measure of Modularity in Hypergraphs: Theoretical Insights and Implications for Effective Clustering. In Complex Networks and Their Applications VIII. Springer International Publishing.Google Scholar
Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. 2007. Graph evolution: Densification and shrinking diameters. ACM Transactions on Knowledge Discovery from Data (TKDD), Vol. 1, 1 (2007), 2.Google ScholarDigital Library
Pan Li, H. Dau, Gregory J. Puleo, and Olgica Milenkovic. 2017. Motif clustering and overlapping clustering for social network analysis (INFOCOM '17). 1--9.Google Scholar
Pan Li and Olgica Milenkovic. 2017. Inhomogeneous Hypergraph Clustering with Applications (NIPS '17). 2308--2318.Google Scholar
Pan Li and Olgica Milenkovic. 2018. Submodular Hypergraphs: p-Laplacians, Cheeger Inequalities and Spectral Clustering (ICML '18). 3020--3029.Google Scholar
Pan Li, Gregory. J. Puleo, and Olgica. Milenkovic. 2019. Motif and Hypergraph Correlation Clustering. IEEE Transactions on Information Theory (2019), 1--1.Google ScholarCross Ref
Tom Michoel and Bruno Nachtergaele. 2012. Alignment and integration of complex networks by hypergraph-based spectral clustering. Physical Review E, Vol. 86 (2012), 056111. Issue 5.Google ScholarCross Ref
Mark EJ Newman and Michelle Girvan. 2004. Finding and evaluating community structure in networks. Physical review E, Vol. 69, 026113 (2004).Google Scholar
Jianmo Ni, Jiacheng Li, and Julian McAuley. 2019. Justifying Recommendations using Distantly-Labeled Reviews and Fine-Grained Aspects (EMNLP-IJCNLP '19). 188--197.Google Scholar
Leto Peel, Daniel B. Larremore, and Aaron Clauset. 2017. The ground truth about metadata and community detection in networks. Science Advances, Vol. 3, 5 (2017).Google Scholar
Gregory. J. Puleo and Olgica. Milenkovic. 2018. Correlation Clustering and Biclustering With Locally Bounded Errors. IEEE Transactions on Information Theory, Vol. 64, 6 (June 2018), 4105--4119.Google ScholarCross Ref
Jörg Reichardt and Stefan Bornholdt. 2004. Detecting Fuzzy Community Structures in Complex Networks with a Potts Model. Phys. Rev. Lett., Vol. 93 (2004), 218701.Google ScholarCross Ref
Cameron Ruggles, Nate Veldt, and David F. Gleich. A Parallel Projection Method for Metric Constrained Optimization (SIAM CSC '20).Google Scholar
Satu Elisa Schaeffer. 2007. Graph clustering. Computer Science Review (2007).Google Scholar
Ron Shamir, Roded Sharan, and Dekel Tsur. 2004. Cluster graph modification problems. Discrete Applied Mathematics, Vol. 144 (2004), 173--182.Google ScholarDigital Library
Jianbo Shi and J. Malik. 2000. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22, 8 (2000), 888--905.Google ScholarDigital Library
Rishi Sonthalia and Anna C. Gilbert. 2020. Project and Forget: Solving Large-Scale Metric Constrained Problems. (2020). arxiv: cs.LG/2005.03853Google Scholar
Ze Tian, TaeHyun Hwang, and Rui Kuang. 2009. A hypergraph-based learning algorithm for classifying gene expression and arrayCGH data with prior knowledge. Bioinformatics, Vol. 25, 21 (2009), 2831--2838.Google ScholarDigital Library
V. A. Traag, P. Van Dooren, and Y. Nesterov. 2011. Narrow scope for resolution-limit-free community detection. Phys. Rev. E, Vol. 84 (Jul 2011), 016114. Issue 1.Google ScholarCross Ref
Charalampos E. Tsourakakis, Jakub Pachocki, and Michael Mitzenmacher. 2017. Scalable Motif-aware Graph Clustering (WWW '17). 1451--1460.Google Scholar
Anke van Zuylen and David P. Williamson. 2009. Deterministic Pivoting Algorithms for Constrained Ranking and Clustering Problems. Mathematics of Operations Research, Vol. 34, 3 (2009), 594--620.Google ScholarDigital Library
Nate Veldt, Austin R. Benson, and Jon Kleinberg. 2020. Hypergraph Cuts with General Splitting Functions. (2020). arxiv: cs.DS/2001.02817Google Scholar
Nate Veldt, David F. Gleich, and Anthony Wirth. 2018. A Correlation Clustering Framework for Community Detection (WWW '18). 439--448.Google Scholar
Nate Veldt, David F. Gleich, and Anthony Wirth. 2019 a. Learning Resolution Parameters for Graph Clustering (WWW '19).Google Scholar
Nate Veldt, David F. Gleich, Anthony Wirth, and James Saunderson. 2019 b. Metric-Constrained Optimization for Graph Clustering Algorithms. SIAM Journal on Mathematics of Data Science, Vol. 1, 2 (2019), 333--355.Google ScholarCross Ref
Nate Veldt, Anthony Wirth, and David F. Gleich. 2020. Parameterized Correlation Clustering in Hypergraphs and Bipartite Graphs. (2020). arxiv: cs.DS/2002.09460Google Scholar
Hao Yin, Austin R. Benson, and Jure Leskovec. 2018. Higher-order clustering in networks. Phys. Rev. E, Vol. 97 (2018), 052306. Issue 5.Google ScholarCross Ref
Hao Yin, Austin R Benson, Jure Leskovec, and David F Gleich. 2017. Local higher-order graph clustering (KDD '17). 555--564.Google Scholar
Dengyong Zhou, Jiayuan Huang, and Bernhard Schölkopf. 2006. Learning with Hypergraphs: Clustering, Classification, and Embedding (NIPS '06).Google Scholar
J. Y. Zien, M. D. F. Schlag, and P. K. Chan. 1999. Multilevel spectral hypergraph partitioning with arbitrary vertex sizes. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 18, 9 (1999), 1389--1399.Google ScholarDigital Library

Index Terms

Parameterized Correlation Clustering in Hypergraphs and Bipartite Graphs
1. Mathematics of computing
  1. Discrete mathematics
    1. Graph theory
      1. Approximation algorithms
      2. Hypergraphs
2. Theory of computation
  1. Design and analysis of algorithms

Recommendations

Complexity and approximation results for the connected vertex cover problem in graphs and hypergraphs

We study a variation of the vertex cover problem where it is required that the graph induced by the vertex cover is connected. We prove that this problem is polynomial in chordal graphs, has a PTAS in planar graphs, is APX-hard in bipartite graphs and ...
Read More
Dominating induced matching in some subclasses of bipartite graphs
Abstract
A subset M ⊆ E of edges of a graph G = ( V , E ) is called a matching if no two edges of M share a common vertex. An edge e ∈ E is said to dominate itself and all other edges adjacent to it. A matching M in a graph G = ( V , E ) is ...
Read More
Strong Transversals in Hypergraphs and Double Total Domination in Graphs

Let $H$ be a 3-uniform hypergraph of order $n$ and size $m$, and let $T$ be a subset of vertices of $H$. The set $T$ is a strong transversal in $H$ if $T$ contains at least two vertices from every edge of $H$. The strong transversal number $\tau_s(H)$ ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
August 2020
3664 pages
ISBN:9781450379984
DOI:10.1145/3394486
General Chairs:
Rajesh Gupta
UC San Diego, USA
,
Yan Liu
USC, USA
,
Program Chairs:
Mohak Shah
LG Electronics, USA
,
Suju Rajan
Linkedin, USA
,
Publications Chairs:
Jiliang Tang
Michigan State, USA
,
B. Aditya Prakash
Georgia Tech, USA
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 August 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
bipartite graphs
correlation clustering
hypergraphs
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 864
  Total Downloads
- Downloads (Last 12 months)182
- Downloads (Last 6 weeks)37
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Parameterized Correlation Clustering in Hypergraphs and Bipartite Graphs

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Complexity and approximation results for the connected vertex cover problem in graphs and hypergraphs

Dominating induced matching in some subclasses of bipartite graphs

Strong Transversals in Hypergraphs and Double Total Domination in Graphs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Parameterized Correlation Clustering in Hypergraphs and Bipartite Graphs

KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Complexity and approximation results for the connected vertex cover problem in graphs and hypergraphs

Dominating induced matching in some subclasses of bipartite graphs

Strong Transversals in Hypergraphs and Double Total Domination in Graphs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media