research-article

The community-search problem and how to plan a successful cocktail party

Authors:
Mauro Sozio

Max-Planck-Institut fur Informatik, Saarbrucken, Germany

Max-Planck-Institut fur Informatik, Saarbrucken, Germany
View Profile

,
Aristides Gionis

Yahoo! Research, Barcelona, Spain

Yahoo! Research, Barcelona, Spain
View Profile

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data miningJuly 2010Pages 939–948https://doi.org/10.1145/1835804.1835923

Published:25 July 2010Publication History

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 939–948

ABSTRACT

A lot of research in graph mining has been devoted in the discovery of communities. Most of the work has focused in the scenario where communities need to be discovered with only reference to the input graph. However, for many interesting applications one is interested in finding the community formed by a given set of nodes. In this paper we study a query-dependent variant of the community-detection problem, which we call the community-search problem: given a graph G, and a set of query nodes in the graph, we seek to find a subgraph of G that contains the query nodes and it is densely connected.

We motivate a measure of density based on minimum degree and distance constraints, and we develop an optimum greedy algorithm for this measure. We proceed by characterizing a class of monotone constraints and we generalize our algorithm to compute optimum solutions satisfying any set of monotone constraints. Finally we modify the greedy algorithm and we present two heuristic algorithms that find communities of size no greater than a specified upper bound. Our experimental evaluation on real datasets demonstrates the efficiency of the proposed algorithms and the quality of the solutions we obtain.

Supplemental Material

kdd2010_sozio_csp_01.mov

mov

150.1 MB

Download

References

G. Agarwal and D. Kempe. Modularity-maximizing network communities via mathematical programming. European Physics Journal B, 66(3), 2008.Google ScholarCross Ref
R. Andersen and K. Chellapilla. Finding dense subgraphs with size bounds. In WAW, 2009. Google ScholarDigital Library
R. Andersen, F. Chung, and K. Lang. Local graph partitioning using pagerank vectors. In FOCS, 2006. Google ScholarDigital Library
Y. Asahiro, K. Iwama, H. Tamaki, and T. Tokuyama. Greedily finding a dense subgraph. In SWAT, 1996. Google ScholarDigital Library
S. Asur and S. Parthasarathy. A viewpoint-based approach for interaction graph analysis. In KDD, 2009. Google ScholarDigital Library
U. Brandes, D. Delling, M. Gaertler, R. G¨orke, M. Hoefer, Z. Nikoloski, and D. Wagner. On modularity clustering. TKDE, 20(2):172--188, 2008. Google ScholarDigital Library
M. Charikar. Greedy approximation algorithms for finding dense components in a graph. In APPROX, 2000. Google ScholarDigital Library
J. Cheng, Y. Ke,W. Ng, and J. X. Yu. Context-aware object connection discovery in large graphs. In ICDE, 2009. Google ScholarDigital Library
Y. Dourisboure, F. Geraci, and M. Pellegrini. Extraction and classification of dense communities in the web. In WWW, 2007. Google ScholarDigital Library
C. Faloutsos, K. McCurley, and A. Tomkins. Fast discovery of connection subgraphs. In KDD, 2004. Google ScholarDigital Library
U. Feige, G. Kortsarz, and D. Peleg. The dense k-subgraph problem. Algorithmica, 29:2001, 1999.Google Scholar
G. W. Flake, S. Lawrence, and C. L. Giles. Efficient identification of web communities. In KDD, 2000. Google ScholarDigital Library
G. W. Flake, S. Lawrence, C. L. Giles, and F. M. Coetzee. Self-organization and identification of web communities. Computer, 35(3):66--71, 2002. Google ScholarDigital Library
S. Fortunato and M. Barthelemy. Resolution limit in community detection. PNAS, 104(1), 2007.Google Scholar
D. Gibson, R. Kumar, and A. Tomkins. Discovering large dense subgraphs in massive graphs. In VLDB, 2005. Google ScholarDigital Library
M. Girvan and M. E. J. Newman. Community structure in social and biological networks. Proceedings of the National Academy of Sciences of the USA, 99(12):7821--7826, 2002.Google ScholarCross Ref
J. Hastad. Clique is hard to approximate within n^1--µ. Electronic Colloquium on Computational Complexity (ECCC), 4(38), 1997.Google Scholar
G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. JSC, 20(1), 1998. Google ScholarDigital Library
G. Kasneci, S. Elbassuoni, and G. Weikum. Ming: mining informative entity relationship subgraphs. In CIKM, 2009. Google ScholarDigital Library
S. Khuller and B. Saha. On finding dense subgraphs. In ICALP, 2009. Google ScholarDigital Library
Y. Koren, S. C. North, and C. Volinsky. Measuring and extracting proximity graphs in networks. TKDD, 1(3), 2007. Google ScholarDigital Library
B. Korte and J. Vygen. Combinatorial Optimization: Theory and Algorithms (Algorithms and Combinatorics). Springer, 2007. Google ScholarDigital Library
L. Kou, G. Markowsky, and L. Berman. A fast algorithm for steiner trees. Acta Informatica, 15(2):141--145, 1981Google ScholarDigital Library
T. Lappas, K. Liu, and E. Terzi. Finding a team of experts in social networks. In KDD, 2009. Google ScholarDigital Library
J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney. Statistical properties of community structure in large social and information networks. In WWW, 2008. Google ScholarDigital Library
M. Newman. Fast algorithm for detecting community structure in networks. Physical Review E, 69, 2003.Google Scholar
P. Sevon, L. Eronen, P. Hintsanen, K. Kulovesi, and H. Toivonen. Link discovery in graphs derived from biological databases. In DILS, 2006. Google ScholarDigital Library
H. Tong and C. Faloutsos. Center-piece subgraphs: problem definition and fast solutions. In KDD, 2006. Google ScholarDigital Library
S. White and P. Smyth. A spectral clustering approach to finding communities in graph. In SDM, 2005.Google ScholarCross Ref

Index Terms

The community-search problem and how to plan a successful cocktail party
1. Information systems
  1. Information systems applications
    1. Data mining
2. Mathematics of computing
  1. Discrete mathematics
    1. Graph theory
      1. Graph algorithms

Recommendations

The K-clique Densest Subgraph Problem
WWW '15: Proceedings of the 24th International Conference on World Wide Web

Numerous graph mining applications rely on detecting subgraphs which are large near-cliques. Since formulations that are geared towards finding large near-cliques are hard and frequently inapproximable due to connections with the Maximum Clique problem, ...
Read More
Clique Percolation Method: Memory Efficient Almost Exact Communities
Advanced Data Mining and Applications
Abstract
Automatic detection of relevant groups of nodes in large real-world graphs, i.e. community detection, has applications in many fields and has received a lot of attention in the last twenty years. The most popular method designed to find ...
Read More
1-Homogeneous Graphs with Cocktail Party μ-Graphs

Let Γ be a graph with diameter d ≥ 2. Recall Γ is 1-homogeneous (in the sense of Nomura) whenever for every edge xy of Γ the distance partition
{{z ∈ V(Γ) | ∂(z, y) = i, ∂(x, z) = j} | 0 ≤ i, j ≤ d}
is equitable and its parameters do not depend on the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
July 2010
1240 pages
ISBN:9781450300551
DOI:10.1145/1835804
General Chairs:
Bharat Rao
Siemens
,
Balaji Krishnapuram
Siemens
,
Program Chairs:
Andrew Tomkins
Google Inc.
,
Qiang Yang
Hong Kong University of Science and Technology
Copyright © 2010 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 July 2010
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
community detection
graph algorithms
graph mining
social networks
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 332
  Total Citations
  View Citations
- 1,975
  Total Downloads
- Downloads (Last 12 months)177
- Downloads (Last 6 weeks)33
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

The community-search problem and how to plan a successful cocktail party

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

The K-clique Densest Subgraph Problem

Clique Percolation Method: Memory Efficient Almost Exact Communities

1-Homogeneous Graphs with Cocktail Party μ-Graphs