ABSTRACT
A lot of research in graph mining has been devoted in the discovery of communities. Most of the work has focused in the scenario where communities need to be discovered with only reference to the input graph. However, for many interesting applications one is interested in finding the community formed by a given set of nodes. In this paper we study a query-dependent variant of the community-detection problem, which we call the community-search problem: given a graph G, and a set of query nodes in the graph, we seek to find a subgraph of G that contains the query nodes and it is densely connected.
We motivate a measure of density based on minimum degree and distance constraints, and we develop an optimum greedy algorithm for this measure. We proceed by characterizing a class of monotone constraints and we generalize our algorithm to compute optimum solutions satisfying any set of monotone constraints. Finally we modify the greedy algorithm and we present two heuristic algorithms that find communities of size no greater than a specified upper bound. Our experimental evaluation on real datasets demonstrates the efficiency of the proposed algorithms and the quality of the solutions we obtain.
Supplemental Material
- G. Agarwal and D. Kempe. Modularity-maximizing network communities via mathematical programming. European Physics Journal B, 66(3), 2008.Google ScholarCross Ref
- R. Andersen and K. Chellapilla. Finding dense subgraphs with size bounds. In WAW, 2009. Google ScholarDigital Library
- R. Andersen, F. Chung, and K. Lang. Local graph partitioning using pagerank vectors. In FOCS, 2006. Google ScholarDigital Library
- Y. Asahiro, K. Iwama, H. Tamaki, and T. Tokuyama. Greedily finding a dense subgraph. In SWAT, 1996. Google ScholarDigital Library
- S. Asur and S. Parthasarathy. A viewpoint-based approach for interaction graph analysis. In KDD, 2009. Google ScholarDigital Library
- U. Brandes, D. Delling, M. Gaertler, R. G¨orke, M. Hoefer, Z. Nikoloski, and D. Wagner. On modularity clustering. TKDE, 20(2):172--188, 2008. Google ScholarDigital Library
- M. Charikar. Greedy approximation algorithms for finding dense components in a graph. In APPROX, 2000. Google ScholarDigital Library
- J. Cheng, Y. Ke,W. Ng, and J. X. Yu. Context-aware object connection discovery in large graphs. In ICDE, 2009. Google ScholarDigital Library
- Y. Dourisboure, F. Geraci, and M. Pellegrini. Extraction and classification of dense communities in the web. In WWW, 2007. Google ScholarDigital Library
- C. Faloutsos, K. McCurley, and A. Tomkins. Fast discovery of connection subgraphs. In KDD, 2004. Google ScholarDigital Library
- U. Feige, G. Kortsarz, and D. Peleg. The dense k-subgraph problem. Algorithmica, 29:2001, 1999.Google Scholar
- G. W. Flake, S. Lawrence, and C. L. Giles. Efficient identification of web communities. In KDD, 2000. Google ScholarDigital Library
- G. W. Flake, S. Lawrence, C. L. Giles, and F. M. Coetzee. Self-organization and identification of web communities. Computer, 35(3):66--71, 2002. Google ScholarDigital Library
- S. Fortunato and M. Barthelemy. Resolution limit in community detection. PNAS, 104(1), 2007.Google Scholar
- D. Gibson, R. Kumar, and A. Tomkins. Discovering large dense subgraphs in massive graphs. In VLDB, 2005. Google ScholarDigital Library
- M. Girvan and M. E. J. Newman. Community structure in social and biological networks. Proceedings of the National Academy of Sciences of the USA, 99(12):7821--7826, 2002.Google ScholarCross Ref
- J. Hastad. Clique is hard to approximate within n1--µ. Electronic Colloquium on Computational Complexity (ECCC), 4(38), 1997.Google Scholar
- G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. JSC, 20(1), 1998. Google ScholarDigital Library
- G. Kasneci, S. Elbassuoni, and G. Weikum. Ming: mining informative entity relationship subgraphs. In CIKM, 2009. Google ScholarDigital Library
- S. Khuller and B. Saha. On finding dense subgraphs. In ICALP, 2009. Google ScholarDigital Library
- Y. Koren, S. C. North, and C. Volinsky. Measuring and extracting proximity graphs in networks. TKDD, 1(3), 2007. Google ScholarDigital Library
- B. Korte and J. Vygen. Combinatorial Optimization: Theory and Algorithms (Algorithms and Combinatorics). Springer, 2007. Google ScholarDigital Library
- L. Kou, G. Markowsky, and L. Berman. A fast algorithm for steiner trees. Acta Informatica, 15(2):141--145, 1981Google ScholarDigital Library
- T. Lappas, K. Liu, and E. Terzi. Finding a team of experts in social networks. In KDD, 2009. Google ScholarDigital Library
- J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney. Statistical properties of community structure in large social and information networks. In WWW, 2008. Google ScholarDigital Library
- M. Newman. Fast algorithm for detecting community structure in networks. Physical Review E, 69, 2003.Google Scholar
- P. Sevon, L. Eronen, P. Hintsanen, K. Kulovesi, and H. Toivonen. Link discovery in graphs derived from biological databases. In DILS, 2006. Google ScholarDigital Library
- H. Tong and C. Faloutsos. Center-piece subgraphs: problem definition and fast solutions. In KDD, 2006. Google ScholarDigital Library
- S. White and P. Smyth. A spectral clustering approach to finding communities in graph. In SDM, 2005.Google ScholarCross Ref
Index Terms
- The community-search problem and how to plan a successful cocktail party
Recommendations
The K-clique Densest Subgraph Problem
WWW '15: Proceedings of the 24th International Conference on World Wide WebNumerous graph mining applications rely on detecting subgraphs which are large near-cliques. Since formulations that are geared towards finding large near-cliques are hard and frequently inapproximable due to connections with the Maximum Clique problem, ...
Clique Percolation Method: Memory Efficient Almost Exact Communities
Advanced Data Mining and ApplicationsAbstractAutomatic detection of relevant groups of nodes in large real-world graphs, i.e. community detection, has applications in many fields and has received a lot of attention in the last twenty years. The most popular method designed to find ...
1-Homogeneous Graphs with Cocktail Party μ-Graphs
Let Γ be a graph with diameter d ≥ 2. Recall Γ is 1-homogeneous (in the sense of Nomura) whenever for every edge xy of Γ the distance partition
{{z ∈ V(Γ) | ∂(z, y) = i, ∂(x, z) = j} | 0 ≤ i, j ≤ d}
is equitable and its parameters do not depend on the ...
Comments