Elsevier

Data & Knowledge Engineering

Volume 117, September 2018, Pages 183-194
Data & Knowledge Engineering

An overlapping community detection algorithm in complex networks based on information theory

https://doi.org/10.1016/j.datak.2018.07.009Get rights and content

Abstract

In this paper, a new algorithm for overlapping community detection is proposed. First, we propose a node importance evaluation matrix to calculate the important degree for each node; second, we put forward the difference function to detect overlapping points in complex networks; finally, we use triangle principle to detect communities in complex networks. We adopt two measures of Normalized Mutual Information and Modularity to evaluate the algorithm. The experimental results show that our algorithm has a good performance on detecting overlapping community.

Introduction

In real life, many real-world networks exist in the form of complex networks, such as private relationship network in social systems, a food chain network in a biological system and World Wide Web etc. These complex networks present a community structure, i.e. vertices groups that have a higher density of edges within them and a lower density of edges between them. Community detection is useful for understanding the properties of network structure and predicting the behaviors of networks [1].

The problem of community structure detection in complex networks has attracted the attention of researchers. Researchers have proposed many community detection algorithms. These methods have been successfully applied in some real complex networks. However, many of the research work mainly focus on hard partition of complex networks (a node can only be divided into a community). But in fact, one node may belong to multiple communities in the real networks. For example, in social networks, one person is usually involved in several social groups such as family, colleagues and friend [2]. Therefore, Kelley et al. [3] pointed out that overlap is indeed an important feature of many real-world networks. Palla et al. [4] introduced the concept of overlapping community and proposed the clique percolation method.

Overlapping community detection algorithms mainly include clique percolation method [2,5,6], block model [7,8], edge clustering algorithm [9,10] and label propagation algorithm [11,12]. The specifications of the different applications are mainly based on the overlapping ratio between different communities. In some approaches, it is required that nodes belonging to multiple communities are strictly restricted. And in other methods, it prefers highly overlapping community structures. In this paper, a new overlapping community detection algorithm based on information theory is proposed. The important node is taken as the cluster center. It is closer to the real community structure. The contributions of the paper are as follows.

  • (1)

    The node importance contribution matrix is proposed to evaluate the node importance.

  • (2)

    The difference function is proposed to detect the overlapping points.

In this paper, the community detection algorithm we proposed uses the triangle principle for community partition. The algorithm is simple and it is easy to be understood and realized. The remaining parts of the paper are organized as follows. Section 2 introduces the related work of the paper. Section 3 introduces our algorithm in details. Section 4 is the experimental analysis. Finally, the fifth part is the conclusion.

Section snippets

Related works

Some research on complex networks has attracted the attention of researchers. One of the most important tasks in complex network analysis is the community detection. In the past decade, scholars proposed many effective community detection algorithms. Among them, the representative methods include clustering-based method [13,14], modularity-based method [15], spectral algorithm [[16], [17], [18]], dynamic algorithm [[17], [18], [19], [20], [21], [22]], statistical inference-based method [18] and

Algorithm

In this section, we propose an overlapping community discovery algorithm based on a triangle principle. Two adjacent nodes and their shared neighbor nodes can form a triangle to determine whether the two nodes belong to the same community. In the community detection algorithm, the node importance contribution matrix is proposed.

Experiment

We use the experiment to evaluate the proposed algorithm. All experiments are performed on a PC with Windows XP, an i3 CPU (2.16 GHz) and 1 GB main memory. The programming environment is JDK 1.7.

Conclusion

In this paper, a new algorithm for overlapping community detection is proposed. First, we propose a node importance contribution matrix to calculate the similarity between each pair of nodes. Second, the difference function is proposed to detect the overlapping points. Finally, we use triangle principle for detecting communities in complex networks. Our proposed algorithm can effectively detect overlapping communities in real network datasets. Next, we will perform overlapping community

Acknowledgments

The corresponding author would like to thank the support from the National Natural Science Foundation of China under the Grant of 61402363, the Education Department of Shaanxi Province Key Laboratory Project under the Grant of 15JS079, Xi'an Science Program Project under the Grant of 2017080CG/RC043(XALG017), the Ministry of Education of Shannxi Province Research Project under the Grant of 17JK0534, and Beilin district of Xi'an Science and Technology Project under the Grant of GX1625.

HongFang Zhou received her B.S. and M.S. degrees from Xi'an University of Technology in 1999 and 2002 respectively. And she received her Ph.D. degree from Xi'an Jiaotong University in 2006. She is now an associate professor in Xi'an University of Technology, China. Her research interests include artificial computing, software and theory and heterogeneous information network.

References (33)

  • P.K. Gopalan et al.

    Efficient discovery of overlapping communities in massive networks

    Proc. Natl. Acad. Sci. Unit. States Am.

    (2013)
  • Y.Y. Ahn et al.

    Link communities reveal multiscale complexity in networks

    Nature

    (2010)
  • T. Evans et al.

    Line graphs, link partitions and overlapping communities

    Phys. Rev. E

    (2009)
  • A. Lancichinetti et al.

    Finding statistically significant communities in networks

    PLoS One

    (2011)
  • T. Népusz et al.

    Fuzzy communities and the concept of bridgeness in complex networks

    Phys. Rev. E

    (2008)
  • A. Lancichinetti et al.

    Consensus clustering in complex networks

    Sci. Rep.

    (2012)
  • Cited by (12)

    • Time series clustering based on complex network with synchronous matching states

      2023, Expert Systems with Applications
      Citation Excerpt :

      In many related works, researchers find that complex network is a useful tool for time series data mining (Ferreira & Zhao, 2015, 2016). Complex network is a valuable mining tool, which can describe the relationship between any two pairs or two groups of data samples (Ge, Peng, & Lu, 2021; Li & Du, 2021; Zhou, Zhang, & Li, 2018). With the development of complex network theory, there are some new methods that transform time series into networks for analysis according to specific mapping methods (Gao, Small, & Kurths, 2017).

    • An approach to detect backbones of information diffusers among different communities of a social platform

      2022, Data and Knowledge Engineering
      Citation Excerpt :

      The last topic in social network literature somewhat related to our approach concerns community detection. A large number of techniques performing this task has been proposed in the past literature [25–29]. In this context, several authors have presented neural network-based approaches to identify communities within a group of users [30–33].

    • An overlapping community detection algorithm based on rough clustering of links

      2020, Data and Knowledge Engineering
      Citation Excerpt :

      On the other hand, OCG first covers a network with overlapping subgroups and then fuses them iteratively according to an extended modularity function [40]. Methods based on node importance and core decomposition [41] have also been used for overlapping community detection [16,42]. The theories and techniques related to soft computing have also been used for community detection.

    View all citing articles on Scopus

    HongFang Zhou received her B.S. and M.S. degrees from Xi'an University of Technology in 1999 and 2002 respectively. And she received her Ph.D. degree from Xi'an Jiaotong University in 2006. She is now an associate professor in Xi'an University of Technology, China. Her research interests include artificial computing, software and theory and heterogeneous information network.

    Yao Zhang received the B.S. degree in Computer science and technology from Xi'an University of Posts & Telecommunications in 2015. Now he is studying for Master degree in the School of Computer Science and Engineering, Xi'an University of Technology. His research interests are focus on statistical machine learning and data mining.

    Jin Li received his B.S. Degree from Xi'an University of Technology, in 2014. He is a postgraduate of Xi'an University of Technology. His research interests include artificial computing and Data Mining.

    View full text