A discrete shuffled frog-leaping algorithm to identify influential nodes for influence maximization in social networks

doi:10.1016/j.knosys.2019.07.004

Knowledge-Based Systems

Volume 187, January 2020, 104833

https://doi.org/10.1016/j.knosys.2019.07.004 Get rights and content

Abstract

Influence maximization problem aims to select a subset of k most influential nodes from a given network such that the spread of influence triggered by the seed set will be maximum. Greedy based algorithms are time-consuming to approximate the expected influence spread of given node set accurately and not well scalable to large-scale networks especially when the propagation probability is large. Conventional heuristics based on network topology or confined diffusion paths tend to suffer from the problem of low solution accuracy or huge memory cost. In this paper an effective discrete shuffled frog-leaping algorithm (DSFLA) is proposed to solve influence maximization problem in a more efficient way. Novel encoding mechanism and discrete evolutionary rules are conceived based on network topology structure for virtual frog population. To facilitate the global exploratory solution, a novel local exploitation mechanism combining deterministic and random walk strategies is put forward to improve the suboptimal meme of each memeplex in the frog population. The experimental results of influence spread in six real-world networks and statistical tests show that DSFLA performs effectively in selecting targeted influential seed nodes for influence maximization and is superior than several state-of-the-art alternatives.

Introduction

Social networks have become powerful platforms for information diffusion and viral marketing by expanding billions of loyal users. An underlying cause fostering the capabilities is the social influence, which maps the interactions between individuals in the network and can be evaluated based on trust and reputation [1]. One of the typical applications promoted by social network is the viral marketing [2], which appreciates the important effect of ‘word-of-mouth’ that indwells the interpersonal influence relationship of consumers and can reshape consumers’ attitudes and behaviors [3]. Influence maximization problem is targeted to select a subset of $k$ influential seed nodes that can maximize the spread of influence into the network. The problem was coined by Domingos and Richardson [4] firstly in terms of network perspective through which the most potential customers are identified to maximize the expected profit of a product promotion activity.

As emphasized in [5], [6], there are two challenges in tackling influence maximization problem. The first difficulty is to estimate the influence spread of given node set accurately, which was proved to be $♯$ P-hard. The second one is to provide effective and efficient algorithms for the selection of a subset influential nodes which can maximize the spread of influence into the network. Kempe et al. [7] firstly formulated influence maximization as a discrete optimization problem and proposed a greedy approach with guaranteed solution accuracy. However, experimental results [8], [9] showed that greedy algorithm is time-consuming especially in large-scale networks. This is because the algorithm has to run $k$ rounds to select the targeted seed nodes. In each round, the algorithm needs to carry out $R$ ( $R \geq 10, 000$ ) Monte-Carlo simulations to evaluate the marginal gain of each of the $N$ nodes in the network approximately, and for each simulation the $M$ edges of the network will be traversed inevitably. Consequently, the time complexity of the original greedy algorithm is $O (k N M R)$ .

Following up on the seminal work, novel influence estimators and influential node selecting approaches have emerged to solve influence maximization problem in a more efficient way. Chen [10] proposed an improved greedy algorithm by pruning the edges that hardly take part in influence spread in the network. Jiang et al. [11] proposed an expected diffusion value estimator to evaluate the spread of influence within the one-hop area of given candidate nodes. However, it performs less effective than the local influence estimator that optimizes the expected influence spread within the two-hop area of given candidate nodes [12]. Kimura et al. [13] assumed that influence only spreads along the shortest and second shortest paths, and proposed a shortest path-based influence maximization algorithm. Further more, by assuming that influence spreads on the paths independent of each other, Kim et al. [14] proposed a parallel influence path-based algorithm to identify the seed nodes in a faster way. Cao et al. [15] systematically studied the influence maximization problem based on community detection. They transformed influence maximization to an optimal resource allocation problem and proposed an optimal dynamic programming algorithm to find an optimal seed allocation. As demonstrated in [16], community-based influence maximization algorithms are generally faster than traditional greedy algorithms, but the accuracy and the scalability of the community-based algorithms need improved. Compared with the original simple greedy algorithm, those methods are more efficient by reducing or avoiding the number of Monte-Carlo simulations. However, sacrifices in solution accuracy and memory cost have to be made to compensate these novel influence maximization algorithms. Therefore, developing effective and efficient methods for influence maximization in large-scale networks still remains as an open research topic of social network analysis and is of great significance due to its promising applications in the spread of information, such as innovation diffusion, viral marketing, etc.

The effectiveness and robustness of meta-heuristic algorithms based on swarm intelligence have been widely validated by many applications on complex optimization problems such as symbolic regression problem [17], feature selection in data mining and machine learning [18], sports training sessions [19] as well as influence maximization problem [20], [21], etc. In this paper, a discrete shuffled frog-leaping algorithm (DSFLA) is proposed based on network topology characteristic to identify influential nodes for influence maximization. The main contributions of our paper are as follows.

$•$ Encoding mechanism for virtual frog individual and discrete evolutionary rules for frog population are conceived based on network topology structure, respectively. Then the framework of discrete shuffled frog-leaping algorithm for influence maximization problem is presented.

$•$ To facilitate the global exploratory solution during the evolutionary process, a local exploitation mechanism combining deterministic and random walk strategies is put forward to improve the suboptimal meme of each memeplex in the frog population.

$•$ The orthogonal experimental design method is adopted to optimize the parameter settings of DSFLA, and the experimental results and statistical tests in six real-world networks show that the proposed DSFLA is effective and efficient for influence maximization, and can be scalable to large-scale networks.

The remainder of this paper is organized as follows: Section 2 reviews related works. Influence maximization problem, the independent cascade model and an effective influence estimator used in this paper are introduced in Section 3. Section 4 gives the proposed discrete shuffled frog-leaping algorithm and the framework of DSFLA for influence maximization. Performance validation of DSFLA and statistical tests are provided in Section 5. Section 6 concludes this paper with future works.

Section snippets

Related works

Since the seminal work by Domingos and Richardson [4], great attention has been paid to the interesting problem. In general, the existing majority of influence maximization algorithms can be mainly categorized into the following three aspects: greedy based algorithms, reverse influence sampling algorithms and advanced heuristic algorithms.

Influence maximization problem

Definition 1

Let $G = (V, E)$ be a network, where $V$ is the node set and $E$ is the edge set of the network. Influence maximization problem aims to select targeted $k$ ( $1 \leq k < | V |$ ) influential nodes as seed set $S$ such that the number of influenced nodes triggered by the seed set $S$ , denoted as influence spread $σ (S)$ , is maximum under a given propagation model. $S^{*} = {argmax}_{S \subseteq V, | S | = k} σ (S)$ where $S$ is a candidate seed set, $σ (S)$ is the expected number of influenced nodes that are triggered by $S$ , and $S^{*}$ is the best seed set that

Proposed method

As discussed above, the expected influence spread of given candidate nodes can be evaluated according to the local influence estimator, so optimization algorithms can be utilized to maximize the fitness value of the LIE function. Shuffled frog-leaping algorithm [40] is an advanced meta-heuristic algorithm, and its effectiveness on optimization problems has been validated in many studies [41], [42]. Inspired by the efficient evolutionary mechanism based on swarm intelligence, we try to make

Datasets and baseline algorithms

To validate the performance of the proposed DSFLA on influence maximization problem, experiments are carried out on six real-world social networks, as shown in Table 1.

AstroPh and CondMat [46] are two undirected collaboration networks which cover scientific collaborations between authors of papers submitted to Arxiv Astro Physics and Condensed Matter, respectively. Slashdot [47] is a technology-related news social network known for its specific user community, and it is treated as an undirected

Conclusions and future works

The shuffled frog-leaping algorithm which combines deterministic and random search strategies shows excellent performance on various complex optimization problems. In this paper, a discrete shuffled frog-leaping algorithm is proposed specially to identify influential nodes for influence maximization. In the proposed framework, discrete encoding mechanism and evolutionary rules are conceived based on network topology, and a local degree-based replacement strategy is presented to cooperate with

Acknowledgments

This work is supported by the National Natural Science Foundations of China (Grant No. 21503101 and No. 61702240) and the CERNET Innovation Project, China (NGII20170422).

References (49)

Uren̄aR. et al.
A review on trust propagation and opinion dynamics in social networks and group decision making frameworks
Inform. Sci.
(2019)
PengS. et al.
Influence analysis in social networks: A survey
J. Netw. Comput. Appl.
(2018)
ZhuT. et al.
Maximizing the spread of influence ranking in social networks
Inform. Sci.
(2014)
KunduS. et al.
Deprecation based greedy strategy for target set selection in large scale social networks
Inform. Sci.
(2015)
GongM. et al.
Influence maximization in social networks based on discrete particle swarm optimization
Inform. Sci.
(2016)
ShangJ. et al.
CoFIM: A Community-based framework for influence maximization on large-scale networks
Knowl.-Based Syst.
(2017)
GolafshaniE.M.
Introduction of biogeography-based programming as a new algorithm for solving problems
Appl. Math. Comput.
(2015)
WeiJ. et al.
A BPSO-SVM algorithm based on memory renewal and enhanced mutation mechanisms for feature selection
Appl. Soft Comput.
(2017)
FisterI. et al.
Planning the sports training sessions with the bat algorithm
Neurocomputing
(2015)
CuiL. et al.
DDSE: A novel evolutionary algorithm based on degree-descending search strategy for influence maximization in social networks
J. Netw. Comput. Appl.
(2018)

GuiC. et al.

Overlapping communities detection based on spectral analysis of line graphs

Physica A

(2018)

ShangJ. et al.

IMPC: Influence maximization based on multi-neighbor potential in community networks

Physica A

(2018)

TangJ. et al.

Maximizing the spread of influence via the collective intelligence of discrete bat algorithm

Knowl.-Based Syst.

(2018)

NewmanM.J.

A measure of betweenness centrality based on random walks

Social Networks

(2005)

Al-garadiM.A. et al.

Identification of influential spreaders in online social networks using interaction weighted K-core decomposition method

Physica A

(2017)

Uren̄aR. et al.

A social network based approach for consensus achievement in multiperson decision making

Inf. Fusion

(2019)

XueY. et al.

Fuzzy Rough set algorithm with binary shuffled frog-leaping (BSFL-frsa): An innovative approach for identifying main drivers of carbon exchange in temperate deciduous forests

Ecol. Indicators

(2017)

LuoJ. et al.

A new hybrid memetic multi-objective optimization algorithm for multi-objective optimization

Inform. Sci.

(2018)

MaoM. et al.

Grid-connected modular PV-converter system with shuffled frog leaping algorithm based DMPPT controller

Energy

(2018)

BrownJ.J. et al.

Social ties and word-of-mouth referral behavior

J. Consum. Res.

(1987)

P. Domingos, M. Richardson, Mining the network value of customers, in: ACM SIGKDD International Conference on Knowledge...

LeeJ.R. et al.

A query approach for influence maximization on specific users in social networks

IEEE Trans. Knowl. Data Eng.

(2015)

D. Kempe, J. Kleinberg, Maximizing the spread of influence through a social network, in: ACM SIGKDD International...

J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. Vanbriesen, N. Glance, Cost-effective outbreak detection in...

Cited by (92)

PHEE: Identifying influential nodes in social networks with a phased evaluation-enhanced search
2024, Neurocomputing
How to identify a small fraction of users with the best capacity to influence other users is central to the research of social network analysis. This problem is termed influence maximization (IM) and is known for its extensive applications. IM can be formulated as a combinatorial optimization problem, which has been shown to be NP-hard under some diffusion models. Therefore, seeking for high-accuracy IM algorithms with acceptable running time has attracted much attention in literature. However, most of existing IM algorithms only adopt a uniform mechanism in the whole solution search process, which lacks of flexible response when the algorithms trap in local optimum. This paper proposes a phased evaluation-enhanced (PHEE) approach for IM, which utilizes two distinct strategies to search the optimal solutions: The first one is a random range division-based evolutionary algorithm for improving solution quality; the second is a fast convergence strategy for searching an optimal solution. Two PHEE-based algorithms, MDD-PHEE and GCI-PHEE, are generated and are evaluated on 10 real-world social networks of different sizes and types. Experimental results demonstrate the effectiveness of PHEE; in particular, MDD-PHEE obtains the best influence spread on all networks compared with the state-of-the-art algorithms, and has a better performance than the time-consuming algorithm CELF on four datasets.
SRFA-GRL: Predicting group influence in social networks with graph representation learning
2023, Information Sciences
Group influence evaluation is the fundamental research in social network analysis, whose main task is evaluating the influence of the group consisting of arbitrary nodes in the social network. Many methods have been proposed to measure group influence, such as centrality-based, Monte Carlo, and path-based methods. Graph Representation Learning (GRL) has profound success in social network node-level, edge-level, and graph-level tasks. GRL integrates nodes, edges, and other information in the network structure to calculate the nodes' embeddings jointly and provide embedding features with richer information than traditional methods. Group influence is related to many factors; existing methods only focus on a single aspect but ignore the various properties. This paper proposes a Subgraph Reconstruction Feature Aggregation Graph Representation Learning based (SRFA-GRL) framework to evaluate the group influence. In the SRFA-GRL framework, a subgraph reconstruction method is proposed to capture the node distribution of the group, and a vector similarity method is proposed to compute the relative distance between the maximal connected subgraph and group embeddings. Vast experiments are conducted on eight real social networks to analyze the effectiveness of the SRFA-GRL model, and the experimental results show that the SRFA-GRL framework outperforms baseline methods.
Dominant coverage for target users at the lowest cost under competitive propagation in social networks
2023, Computer Networks
Multiple pieces of information disseminated in social networks can create a competitive environment where the influence of each message will be reduced. Thus, how to increase the influence under competitive propagation is significant for many applications such as product marketing and rumor suppression. In this paper, to make the dissemination targeted and maximize revenue, we propose the lowest cost problem to achieve dominant coverage for target users under competitive propagation(LC-CTU problem), using target user profiles and setting the objective to dominate coverage. To better model this scene, this paper proposes a competitive restricted propagation independent cascade model based on weights of attribute-generating edges(IC-CRA model), which adds competition, delay, and attributes to the basic independent cascade model. After proving that the LC-CTU problem under the IC-CRA model is NP-hard and the objective function is monotonic and submodular, the problem can be solved by a greedy algorithm. However, the greedy algorithm is time-consuming for large social networks. Thus, to improve the efficiency of the algorithm, we propose the lowest cost heuristic algorithm based on target users and restricted competition (LCH-TU algorithm), which uses the local influence update of each node for selecting the seed node. Finally, this paper demonstrates the effectiveness, efficiency, and scalability of the LCH-TU algorithm through extensive experiments on seven datasets, including real and artificial networks.
Identification of influential users with cost minimization via an improved moth flame optimization
2023, Journal of Computational Science
The issue of influence maximization has received great attention due to its application potential. Some traditional models used to solve the influence maximization problem only consider the maximum propagation range that the seed node set can reach but ignore the cost difference between the potential candidate nodes. This is not characteristic of real-world network behavior. Thus, in response to this research gap, a multi-objective optimization model based on maximizing influence spreading while minimizing cost is proposed in this paper. On the basis of maintaining the effective characteristics of non-dominated sorting moth flame optimization, the diversity weight and mutation mechanism are integrated into the algorithm to maintain the population diversity in the exploration stage. The evolutionary rules of the original moth flame optimization are redesigned to meet the desired needs of multi-objective influence optimization. By considering three types of real-world social networks, we show that our proposed method can generate a set of well-distributed Pareto optimal solutions.
TSIFIM: A three-stage iterative framework for influence maximization in complex networks
2023, Expert Systems with Applications
Citation Excerpt :
Unlike some existing researches, IMUD effectively avoids non-target users in viral marketing networks, i.e., those who “hate” the promotion of an activity. Based on network topology characteristic, Tang et al. (2020) proposed the discrete shuffled frog-leaping algorithm for solving the IM problem. Calio and Tagarelli (2021) put forward the ADITUM algorithm to determine the influential spreaders in complex networks, which disperses the seeds as much as possible according to the side-information available at node level, where the side-information corresponds to the categorical attribute values.
The problem of influence maximization is a classic issue that has been well-studied in the field of network science, but most of existing researches are compromising among computational complexity or result accuracy. In this work, a three-stage iterative framework for influence maximization (TSIFIM) is presented to find a set of seed spreaders in complex networks. In TSIFIM, the initial candidate seeds are first selected by considering the global communicability of each node and its importance in their local network. Then, in addition to the candidate seeds, other remained nodes are assigned to the specific communities based on the proposed local resource allocation similarity index, and the core node in each community which satisfies the local influence threshold condition are selected as the supplementary candidate seeds. Furthermore, we employ an adaptive search strategy to find the optimal solution among these candidates. The proposed algorithm is compared with eight popular influence maximization algorithms on nine real-world networks to verify the performance. Experimental results show that TSIFIM has better performance in terms of influence spreading, sensitivity analysis, seed dispersion and statistical test.
Improved shuffled Frog leaping algorithm with unsupervised population partitioning strategies for complex optimization problems
2024, Journal of Combinatorial Optimization

View all citing articles on Scopus

^☆: No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.knosys.2019.07.004.

View full text

A discrete shuffled frog-leaping algorithm to identify influential nodes for influence maximization in social networks☆

Abstract

Introduction

Section snippets

Related works

Influence maximization problem

Proposed method

Datasets and baseline algorithms

Conclusions and future works

Acknowledgments

Inform. Sci.

J. Netw. Comput. Appl.

Inform. Sci.

Inform. Sci.

Inform. Sci.

Knowl.-Based Syst.

Appl. Math. Comput.

Appl. Soft Comput.

Neurocomputing

J. Netw. Comput. Appl.

Physica A

Physica A

Knowl.-Based Syst.

Social Networks

Physica A

Inf. Fusion

Ecol. Indicators

Inform. Sci.

Energy

Social ties and word-of-mouth referral behavior

J. Consum. Res.

A query approach for influence maximization on specific users in social networks

IEEE Trans. Knowl. Data Eng.