# An Efficient Partition-Based Approach to Identify and Scatter Multiple Relevant Spreaders in Complex Networks

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Related Work

## 3. Proposed Method

#### 3.1. Measuring the Relevance of the Vertices

#### 3.2. Spreaders Distribution

- If a community is small, then we assign one spreader.
- The number of spreaders assigned to a community is proportional to its size.

## 4. Evaluation Design

#### 4.1. Benchmark Methods

#### 4.1.1. Degree Centrality (DEG)

#### 4.1.2. Closeness Centrality (CLO)

#### 4.1.3. Betweenness Centrality (BET)

#### 4.1.4. VoteRank Method (VR)

#### 4.1.5. HybridRank Centrality (HC)

#### 4.1.6. Indirect Spreading (SC)

#### 4.1.7. Improved K-Shell (IKS)

#### 4.2. Spreading Model

#### 4.3. Performance Metrics

#### 4.4. Data Description

## 5. Results and Discussion

#### Computational Efficiency

## 6. Conclusions

## Author Contributions

## Funding

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Moore, S. Gartner Top 10 Data and Analytics Trends 2020. Available online: https://www.gartner.com/smarterwithgartner/gartner-top-10-data-analytics-trends/ (accessed on 11 May 2019).
- Costa, L.D.F.; Oliveira, O.N.; Travieso, G.; Rodrigues, F.A.; Villas Boas, P.R.; Antiqueira, L.; Viana, M.P.; Correa Rocha, L.E. Analyzing and modeling real-world phenomena with complex networks: A survey of applications. Adv. Phys.
**2011**, 60, 329–412. [Google Scholar] [CrossRef][Green Version] - Barabási, A.L. Spreading Phenomena. In Network Science; Cambridge University Press: Cambridge, UK, 2016; Chapter 10; p. 474. [Google Scholar]
- Kempe, D.; Kleinberg, J.; Tardos, E. Maximizing the spread of influence through a social network. Theory Comput.
**2015**, 11, 105–147. [Google Scholar] [CrossRef] - Lü, L.; Chen, D.; Ren, X.L.; Zhang, Q.M.; Zhang, Y.C.; Zhou, T. Vital nodes identification in complex networks. Phys. Rep.
**2016**, 650, 1–63. [Google Scholar] [CrossRef][Green Version] - Yang, G.; Benko, T.P.; Cavaliere, M.; Huang, J.; Perc, M. Identification of influential invaders in evolutionary populations. Sci. Rep.
**2019**, 9, 1–12. [Google Scholar] [CrossRef] [PubMed] - Bucur, D.; Iacca, G. Influence Maximization in Social Networks with Genetic Algorithms. In Lecture Notes in Computer Science (including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer International Publishing: Cham, Switzerland, 2016; Volume 9597, pp. 379–392. [Google Scholar] [CrossRef]
- Konotopska, K.; Iacca, G. Graph-Aware Evolutionary Algorithms for Influence Maximization. In Proceedings of the 2021 Genetic and Evolutionary Computation Conference Companion (GECCO ’21 Companion), Lille, France, 10–14 July 2021; Volume 1. [Google Scholar] [CrossRef]
- Erkol, B.; Castellano, C.; Radicchi, F. Systematic comparison between methods for the detection of influential spreaders in complex networks. Sci. Rep.
**2019**, 9, 1–11. [Google Scholar] [CrossRef] [PubMed][Green Version] - Yang, G.; Cavaliere, M.; Zhu, C.; Perc, M. Ranking the invasions of cheaters in structured populations. Sci. Rep.
**2020**, 10, 2231. [Google Scholar] [CrossRef] [PubMed][Green Version] - Kitsak, M.; Gallos, L.K.; Havlin, S.; Liljeros, F.; Muchnik, L.; Stanley, H.E.; Makse, H.A. Identification of influential spreaders in complex networks. Nat. Phys.
**2010**, 6, 888–893. [Google Scholar] [CrossRef][Green Version] - Fortunato, S. Community detection in graphs. Phys. Rep.
**2010**, 486, 75–174. [Google Scholar] [CrossRef][Green Version] - Opsahl, T.; Agneessens, F.; Skvoretz, J. Node centrality in weighted networks: Generalizing degree and shortest paths. Soc. Netw.
**2010**, 32, 245–251. [Google Scholar] [CrossRef] - Al-Garadi, M.A.; Varathan, K.D.; Ravana, S.D.; Ahmed, E.; Mujtaba, G.; Khan, M.U.S.; Khan, S.U. Analysis of Online Social Network Connections for Identification of Influential Users. ACM Comput. Surv.
**2018**, 51, 1–37. [Google Scholar] [CrossRef] - Wang, M.; Li, W.; Guo, Y.; Peng, X.; Li, Y. Identifying influential spreaders in complex networks based on improved k-shell method. Physica A
**2020**, 554, 124229. [Google Scholar] [CrossRef] - Liu, Y.; Tang, M.; Zhou, T.; Do, Y. Improving the accuracy of the k-shell method by removing redundant links: From a perspective of spreading dynamics. Sci. Rep.
**2015**, 5, 13172. [Google Scholar] [CrossRef] [PubMed][Green Version] - Ma, L.L.; Ma, C.; Zhang, H.F.; Wang, B.H. Identifying influential spreaders in complex networks based on gravity formula. Physica A
**2016**, 451, 205–212. [Google Scholar] [CrossRef][Green Version] - Zhang, J.X.; Chen, D.B.; Dong, Q.; Zhao, Z.D. Identifying a set of influential spreaders in complex networks. Sci. Rep.
**2016**, 6, 1–9. [Google Scholar] [CrossRef] - Ahajjam, S.; Badir, H. Identification of influential spreaders in complex networks using HybridRank algorithm. Sci. Rep.
**2018**, 8, 1–9. [Google Scholar] [CrossRef] - Yu, S.; Gao, L.; Xu, L.; Gao, Z.Y. Identifying influential spreaders based on indirect spreading in neighborhood. Physica A
**2019**, 523, 418–425. [Google Scholar] [CrossRef] - Batagelj, V.; Zaversnik, M. An O(m) Algorithm for Cores Decomposition of Networks. arXiv
**2003**, arXiv:cs/0310049. [Google Scholar] - Freeman, L.C. Centrality in social networks conceptual clarification. Soc. Netw.
**1978**, 1, 215–239. [Google Scholar] [CrossRef][Green Version] - Newman, M.E.J. The structure and function of complex networks. SIAM Rev.
**2003**, 45, 167–256. [Google Scholar] [CrossRef][Green Version] - Bodendorf, F.; Kaiser, C. Detecting Opinion Leaders and Trends in Online Communities. In Proceedings of the 2010 Fourth International Conference on Digital Society, Saint Maarten, Netherlands Antilles, 10–16 February 2010; pp. 124–129. [Google Scholar] [CrossRef]
- Karunakaran, R.K.; Manuel, S.; Satheesh, E.N. Spreading Information in Complex Networks: An Overview and Some Modified Methods. In Graph Theory-Advanced Algorithms and Applications; Sirmacek, B., Ed.; InTech: London, UK, 2018; Volume 1, pp. 1–17. [Google Scholar] [CrossRef][Green Version]
- Batagelj, V.; Andrej Mrvar, A. Pajek Datasets (USAir). Available online: http://vlado.fmf.uni-lj.si/pub/networks/data/ (accessed on 16 January 2020).
- Newman, M. NetScience Dataset. Available online: http://networkrepository.com/netscience.php (accessed on 16 January 2020).
- Leskovec, J.; Kleinberg, J.; Faloutsos, C. Graph Evolution: Densification and Shrinking Diameters (Email-EU Dataset). ACM Trans. Knowl. Discov. Data (ACM TKDD)
**2007**, 1, 2-es. [Google Scholar] [CrossRef] - Boguna, M.; Pastor-Satorras, R.; Diaz-Guilera, A.; Arenas, A. PGP Giant Component (PGP Dataset). Available online: http://networkrepository.com/PGPgiantcompo.php (accessed on 16 January 2020).
- Newman, M.E.J. The structure of scientific collaboration networks (CondMat Dataset). Proc. Natl. Acad. Sci. USA
**2001**. [Google Scholar] [CrossRef] - Leskovec, J.; Adamic, L.A.; Huberman, B.A. The Dynamics of Viral Marketing (Amazon Dataset). ACM Trans. Web
**2007**, 1, 5-es. [Google Scholar] [CrossRef][Green Version] - Yang, J.; Leskovec, J. Defining and Evaluating Network Communities based on Ground-truth (DBLP and YouTube Datasets). CoRR
**2012**. [Google Scholar] [CrossRef][Green Version] - Blondel, V.D.; Guillaume, J.L.; Lambiotte, R.; Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp.
**2008**, 2008, P10008. [Google Scholar] [CrossRef][Green Version] - Wang, J.; Hou, X.; Li, K.; Ding, Y. A novel weight neighborhood centrality algorithm for identifying influential spreaders in complex networks. Physica A
**2017**, 475, 88–105. [Google Scholar] [CrossRef] - Hu, Z.L.; Liu, J.G.; Yang, G.Y.; Ren, Z.M. Effects of the distance among multiple spreaders on the spreading. EPL (Europhys. Lett.)
**2014**, 106, 18002. [Google Scholar] [CrossRef]

**Figure 1.**The final spreading scope for PBSI and benchmark methods on nine real networks when varying the spreading rates ($\beta $). The number of initial source spreaders $\rho $ are defined as follows: $\rho =20$ for (

**a**,

**b**); $\rho =50$ for (

**c**,

**d**); $\rho =100$ for (

**e**,

**f**); $\rho =200$ for (

**g**,

**h**); and $\rho =10k$ for (

**i**). SIR model with $\mu =1$. The influence spread (${F}_{SS}$) on the y-axis against propagation probability $\beta $ on the x-axis. Acronyms are defined in Section 4.1.

**Figure 2.**The final spreading scope for PBSI and benchmark methods on six real networks and three spreading rates ($\beta $) when varying the number of spreaders ($\rho $). SIR model with $\mu =1$, $\rho $ ranges up to 100 in PGP, CondMat, and Email-All, up to 400 and 600 in Amazon and DBLP, and up to 15k in the YouTube network. Plots (

**a**–

**f**) were computed with $\beta =0.07$, plots (

**g**–

**l**) with $\beta =0.10$, and plots (

**m**–

**r**) with $\beta =0.13$. $\rho $ spreaders in x-axis against ${F}_{SS}$ in y-axis. Acronyms are defined in Section 4.1.

**Figure 3.**The average shortest path length ${L}_{S}$ among the source spreaders selected by PBS and the benchmark methods on eight real networks when varying the number of spreaders ($\rho $). $\rho $ ranges from 1 to 50 in USAir (

**a**), 1 to 100 in the next five networks (

**b**–

**f**), 50 to 400 in Amazon (

**g**), and 1 to 600 in DBLP (

**h**). The average shortest path length on the y-axis against the number of multiple spreaders $\rho $ on the x-axis. Acronyms are defined in Section 4.1.

**Figure 4.**The profit margin achieved by the reference methods combined with the distribution strategy compared to their simple versions on eight real complex networks. Acronyms are defined in Section 4.1.

**Figure 5.**Time in seconds, consumed by the two steps of PBSI, and the benchmark methods for the mid-size and large-size networks. Axes are in log scale. On the five axes are the five networks, and each line corresponds to each evaluated method. Acronyms are defined in Section 4.1.

**Table 1.**Summary of basic topological properties of the 9 real networks. V and E are the total number of nodes and edges, respectively. $\u2329k\u232a$ is the average degree. L is the average shortest path. $\u2329Cc\u232a$ is the average clustering coefficient. $k{s}_{max}$ is the maximum k-shell value, and C is the number of communities identified in the graph.

Network | V | E | $\u2329\mathit{k}\u232a$ | L | $\u2329\mathbf{Cc}\u232a$ | ${\mathit{ks}}_{\mathit{max}}$ | C |
---|---|---|---|---|---|---|---|

USAir [26] | 332 | 2,126 | 12.807 | 2.738 | 0.749 | 26 | 7 |

NetSci [27] | 379 | 914 | 4.82 | 6.042 | 0.798 | 8 | 18 |

Email-EU core [28] | 1,005 | 25,571 | 50.88 | 2.586 | 0.399 | 56 | 41 |

PGP [29] | 10,680 | 24,316 | 4.55 | 7.463 | 0.440 | 31 | 117 |

CondMat [30] | 23,133 | 93,497 | 8.51 | 5.352 | 0.633 | 25 | 62 |

Email-EU [28] | 265,214 | 420,045 | 3.03 | 4.118 | 0.486 | 39 | 115 |

Amazon [31] | 262,111 | 1,234,877 | 9.42 | 8.831 | 0.420 | 10 | 214 |

DBLP [32] | 317,080 | 1,049,866 | 6.62 | 6.792 | 0.632 | 113 | 573 |

YouTube [32] | 1,134,890 | 2,987,624 | 5.27 | 5.279 | 0.080 | 51 | 9635 |

**Table 2.**Comparison between the final spreading scope achieve by the simple benchmark methods and their version combined with our proposed scattering method, which distributes the relevant nodes according to the partitioning induced by a community scheme. The highest values for each graph are shown in bold. (+) means a higher value than the simple metric. (−) means lower value than the simple metric. Acronyms are defined in Section 4.1.

USAir | NetSci | Email-Core | PGP | CondMat | Email-All | Amazon | DBLP | |
---|---|---|---|---|---|---|---|---|

BET | 155.04 | 66.75 | 756.28 | 1401.24 | 7221.25 | 36,498.84 | 19,928.94 | 72,673.86 |

BET⋆ | 154.19 − | 70.28 + | 775.36 + | 1526.5 + | 7311.28 + | 36,852.86 + | 21,368.11 + | 73,896.33 + |

CLO | 153.75 | 53.10 | 755.08 | 1203.53 | 7221.54 | 35,627.50 | 18,996.47 | 72,647.98 |

CLO⋆ | 153.87 + | 72.76 + | 774 + | 1533.98 + | 7283.41 + | 36,825.19 + | 21,317.18 + | 73,863.33 + |

DEG | 151.42 | 70.38 | 755.76 | 1242.64 | 7205.79 | 36,464.72 | 20,892.47 | 72,679.94 |

DEG⋆ | 154.78 + | 74.21 + | 774.91 + | 1565.75 + | 7307.24 + | 36,863.84 + | 21,825.99 + | 74,039.21 + |

HC | 152.47 | 50.74 | 755.74 | 1157.59 | 7205.75 | 36,500.04 | 21,131.07 | 72,650.26 |

HC⋆ | 157.49 + | 71.79 + | 775.35 + | 1562.12 + | 7278.66 + | 36,707.82 + | 19,646.13 − | 73,661.48 + |

VR | 153.28 | 76.04 | 755.63 | 1494.00 | 7224.53 | 36,581.93 | 21,176.56 | 72,753.74 |

VR⋆ | 153.66 + | 76.06 + | 775.69 + | 1557.41 + | 7284.29 + | 36,672.45 + | 19,709.25 − | 73,621.61 + |

SC | 151.63 | 41.13 | 757.19 | 1147.16 | 7221.50 | 35,647.38 | 19,173.56 | 72,715.13 |

SC⋆ | 157.22 + | 55.81 + | 772.16 + | 1529.56 + | 7333.97 + | 36,750.59 + | 21,600.94 + | 73,680.69 + |

IKS | 150.78 | 63.94 | 757.34 | 1392.66 | 7264.97 | 36,811.25 | 20,041.84 | 73,038.16 |

IKS⋆ | 153.72 + | 71.13 + | 776.06 + | 1504.31 + | 7336.69 + | 36,845.56 + | 18,993.88 − | 74,100.79 + |

TSA | 153.97 | 51.91 | 758.44 | 1154.72 | 7197.31 | 35,597.56 | 18,806.53 | 72,612.16 |

TSA⋆ | 154.47 + | 61.09 + | 776.5 + | 1427.03 + | 7270.59 + | 36,744.88 + | 19,728.44 + | 73,735.09 + |

PBSI | 156.10 | 76.40 | 776.85 | 1556.82 | 7316.78 | 36,868.83 | 21,857.31 | 74,277.69 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Yanez-Sierra, J.; Diaz-Perez, A.; Sosa-Sosa, V. An Efficient Partition-Based Approach to Identify and Scatter Multiple Relevant Spreaders in Complex Networks. *Entropy* **2021**, *23*, 1216.
https://doi.org/10.3390/e23091216

**AMA Style**

Yanez-Sierra J, Diaz-Perez A, Sosa-Sosa V. An Efficient Partition-Based Approach to Identify and Scatter Multiple Relevant Spreaders in Complex Networks. *Entropy*. 2021; 23(9):1216.
https://doi.org/10.3390/e23091216

**Chicago/Turabian Style**

Yanez-Sierra, Jedidiah, Arturo Diaz-Perez, and Victor Sosa-Sosa. 2021. "An Efficient Partition-Based Approach to Identify and Scatter Multiple Relevant Spreaders in Complex Networks" *Entropy* 23, no. 9: 1216.
https://doi.org/10.3390/e23091216