# Weighted h-index for Identifying Influential Spreaders

## Abstract

## 1. Introduction

## 2. Methods

#### 2.1. Measures

#### 2.2. Single Seed SIR Model

#### 2.3. Evaluation Methods

**Kendall**${\tau}_{b}$

**correlation coefficient**is adopted to measure the consistency between two rankings. Given ${R}_{\mu}$, the rank vector of a measure $\mu $, and ${R}_{SIR}$, that of the single seed SIR model, the Kendall ${\tau}_{b}$ correlation coefficient is defined as

**imprecision function**[10] is employed. Given a selection fraction $p\in \left[0,1\right]$, ${V}_{\mathrm{eff}}\left(p\right)$ is the top $p$ fraction of the most influential spreaders, and ${V}_{\mu}\left(p\right)$ is the $pN$ nodes with the highest value of measure $\mu $. Their average spreading scope is denoted by ${S}_{\mathrm{eff}}\left(p\right)$ and ${S}_{\mu}\left(p\right)$, respectively. Then the imprecision function is defined as

**monotonicity**of ranking vector ${R}_{\mu}$ is defined as [14]

## 3. Results

#### 3.1. Accuracy

#### 3.2. Monotonicity

## 4. Discussion

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

**Figure 1.**The imprecision functions ${\epsilon}_{\mu}\left(p\right)$ as a function of node fraction $p\in \left[0.01,0.30\right]$ when $\beta =1.005{\beta}_{c}$ in the twelve real networks. The six measures are $k$ (purple), $B$ (blue), ${k}_{s}$ (cyan), $h$ (green), ${h}^{w}$ (red) and ${s}^{h}$ (black).

**Figure 2.**The imprecision functions ${\epsilon}_{\mu}\left(p\right)$ as a function of node fraction $p\in \left[0.01,0.30\right]$ when $\beta =1.5{\beta}_{c}$ in the twelve real networks. The six measures are $k$ (purple), $B$ (blue), ${k}_{s}$ (cyan), $h$ (green), ${h}^{w}$ (red) and ${s}^{h}$ (black).

**Figure 3.**Kendall ${\tau}_{b}$ correlation coefficient as a function of the infection probability $\beta $ for six real networks. The vertical dash line shows the critical infection rate ${\beta}_{c}$. The six measures are $k$ (purple), $B$ (blue), ${k}_{s}$ (cyan), $h$ (green), ${h}^{w}$ (red) and ${s}^{h}$ (black).

**Figure 4.**Complementary cumulative distribution function (CCDF) of ranking by six different measures. Six measures are $k$ (purple), $B$ (blue), ${k}_{s}$ (cyan), $h$ (green), ${h}^{w}$ (red) and ${s}^{h}$ (black).

**Table 1.**Properties of twelve real networks. $N$: the number of nodes, $M$: the number of edges, ${\beta}_{c}$: the critical infection rate for single seed SIR model, $\langle k\rangle $: the average degree, ${k}_{max}$: the maximum degree, and ${k}_{s,max}$: the maximum k-shell index.

Network | $\mathit{N}$ | $\mathit{M}$ | ${\mathit{\beta}}_{\mathit{c}}$ | $\langle \mathit{k}\rangle $ | ${\mathit{k}}_{\mathit{m}\mathit{a}\mathit{x}}$ | ${\mathit{k}}_{\mathit{s},\mathit{m}\mathit{a}\mathit{x}}$ |
---|---|---|---|---|---|---|

Power Grid | 4941 | 6594 | 0.26 | 2.6691 | 19 | 5 |

AS | 3015 | 5156 | 0.01 | 3.4202 | 590 | 9 |

Gnutella06 | 8717 | 31,525 | 0.07 | 7.2330 | 115 | 9 |

Gnutella08 | 6301 | 20,777 | 0.06 | 6.5948 | 97 | 10 |

C. elegans | 453 | 2025 | 0.02 | 8.9404 | 237 | 10 |

1133 | 5451 | 0.05 | 9.6222 | 71 | 11 | |

PGP | 10,680 | 24,316 | 0.05 | 4.5536 | 205 | 31 |

4039 | 88,234 | 0.01 | 43.6910 | 1045 | 115 | |

Hamster | 2426 | 16,630 | 0.02 | 13.7098 | 273 | 24 |

CondMat | 23,133 | 93,497 | 0.05 | 8.0830 | 279 | 25 |

NetSci | 379 | 914 | 0.12 | 4.8232 | 34 | 9 |

Protein | 1870 | 2203 | 0.15 | 2.3562 | 56 | 5 |

**Table 2.**Kendall ${\tau}_{b}$ correlation coefficient for six measures in twelve real networks. In the single seed SIR model, the infection probability $\beta $ is set to slightly larger than ${\beta}_{c}$, i.e., $\beta =1.005{\beta}_{c}$. The largest ${\tau}_{b}$ in each row is marked in boldface.

Network | ${\mathit{\tau}}_{\mathit{b}}\left(\mathit{k}\right)$ | ${\mathit{\tau}}_{\mathit{b}}\left(\mathit{B}\right)$ | ${\mathit{\tau}}_{\mathit{b}}\left({\mathit{k}}_{\mathit{s}}\right)$ | ${\mathit{\tau}}_{\mathit{b}}\left(\mathit{h}\right)$ | ${\mathit{\tau}}_{\mathit{b}}\left({\mathit{h}}^{\mathit{w}}\right)$ | ${\mathit{\tau}}_{\mathit{b}}\left({\mathit{s}}^{\mathit{h}}\right)$ |
---|---|---|---|---|---|---|

Power Grid | 0.6020 | 0.4238 | 0.5142 | 0.6177 | 0.7466 | 0.8060 |

AS | 0.4478 | 0.2896 | 0.4540 | 0.4522 | 0.3999 | 0.5023 |

Gnutella06 | 0.6715 | 0.6393 | 0.6811 | 0.6940 | 0.7206 | 0.7578 |

Gnutella08 | 0.6549 | 0.5987 | 0.6887 | 0.6913 | 0.7139 | 0.7527 |

C. elegans | 0.5729 | 0.4361 | 0.5969 | 0.5820 | 0.5868 | 0.6289 |

0.7222 | 0.5862 | 0.7486 | 0.7483 | 0.7694 | 0.7868 | |

PGP | 0.6027 | 0.4160 | 0.5707 | 0.6051 | 0.6481 | 0.6566 |

0.6818 | 0.4491 | 0.7135 | 0.7074 | 0.7320 | 0.7575 | |

Hamster | 0.7477 | 0.5773 | 0.7378 | 0.7523 | 0.8390 | 0.8383 |

CondMat | 0.6158 | 0.3884 | 0.6337 | 0.6432 | 0.7312 | 0.7564 |

NetSci | 0.6391 | 0.4071 | 0.5830 | 0.6499 | 0.8256 | 0.8592 |

Protein | 0.5642 | 0.5227 | 0.5598 | 0.5835 | 0.7690 | 0.8246 |

**Table 3.**Kendall ${\tau}_{b}$ correlation coefficient for six measures in twelve real networks. In the single seed SIR model, the infection probability is $\beta =1.5{\beta}_{c}$. The largest ${\tau}_{b}$ in each row is marked in boldface.

Network | $\mathit{\tau}\left(\mathit{k}\right)$ | $\mathit{\tau}\left(\mathit{B}\right)$ | $\mathit{\tau}\left({\mathit{k}}_{\mathit{s}}\right)$ | $\mathit{\tau}\left(\mathit{h}\right)$ | $\mathit{\tau}\left({\mathit{h}}^{\mathit{w}}\right)$ | $\mathit{\tau}\left({\mathit{s}}^{\mathit{h}}\right)$ |
---|---|---|---|---|---|---|

Power Grid | 0.4241 | 0.2921 | 0.3987 | 0.4646 | 0.6206 | 0.6893 |

AS | 0.4148 | 0.2409 | 0.4412 | 0.4237 | 0.5091 | 0.5927 |

Gnutella06 | 0.8135 | 0.7626 | 0.8073 | 0.8438 | 0.8599 | 0.8645 |

Gnutella08 | 0.7214 | 0.6597 | 0.7525 | 0.7627 | 0.7844 | 0.8254 |

C. elegans | 0.5759 | 0.4137 | 0.6140 | 0.5867 | 0.6355 | 0.6842 |

0.7738 | 0.6171 | 0.7964 | 0.8050 | 0.8438 | 0.8601 | |

PGP | 0.5153 | 0.3500 | 0.5118 | 0.5287 | 0.6575 | 0.7099 |

0.6220 | 0.4251 | 0.6660 | 0.6526 | 0.7353 | 0.7875 | |

Hamster | 0.7151 | 0.5727 | 0.7110 | 0.7232 | 0.8484 | 0.8745 |

CondMat | 0.6051 | 0.3942 | 0.6316 | 0.6422 | 0.7714 | 0.8152 |

NetSci | 0.5335 | 0.3443 | 0.5019 | 0.5609 | 0.7747 | 0.8330 |

Protein | 0.4718 | 0.4466 | 0.5147 | 0.5103 | 0.7452 | 0.8429 |

**Table 4.**The monotonicity $M$ of node ranking based on six measures was applied to twelve real networks.

Network | $\mathit{M}\left(\mathit{k}\right)$ | $\mathit{M}\left(\mathit{B}\right)$ | $\mathit{M}\left({\mathit{k}}_{\mathit{s}}\right)$ | $\mathit{M}\left(\mathit{h}\right)$ | $\mathit{M}\left({\mathit{h}}^{\mathit{w}}\right)$ | $\mathit{M}\left({\mathit{s}}^{\mathit{h}}\right)$ |
---|---|---|---|---|---|---|

Power Grid | 0.5927 | 0.8322 | 0.2460 | 0.4776 | 0.8523 | 0.9606 |

AS | 0.4506 | 0.3728 | 0.3734 | 0.4336 | 0.9557 | 0.9803 |

Gnutella06 | 0.8110 | 0.8990 | 0.5625 | 0.7945 | 0.9738 | 0.9986 |

Gnutella08 | 0.7636 | 0.8511 | 0.5990 | 0.7575 | 0.9644 | 0.9979 |

C. elegans | 0.7922 | 0.8743 | 0.6962 | 0.7599 | 0.9301 | 0.9961 |

0.8874 | 0.9400 | 0.8088 | 0.8661 | 0.9914 | 0.9996 | |

PGP | 0.6193 | 0.5099 | 0.4806 | 0.5836 | 0.9495 | 0.9920 |

0.9740 | 0.9855 | 0.9419 | 0.9674 | 0.9838 | 0.9998 | |

Hamster | 0.8980 | 0.7128 | 0.8714 | 0.8892 | 0.9796 | 0.9854 |

CondMat | 0.8524 | 0.4506 | 0.7980 | 0.8268 | 0.9863 | 0.9974 |

NetSci | 0.7642 | 0.3387 | 0.6421 | 0.6976 | 0.9472 | 0.9907 |

Protein | 0.4264 | 0.4053 | 0.2534 | 0.3825 | 0.9084 | 0.9563 |

