Next Article in Journal
Improving Human Motion Classification by Applying Bagging and Symmetry to PCA-Based Features
Previous Article in Journal
A Novel Lattice-Based CP-ABPRE Scheme for Cloud Sharing

Symmetry 2019, 11(10), 1263; https://doi.org/10.3390/sym11101263

Article
Weighted h-index for Identifying Influential Spreaders
1
Key Laboratory of Transport Industry of Big Data Application Technologies for Comprehensive Transport, Beijing Jiaotong University, Beijing 100044, China
2
Beijing Institute of Science and Technology Information, Beijing 100048, China
3
National Science Library, Chinese Academy of Sciences, Beijing 100190, China
*
Author to whom correspondence should be addressed.
Received: 12 September 2019 / Accepted: 8 October 2019 / Published: 10 October 2019

Abstract

:
In this paper, we propose weighted h-index h w and h-index strength s h to measure spreading capability and identify the most influential spreaders. Experimental results on twelve real networks reveal that s h was more accurate and more monotonic than h w and four previous measures in ranking the spreading influence of a node evaluated by the single seed SIR spreading model. We point out that the questions of how to improve monotonicity and how to determine a proper neighborhood range are two interesting future directions.
Keywords:
weighted h-index; influential spreaders; SIR model; complex networks

1. Introduction

Many spreading phenomena, such as the cascading failure [1], rumor diffusion [2], viral advertising [3], etc., in the real world, can be described as the spreading process on complex networks [4,5]. Understanding the significant role that a single node plays provides insights into network structure [6,7,8,9] and function [10,11]. Identifying influential spreaders, whose fundamental problem is how to identify and rank the efficient spreaders in complex networks, has attracted much attention [12].
Degree, the most straightforward indicator, focuses on the number of links per node, and is an often-used measure of the influence of a node in the spreading process. In the early days, people guessed that the node with the maximal degree was the most influential in a network [13,14]. There are also many traditional metrics, such as betweenness [15], closeness [16], Katz [17], etc., which show good performance in distinguishing different influential nodes. However, their computational complexity is unacceptable when we apply them to large-scale networks. Recently, Kitsak et al. [18] found that the most influential spreaders are likely to reside in the core part of a network. However, coreness is a metric based on k-shell decomposition, which assigns many nodes to the same shell. The nodes in the same shell actually have different spreading abilities [19]. The method in Ref. [18] was extended to identify the difference in spreading ability among nodes in the same shell [19,20,21,22,23,24,25,26,27,28,29,30,31]. For example, Zeng et al. proposed a mixed degree decomposition (MDD) method by considering both the residual degree and the exhausted degree [20], but the optimal parameter λ is uncertain. Liu et al. took into account the shortest distance from a target node to the node with the highest coreness and presented a more distinguishable identification of influential spreaders [21]. Bae and Kim proposed the neighborhood coreness centrality, which summed the neighbors’ coreness together [22]. Wang et al. utilized the iteration information in the k-shell decomposition process to differentiate the spreading influence of nodes in the same shell [23]. Xu et al. designed an iterative neighbor information gathering (ING) process to rank the node influence [24]. Other measures [25,26,27,28,29,30,31,32,33], such as information index [32] and subgraph centrality [33], also have good performance in finding important nodes. These above-mentioned measures are structural centralities which measure the importance of a node based mainly on the topological structure of a network [10].
In this paper, we argue that edges can be quite different [34] and have different significances when defining the spreading influence of a node. For example, degree and betweenness have good performance for unweighted networks, and they have been extended to weighted networks [29,35,36]. Evidence theory has been employed to identify influential nodes in weighted networks [37]. The different importance of the direction of a link in spreading was taken into account by considering weighted neighborhood centrality [38] and asymmetric link weights [39].
Recently, Lü et al. discussed that h-index as a good tradeoff between degree and coreness when measuring a node’s influence. However, h-index is not the overall best performer when compared with coreness [40]. Investigated by recent work [41], these three fundamental methods (i.e., degree, h-index, and coreness) have the advantage of assessing the performance of identifying the influential node. Inspired by these factors, we propose a weighted h-index ( h w ) by constructing an operator H on weighted edges. Furthermore, the sum of weighted h-index ( s h ) in the node’s neighborhood defines the spreading influence. To evaluate the effectiveness of the proposed measures, we apply the susceptible-infected-recovered (SIR) model for investigating an epidemic spreading process on twelve real-world networks.
The remainder of this paper is organized as follows. Section 2 reviews several centralities and a detailed description of our methods is presented. Then, the SIR model and the evaluation methods are introduced and twelve real networks are employed to test the accuracy and monotonicity of our methods in Section 3. Finally, in Section 4 a simple conclusion is summarized.

2. Methods

Given a network G ( V ,   E ) with N = | V | nodes and M = | E | edges, e i j represents the edge linking node i and node j , the weight of e i j is w i j = k i k j , and Γ i denotes the set of the neighbors of node i .

2.1. Measures

The degree of node i is defined as k i = | Γ i | , where | · | indicates the number of the elements in a set.
The betweenness of node i is defined as
B i = j V , k V ,     j k n j k ( i ) n j k
where n j k is the number of shortest paths connecting node j and node k , while n j k ( i ) is the number of shortest paths connecting node j and node k and passing through node i .
The k-shell index ( k s ) is obtained by k-shell decomposition. During k-shell decomposition, each node will be assigned to a shell with a specific k s index.
The h-index of node i is defined as [40]
h i = H ( k j 1 , k j 2 , , k j k i ) .
where H ( · ) is an operator, which finds out the maximum integer h such that there are at least h neighbors whose degree is no less than h .
To reflect the spreading influence of a node in a network, we design a new measure called weighted h-index ( h w ). The weight of virtual edges incident with node i is selected to replace the degree of node i in Ref. [40]. Each neighbor j of node i is cloned k j times. Each cloned neighbor j c has a virtual edge e i j c whose weight w i j c = w i j . The edge weights of all the virtual edges incident with node j are grouped in the descending edge weight sequence of node i . This means that w i j , which is the weight of the original edge e i j , will show up k j times in the descending edge weight sequence. Then, weighted h-index is
h i w = H ( w i j 1 , 1 , , w i j 1 , k j 1 , w i j 2 , 1 , , w i j 2 , k j 2 , , w i j k i , 1 , , w i j k i , k j k i ) .
Since the strength of a node [35] is often adopted in identifying the influential nodes [14,39,42], we define the h-index strength of node i as
s i h = j Γ ( i ) h j w .

2.2. Single Seed SIR Model

We employed the single seed SIR model to investigate the spreading process on complex networks. Initially, all nodes in a network are in the susceptible state (S) except for the seed node, which is in the infectious state (I). At each time step, the nodes in the I state infect their neighbors in the S state with probability β , then their state changes from the infectious state (I) to the recovered state (R), which means that they cannot be infected again. The spreading process will stop when any node in the network is in state I. The number of nodes in the recovered state reflects the final infected scope, and this is adopted to measure the infection strength of the seed node.
The higher infection probability β , the larger the population that will be infected, wherever it locates in a network. According to the previous study [43], we know the critical infection probability β c in a network approximately equals to k / k 2 . In this study, the infection probability β is set to larger than β c .
When node i is the seed for the single seed SIR model, its spreading scope ( S i ) is quantified by the average number of recovered nodes over 200 independent simulations.

2.3. Evaluation Methods

The Kendall τ b correlation coefficient is adopted to measure the consistency between two rankings. Given R μ , the rank vector of a measure μ , and R S I R , that of the single seed SIR model, the Kendall τ b correlation coefficient is defined as
τ b ( R μ ,   R S I R ) = n c n d ( n 0 n 1 ) ( n 0 n 2 ) ,
where n c is the number of concordant pairs, n d is the number of discordant pairs, and n 0 = n ( n 1 ) / 2 , n 1 = i t i ( t i 1 ) / 2 , n 2 = j u j ( u j 1 ) / 2 , where n is the size of rank vectors and t i and u j are the number of tied values in the i th and j th group of ties, respectively. Since all measures are evaluated by R S I R , τ b ( R μ ,   R S I R ) will be denoted by τ b ( μ ) for short.
To quantify the accuracy of a measure, the imprecision function [10] is employed. Given a selection fraction p [ 0 ,   1 ] , V eff ( p ) is the top p fraction of the most influential spreaders, and V μ ( p ) is the p N nodes with the highest value of measure μ . Their average spreading scope is denoted by S eff ( p ) and S μ ( p ) , respectively. Then the imprecision function is defined as
ϵ μ ( p ) = 1 S μ ( p ) S eff ( p ) .
A smaller ϵ μ means that the corresponding measure μ performs more accurately in identifying the most influential spreaders. The measure μ , as discussed in this work, could be k , B , k s , h , h w , and s h .
The monotonicity of ranking vector R μ is defined as [14]
M ( R μ ) = ( 1 r ϵ R μ N ( r ) ( N ( r ) 1 ) N ( N 1 ) ) 2 ,
where N is the size of ranking vector R μ , which is equal to the number of nodes in a network in this paper. Furthermore, N ( r ) is the number of nodes with the same rank r in R μ . If every node is given a distinctive rank, then M ( R μ ) = 1 . This means that R μ is a complete monotonic ranking, and each node can therefore be differentiated from others. When M ( R μ ) = 0 , all nodes have the same rank, and ranking nodes by the measure μ cannot distinguish nodes at all. A measure with perfect monotonicity in ranking the nodes of a network will rank each node via an exclusive rank, which means that each node has a different rank from any other nodes.

3. Results

Setting the ranking of nodes by their spreading scope, with the single seed SIR model as the benchmark, we evaluated the accuracy and the monotonicity of ranking nodes by degree ( k ), node betweenness ( B ), k-shell index ( k s ), h-index of node ( h ), weighted h-index of node ( h w ) and h-index strength of node ( s h ) on twelve real networks. The twelve real networks were one power grid network (Power Grid [44]), one computer network in the autonomous systems level (AS), two file-sharing networks (Gnutella06, Gnutella08), one metabolic network (C. elegans [45]), one email communication network (Email [46]), three social networks (PGP [47], Facebook [48] and Hamster [49]), two collaboration networks (CondMat [50] and NetSci [51]) and one protein–protein interaction network (Protein [52]). Their primary features are summarized in Table 1.

3.1. Accuracy

Table 2 shows that s h obtained eleven highest scores, and h w obtained one highest score in the consistency between ranking by measures and the averaged spreading scope. Compared to the four previous measures, s h and h w were better in twelve and ten real networks, respectively. Table 2 suggests that s h was a better measure than h w in most networks (except for Hamster). It is noteworthy that similar results can be found in Table 3 when β = 1.5 β c .
The imprecision functions of the ranking by six measures are shown in Figure 1. The imprecision of s h (black) and h w (red) was less than 0.1 for all p [ 0.01 ,   0.3 ] in all cases. To date, the imprecision of s h is even less than 0.05 in eleven networks except for Power Grid. Furthermore, s h is always the most accurate measure in identifying the influential nodes in a network when compared to the other five measures. Similar results are shown in Figure 2 (where β = 1.5 β c ), which proves that s h is the most accurate measure in identifying the influential spreaders.
To investigate the robustness of the accuracy of the ranking based on the proposed measures, we show the Kendall τ b correlation coefficient as a function of the infection probability β on six selected real networks in Figure 3. When the infection probability β is set around β c , s h and h w show a significantly higher robustness of accuracy. For Power Grid, NetSci and Protein networks in the whole range of β , s h is more accurate than h w , and they are both more accurate than the other four measures. When β < β c , the spreading is typically confined to the neighborhood of the initially infected seed node. Since the seed node with a larger degree has a relatively larger neighborhood, it will infect more nodes than other seed nodes with a smaller degree. This is why degree ( k ) always achieves the most significant τ values when β < β c for C. elegans and Email. When β is increasing, s h and h w perform better gradually. The results in Figure 3 demonstrate that s h is a better measure of the ranking’s robustness for identifying the influential spreaders in a network.

3.2. Monotonicity

Table 4 shows the monotonicity of the ranking based on six measures. For all the networks, ranking nodes based on s h achieve the best performance, and h w gets the second best performance. Both h w and s h are more competitive measures than k , B , k s , and h from monotonicity’s perspective in the global scale of a network.
To depict the monotonicity in the local scale of a network, the node distribution between ranks is plotted by the complementary cumulative distribution function (CCDF) in Figure 4. The more monotonic a measure is, the slower the distribution in CCDF decreases, and the more significant in distinguishing influential spreaders (nodes). In all twelve real networks, k , k s , and h drop quickly at the left beginning, which means that they are poor at distinguishing nodes from each other based on the spreading influence. s h decreases slower than h w , and both s h and h w are slower than k , k s , and h in all twelve real networks.
Undoubtedly, B has a competitive performance compared with s h and h w . For the five real networks including Power Grid, PGP, Gnutella06, Gnutella08, and Protein, B decreases slower than s h and h w , which means that B shows better monotonicity than s h and h w . For the other seven networks, s h always has the best monotonicity. AS is a typical example (Figure 4b), where B decreases slower than s h at the beginning. However, the black line ( s h ) goes across the blue line ( B ), which means that ranking nodes by s h will distinguish nodes into more different ranks, and proves that s h has a better monotonicity than B .

4. Discussion

Spreading like an epidemic, information is a ubiquitous process in social, biological, and technological networks. Identifying influential nodes in the spreading process, as one of the primary problems in network science, remains an open issue. In this study, we propose the weighted h-index h w and the h-index strength s h to identify the influential nodes in the spreading process on complex networks. To evaluate the accuracy and monotonicity of the proposed measures, the single seed SIR model was employed and simulated on twelve real networks. The results show that compared to the four previous measures, h-index strength, s h , was the best measure and weighted h-index, h w , was the second best measure to identify the influential nodes in single seed SIR spreading process.
Although the weighted h-index h w and the h-index strength s h perform better in most of the conditions, the insufficient monotonicity of the local scale of a network cannot be neglected. The issue of how to improve monotonicity in the local scale of a network requires further studies. Since we only extended a weighted edge to virtual edges according to the degree of the incident neighbor node, how to determine a proper neighborhood range is another noteworthy topic for the future.
Note that the criteria for essential nodes are diverse. For example, the node acts as an articulation point, damage to which will destroy a network into two or more components, and it may therefore be totally unimportant in the spreading process. Although the proposed measures are outstanding in identifying the vital node for the single seed SIR spreading process, it is necessary to conduct further experiments, if one wishes to apply our methods to other situations.

Author Contributions

Conceptualization, Z.G.; methodology, L.G., S.Y., M.L., Z.S. and Z.G.; investigation, L.G. and S.Y.; writing—original draft preparation, L.G. and S.Y.; writing—review and editing, M.L. and Z.S.; supervision, L.G.; funding acquisition, L.G., M.L. and Z.G.

Funding

The authors thank for support from the Fundamental Research Funds for the Central Universities (2015JBM058).

Acknowledgments

L.G. and S.Y. are partially supported by the National Natural Science Foundation of China (No.71571017, No.91646124, No.71621001, and No. 91746201). M.L. is partially supported by the National Natural Science Foundation of China (No. 71974017). Z.G. is partially supported by the National Natural Science Foundation of China (No. 71621001).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Motter, A.E. Cascade control and defense in complex networks. Phys. Rev. Lett. 2004, 93, 098701. [Google Scholar] [CrossRef] [PubMed]
  2. Moreno, Y.; Nekovee, M.; Pacheco, A.F. Dynamics of rumor spreading in complex networks. Phys. Rev. E 2004, 69, 066130. [Google Scholar] [CrossRef] [PubMed]
  3. Leskovec, J.; Adamic, L.A.; Huberman, B.A. The dynamics of viral marketing. ACM Trans. Web 2007, 1, 5. [Google Scholar] [CrossRef]
  4. Shen, Z.; Wang, W.; Fan, Y.; Di, Z.; Lai, Y.C. Reconstructing propagation networks with natural diversity and identifying hidden sources. Nat. Commun. 2014, 5, 4323. [Google Scholar] [CrossRef] [PubMed]
  5. Shen, Z.; Cao, S.; Wang, W.; Di, Z.; Stanley, H.E. Locating the source of diffusion in complex networks by time-reversal backward spreading. Phys. Rev. E 2016, 93, 032301. [Google Scholar] [CrossRef]
  6. Gao, L.; Gao, G.; Ma, D.; Xu, L. Coreness variation rule and fast updating algorithm for dynamic networks. Symmetry 2019, 11, 477. [Google Scholar] [CrossRef]
  7. Gao, L.; Shan, X.; Qin, Y.; Yu, S.; Xu, L.; Gao, Z.Y. Scaling tunable network model to reproduce the density-driven superlinear relation. Chaos 2018, 28, 033122. [Google Scholar] [CrossRef]
  8. Qin, Y.; Zhao, Z.; Cai, S.; Gao, L.; Stanley, H.E. Dual-induced multifractality in online viewing activity. Chaos 2018, 28, 013114. [Google Scholar] [CrossRef]
  9. Song, J.; Gao, L.; Shan, X. Historical street network GIS datasets of Beijing within 5th ring-road. China Sci. Data 2017, 2, 114. [Google Scholar] [CrossRef]
  10. Lv, L.Y.; Chen, D.B.; Ren, X.L.; Zhang, Q.M.; Zhang, Y.C.; Zhou, T. Vital nodes identification in complex networks. Phys. Rep. 2016, 650, 1–63. [Google Scholar] [CrossRef]
  11. Zeng, G.; Li, D.; Guo, S.; Gao, L.; Gao, Z.Y.; Stanley, H.E.; Havlin, S. Switch between critical percolation modes in city traffic dynamics. Proc. Natl. Acad. Sci. USA 2019, 116, 23–28. [Google Scholar] [CrossRef] [PubMed]
  12. Yu, S.; Gao, L.; Wang, Y.; Xu, L.; Gao, Z.Y. Identifying influential spreaders based on indirect spreading in neighborhood. Physica A 2019, 523, 418–425. [Google Scholar] [CrossRef]
  13. Albert, R.; Jeong, H.; Barabási, A.-L. Error and attack tolerance of complex networks. Nature 2000, 406, 378–382. [Google Scholar] [CrossRef] [PubMed]
  14. Pastor-satorras, R.; Vespignani, A. Epidemic spreading in scale-free networks. Phys. Rev. Lett. 2001, 86, 3200–3203. [Google Scholar] [CrossRef] [PubMed]
  15. Freeman, L.C. A set of measures of centrality based on betweenness. Sociometry 1977, 40, 35–411. [Google Scholar] [CrossRef]
  16. Katz, L. A new status index derived from sociometric analysis. Psychometrika 1953, 18, 39–43. [Google Scholar] [CrossRef]
  17. Sabidussi, G. The centrality index of a graph. Psychometrika 1966, 31, 581–603. [Google Scholar] [CrossRef]
  18. Kitsak, M.; Gallos, L.K.; Havlin, S.; Liljeros, F.; Muchnik, L.; Stanley, H.E.; Makse, H.A. Identification of influential spreaders in complex networks. Nat. Phys. 2010, 6, 888–893. [Google Scholar] [CrossRef]
  19. Chen, D.B.; Lu, L.Y.; Shang, M.S.; Zhang, Y.-C.; Zhou, T. Identifying influential nodes in complex networks. Physica A 2019, 391, 1777–1787. [Google Scholar] [CrossRef]
  20. Zeng, A.; Zhang, C.J. Ranking spreaders by decomposing complex networks. Phys. Lett. A 2013, 377, 1031–1035. [Google Scholar] [CrossRef]
  21. Liu, J.-G.; Ren, Z.-M.; Guo, Q. Ranking the spreading influence in complex networks. Physica A 2013, 392, 4154–4159. [Google Scholar] [CrossRef]
  22. Bae, J.; Kim, S. Identifying and ranking influential spreaders in complex networks by neighborhood coreness. Physica A 2014, 395, 549–559. [Google Scholar] [CrossRef]
  23. Wang, Y.; Zhao, J.K.; Xi, C.J.; Du, Z.X. Fast ranking influential nodes in complex networks using a k-shell iteration factor. Physica A 2016, 461, 171–181. [Google Scholar] [CrossRef]
  24. Xu, S.; Wang, P.; Lü, J.H. Iterative neighbour-information gathering for ranking nodes in complex networks. Sci. Rep. 2017, 7, 41321. [Google Scholar] [CrossRef] [PubMed]
  25. Liu, Y.; Tang, M.; Zhou, T.; Do, Y.H. Improving the accuracy of the k-shell method by removing redundant links: From a perspective of spreading dynamics. Sci. Rep. 2015, 5, 13172. [Google Scholar] [CrossRef] [PubMed]
  26. Liu, Y.; Tang, M.; Zhou, T.; Do, Y. Core-like groups result in invalidation of identifying super-spreader by k-shell decomposition. Sci. Rep. 2015, 5, 9602. [Google Scholar] [CrossRef]
  27. Wang, Z.X.; Du, C.J.; Fan, J.P.; Xing, Y. Ranking influential nodes in social networks based on node position and neighborhood. Neurocomputing 2017, 260, 466–477. [Google Scholar] [CrossRef]
  28. Yang, F.; Zhang, R.S.; Yang, Z.; Hu, R.J.; Li, M.T.; Yuan, Y.N.; Li, K.Q. Identifying the most influential spreaders in complex networks by an Extended Local K-Shell Sum. Int. J. Mod. Phys. C 2017, 28, 1750014. [Google Scholar] [CrossRef]
  29. Al-garadi, M.A.; Varathan, K.D.; Ravana, S.D. Identification of influential spreaders in online social networks using interaction weighted K-core decomposition method. Physica A 2017, 468, 278–288. [Google Scholar] [CrossRef]
  30. Hou, B.N.; Yao, Y.P.; Liao, D.S. Identifying all-around nodes for spreading dynamics in complex networks. Physica A 2012, 391, 4012–4017. [Google Scholar] [CrossRef]
  31. Ren, Z.M.; Liu, J.-G.; Shao, F.; Hu, Z.-L.; Guo, Q. Analysis of the spreading influence of the nodes with minimum K-shell value in complex networks. Acta. Phys. Sin. 2013, 62, 108902. [Google Scholar] [CrossRef]
  32. Poulin, R.; Boily, M.-C.; Mâsse, B.R. Dynamical systems to define centrality in social networks. Soc. Netw. 2000, 22, 187–220. [Google Scholar] [CrossRef]
  33. Estrada, E.; Rodríguez-Velázquez, J.A. Subgraph centrality in complex networks. Phys. Rev. E 2005, 71, 056103. [Google Scholar] [CrossRef] [PubMed]
  34. Grady, D.; Thiemann, C.; Brockmann, D. Robust classification of salient links in complex networks. Nat. Commun. 2012, 3, 864. [Google Scholar] [CrossRef] [PubMed]
  35. Opsahl, T.; Agneessens, F.; Skvoretz, J. Node centrality in weighted networks: Generalizing degree and shortest paths. Soc. Netw. 2010, 32, 245–251. [Google Scholar] [CrossRef]
  36. Chu, X.W.; Zhang, Z.Z.; Guan, J.H.; Zhou, S.G. Epidemic spreading with nonlinear infectivity in weighted scale-free networks. Physica A 2011, 390, 471–481. [Google Scholar] [CrossRef]
  37. Wei, D.J.; Deng, X.Y.; Zhang, X.G.; Deng, Y.; Mahadevan, S. Identifying influential nodes in weighted networks based on evidence theory. Physica A 2013, 392, 2564–2575. [Google Scholar] [CrossRef]
  38. Wang, J.Y.; Hou, X.N.; Li, K.Z.; Ding, Y. A novel weight neighborhood centrality algorithm for identifying influential spreaders in complex networks. Physica A 2017, 475, 88–105. [Google Scholar] [CrossRef]
  39. Liu, Y.; Tang, M.; Do, Y.; Hui, P.M. Accurate ranking of influential spreaders in networks based on dynamically asymmetric link weights. Phys. Rev. E 2017, 96, 022323. [Google Scholar] [CrossRef]
  40. Lu, L.Y.; Zhou, T.; Zhang, Q.-M.; Stanley, H.E. The H-index of a network node and its relation to degree and coreness. Nat. Commun. 2016, 7, 10168. [Google Scholar] [CrossRef]
  41. Yu, S.B.; Gao, L.; Wang, Y.-F. Finding the proper node ranking method for complex networks. arXiv 2018, arXiv:1812.10616. [Google Scholar]
  42. Ma, L.-L.; Ma, C.; Zhang, H.-F.; Wang, B.-H. Identifying influential spreaders in complex networks based on gravity formula. Physica A 2016, 451, 205–212. [Google Scholar] [CrossRef]
  43. Castellano, C.; Pastor-Satorras, R. Thresholds for epidemic spreading in Networks. Phys. Rev. Lett. 2010, 105, 218701. [Google Scholar] [CrossRef] [PubMed]
  44. Watts, D.J.; Strogatz, S.H. Collective dynamics of ‘small-world’ networks. Nature 1998, 393, 440–442. [Google Scholar] [CrossRef] [PubMed]
  45. Duch, J.; Arenas, A. Community identification using extremal optimization. Phys. Rev. E 2005, 72, 027104. [Google Scholar] [CrossRef] [PubMed]
  46. Guimerà, R.; Danon, L.; Díaz-Guilera, A.; Giralt, F.; Arenas, A. Self-similar community structure in a network of human interactions. Phys. Rev. E 2003, 68, 065103. [Google Scholar] [CrossRef] [PubMed]
  47. Boguñá, M.; Pastor-Satorras, R.; Díaz-Guilera, A.; Arenas, A. Models of social networks based on social distance attachment. Phys. Rev. E 2004, 70, 056122. [Google Scholar] [CrossRef]
  48. Leskovec, J.; McAuley, J.J. Learning to discover social circles in ego networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems—Volume 1, Lake Tahoe, NV, USA, 3–6 December 2012; pp. 539–547. [Google Scholar]
  49. Kunegis, J. Hamsterster Full Network Dataset. KONECT. 2015. Available online: http://konect.uni-koblenz.de/networks/petster-hamster (accessed on 1 October 2018).
  50. Leskovec, J.; Kleinberg, J.; Faloutsos, C. Graph evolution: Densification and shrinking diameters. ACM Trans. Knowl. Discov. Data 2007, 1, 1556–4681. [Google Scholar] [CrossRef]
  51. Newman, M.E.J. Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 2006, 74, 036104. [Google Scholar] [CrossRef]
  52. Jeong, H.; Mason, S.P.; Barabási, A.-L.; Oltvai, Z.N. Lethality and centrality in protein networks. Nature 2001, 411, 41–42. [Google Scholar] [CrossRef]
Figure 1. The imprecision functions ε μ ( p ) as a function of node fraction p [ 0.01 ,   0.30 ] when β = 1.005 β c in the twelve real networks. The six measures are k (purple), B (blue), k s (cyan), h (green), h w (red) and s h (black).
Figure 1. The imprecision functions ε μ ( p ) as a function of node fraction p [ 0.01 ,   0.30 ] when β = 1.005 β c in the twelve real networks. The six measures are k (purple), B (blue), k s (cyan), h (green), h w (red) and s h (black).
Symmetry 11 01263 g001
Figure 2. The imprecision functions ε μ ( p ) as a function of node fraction p [ 0.01 ,   0.30 ] when β = 1.5 β c in the twelve real networks. The six measures are k (purple), B (blue), k s (cyan), h (green), h w (red) and s h (black).
Figure 2. The imprecision functions ε μ ( p ) as a function of node fraction p [ 0.01 ,   0.30 ] when β = 1.5 β c in the twelve real networks. The six measures are k (purple), B (blue), k s (cyan), h (green), h w (red) and s h (black).
Symmetry 11 01263 g002
Figure 3. Kendall τ b correlation coefficient as a function of the infection probability β for six real networks. The vertical dash line shows the critical infection rate β c . The six measures are k (purple), B (blue), k s (cyan), h (green), h w (red) and s h (black).
Figure 3. Kendall τ b correlation coefficient as a function of the infection probability β for six real networks. The vertical dash line shows the critical infection rate β c . The six measures are k (purple), B (blue), k s (cyan), h (green), h w (red) and s h (black).
Symmetry 11 01263 g003
Figure 4. Complementary cumulative distribution function (CCDF) of ranking by six different measures. Six measures are k (purple), B (blue), k s (cyan), h (green), h w (red) and s h (black).
Figure 4. Complementary cumulative distribution function (CCDF) of ranking by six different measures. Six measures are k (purple), B (blue), k s (cyan), h (green), h w (red) and s h (black).
Symmetry 11 01263 g004
Table 1. Properties of twelve real networks. N : the number of nodes, M : the number of edges, β c : the critical infection rate for single seed SIR model, k : the average degree, k m a x : the maximum degree, and k s , m a x : the maximum k-shell index.
Table 1. Properties of twelve real networks. N : the number of nodes, M : the number of edges, β c : the critical infection rate for single seed SIR model, k : the average degree, k m a x : the maximum degree, and k s , m a x : the maximum k-shell index.
Network N M β c k k m a x k s , m a x
Power Grid494165940.262.6691195
AS301551560.013.42025909
Gnutella06871731,5250.077.23301159
Gnutella08630120,7770.066.59489710
C. elegans45320250.028.940423710
Email113354510.059.62227111
PGP10,68024,3160.054.553620531
Facebook 403988,2340.0143.69101045115
Hamster242616,6300.0213.709827324
CondMat23,13393,4970.058.083027925
NetSci3799140.124.8232349
Protein187022030.152.3562565
Table 2. Kendall τ b correlation coefficient for six measures in twelve real networks. In the single seed SIR model, the infection probability β is set to slightly larger than β c , i.e., β = 1.005 β c . The largest τ b in each row is marked in boldface.
Table 2. Kendall τ b correlation coefficient for six measures in twelve real networks. In the single seed SIR model, the infection probability β is set to slightly larger than β c , i.e., β = 1.005 β c . The largest τ b in each row is marked in boldface.
Network τ b ( k ) τ b ( B ) τ b ( k s ) τ b ( h ) τ b ( h w ) τ b ( s h )
Power Grid0.60200.42380.51420.61770.74660.8060
AS0.44780.28960.45400.45220.39990.5023
Gnutella060.67150.63930.68110.69400.72060.7578
Gnutella080.65490.59870.68870.69130.71390.7527
C. elegans0.57290.43610.59690.58200.58680.6289
Email0.72220.58620.74860.74830.76940.7868
PGP0.60270.41600.57070.60510.64810.6566
Facebook0.68180.44910.71350.70740.73200.7575
Hamster0.74770.57730.73780.75230.83900.8383
CondMat0.61580.38840.63370.64320.73120.7564
NetSci0.63910.40710.58300.64990.82560.8592
Protein0.56420.52270.55980.58350.76900.8246
Table 3. Kendall τ b correlation coefficient for six measures in twelve real networks. In the single seed SIR model, the infection probability is β = 1.5 β c . The largest τ b in each row is marked in boldface.
Table 3. Kendall τ b correlation coefficient for six measures in twelve real networks. In the single seed SIR model, the infection probability is β = 1.5 β c . The largest τ b in each row is marked in boldface.
Network τ ( k ) τ ( B ) τ ( k s ) τ ( h ) τ ( h w ) τ ( s h )
Power Grid0.42410.29210.39870.46460.62060.6893
AS0.41480.24090.44120.42370.50910.5927
Gnutella060.81350.76260.80730.84380.85990.8645
Gnutella080.72140.65970.75250.76270.78440.8254
C. elegans0.57590.41370.61400.58670.63550.6842
Email0.77380.61710.79640.80500.84380.8601
PGP0.51530.35000.51180.52870.65750.7099
Facebook 0.62200.42510.66600.65260.73530.7875
Hamster0.71510.57270.71100.72320.84840.8745
CondMat0.60510.39420.63160.64220.77140.8152
NetSci0.53350.34430.50190.56090.77470.8330
Protein0.47180.44660.51470.51030.74520.8429
Table 4. The monotonicity M of node ranking based on six measures was applied to twelve real networks.
Table 4. The monotonicity M of node ranking based on six measures was applied to twelve real networks.
Network M ( k ) M ( B ) M ( k s ) M ( h ) M ( h w ) M ( s h )
Power Grid0.59270.83220.24600.47760.85230.9606
AS0.45060.37280.37340.43360.95570.9803
Gnutella060.81100.89900.56250.79450.97380.9986
Gnutella080.76360.85110.59900.75750.96440.9979
C. elegans0.79220.87430.69620.75990.93010.9961
Email0.88740.94000.80880.86610.99140.9996
PGP0.61930.50990.48060.58360.94950.9920
Facebook 0.97400.98550.94190.96740.98380.9998
Hamster0.89800.71280.87140.88920.97960.9854
CondMat0.85240.45060.79800.82680.98630.9974
NetSci0.76420.33870.64210.69760.94720.9907
Protein0.42640.40530.25340.38250.90840.9563

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Back to TopTop