Abstract
Community detection can help analyze the structural features and functions of complex networks, and plays important roles in many aspects such as project recommendation and network evolution analysis. Therefore, community detection has always been a hot topic in the field of complex networks. Although various community-detection methods have been proposed, how to improve their accuracy and efficiency is still an ambition pursued by researchers. In view of this, this paper proposes a community-detection method for complex networks based on node influence analysis. First, the influence of nodes is represented as a vector composed by neighborhood degree centrality, betweennes centrality and clustering coefficient. Then, Pareto dominance is used to rank the influence of nodes. After that, the community centers are selected by comprehensively considering the node influence and crowding degree. Finally, the remaining nodes are allocated to different communities using a labeling algorithm. The proposed method in this paper is applied to several actual networks. The comparison results with other methods demonstrate the effectiveness of the proposed method.
1. Introduction
Many complex systems in reality can be modeled as a complex network, such as power networks, aviation networks, transportation networks, social networks, etc. [1,2]. For undirected networks, this relationship satisfies transitivity and symmetry. Complex networks often exhibit obvious structural properties, such as small-world properties, scale-free properties and community structure properties [3]. Detecting the community structure of the network is important to help study its organizational functions and uncover the hidden inner connections between nodes [4].
Community detection is to partite the entire network into some sub-networks such that each sub-network is tightly connected to the others internally and sparsely connected between different sub-networks [5]. Community detection has very important applications in many fields, such as identifying criminal gangs [6] and product recommendation [7]. Although the community-detection problem of complex networks has been widely studied and a variety of classical algorithms have emerged [8,9], the quality of community detection of many existing methods largely depends on the selection of seed nodes, i.e., community centers. The search for more efficient and accurate community-detection methods is still a hot issue of research.
Generally speaking, the central nodes of communities should be individuals with high influence in the community. If influential nodes of the network can be identified, then the community centers can be determined based on these influential nodes, which is expected to greatly improve the efficiency and accuracy of community detection.
In view of this, this paper proposes a community-detection method of complex networks based on node influence analysis. Firstly, the influence of a node is represented by a vector consisting of three indicators. Then, Pareto dominance is used to divide the nodes into several levels. After that, the community centers are selected taking into account the node influence and the crowding degree. Finally, a labeling algorithm is proposed to partition nodes into various communities. The comparison results with other methods have verified the effectiveness of our method.
This paper has the following three-fold innovations and contributions:
- A node impact-ranking method based on Pareto dominance is proposed, which comprehensively considers multiple node influence indicators. We also propose a concept of neighborhood degree centrality, which is more suitable for detecting community centers than conventional degree centrality.
- A community-detection method based on node influence and crowding degree is presented. The algorithm first comprehensively considers node influence and crowding degree to determine community centers, and then assigns nodes to different communities by a label method.
- The proposed method is applied to different types of actual networks, and the experimental results demonstrate the superiority of the proposed method.
The remainder of this paper is organized as follows: Section 2 gives the related work; the node influence-ranking method based on Pareto dominance is presented in Section 3; Based on the node influence analysis, Section 4 proposes the community-detection method. The experiment to validate the proposed method is put forward in Section 5. Finally, Section 6 shortly summarizes the work of this paper.
2. Related Work
Since Girven and Newman introduced the concept of community in 2002 [8], many community-detection methods have been proposed [10]. Among them, the community-detection methods for undirected complex networks have been widely studied, which can be broadly classified into graph-partitioning methods [11,12], clustering methods [13,14] and methods based on heuristic algorithms [15,16]. Some of these methods require early identification of community centers, while others do not. In general, people mainly use the influence of nodes to determine the community centers [17]. Therefore, we will elaborate on the related work for both general community-detection methods and node influence-based community-detection methods.
2.1. Graph-Partitioning Method
The graph-partitioning method divides the nodes of a network into groups of subgraphs with the rule that the number of connected edges between the subgraphs is minimal, the most famous of which is the GN (Girvan–Newman) algorithm proposed by Girven and Newman [8]. This type of algorithm requires the number of communities to be specified in advance and is not applicable to community detection in general networks. Belim and Larionov [18] used a modularity function to evaluate the quality of community-partitioning and proposed a greedy algorithm to find communities. Mourchid et al. [19] proposed a new community-partitioning algorithm by maximizing a new centrality metric, local Fiedler vector centrality (LFVC), at each stage. Most graph-partitioning problems are NP-hard, and therefore, efforts have been devoted to finding algorithmic designs that are closer to the optimal solution. Zhang et al. [20] proposed a large-scale community-detection method based on core node and layer-by-layer label propagation (CNLP).
2.2. Clustering Method
The clustering method mainly clusters nodes based on some kind of metric that measures the result of community partition. The most commonly used metric is the modularity function proposed by Newman and Girvan [21]. Newman [22] first proposed a community-detection algorithm based on modularity optimization, i.e., fast Newman. Blondel et al. [23] proposed the Louvain algorithm based on hierarchical clustering, which is divided into two stages: the first stage forms the initial community by merging nodes, and the second stage obtains the final community-detection result by merging the initial community. Both stages use modularity as the optimization function. Besides modularity, some scholars also use other improved similarity functions for node clustering. Lv et al. [24] proposed a community-detection algorithm based on proximity ranking and signal transmission, which ranks nodes by proximity centrality, calculates the similarity between nodes based on the idea of signal transmission and then assigns each node to a candidate community.
2.3. Heuristic Algorithm-Based Method
The heuristic algorithm-based approach regards the community-detection problem as an optimization one and then solves it using some heuristic algorithm. The classical heuristic algorithms are genetic algorithm [25], particle swarm algorithm [26], simulated annealing algorithm [27], label propagation algorithm (CIDLPA) [28], etc. In terms of objective function construction, there is no widely accepted criterion for community quality evaluation, and most optimization objectives are to minimize or maximize some structural metrics, such as modularity, triangle participation ratio or conductivity of the community [29].
2.4. Node Influence-Based Method
In actual complex networks, the influences of different individuals varies. The nodes with high influence are often the leaders of various communities. Therefore, it is natural to first determine the community centers based on the influence of nodes, and then assign other nodes to the community centers.
Zhao et al. [30] introduced an algorithm for detecting overlapping communities by utilizing node influence propagation. Node influence refers to a node’s capacity to impact other nodes, which is determined by both the node’s location and its activity behavior. Liu et al. [31] proposed an algorithm for discovering communities that integrates the influence of seed nodes and the similarity of neighborhoods. Xu et al. [32] introduced a community-detection algorithm that utilizes node influence and node similarity for clustering. The algorithm comprises three fundamental steps: identification of the central nodes based on node influence, selection of candidate neighbors based on node similarity to expand the community and merging of small communities based on community similarity. Ma et al. [33] put forward a new method to identify the most influential nodes which are considered as cores of communities and achieve the initial communities. Subsequently, through an expansion strategy, unassigned nodes are incorporated into the initial communities to facilitate their growth and enlargement. In order to maximize modularity, Boroujeni and Soleimani [34] attempted to identify influential nodes, and then detect communities by estimating their influence domains.
The above research results have provided various ideas and methods for the community-detection problem of complex networks, which greatly enriched the theory of complex networks and effectively improved the accuracy of community detection. However, the search for more efficient and accurate community-detection methods is still the focus of attention in the field of complex network, and the in-depth research in this popular direction will continue to be promoted on the basis of the existing research results.
3. Analysis of Node Influence Based on Pareto Dominance
We first provide some basic knowledge on networks that need to be used. Then, the node’s influence vector is constructed. Finally, the node influence-ranking method based on Pareto dominance is presented.
3.1. Basic Concepts on Network
Generally, a network can be represented by a graph , where is the node set of the network, and n is the number of nodes in the network; E is called the edge set of the network, and represents the number of edges in the network.
If there is an edge between and , then one of them is called a neighbor of the other one. The edge is said to be incident with and . The neighbor set of is denoted as . The degree of refers to the number of edges of G incident with , denoted as . For a graph without loops, .
A path P in G is a non-null sequence , in which all nodes are distinct. k is called the length of P. Nodes and are said to be connected if there is a path in G. Distance of and in G is the smallest length of the path. If there is no path between and , then it is specified that . The distance between all nodes can form a matrix called the distance matrix , where . We can use the Dijkstra’ algorithm to find the shortest path from any node to all other nodes in a network [35].
3.2. Node’s Influence Vector
Some nodes play a vital role in the complex network. The influential and important nodes have a great impact on the structure and function of the network. Therefore, it has important theoretical significance and application value to evaluate the node influence of complex networks, and then explore the key nodes.
This paper synthesizes three commonly used indicators to evaluate the node influence of complex networks. In fact, there are many indicators that can measure the centrality of nodes. We have comprehensively considered the focus and computational cost of different indicators and selected three measurement methods, i.e., neighborhood degree centrality, closeness centrality and clustering coefficient.
3.2.1. Neighborhood Degree Centrality
Generally speaking, in social networks, the greater the influence of an individual, the stronger his social relationships, and therefore the greater its degree will be. So, people measure the influence of nodes based on degrees. The degree centrality of node is defined as [36]:
Degree centrality measures the degree of connection between a node and all other nodes in the network. However, some communities contain large number of nodes, and the degrees of core nodes are relatively high. But some communities have fewer nodes, and the degrees of core nodes are relatively small. Therefore, it is unreasonable to analyze the influence of nodes using traditional degree centrality. To address this issue, we propose the concept of neighborhood degree centrality, which is defined as follows:
where , and is the neighbor set of . Relative degree centrality measures the relative size of a node’s degree within its neighborhood. If a node has significant influence, its relative degree centrality tends to be closer to 1; conversely, the relative degree centrality approaches 0 if its influence is small.
3.2.2. Betweenness Centrality
Betweenness centrality quantifies the number of times a node acts as a bridge along the shortest path between two other nodes. The betweenness centrality of node is defined as [36]:
where represents the number of shortest paths between i and that pass , and refers to the number of all shortest paths between i and .
Nodes with high betweenness centrality have significant influence over the flow of information in the network, as many shortest paths pass through them.
3.2.3. Clustering Coefficient
Clustering coefficient is the quantity that represents the aggregation degree of nodes in the network. Let , which refers to the number of neighbors of . Let be the number of edges between all nodes of . Then, the clustering coefficient of is defined as [36]:
The clustering coefficient can quantitatively describe the probability that any two neighbors of a node in a network are also neighbors with each other. The larger the clustering coefficient, the higher the degree of aggregation between the neighbors of the node.
3.2.4. Influence Vector
The above three indicators of node influence can be regarded as a three-dimensional vector, called the influence vector, denoted as , i.e., , and .
3.3. Node Influence Ranking Based on Pareto Dominance
The above three indicators can reflect the influence of nodes from different aspects, and each have its own advantages and disadvantages. Some works combine different indicators into one indicator by weighting [37]. Although this kind of method can integrate the advantages of each indicator, the weights of different indicators has a great impact on the evaluation results. In addition, it is easy to ignore which nodes are particularly prominent in some aspects. Therefore, this paper uses Pareto dominance to sort the influence of different nodes.
Definition 1
([38]). For nodes and , if the following two relationships are satisfied:
- 1.
- , ;
- 2.
- , .
it is said that node dominates , denoted as , which means that node has greater influence than node . If a node is not dominated by any node, it is called a non-dominated node. We can sort all nodes based on their dominance relationship.
The set consisting of all non-dominated nodes in V is recorded as . The nodes in set have the highest level of influence.
Then, let . The set consisting of all non-dominated nodes in is recorded as . The nodes in set have the second level of influence.
This process continues, and we can divide all nodes into several layers. Assuming that a total of l layers are obtained in the end, then . We can also set the number of layers l in advance as needed. After obtaining the first layers, assign the remaining nodes to l-layer.
The process for sorting all nodes is shown as Algorithm 1. From line 1 to 2, the relativity degree centrality, closeness centrality and clustering coefficient of each node are calculated, respectively. In line 4, these three indicators are stitched together to form a matrix of size . Each row of this matrix represents the influence vector of a node. Lines 5–22 are to sort all nodes into several layers. To obtain the final set of non-dominant nodes, add an attribute called “WND” to each node in Line 8. First assume that all nodes are not dominated, i.e., WND = true. From line 9 to 16, the algorithm traverses all nodes in V and compares its influence vector with all other nodes. If there exists another node that dominates , then set WND[] = false. If the attribute WND of node does not change after the loop, then it belongs to the set . In this way, all nodes can be divided into l layers.
| Algorithm 1 Influence Ranking |
| Input: . Output: .
|
4. Community-Detection Method Based on Node Influence and Crowding Degree
This section provides specific community division, including the determination of community centers, process of community detection, algorithm refinement and time complexity.
4.1. Determination of Community Center
In the real world, community centers are generally individuals with high influence. Therefore, it is reasonable to detect the communities of the network with influential nodes as the centers. However, there may also be more than one influential node within the same community, and in general, these two nodes are closely connected. Additionally, it is also possible that some community center has the highest influence in its own community, but is relatively weak compared to the centers of other communities. In view of this, we consider both node influence and crowding degree to determine the community center.
The crowding degree of nodes and is defined as follows:
The crowding degree is only between 0 and 1. The greater the crowding degree, the more likely two nodes are to be in the same community; otherwise, they are more likely to belong to different communities.
We first consider the nodes in the first layer . Each node in has the highest influence. First, a node is randomly selected in as the first community center, recorded as . Then, and the nodes whose crowding degree with is less than a given threshold are removed from . The above process repeats until is empty.
We then consider the nodes in the second layer . The selection process of the center nodes is similar to .
In order to ensure that the community center has a high level of influence, this paper only selects nodes from the first or second layers as community centers.
The value of threshold will have a certain impact on the selection of community centers. Generally speaking, the smaller the threshold, the larger the number of community centers; conversely, the larger the threshold, the smaller the number of community centers. Because the influence of the first layer individuals is higher than that of the first layer individuals, the threshold should be set lower than that of the second layer, which is conducive to selecting more individuals from the second layer as community centers.
The specific selection process is shown in Figure 1. Firstly, based on non-dominated sorting, non-dominated sets for different layers are obtained. Then, when screening community centers, crowding non-dominated sorting method is used. Through two different sorting processes, all community centers are ultimately obtained.
Figure 1.
Selection Process of Community Centers.
4.2. Process of Community Detection
After the community centers are determined, other nodes will be allocated to the center node a labeling algorithm. The community with center is denoted as , . To achieve this, two parameters are defined to determine which set would be most suitable for a node to add in among .
Suppose that v is a node, H is a subset of V. The distance between v and H is defined as . If , then v has at least one neighbor node in H. Also note that for , .
If , let , which is called the attachment of v at H. is called the fitness of node v to set H, indicating the number of neighbors of node v in H. The greater the fitness, the tighter the connection between node v and set H.
The process of generating is shown as Algorithm 2, which is mainly implemented through a labeling process. The input of the algorithm is graph G and the set of community centers .
The label represents the minimum distance from v to all communities. First, lines 1–8 regard each central node as a community and set the label values of the center nodes as 1, and that of other nodes as ∞.
Lines 9–24 assign all other nodes to the community with the highest fitness. records the community code with the highest fitness for each node. Lines 20–22 update the label values of the neighbor nodes.
Theorem 1.
Suppose that G is a connected network. In Algorithm 1, if , there is at least one and , so that .
Proof.
Assume that the theorem does not hold. For any , let . According to the assumption, . So there is a path of length d, denoted as , such that . Then, contradicts the hypothesis. □
| Algorithm 2 Community Detection |
| Input: , C Output: Community division result .
|
Theorem 1 guarantees that the algorithm will not fall into a dead cycle for a connected graph. After the algorithm, all nodes are divided into s sets, that is, s communities are obtained. But when the network is disconnected, it is difficult to guarantee. In general, determining whether a network is connected also requires a lot of computation. So, we proposed a refinement for the algorithm.
4.3. Algorithm Refinement
In many cases, community-detection algorithms deal with networks that are not connected. If none of the nodes in a certain connected component is selected as the center of community, the nodes in that connected component cannot be classified into any community. At the same time, the algorithm will be stuck in a dead loop because there are always nodes that have not been classified into communities. In response to this scenario, algorithm augmentation and refinement have been undertaken. The fundamental idea involves recursively decomposing subgraphs not assigned to communities using the same method.
Firstly, the initial graph is subjected to community detection using Algorithm 2. After the number of iterations in Algorithm 2 exceeds the number of nodes, which means the algorithm enters a dead cycle, the algorithm terminates and outputs the set of undivided nodes, denoted as R. Take the nodes in R and edges between nodes to form a subgraph and apply Algorithm 2 for community detection within this subgraph. This process will continue recursively until all nodes are assigned to communities.
4.4. Overall Process
At this point, all the steps of this community-detection algorithm have been designed. The entire algorithm process includes four steps: node influence ranking, community center determination, community division and algorithm refinement. Given this, we denote the method proposed in this paper as the ICDR. The overall process of the proposed method is shown in Figure 2.
Figure 2.
Overall process of the proposed method.
Step 1 divides the nodes in V into several layers according the influence vector. The influence matrix of the 10 nodes in the mini network shown in Figure 1 is as follows:
As can be seen from matrix , is a non-dominated node in V. So . After removing from V, the non-dominated node in is . So . Similarly, . If we intend to divide all nodes into four layers, then .
Step 2 determines the community centers by simultaneously considering node influence and crowding degree. First, is selected as the first center from . Because only has one node, then we will consider . Thus, is selected as the second center. If we only consider the non-dominated nodes in the first two layers, the selection process of community centers is over. If we consider the third layer of non-dominated nodes, or will be selected as the third center. The result in Figure 1 only considers the first two layers.
Step 3 divides the remaining nodes into various communities in sequence using the labeling algorithm. From Figure 1, we can see that and . However, because nodes and are not connected to the other nodes, they will not be assigned to the above two communities.
Step 4 will modify the results if there are nodes that have not been divided into any community. We can see from Figure 1 that nodes and form a subgraph. Applying the same method, these two nodes will be divided into the same community.
In the above example, if we use the first three layers of nodes to determine the community centers, the refinement process will not be used, because we can run the algorithm to partition all nodes at one time.
4.5. Case Study
Random networks are also commonly used as test objects to evaluate algorithm performance [10]. We used a random network to illustrate the implementation process of our method and preliminarily verified its effectiveness. The network is shown in Figure 3, which contains 20 vertices and has three distinct community structures.
Figure 3.
Example network.
Firstly, using node influence analysis, we can obtain the first layer of non-dominated nodes as . From Figure 3, we can see that these three nodes do occupy important positions in the network. Similarly, the second layer of non-dominated nodes is . The remaining nodes form the third layer .
Using the selection method of community centers in this paper, we first select node 13 from as the community center. Since the crowding degree of nodes 16 and 20 is 0.5, we chose one of them as the community center. Suppose that we chose node 20. Next, we will make the selection in the second layer . Taking into account both node influence and crowding degrees, we ultimately chose node 11 as the third community center. We can see from Figure 3 that these three nodes are exactly located in different communities, so the selected result is reasonable.
Then, we use labeling algorithms to sequentially partition all vertices into different communities.
Through this example, we can see that using only the nodes in the first layer to select community centers may result in biased results. Neglecting crowded degree can also lead to the consequences of having too many community centers.
5. Experiment
5.1. Experimental Subjects
To verify the effectiveness of our method, we selected seven classic real world networks for experiments, whose information is listed in detail in Table 1. Except for Dolphin amd Polbboks, all the data about these networks can be downloaded from the website http://snap.stanford.edu/data/ (accessed on 15 April 2024). Four of these networks have ground-truth community structures, but the remaining ones do not.
Table 1.
Information of real world networks.
5.2. Evaluation Indicator
Since four networks have ground-truth community structures, we adopt generalized Normalized Mutual Information () and score to evaluate the performance of the proposed algorithm.
: This indicator is used to measure the similarity between the detected community structures and the ground-truth community structures, whose definition is given as follows [40]:
where and are two matrices that represent the detected community structures and the ground-truth community structures, respectively. The k-th rows and represent the node distribution of community and , respectively. and are, respectively, the entropy of and the conditional entropy of with respect to .
The value of is between 0 and 1, and the higher the value, the better the performance of the algorithm.
F1 score: This indicator measures the closeness between the community structures discovered by the algorithm and the ground-truth community structures, which is given by the following formula [41]:
where and are the detected and ground-truth communities, respectively. , and is the harmonic mean of Precision and Recall, i.e.,
where
and
Modularity is an indicator used to measure the degree of modularity in a network structure, which can help us evaluate the quality of community partitioning. The calculation formula for modularity is as follows [20]:
where Q represents the modularity, m is the edge number of the network, refers to the connection strength between nodes and , and , respectively, represent the degrees of nodes and , and are the modules to which nodes and belong, and equals 1 when and are equal; otherwise, it equals 0.
For all networks, we use modularity to measure the quality of community detection.
5.3. Experimental Results and Analysis
5.3.1. Correlation Analysis of Different Indicators
In order to verify the rationality of the centrality measurement indicators selected in this paper, we analyzed the correlation between different indicators. In addition to the three metrics we used, we also add the closeness centrality.
We generated 10 random networks, each containing three communities with varying numbers of nodes. Then, we calculate the node influence of each network separately. The final result is shown in Figure 4, in which, refers to clustering coefficients and refers to closeness centrality.
Figure 4.
Correlation analysis of different indicators.
By analyzing the correlation, we find that there is a significant correlation between indicators and . Combining other indicators and their correlations, we conclude that the three indicators we have chosen are quite reasonable.
5.3.2. Analysis of the Changing Trend of Non-Dominant Nodes
We obtain the number of non-dominated nodes for the Erdös–Rényi random graph of different node sizes n but fixed edge existence probability .
From Figure 5, it could be seen that as the network size n continues to increase, the quantity of non-dominant nodes tends to grow roughly in a logarithmic fashion. To test the conjecture, a linear regression experiment is now performed on these data, with as the predictor variable and as the response variable. The result of the linear regression model shows that the adjusted criterion is 0.9326 and the p-value for the F-statistic is . This shows that our model explains the relationship between n and well. Overall, the number of non-dominated nodes is approximately linearly related to the log of the network size, i.e., . This result justifies our assumptions in the complexity analysis about the trend in the number of non-dominated nodes and the correctness of the algorithm’s complexity in the average case.
Figure 5.
The changing trend of non-dominated nodes with network size.
5.3.3. The Community-Detection Results of Our Method
We used the method in this paper to divide the selected network into communities, and visualized partitioning results are shown as Figure 6 and Figure 7, in which different colors represent different communities. It should be noted that due to the large number of networks DBLP and Amazon, we are unable to display the detection results of the entire network, but only visualize the results of partial nodes (). However, the entire network was divided in the experiment.
Figure 6.
Community-detection results for small real world networks. (a) Dolphin; (b) Polbooks.
Figure 7.
Community-detection results for real world networks. (a) Facebook; (b) Ca-GrQc; (c) Feather-lastfm-social; (d) DBLP; (e) Amazon.
5.3.4. Comparison with Other Methods
To verify the effectiveness of the proposed algorithm, three classic community-detection algorithms are selected to compare with the algorithm proposed in this paper (ICDR): Girvan’s algorithm [8], CIDLPA [28] and CNLLP [20]. The final results of these four algorithms on four networks with ground-truth community structures are listed in Table 2.
Table 2.
and score of different methods on two networks.
From Table 2, it can be seen that among the four comparative algorithms, our method (ICDR) achieves the highest value and score in all networks. Among the other three algorithms, CNLLP is superior to CIDLPA, while CIDLPA is superior to Girvan’s algorithm. The experimental results fully demonstrate the superiority of the proposed method. In addition, all algorithms have better scores than values. Moreover, all algorithms achieve better results in network Amazon compared to network DBLP.
The results on modularity for all networks are listed in Table 3.
Table 3.
Modularity of different methods on all networks.
From Table 3, it can be seen that the overall modularity obtained by our method is the highest, and our results are the best for most networks. The experimental results indicate that the community-detection method proposed in this paper can effectively reveal the community structure of the network.
6. Conclusions
Community structure is an important feature of complex networks. Mining communities can help analyze the structural features and functions of complex networks, and therefore has important theoretical and practical significance. Although various community-partitioning methods have been proposed, most of them often rely on the selection of seed nodes.
In view of this, this article proposes a complex network community-partitioning method based on node influence analysis. Firstly, this article uses Pareto dominance to divide nodes into several Zeng. Then, we use node influence and crowding to determine the community center. Finally, we use a labeling algorithm to divide the network into several communities. Finally, the method proposed in this article is applied to several practical networks. The comparison results with other methods demonstrate the effectiveness of this method.
Although the proposed method can effectively improve the efficiency and accuracy of community detection, there are also some insufficiencies. Firstly, the determination of some parameters involved in the algorithm still needs to be carefully studied. At present, we mainly rely on experimental analysis to determine these parameters, and have not conducted an in-depth analysis from a theoretical perspective. Secondly, the effectiveness of the proposed method still needs to be promoted and applied in more practical networks in order to comprehensively verify its performance.
Author Contributions
J.Y. performed the formal analysis of the investigation, the methodology, the software, wrote the first draft and edited the paper. B.L. served as supervision, performed the validation of the method, as well as reviewed and edited the paper. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by National Natural Science Foundation of China under grant number 11971447.
Data Availability Statement
The authors declare that data supporting the findings of this study are available within the article, the figures are concrete expression.
Acknowledgments
The authors are thankful to the Deanship of National Natural Science Foundation of China.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Day, C.J.; Hobbs, B.F.; Pang, J.S. Oligopolistic competition in power networks: A conjectured supply function approach. IEEE Power Eng. Rev. 2002, 22, 68. [Google Scholar] [CrossRef]
- Bona, A.D.; Marcelo, D.; Fonseca, K.O.; Lüders, R. A reduced model for complex network analysis of public transportation systems. Phys. A Stat. Mech. Appl. 2021, 567, 125715. [Google Scholar] [CrossRef]
- Zhang, L.; Cao, J.; Li, J. Complex Networks: Statistical Properties, Community Structure, and Evolution. Math. Probl. Eng. 2015, 7, 1–7. [Google Scholar] [CrossRef]
- Fortunato, S.; Hric, D. Community detection in networks: A user guide. Phys. A Stat. Mech. Its Appl. 2016, 659, 1–44. [Google Scholar] [CrossRef]
- Cao, B.; Li, D.; Bing, L.; Chen, G. Complex network topology mining and community detection. Dyn. Contin. Discret. Impuls. Syst. 2021, 13, 361–370. [Google Scholar]
- Ferrara, E.; Meo, P.D.; Catanese, S.; Fiumara, G. Detecting criminal organizations in mobile phone networks. Expert Syst. Appl. 2014, 41, 5733–5750. [Google Scholar] [CrossRef]
- Bagci, H.; Karagoz, P. Context-aware friend recommendation for location based social networks using random walk. In Proceedings of the International Conference Companion on World Wide Web, Montreal, QC, Canada, 11–15 April 2016; pp. 531–536. [Google Scholar]
- Girvan, M.; Newman, M.J. Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA 2002, 99, 7821–7826. [Google Scholar] [CrossRef] [PubMed]
- Javed, M.A.; Younis, M.S.; Latif, S.; Qadir, J.; Baig, A. Community detection in networks: A multidisciplinary review. J. Netw. Comput. Appl. 2018, 108, 87–111. [Google Scholar] [CrossRef]
- Lusseau, D.; Schneider, K.; Boisseau, O. Identification of the effects of the existing network properties on the performance of current community-detection methods. J. King Saud. Univ. Comput. Inf. Sci. 2022, 34, 1296–1304. [Google Scholar]
- Santos-Sierra, D.D.; SendinA-Nadal, I.; Leyva, I.; Almendral, J.A.; Ayali, A.; Anava, S.; Sánchez-Ávila, C.; Boccaletti, S. Graph-based unsupervised segmentation algorithm for cultured neuronal networks’ structure characterization and modeling. Cytom. A 2015, 123, 18–33. [Google Scholar] [CrossRef]
- Breve, F. Interactive image segmentation using label propagation through complex networks. Expert Syst. Appl. 2015, 87, 513–523. [Google Scholar] [CrossRef]
- Liu, Y.; Jin, J.; Yi, Z. A new clustering algorithm based on data field in complex networks. J. Supercomput. 2014, 67, 723–737. [Google Scholar] [CrossRef]
- Shi, C.; Cai, Y.; Fu, D.; Dong, Y.; Wu, B. A link clustering based overlapping community-detection algorithm. Data Knowl. Eng. 2013, 87, 394–404. [Google Scholar] [CrossRef]
- Guo, Y.; Li, X. Local heuristic genetic algorithm for discovering community of complex networks. In Proceedings of the International Forum on Electrical Engineering and Automation, Guangzhou, China, 26–27 December 2015; pp. 92–95. [Google Scholar]
- Chatterjee, A.; Das, D.; Naskar, M.K.; Pal, N.; Mukherjee, A. Heuristic for maximum matching in directed complex networks. In Proceedings of the International Conference on Advances in Computing, Mysore, India, 22–25 August 2013; pp. 1146–1151. [Google Scholar]
- Xing, Y.; Meng, F.; Zhou, Y.; Zhu, M.; Shi, M. A node influence based label propagation algorithm for community detection in networks. Sci. World J. 2014, 2014, 627581. [Google Scholar] [CrossRef] [PubMed]
- Belim, S.V.; Larionov, S.B. An algorithm of image segmentation based on community detection in graphs. Comput. Opt. 2016, 40, 904–910. [Google Scholar] [CrossRef]
- Sabir, E.; García, A.; Ghogho, M.; Debbah, M. Image segmentation by deep community detection approach. Comput. Opt. 2017, 10542, 607–618. [Google Scholar]
- Zhang, W.; Shang, R.; Jiao, L. Large-scale community detection based on core node and layer-by-layer label propagation. Inf. Sci. 2023, 632, 1–18. [Google Scholar] [CrossRef]
- Newman, M.; Girvan, M. Finding and evaluating community structure in networks. Phys. Rev. E 2004, 69, 026113. [Google Scholar] [CrossRef] [PubMed]
- Newman, M. Fast algorithm for detecting community structure in networks. Phys. Rev. E 2004, 69, 066133. [Google Scholar] [CrossRef]
- Blondel, V.D.; Guillaume, J.L.; Lambiotte, R.; Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, 10, P10008. [Google Scholar] [CrossRef]
- Lv, H.; Lei, T.; Huang, X.; Gao, H.; Bi, Z. Detection algorithm based on closeness rank and signal transimission. In Proceedings of the IEEE Advanced Information Technology, Electronic and Automation Control Conference, Chongqing, China, 25–26 March 2017; pp. 443–447. [Google Scholar]
- He, D.; Zhou, X.; Wang, Z.; Zhou, C.G.; Jin, D. Community mining in complex networks-clustering combination based genetic algorithm. Acta Autom. Sin. 2010, 36, 1160–1170. [Google Scholar] [CrossRef]
- Jiang, W.; Pan, S.; Lu, C.; Zhao, Z.; Lin, S.; Xiong, M.; He, Z. Label entropy-based cooperative particle swarm optimization algorithm for dynamic overlapping community detection in complex networks. Int. J. Intell. Syst. 2022, 37, 1371–1407. [Google Scholar] [CrossRef]
- He, J.; Chen, D.; Sun, C. A fast simulated annealing strategy for community detection in complex networks. In Proceedings of the International Conference on Computer and Communications, Chengdu, China, 14–17 October 2016; pp. 2380–2384. [Google Scholar]
- Sattari, M.; Zamanifar, K. A cascade information diffusion based label propagation algorithm for community detection in dynamic social networks. J. Comput. Sci. 2018, 25, 122–133. [Google Scholar] [CrossRef]
- Wagenseller, P.; Wang, F. Size matters: A comparative analysis of community-detection algorithms. IEEE Trans. Comput. Soc. Syst. 2018, 5, 951–960. [Google Scholar] [CrossRef]
- Zhao, Y.; Wang, Y.; Wu, M. Overlapping community detection based on node influence propagation in heterogeneous social networks. J. Chin. Comput. Syst. 2015, 36, 2190–2196. [Google Scholar]
- Liu, M.; Yang, J.; Guo, J.; Chen, J. A label propagation community discovery algorithm combining seed node influence and neighborhood similarity. Knowl. Inf. Syst. 2024, 66, 2625–2649. [Google Scholar] [CrossRef]
- Xu, Y.; Ren, T.; Sun, S. Community Detection Based on Node Influence and Similarity of Nodes. Mathematics 2022, 10, 970. [Google Scholar] [CrossRef]
- Ma, T.; Liu, Q.; Cao, J.; Tian, Y.; Al-Dhelaan, A.; Al-Rodhaan, M. LGIEM: Global and local node influence based community detection. Future Gener. Comput. Syst. 2020, 105, 533–546. [Google Scholar] [CrossRef]
- Boroujeni, R.J.; Soleimani, S. The role of influential nodes and their influence domain in community detection: An approximate method for maximizing modularity. Expert Syst. Appl. 2022, 202, 117452. [Google Scholar] [CrossRef]
- Bondy, J.A.; Murty, U.S.R. Graph Theory; Springer: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
- Linton, C.C. Centrality in social networks conceptual clarification. Soc. Netw. 1979, 1, 215–223. [Google Scholar]
- Zhao, Q.; Yao, X.; Gong, D.; Dang, X. The Nodes influence maximization in open source software community based on probability propagation model. IEEE Trans. Netw. Sci. Eng. 2023, 10, 2386–2395. [Google Scholar] [CrossRef]
- Deb, K.; Pratap, A.; Agarwal, S.; Meyarivan, T. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 2002, 6, 182–197. [Google Scholar] [CrossRef]
- Lusseau, D.; Schneider, K.; Boisseau, O.; Haase, P.; Slooten, E.; Dawson, S.M. The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations. Behav. Ecol. Sociobiol. 2003, 54, 396–405. [Google Scholar] [CrossRef]
- Lancichinetti, A.; Fortunato, S. Kertész, J. Detecting the overlapping and hierarchical community structure in complex networks. New J. Phys. 2009, 11, 033015. [Google Scholar] [CrossRef]
- Yang, J.; Leskovec, J. Overlapping community detection at scale: A nonnegative matrix factorization approach. In Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, Chengdu, China, 14–17 October 2013; pp. 587–596. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).