Detecting Overlapping Communities in Modularity Optimization by Reweighting Vertices

On the purpose of detecting communities, many algorithms have been proposed for the disjointed community sets. The major challenge of detecting communities from the real-world problems is to determine the overlapped communities. The overlapped vertices belong to some communities, so it is difficult to be detected using the modularity maximization approach. The major problem is that the overlapping structure barely be found by maximizing the fuzzy modularity function. In this paper, we firstly introduce a node weight allocation problem to formulate the overlapping property in the community detection. We propose an extension of modularity, which is a better measure for overlapping communities based on reweighting nodes, to design the proposed algorithm. We use the genetic algorithm for solving the node weight allocation problem and detecting the overlapping communities. To fit the properties of various instances, we introduce three refinement strategies to increase the solution quality. In the experiments, the proposed method is applied on both synthetic and real networks, and the results show that the proposed solution can detect the nontrivial valuable overlapping nodes which might be ignored by other algorithms.


Introduction
Determining the group with some particular properties helps the analysts to capture the common properties from the members in the community. Many applications could be considered based on the community detection. For example, the precise information delivery, e.g., Google AdWords [1] increases the transaction amounts for sending the advertisement information to the right person. Therefore, detecting communities is a popular research topic [2][3][4][5][6][7][8].
Many results focus on the disjoin community sets that each node belongs to exactly one community [2,3]. However, in the real-world networks, many people may belong to multiple communities, so the communities may overlap with each other. For example, an engineer may belong to many projects in a company. Thus, instead of strict partitions, fuzzy partitions are more appropriate for understanding the network structures [9,10]. Fuzzy partitions allow a node belongs to multiple communities simultaneously. Considering a real-world situation, some staff work together in a building, and the manager would like to track the movement history for each staff [11]. Each one may move to various rooms, and the move purpose comes from the role of each staff. When we treat the purpose of all staff to be the communities, the staff may belong to different communities.
(a) G4415, an example with three communities (b) G415, an example with two communities Figure 1. The benchmark with more than two communities and two communities.
In this paper, we focus on the overlapping community detection, and propose the node weight allocation problem denoted by NWA OCD to formulate the community overlap. Since computing the partition with maximum modularity is NP-complete, decreasing the computation cost to seek the near optimal partitions is the popular approach in solving the overlapping community detection. The heuristic algorithms are outstanding in seeking better solutions in large search space, especially for the genetic algorithms (GAs) [2,3,8]. Therefore, some works consider GA as the core approach in their solutions. Mu et al. use a hybrid heuristic approach including GA and the simulated annealing to find out the communities [2]. Shang et al. use GA with an extra local search [3]. The heuristic algorithms perform well in seeking the solution with high quality in a large search space. However, the above results do not deal with the overlapping properties. The overlapping networks have various properties, so some approaches consider the multi-objective approach to find the balanced results [4][5][6]31]. The balanced results mean that most properties are considered, but the derived results may not be closed to the real-world properties. Therefore, Behera et al. check the similarity between each pair of nodes [8]. The node similarity is also considered by Ezeh et al. to the overlapping nodes and their neighbors [32]. To emphasize the community attribution of each node, Shakya et al. combine fuzzy with the GA to calculate the detail properties of the nodes [7]. Shakya et al. consider the GA to reduce the computation time without decreasing the solution quality too much and adopt the fuzzy communities to identify the overlapping nodes.
Even if some approaches provide the solutions with high modularity, the partitions may not reflect the properties of the real-world networks in some situations. We found that the solution quality could be refined by considering following issues: ignoring overlapping nodes, merging clusters, and reweighting nodes. Therefore, we consider the modularity to design the solution searcher of the approach GA NWA I MR . We firstly modify the fitness function in GA NWA I MR to show the network properties by considering the null model, so the revised fitness function could output the partitions that are closer to the real-world behavior. Moreover, we design three refinement strategies to make the solutions to reflect the real-world properties.
In the simulation, we consider the synthetic network and popular networks that include Zachary Karate Club Network, Books about American Politics, and American College Football to evaluate the solution quality calculated by GA NWA I MR and other approaches. The derived networks correctly reflect the real-world properties in the synthetic networks and the real-world networks. Moreover, the proposed refinement strategies are also evaluated, and the refinement strategies provide higher quality of the derived partitions in the perspective of the real-world behavior. Therefore, the simulation results show that GA NWA I MR outputs the partitions, and the results are closed to the real-world properties. This paper is organized as follows. The overlapping communities and the problem definition are introduced and formulated in Section 2. The proposed approach GA NWA I MR is shown in Section 3, and the refinement strategies are also listed in this section. The simulation and comparisons are arranged in Section 4, and we show the network partitions in this section. Eventually, the conclusion and future works are stated in Section 5.

Modularity in Overlapping Communities
The community detection of a given network involves two processes. The first one is to find out the network structure and the other one is to determine the numbers of communities. Here we introduce the works proposed by Nepusz et al. [33] to explain the modularity in overlapping communities. Nepusz et al. consider a belonging coefficient matrix U = [α ic ] n×k , where n is the number of nodes, and k is a given number of communities. Each entry α ic shows how strongly the node v i belongs to the community c. The constraint of the relationship between v i and all communities is: So, the objective function is: where w ij is the predefined weight, s ij = ∑ k c=1 α ic α jc , ands ij is the prior similarity of v i and v j . By minimizing Equation (2), the nodes with high similarity will be grouped together. So, U with optimal result D G (U) is the overlapping community structure.
To determine an appropriate number of communities k, Nepusz et al. iteratively increase the value of k from 2, and then choose the value of k with the highest fuzzy modularity value calculated by Equation (3).

Problem Definition
The overlapping community detection problem is considered as a node weight allocation problem, denoted by NWA OCD for short. Given a network G(V, E), a maximum number of communities k, and a null model weight γ. Find a modified belonging coefficient matrix M = [λ ic ] n×k , such that the Q ov value is maximized. The objective function and constraints are: We consider inc f as the increasing factor. Given inc f > 1, the total weight of an overlapping node over all communities is larger than one, i.e., ∑ k c=1 λ ic > 1. The total weight of a non-overlapping node is still equal to one exactly, i.e., ∑ k c=1 λ ic = 1. By solving the NWA OCD problem, the overlapping community structure will be obtained by modifying the optimal solution. Note that if inc f = 1 and γ = 1, Equation (4) is the same with Equation (5), which means the fuzzy modularity is a special case of the NWA OCD problem.
Although Griechisch et al. [34] apply the fuzzy modularity to find overlapping communities, there are still some networks are unresolved. We introduce the networks with more than two communities and two communities to show this issue. The benchmark is shown in Figure 1. The values of Q ov for G4415 and G415 are shown in Table 1. We can see that v 9 belongs to B in G4415 while v 5 belongs A in G415, and they are not overlapping nodes.
The major difference between Equations (4) and (5) is the coefficient matrix. Each entry in Equation (5) is unweighted while that is weighted in Equation (4). Therefore, we need a mapping as shown in the following equations.

Allocate Node Weight by Genetic Algorithms
Computing the partition with maximum modularity has been proved as the NP-complete problem [13]. Even if we consider the solution with high computation performance, e.g., the cloud computing [35,36] and the parallel computing [37], to compute the partitions for maximizing the modularity, it still requires huge computation resource. Therefore, we propose a GA-based approach to get the near-optimal solution with minimum computation. The proposed algorithm GA NWA I MR includes two steps. We first apply GA to obtain a high-quality feasible solution, and then design three refinement strategies to improve the derived solution to modify the derived partition to be closer to the real-world behavior. In the following context, we will introduce the revised GA algorithm and the refinement strategies.

Genetic Algorithm
The iterative process of GA as shown in Algorithm 1 includes three major processes: crossover, mutation, and selection. Before invoking the iterative process, the initial population P with indi n chromosomes will be determined firstly. Each chromosome is represented by M = [λ ic ] n×k , as shown in Figure 2. Each entry λ ic is a weight to indicate the assignment from v i to c. The initial population is generated randomly, and each row of M must satisfy the problem constraints. Given a maximum number of iterations max t , the GA then invokes following processes.

1.
Crossover: we randomly select two chromosomes C A and C B form P, and a random column. The offspring is generated by the selected column of C B and the remaining part of C A as shown in Figure 3. The number of offsprings is determined by indi n , and in other words, we will obtain 2 × indi n chromosomes after the crossover.

2.
Mutation: the mutation process is launched in 80% probability after finishing the crossover. Once the mutation is invoked, one λ ic of a randomly selected chromosome will be picked up within [−0.1, 0.1]. Eventually, the offspring will be normalized to be a feasible solution to fit the requirements in NWA OCD .

3.
Selection: we consider the modularity to be the objective function, and finding the partition with maximum modularity is the purpose of GA. We use Q ov to be the fitness function and calculate Q ov of each solution. Moreover, all chromosomes are sorted in the descending order of Q ov . Computing the chromosomes with maximum Q ov is the major goal of the GA, so we select top indi n individuals, and they will survive to the next generation.  To keep the heavily overlapping nodes, a threshold α T in terms of α is given. We transform α T to the corresponding λ with the threshold λ T by Equation (6).

Refinement Strategies
GA provides an elite solution from the population, but this solution may not be suitable for all instances. In the pre-analysis phase, we observed three situations derived by GA NWA I MR , and we could receive better solutions by some extra processes. The situations are (1) lightly overlapping nodes, (2) mergeable clusters, and (3) reweight nodes. We call the processes that are used to get better solutions the "refinement strategies". Therefore, we provide three refinement strategies to refine the solutions for the above situations, respectively.
Ignore slight overlapping nodes The overlapping degree of each λ is important for splitting the communities. Determining the community with low value of λ is easier than that with a higher value. We use a threshold λ T corresponding to Equation (6) to determine that the entry should be treated as an entry without overlaps. In addition, we also can derive λ T by Equation (6). When λ < λ T , we set λ as zero. When λ T is set as a higher value, more entries will be assigned to single community.
Merge clusters Some small communities should be merged by other large community. If the overlapping ratio of any two communities is larger than a given merge threshold m T , they should be simply merged to a single community. Given two non-empty communities, we define ov ratio = |C 1 ∩ C 2 |/min(|C 1 |, |C 2 |) to be the overlapping ratio. When ov ratio is larger than a given threshold, C 1 and C 2 will be merged. Reweight node values To calculate the weight distribution of each overlapping node, directly converting λ to α via Equation (6) results in a situation that a node belongs to multiple communities but the majority of its weight is allocated to one community. To avoid this problem, we propose the reweight strategy. The weight should be proportional to the number of edges that v i linked in c. Moreover, if the neighbors of v i in c are more than the average number of nodes in c, c is more important than others for v i . Given a community c, avgNighbor c = ∑ i,j∈V(c) A ij /|V(c)| represents the average number of neighbors and α i = ∑ c∈C(i) ∑ j∈V(c) A ij /avgNighbor c be the normalized term. Therefore, we have the new weight is: where V(c) is the set of nodes belong to c and C(i) is the set of communities that v i belongs to. We use α i for normalization, so we have ∑ k c=1 α ic = 1.

Simulations
We consider a synthetic network and three real networks including Zachary Karate Club network, Books about American Politics, and American College Football to evaluate the performance of GA NWA I MR .
The evaluation criteria involve detecting overlapping community structure, detecting meaningful communities, detecting dense overlaps, and detecting heavily overlapping nodes.

Synthetic Network
We consider G210 as our synthetic network which has 210 nodes and four pre-defined communities A, B, C and D. Each of them has 60 nodes and 10 shared by any two continuous communities, i.e., A = {v 1 : v 60 }, B = {v 51 : v 110 }, C = {v 101 : v 160 }, and D = {v 151 : v 210 }. Note that A and B share nodes {v 51 , . . . , v 60 }, B and C share nodes {v 101 , . . . , v 110 } and so on. Each pair of nodes has 3% chances to be linked to each other, and for each community they shared, an extra 55% chances for them to be linked. Thus, overlapping parts will be denser than non-overlapping parts [38].
Since the fuzzy modularity is a special case of the NWA OCD problem, we could use the same optimization strategy to solve the problem. The parameter settings are inc f = 1.5 and 1, α T = 0, m T = −1, k = 6, and γ = 1. Figure 4 shows the bitmaps of sorted adjacency matrices. The black and white points represent the entries of 1s and 0s respectively. The adjacency matrices are sorted by the following strategy:

1.
Nodes are grouped by the detected community id. For the overlapping nodes, only the smallest id is counted.

2.
For each c, all nodes are sorted in descending order of λ ic . Therefore, the overlapping nodes will be in the bottom area of each community.    Figure 4b is the result of fuzzy modularity. Four communities are detected too, but no overlapping nodes are identified.
Although the maximum number of communities is six, only four communities were detected while the other two were empty communities. Since the number of communities could be captured by modularity [39], it is unnecessary to know the exactly value of number of communities in our method.

Zachary Karate Club Network
Zachary karate club network [40] is a popular benchmark for community detection algorithms. It has 34 nodes and 78 edges while nodes are members and edges are friendships between them. This network includes two groups due to a disagreement between the administrator and the instructor. Figure 5 is the result captured by the fuzzy modularity. In this experiment, we evaluate the results with different inc f settings, and show the importance of "ignore slight overlapping nodes" and "reweight node values". Finally, we apply our method on the case with the value k = 2, and halved the null model.  Figure 6a is the result with inc f = 1.2, and we get four communities and three overlapping nodes while λ is shown in Table 2a. The network separation in Figure 6a is identical to that in Figure 5, but maximizing the modularity outputs a larger one than that we derived. When inc f is increased from 1.2 to 1.5, we get two extra overlapping nodes, and they are v 12 and v 34 . When inc f is set as 1.7, the values of λ are changes as shown in Table 2c, and others are identical to that derived by inc f = 1.5. Therefore, larger settings of inc f results in more overlapping nodes.  Considering that a node has only one edge connecting to an overlapping node, e.g., v 12 , the isolation has the same property with that held by the overlapping node. Moreover, we found that Q ov derived by GA NWA I MR is higher than the optimal Q. It implies that the overlapping structure is easier to be detected as assigning higher weight to the overlapping nodes.
Here we consider an extreme case that all nodes are overlapped, i.e., inc f = 4. We analyze the obtained result, and then find the "duplicate communities". Two or more communities are extremely overlapped with each other, and even some of them are just the same community. Figure 7 shows the fuzzy partition result. Four communities are detected, but two of them denoted by dotted lines are the subsets of the rest two communities denoted by solid lines. Therefore, two sets should be merged to a correct community. After merging the communities, we derive two communities, and there is only one overlapping node v 10 . However, the value of Q ov is decreased from 0.526 to 0.371 simultaneously. Even if we derive the result with maximized value of Q ov , the solution does not show the correct properties of the communities. We use the refinement strategies to get the solution with lower quality but more closed to the real-world properties. Therefore, the refinement strategies are useful for improving the solution quality in terms of the real-world consideration.

Effects of Ignoring Slight Overlapping Nodes
We consider the network with inc f = 1.5 to evaluate the effects of the ignore step. The result with and without the ignore step are 0.427114 and 0.427117, respectively. Figure 8 and Table 3 are the detected communities and values of λ. Two overlapping nodes v 28 and v 30 are ignored. Since most of their weights were kept in a specific community, reducing the weights will not decrease Q ov dramatically. Therefore, the process of ignoring slight overlapping nodes helps to keep those heavily overlapping nodes.  Table 3. λ values of overlapping nodes in Figure 8 with inc f = 1.5 (before ignoring).

Effects of Reweight Strategy
To emphasize the importance of the communities, we propose a reweight strategy to assign various weights. The result with reweight strategy is identical to that shown in Figure 6b. Table 4a,b show the value of λ without and with considering the reweight strategy, respectively. The reweight strategy reduces the gap of the number of edges for connecting the inside-community nodes and outside-community nodes. However, the structure of the main community may be changed after reweighting, because the values are inversely proportional to the average number of neighbors in the communities to that out of communities. For example, v 12 is unbalanced before reweighting, but the value of λ of v 12 reflect the real-world behavior.

The Network with Two-Communities
We examine the network with exactly two communities to verify the property illustrated in Figure 1b can be captured by GA NWA I MR . We consider inc f = 1.5, α T = 0.01, m T = −1, k = 2, and γ = 0.5. In this case, we easily find out the overlapping nodes. The results are shown in Figure 9 and Table 5.
GA NWA I MR derives three overlapping nodes as shown in Table 5. From Figure 9, we have Q ov = 0.628, and the dotted curve is the real split of the club network. v 3 is the main overlapping node since it has a roughly balanced weight value. In summary, the two-community problem is solved by reducing the number of expected edges. . Detected communities with k = 2, and γ = 0.5 Table 5. λ values of overlapping nodes in Figure 9.

Compare with Different Algorithms
In the above simulations, GA NWA I MR detects two communities, and we compare the result with previous algorithms in this dataset. Shen et al. captured three overlapping communities [30], and the overlapping nodes are v 1 , v 3 and v 9 . However, v 12 is missed in the method of Shen et al. The property of the overlapping communities in v 12 is not discovered. The node v 12 has exactly one neighbor that is node v 1 , so v 12 should have the same overlapping properties as that of v 1 .
Chen et al. captured two overlapping communities [29], and their results are similar to ours as shown in Figure 9. Chen et al. found one overlapping node v 10 . Node v 10 has two edges that one connects to the left community while the other one comments to the right community. Therefore, considering v 10 as the overlapping node is reasonable. However, the node v 3 has five edges where three edges connect to the left community while two connect to the right community. v 3 is more appropriate than v 10 to be the overlapping node.
From the above observation, the communities are split more precisely by GA NWA I MR than the previous works. For the considerations of the split appropriateness, e.g., the number of detected communities, and the split correctness, e.g., the overlapping nodes, GA NWA I MR provides more precise results than other approaches.

Books about American Politics
This network is built from the transaction data from amazon.com [41]. The network has 105 nodes and 441 edges while nodes indicate books and edges are frequent co-purchase events. The nodes are labeled by three categories including liberal, neutral, or conservative. Each category has 43, 13, and 49 nodes respectively. In this simulation, we consider inc f = 1.5, α T = 0.01, m T = 0.5, k = 8, and γ = 1. We evaluate the performance of the merge strategy. Figure 10a,b are the solutions with and without merge strategies respectively. The text on each node is the node id and the real label. The results of Q ov are 0.528 and 0.533 for the results with and without merge strategy.

The Result with Merge Strategy
GA NWA I MR with the merge strategy detects four communities denoted by W, X, Y, and Z. Most nodes belong to two large communities W and X, which are mainly consisted of conservative and liberal books respectively. Most neutral books belong to two small communities. This result is similar to that obtained by Newman [39]. Table 6 is the values of λ for ten overlapping nodes. There are four neutral nodes, that is 40% of all overlapping nodes and 30% of all neutral nodes. The result implies that neutral books are often co-purchased with different books.

The Result without Merge Strategy
GA NWA I MR without the merge strategy splits W and X into two parts respectively denoted by W 1 , W 2 , X 1 and X 2 . A small community including v 48 , v 49 and v 57 has been detected by the modularity maximization [25]. Therefore, we also found this community and labeled it by W 2 .
Moreover, we also detect an extra community X 2 . After analyzing the edge density of X 1 and X 2 , they are both denser than the merged community X. Besides, the overlapped part is even denser as shown in Table 7. The density function definition is as follows: (8) Table 6. λ values of overlapping nodes in Figure 10a.  Table 7. Density value of each part of community X.
The overlapping ratios of (W 1 , W 2 ) and (X 1 , X 2 ) are 57% and 53%, respectively. High overlapping ratios indicate that we could merge each pair of them without decreasing Q ov too much. Therefore, modularity can not detect X 2 because of high overlapping ratio and dense overlapped part. This result shows the dense overlaps can be discovered by GA NWA I MR correctly.

American College Football
This is the network of American football games between Division IA colleges in 2000 [42]. It has 115 nodes, 613 edges and 12 conferences as shown in Table 8. Nodes are teams and edges are games between the corresponding two teams while nodes are labeled by the conferences they belong to. We apply inc f = 1.5, α T = 0.01, m T = −1, k = 15, and γ = 1 in this simulation.  Figure 11 shows the result with Q ov = 0.607, true labels are on the nodes. Ten communities and 17 overlapping nodes are detected. Most conferences are well matched to the detected communities except for the conferences Independents (Label 5) and Sun Belt (Label 10). There are total seven overlapping nodes in these two conferences. From Table 9, 41% overlapping nodes and 58% nodes are in the two conferences. Figure 11. Football communities. Table 9. λ values of overlapping nodes in Figure 11. The conference Independents has five teams, and only one game was held. This is the major reason that makes this conference undetectable. However, the teams often play with other teams in varied conferences, and this phenomenon results in the overlapping property. For example, v 82 is assigned to four communities, although it connected to community G with four edges. v 82 still connects to other three communities with a significant number of edges, so that is why it belongs to many communities simultaneously as shown in Figure 12. On the other hand, Sun Belt is in the similar situation. In this example, the heavily overlapping nodes could be detected by our method.

Dolphin Network
The Dolphin Network is a common benchmark for evaluating the overlapping communities. Some results consider the Dolphin Network to evaluate the community quality [26,43]. We compare the proposed GA NWA I MR with related results in this simulation. The Dolphin Network includes 62 nodes and 159 edges, and two communities are detected eventually for a long-term observation.
The distribution of λ for overlapping nodes is listed in Table 10 while the separation with Q = 0.535 is illustrated in Figure 13. According to the refinement strategy Ignore slight overlapping nodes, we get three overlapping nodes v 20 , v 28 , and v 44 after decreasing the setting of λ T from 1.0 to 0.9. The overlapping nodes are marked by the red circle with dot lines, and they are marked by the overlapping nodes based on the distribution of λ. On the other hand, we also consider m T = −1 in Dolphin network as the same setting in the above simulations. The community B, C, D, and E are merged according to the refinement strategy Merge clusters. Eventually, we get two communities. Figure 13. Five communities are detected by the proposed approach. There are three overlapping nodes when using λ T = 0.9. Therefore, the community B, C, D, and E could be merged by refinement strategy Ignore slight overlapping nodes, and we find two communities eventually. Nicosia et al. found four communities in Dolphin network [26]. The overlapping nodes are mentioned, but the authors did not list the overlapping nodes. Wang and Fleury provided detail analysis and found two communities from Dolphin network with Q = 0.385 [43]. The separation is acceptable, but the network structure is not so strong comparing to Figure 13. After considering the refinement strategies, the separation derived by the proposed GA NWA I MR is similar to that provided by Wang and Fleury in [43], but the structure of our network is stronger than the network in [43]. In summary, the refinement strategies are useful in revising the network separation to be closer to the real-world behavior, and the strength of the network structure is also improved. Table 10. λ values of overlapping nodes in Figure 13.

Conclusion and Discussion
Given a network, the modularity is used for measuring the partition quality while the fuzzy clustering recognizes the overlapping communities. Combining above concepts together to be the fuzzy modularity is an appropriate method to formulate the structure of the given network with overlapping communities. Maximizing the modularity outputs the partition with well network structure, but computing the partition with maximum modularity requires huge computation cost. Therefore, the heuristic algorithms are outstanding in seeking high quality solution from a large search space, and we can find some research results of using heuristic algorithms for finding the partitions with maximum modularity. However, there are some special cases that we have to deal with. We find out three common situations from the partitions derived from the GA with modularity maximization and propose three solution refinement strategies to ignore overlapping nodes, merge clusters, and reweight nodes to separate the network to be closer the real-world behaviors. Moreover, we modify the fitness function of the GA to consider the null model for measuring the distance between the derived partition and the random graph. Thus, the simulation results show that the proposed GA NWA I MR provide significant improvement comparing with previous approaches. The derived partition may not always have maximum modularity, but the community structure is more reasonable than the partitions derived by previous works. GA NWA I MR measures the connectivity of nodes and reweight the overlapping nodes to reflect the correct properties in the given networks. Eventually, GA NWA I MR determines the partitions appropriately, but the heavily overlapping nodes may be marked as the interior nodes by other approaches.
The overlapping nodes could be detected and provided appropriate allocation by GA NWA I MR . During the simulations, we found some extension works that will be address in the future, and they are listed as follows: 1.
In our simulations, we got an interesting result as shown in Figure 14 from the karate network with inc f = 2. The result consists of three communities, and they are grouped by v 33 , v 3 and v 1 . The community with v 3 that the nodes are marked by red could be consider as an overlapping set. It means that the networks not only have overlapping nodes but also overlapping groups. Thus, applying the fuzzy concept to the communities will eliminate the group with v 3 , and they may be more closed to the real-world behavior. Since the members in the group with v 3 may belong to different communities based on the situations, e.g., the competitions or the events. Therefore, assigning the red nodes to any community may be inappropriate.

2.
The proposed algorithm invokes GA to compute the preliminary partitions and then adopts proposed refinement strategies to correct the partitions by the secondary processes. The refinement strategies could be considered as the local search to improve the partition quality in each iteration. However, it is a tradeoff between the computation cost and the partition quality.
Once the refinement strategies are modified from the external processes to the internal processes in GA, the computation cost will be increased. Moreover, the given networks may not always consist of the target properties that could be improved by the refinement strategies. Therefore, the refinement strategies could be designed as local search approaches, but the trigger of launching the local search approaches should be analyzed in the future.