You are currently viewing a new version of our website. To view the old version click .
Sensors
  • Article
  • Open Access

11 March 2022

A Memetic Algorithm for Solving the Robust Influence Maximization Problem on Complex Networks against Structural Failures

,
,
and
School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen 518107, China
*
Author to whom correspondence should be addressed.
This article belongs to the Special Issue Geo-Distributed Big Data Analytics in Sensor Networks

Abstract

Many transport systems in the real world can be modeled as networked systems. Due to limited resources, only a few nodes can be selected as seeds in the system, whose role is to spread required information or control signals as widely as possible. This problem can be modeled as the influence maximization problem. Most of the existing selection strategies are based on the invariable network structure and have not touched upon the condition that the network is under structural failures. Related studies indicate that such strategies may not completely tackle complicated diffusion tasks in reality, and the robustness of the information diffusion process against perturbances is significant. To give a numerical performance criterion of seeds under structural failure, a measure has been developed to define the robust influence maximization (RIM) problem. Further, a memetic optimization algorithm (MA) which includes several problem-orientated operators to improve the search ability, termed RIM MA , has been presented to deal with the RIM problem. Experimental results on synthetic networks and real-world networks validate the effectiveness of RIM MA , its superiority over existing approaches is also shown.

1. Introduction

There are many networked systems in real life such as transportation networks and robot networks, which are indispensable parts of human work and life [1]. Automatic guided vehicles (AGVs), which belong to the category of wheeled mobile robots, play a significant role in transportation, the logistics industry, and autonomous driving [2], and can also be modeled as networked systems. The network topology information is widely studied for its direct description of the structural characteristics of systems. Some network structure characteristics including the random connection and the power law degree distribution, have been discovered and summarized in previous studies [3,4]. The network topology information plays a crucial role in related research and analysis takes on networked systems.
Due to the limited cost, resources cannot be allocated to all nodes in a network, but some influential nodes tend to be selected from the network as seeds to spread the influence. How to use the topological information of a specific network to select the seeds that can achieve the optimal propagation effect is defined as the influence maximization problem [5], which is of great significance in both theoretical and realistic applications. Applications of the influence maximization theory can be found in transportation networks such as the selection of cluster heads in the vehicular networks [6] and traffic bottleneck identification in the city [7].
For the influence propagation process in the networked systems, several information spreading models have been extensively emphasized, including the independent cascade (IC) model [8], the weighted cascade (WC) model [9], and the linear threshold (LT) model [10]. Kempe et al. modeled the seed selection problem as a discrete combinatorial optimization problem and proved the problem is NP-hard [8]. Another approach is through the Monte Carlo simulation, which directly estimates the influence range of seeds. The deficiency lies in the prohibitive computational cost; as the number of nodes enlarges, the required budget increases sharply. This method thus cannot be applied to large-scale networks. Lee et al. in [11] proposed a fast approximation method for influence spreading, and its rationality has been verified through experiments. The advantage is its lower computational cost. On the basis of these studies, the influence maximization problem can be regarded as an optimization problem, i.e., selecting the optimal set of nodes from the network guided by reasonable performance evaluation factors.
Networked systems are exposed to uncertainty and disturbances, and damages can even be destructive to the functionality. For the vehicular networks and robot networks, hacking smart terminals, disrupting cloud computing platform services, and cracking communication protocols are common forms of attack. According to the difference of attack targets, it can be divided into node-based and link-based attacks [12,13], both two categories have been demonstrated to be common and may cause serious damages. In terms of the attack type, it can be roughly divided into random attacks and malicious attacks. For the first category, targets in the network are attacked at the same probability. In malicious attacks, targets are attacked in the order of their importance; for example, nodes with larger degrees tend to be removed in priority [14]. Generally, malicious attacks are likely to cause more distinct structural losses than random attacks; therefore, such attack model has been intensively studied in previous studies including the attack on connectivity [15,16,17], on community [18,19], and on the diffusion behavior [13]. Several reasonable robustness performance evaluation factors have been designed [20,21]; based on these, methods that can improve the robustness are also developed [13,20,22,23,24].
Most of the existing studies only consider the situation that the network structure stays stable, and the selected seeds are only suitable for the current network structure [25,26,27,28]. Regarding the network-related influence maximization problem, there are some studies on how to robustly select seeds against potential uncertainties in the propagation process. These studies focus on the situation that the influence spreading probability or spreading model is uncertain [29,30]. Yet such factors have already been implied in the seed determination process, as shown in [5,7,9,25]. Meanwhile, the network structure is closely related to its performance. Changes in the network structure always impact the interaction between network nodes and further bring about disturbances on the influence propagation process. Consequently, the selection of seeds is expected to possess the ability to resist changes in the network structure and keep a relatively robust influential range. This important feature has not been thoroughly studied in the past literature. In other words, how to reasonably evaluate the influential performance of seeds when attacks on the network structure happen, and how to select the optimal seed set guided by the evaluation factor, these problems remain to be solved. Correspondingly, the robust influence maximization (RIM) problem is defined as the task of selecting a seed set that can maintain a good influence spreading ability under potential network structural damages.
Aiming at these deficiencies of the existing studies, this paper first analyzes how to robustly solve the problem of network influence maximization; the malicious attacks on networks are considered. Based on the robustness performance evaluation factors of the existing studies, a factor that evaluates the influence performance of the selected seeds under nodal attacks was designed, where a changeable parameter is included to control the damage extent. An experimental analysis was also conducted to determine a rational configuration of the parameter toward multiple scenarios. In this manner, the factor intuitively assesses the influence performance of seeds in a numerical form, and thus provides guidance for the optimal seed selection process. Equipped with which, a memetic algorithm is devised to select robust seeds under malicious node-based attacks. The proposed algorithm, RIM MA , contains several problem-directed operators and exploits genetic information from both global and local areas. Corresponding experimental results on synthetic network and real network data indicate the superiority of RIM MA over existing methods. Meanwhile, tests are carried out on land transportation networks such as logistics networks and robot networks. The obtained seeds can achieve considerable influence performance when the network structure is under attack.
The rest of this paper is organized as follows. Section 2 represents the related works. Section 3 introduces the evaluation factor of robust influence performance proposed in this paper and the parametric configuration process. The RIM MA is described in detail in Section 4. Section 5 presents the experimental results and analysis. Finally, Section 6 summarizes this work and presents possible work.

3. Robust Influence Evaluation of Seeds under Network Structure Damage

In terms of the influence maximization problem, most studies ignore the impact of structural failures on the propagation ability of the selected seeds. In realistic applications, decision makers tend to be benefited by having robust seeds to deal with diversified situations. As mentioned, network attacks can be divided into node-based attacks and link-based ones. Attacks on nodes are direct and show effectiveness to collapse networks; as indicated by some previous studies [19,26], only the failure of a few nodes is enough to cause malfunction of the whole network. Considering the significance of nodes, this work concentrates on node-based attacks to investigate the robust influence performance.

3.1. Robust Influence Performance Evaluation Method

It has been proved that a malicious attack often causes greater losses to the structure and performance of networks. Under this circumstance, nodes are ranked according to their importance. The degree of nodes is a popularly-used measure to assess the importance. From the perspective of the attacker, the destruction operation is also limited by the available resources. If resources are sufficient, all nodes in the network can be destroyed; otherwise, only part of the nodes can be destroyed. Due to the significant structural importance, nodes with higher rank tend to be considered as priority in the destruction process.
In terms of the influence spreading, if a removal operation causes changes in the network structure, the maximal influence evaluation of the seeds should be re-estimated. Different network damage scenarios should be considered in the performance evaluation process; in this manner, the robustness of seeds can be guaranteed. Referring to Equation (2), the influence performance evaluation factor of seeds under node-based attacks is defined as follows:
R s = 1 N × ρ P = 1 N × ρ σ ^ S | P
where N is the number of nodes in the network. ρ is the ratio of attacked nodes. σ ^ S | P is the estimated value of influence of the selected seed node set when P nodes are attacked. For a certain ρ , if the selected set S can obtain a larger R s value, it means that the selected set S is considered to be causing a greater influence under the current attack. Here, 1 / N · ρ works as the normalization factor.

3.2. Parameter Calibration in R s

In network attacks, the degree of structural damage cannot be known in advance for decision makers, which makes it difficult to determine the parameter ρ . Once the parameter ρ in R s is determined, the factor can be used as an objective function to guide the seed selection process. Meanwhile, seeds obtained under different ρ are likely to be diverse. For example, when ρ is set as a small value, the selected seeds prefer the condition when the network stays steady or only suffers from slight damages. However, in the case of a larger value of ρ , the selected seeds prefer the condition that the network is severely damaged. Therefore, it is of great significance to select an appropriate ρ value so as to obtain a robust solution for possible scenarios.
In this subsection, a rational ρ value is determined through trials and errors. The experiments are conducted on two common synthetic networks, namely scale-free (SF) network [3] and random (ER) network [4], where the number of nodes is set as N = 100 and the average degree is set as k = 4 [36]. The selection of seeds is carried out under the guidance of the factor R s . The scale of seed set is configured as S = 10 , and the probability of spreading influence between nodes is set as p = 0.01 . The value of ρ is divided into 11 groups, lying in the range of [0,1] at an increasing step of 0.1, i.e., ρ = 0 ,   0.1 ,   0.2 ,   ,   1 . Seeds are selected guided by R s with different ρ values. It is necessary to verify the performance of the obtained seeds in the 11 groups. Each group of seed sets selected by a specific ρ is verified in 11 scenarios of different removal situations, i.e., 121 sets of tests in total. The difference between the specific result under a target ρ under each scenario and the optimal performance in this scenario is evaluated and summarized. If the sum of the differences is smaller, it can be considered that the selected seeds guided by the target ρ can achieve a relatively stable performance against multiple scenarios, indicating the seeds maintain better robustness against unknown attacks. The final experimental results are presented in Figure 2.
Figure 2. Numerical analyses of seeds’ performance guided by R s with different ρ on SF network and ER network. The results are averaged over 5 independent realizations.
According to the experimental results in Figure 2, in the SF network, the seed set obtained when ρ is set to 0.2 and 0.3 can achieve a relatively stable propagation ability on all possible scenarios. In the ER network, the set of seeds selected under the condition of ρ = 0.2 has the stable influence propagation ability in the test and shows distinct advantages over other configurations of ρ . In conclusion, the seed set selected when the ρ is 0.2 can achieve relatively stable influence propagation ability on two common artificial synthetic networks. Therefore, in the subsequent process of selecting seeds, ρ in R s is set as 0.2 to evaluate the performance of candidates.

4. RIMMA Algorithm

In this section, operators of RIM MA are introduced in detail. The goal is to find candidates with the maximal robust influence factor R s . In RIM MA , R s is used as the fitness function to evaluate the quality of chromosomes. The procedure of the RIM algorithm is shown in Figure 3. First, an initialization operation is performed to obtain a random initial population. Then, randomly select different individuals from the population to execute the crossover operator to expand the population. The function of the mutation operator is to generate new individuals in the population to replace the old ones. Finally, the local search operator takes into account the local characteristics of nodes, and the target is to seek local replacement operations that can improve the fitness of the individual. After reaching the maximum number of iterations, the optimal individual in the population is output as the optimal solution. The details of each operator will be introduced in detail in each subsection. In the end, the framework of the RIM MA algorithm is summarized.
Figure 3. The procedure of the RIM MA algorithm.

4.1. Initialization

In a memetic algorithm, each chromosome represents a set of seeds, which are the limited nodes selected to spread influence, and each individual contains a specific seed set. A population with Ω chromosomes represented Ω seed sets, which are labeled as S 1 , S 2 , , S Ω . The initial population is generated by combining two strategies, i.e., random selection and degree preference selection. Specifically, the random selection strategy is adopted for the first half of the population. In detail, every chromosome selects K different seeds from n nodes in the network stochastically. The selection strategy of the other half population is based on the degree information of nodes. Nodes with a larger degree accounting for the top 2% in the network are preserved in the TOP set. The first node of the seed set of the individual is randomly selected from the TOP set, and the remaining nodes are randomly selected from other nodes in the network. Note that no duplicate seeds are allowed in a seed set; in that case, the duplicate one is replaced by a randomly generated seed. This strategy ensures that both nodes with high and low degrees are considered. The designed initialization operation is used to generate potential solutions widely distributed in the solution space, which is conducive to the subsequent optimization operations. The details of the crossover operator are summarized in Algorithm 1. The procedure of initialization is summarized as Figure 4. P 1 represents the initial population containing Ω 0 individuals. S i 1 represents the i-th individual in the population, and similarly S k 1 represents the k-th individual.
Algorithm 1. Initialization
Input:
   Ω 0 : Initial population size
   k : Size of seed set;
   G : Target network;
Output:
   P 1 = S 1 1 , S 2 1 , , S Ω 0 1 : Initial population;
for  i = 1  to  Ω 0 / 2 do
  for  j = 1  to  k  do
    Randomly select a node from N nodes in G as the j -th element in the S i 1
    while (the j -th node in S i 1 is the same as the rest) do
      Randomly select a node from N nodes in G to replace j -th node
    end while
  end for
end for
for  k = Ω 0 / 2  to  Ω 0  do
  The first node of S k 1 is randomly selected from TOP set
  for  l = 2  to  k  do
    Randomly select a node from N nodes as the l -th element in the S k 1
  end for
end for
Figure 4. The procedure of the initialization algorithm.

4.2. Crossover Operator

The purpose of the crossover operator is to exchange partial information between two chromosomes, and new chromosomes are generated to further enrich the current population. The crossover method adopted in this paper utilizes the single-point crossover at a probability of p c . Assuming that S p 1 and S p 2 are two randomly selected parent chromosomes, an integer L 1 is randomly generated first in the range of 1 , K , and then the genetic information at L 1 in S p 1 and S p 2 is exchanged to generate two new child chromosomes, S c 1 and S c 2 .
It should be noted that the randomly generated integer L 1 needs to ensure that the generated child chromosomes S c 1 and S c 2 are legitimate. In other words, the genetic information at L 1 in chromosome S p 1 cannot be duplicated with that in chromosome S p 2 , and vice versa. If the random number L 1 cannot meet this condition, a new random number is generated. The details of the crossover operator are summarized in Algorithm 2. Examples of the operator are shown in Figure 5. The procedure of crossover operator is summarized as Figure 6. S t is a temporary set of two parent chromosomes and two child chromosomes. S i represents the i-th individual in the population.
Algorithm 2. Crossover
Input:
   Ω 0 : Initial population size
   Ω : Total population size
   P : Current generation population
   p c : Crossover probability;
Output:
   P c : Population after crossover;
P c P
for  i = Ω 0 + 1  to  Ω do
  if  ( r < p c ) /* r is a random number subjecting to uniform distribution between [0,1] */
    Randomly select two different chromosomes from the population P as the parent chromosomes S p 1 and S p 2
    Randomly generate an integer L 1 in the range of 1 , K
    while (gene at L 1 on one parent chromosome is duplicated with the other) do
      Randomly generate an integer L 1 again
    end while
     S c 1 S p 1 , S c 2 S p 2
    Remove the node at L 1 from S c 1 and add the node at L 1 from S p 2 to S c 1
    Remove the node at L 1 from S c 2 and add the node at L 1 from S p 1 to S c 2
    Calculate the fitness of S p 1 , S p 2 , S c 1 , S c 2 , and the chromosome with the largest fitness is denoted as S t
     S i S t
    add S i to P c
  else
    randomly select a chromosome S t from P
     S i S t
    add S i to P c
  end if
end for
Output the expanded population P c ;
Figure 5. Examples for the crossover operator. The seed set size of the chromosome was set as 5. The red dotted box represents the crossover position and blue dotted box represents the position of repeated nodes. In (a), the crossover position L 1 was randomly selected as 4 to generate two legitimate child chromosomes and the crossover succeeded. In (b), L 1 was selected as 2 to generate illegitimate child chromosomes and the crossover failed.
Figure 6. The procedure of crossover operator.

4.3. Mutation Operator

For all chromosomes in the population, the mutation operator is performed at a probability of p m . Similar to the crossover operation, for the chromosome S to be mutated, an integer L 2 in the range of 1 , K is randomly generated, and then a node is randomly selected from all N nodes to replace the gene at L 2 in the chromosome S . Similarly, in order to ensure that the mutated chromosome S m is legitimate, it is necessary to guarantee that the selected replacement node is not duplicated with all nodes in S ; otherwise, the replacement node should be selected again until it meets the condition. The details of the mutation operator are given in the following Algorithm 3. Examples of the mutation operator are shown in Figure 7. The procedure of mutation operator is summarized as Figure 8.
Figure 7. Examples for the mutation operator. The seed set size of the chromosome was set as 5. The red dotted box represents the mutation position and blue dotted box represents the position of repeated nodes. In (a), the mutation position L 2 was randomly selected as 3 to generate legitimate chromosomes and the mutation succeeded. In (b), L 2 was selected as 4 to generate illegitimate chromosomes and the mutation failed.
Figure 8. The procedure of mutation operator.
Algorithm 3. Mutation
Input:
   P : Population before mutation
   p m : Mutation probability;
Output:
   P m : Population after mutation;
P m P
for (each chromosome S i in P m ) do
  if ( r < p m ) /* r is a random number subjecting to uniform distribution between [0,1] */
   Randomly generate an integer L 2 in the range of 1 , K
   Randomly select a node v t from all N nodes
   while ( v t is duplicated with all nodes in S i ) do
     Randomly select a node v t from all N nodes again
   end while
   Remove the node at L 2 from S i and add the node v t to S i
  end if
end for

4.4. Local Search Operator

The local search operator is an important operation that distinguishes MA from GA. Two strategies are considered in the operator. Firstly, the operator should consider the local characteristics of the node such as its 2-hop neighborhood. Secondly, those nodes with larger degrees are preferred in the early stage of the algorithm, which is aimed at promoting the fitness function. As the iteration time increases, the probability of such operations is reduced to avoid premature convergence.
Based on the above strategy, the local search operator is divided into two phases. The first phase is the local search toward the nodal neighborhood, which is performed at a probability of p m i . For each seed in the chromosome, its neighbors or neighbors’ neighbors are searched to find better candidates. In order to limit the computational cost, this 2-hop replacement is only performed at a small probability. The fitness of the replaced seed set is evaluated, and only the operations that reach a better performance are retained. The second part is the global search of nodes, which is performed at a varying probability, and the probability decreases as the number of iterations increases. Applying the roulette wheel selection, the node that is to be replaced in the current chromosome is selected. The strategy of roulette wheel selection is based on the degree of seed. The smaller the degree, the higher the probability of being selected. Nodes with smaller degrees may also be important nodes in the network, so all nodes should have a chance of being selected. The operation is inclined to replace those low-degree ones in priority. If the performance of this seed set gets promoted, then the replacement operation is kept. The specific details of the local search operator are given in Algorithm 4. The procedure of local search operator is summarized as Figure 9. S N e i represents the temporary set of 1-hop and 2-hop of the node. s l represents the replacement node in S N e i that can improve the individual performance the most. s g represents the replacement node in the TOP set that can improve the individual performance the most.
Algorithm 4. Local Search
Input:
   P : Current generation population
   p m i : Local search probability
   p m a : Global search probability
   g e n : Current iteration
   M a x G e n : Maximum iterations;
Output:
   P l : Population after local search;
P l P
for (each chromosome S i in P l ) do
  for (each seed s in S i ) do
    if  ( r < p m i ) /* r is a random number subjecting to uniform distribution between [0,1] */
      for (each neighbor node s n of s ) do
        Add s n into the set S N e i
        for (each neighbor node s n n of s n ) do
          if  ( r < p m i )
           Add s n n into the set S N e i
          end if
        end for
     end for
     Try to replace s with each node in the set S N e i . If the fitness is improved, the neighbor node with the largest fitness is recorded as s l
     Remove s and add s l into S i
   end if
   if  ( r < p m a × M a x G e n g e n / M a x G e n )
     Obtain the TOP set of nodes with a large degree accounting for 2% of the total nodes;
     Select a node s from S i using roulette wheel selection
     /* The smaller the degree, the higher the probability of being selected */
     Try to replace s with each node in the TOP set. If the fitness is improved, the node with the largest fitness is recorded as s g
     Remove s and add s g into S i
   end if
  end for
end for
Figure 9. The procedure of local search operator.

4.5. RIMMA Framework

In RIM MA , the initialization operator is performed first to obtain the initial population. In each generation of the RIM MA , the crossover operator is performed to enrich the population; then the mutation operator is performed. Followed by the local search operator, the fitness level of the whole population is to be promoted. At the end of each iteration, the fitness function of each chromosome in the population is evaluated, and the best individual is preserved into the next population, while other individuals are selected from individuals in the current population based on a roulette wheel selection according to their fitness. The higher the fitness, the greater the probability of being selected. The above process is repeated until the iterations reach the pre-defined threshold, then the overall best candidate is the final output. The overall framework of RIM MA is summarized in Algorithm 5.
Algorithm 5. RIM MA
Input:
   G : Target network
   Ω 0 : Initial population size
   Ω : Total population size
   k : Size of seed set
   p c : Crossover probability
   p m : Mutation probability
   p m i : Local search probability
   p m a : Global search probability
   M a x G e n : Maximum iterations;
Output:
   S * = s 1 , s 2 , , s k : Optimal seed set;
P 1 = S 1 1 , S 2 1 , , S Ω 0 1 Initialization G , Ω 0 , k
for  g = 1 to M a x G e n  do
   P t
  Repeat
    Randomly select two different chromosomes from the population P g as the parent chromosomes S p i and S p j
    ( S c i , S c j ) Crossover ( S p i , S p j , p c )
     P t P t S c i , S c j
  Until (all chromosomes in P g have been selected)
  for (each chromosome S in P t and P g ) do
     P m g Mutation ( S ,   p m )
  end for
  for (each chromosome S m in P m g ) do
     P l g Local_Search ( S m ,   p m i , p m a )
  end for
   P g + 1 Selection_Operator ( P l g );
end for
Output the current best individual;

5. Experiments

In order to verify the performance of the designed RIM MA , experiments on three synthetic networks are conducted first, including scale-free (SF) networks [3], random (ER) networks [4], and small-world (SW) networks [37], and then those on realistic networks are presented. In this paper, we implement the comparison of various existing seed selection algorithms such as simplified memetic algorithm (MA-sim), genetic algorithm (GA), simulated annealing algorithm (SAA), and degree-based algorithm (DBA) with the RIM MA . The factor R s in Equation (2) is used to evaluate the performance of seeds selected by algorithms. Further, experiments are conducted on several land transportation networks to validate the effectiveness of the developed algorithm.
To compare the Monte Carlo simulation method and the 2-hop fast approximation method, we also conduct experiments with a small-scale network. The above two methods are used to evaluate the influence of the seeds obtained by the four algorithms. The numerical values and computation time of the two evaluation methods are shown in Table 1. σ S represents the influence of seeds evaluated using the Monte Carlo simulation method. σ ^ S represents the influence of seeds evaluated using the 2-hop fast approximation method. There is almost no difference in the performance of the two evaluation methods, but the computation time of the Monte Carlo simulation method is more than ten times that of the 2-hop fast approximation method. This also shows that the Monte Carlo simulation method is not suitable as the fitness function of the evolutionary algorithm, and this method cannot tackle the evaluation task on networks of a large scale.
Table 1. Differences between the two influence evaluation methods.
The various parameters of RIM MA are set as follows. The maximum number of iterations M a x G e n is 150, the size of population Ω is 50. We conducted a simple experiment to determine the parameters p c , p m , p m i , p m a . The experimental method is as follows. In the same network, only the test parameters are changed and other parameters remain unchanged to test the performance of the algorithm. The final experimental results are shown in Figure 10. According to the results in Figure 10, when the parameter p c is set to 0.6, the performance of the algorithm is better than when p c is set to the other four values, and the same is true for p m and p m i . Additionally, when p m a is set to 0.4, the performance of the algorithm is better than setting the other four values. Therefore, we set p c , p m , p m i to 0.6, and p m a to 0.4. In order to ensure comparability, GA also uses similar parameters. The parameters of MA-sim are consistent with the RIM MA . The only difference is that the local search of the node neighborhood is omitted from the local search operator, and only the global search of the node is retained.
Figure 10. Performance of the algorithm with different parameters.

5.1. Experiments on the Synthetic Networks

SF networks, ER networks, and WS networks with different scales are generated to compare the proposed RIM MA with other existing algorithms in this experiment. The experiments are conducted on three artificial synthetic networks with 100, 300, 500, and 1000 nodes N where the average degree of the network is set to 4. Results of each algorithm in a specific network are averaged over 20 independent realizations. As aforementioned, the parameter ρ of the factor R s is set to 0.2. The specific value of the influence maximization performance quantitative index R s of the seed set is given in Table 2.
Table 2. R s performance comparison of seeds selected by different algorithms on three synthetic networks with different scales.
It can be seen that in three artificial synthesis networks with different scales, the seed set selected by all evolutionary algorithms including RIM MA , MA-sim, and GA achieve a higher R s value than the seed set obtained by other algorithms. This phenomenon indicates that these seed sets have a better ability to spread influence under uncertain deliberate attacks. The seed sets selected by the degree-based algorithm have the worst performance of influence propagation. It also shows that only relying on the information of the original network such as the degree is inadequate, and the obtained seed set often cannot cope with the network structure damage when the network is attacked. Specifically, the degree-based algorithm preferentially selects the nodes with a higher degree of network, while the malicious attack also preferentially selects these nodes. When the number of attacked nodes is large, all the seeds have been attacked and cannot spread influence. When the number of attacked nodes is small, only a few seeds in the network can still spread influence. Therefore, the seed set selected by the degree-based algorithm scores a low R s , and such seeds may not tackle the robust influence maximization task.
Comparing the three evolutionary algorithms, the RIM MA tends to achieve better results, and seeds selected by RIM MA can obtain the best influence propagation performance when the network is attacked. MA-sim and GA are inferior. It is worth mentioning that in the SW network with 100 nodes, the influence propagation performance of the seed set selected by the MA-sim is slightly better than that of the seed set selected by the RIM MA , which may be caused by the small scale of the network and the unique neighboring connected structure of SW networks. The convergence process of the three evolutionary algorithms is further analyzed. Figure 11 shows the convergence curves of the three algorithms on SF networks with different scales. For large-scale or small-scale SF networks and ER networks, the performance of the initial seed set selected by RIM MA is significantly better than the sets selected by the other two evolutionary algorithms due to the addition of diversified structural information in the initialization, and RIM MA maintains superiority over the whole evolution process. The advantage of RIM MA is not so marked compared with other networks on SW networks, but the algorithm is still effective for selecting powerful seeds. In general, the experimental results verify that the RIM MA can be well applied to common synthetic networks with different scales, and the generality is considerable. The computation time of each algorithm is shown in Figure 12 compared with the other three algorithms, RIM MA is computationally expensive. However, the experiments prove that these excess costs are reasonable, and RIM MA provides a more competitive solution for decision makers.
Figure 11. The evolution process of three evolutionary algorithms on (a) large-scale SF network, (b) small-scale SF network, (c) large-scale ER network, (d) small-scale ER network, (e) large-scale SW network, (f) small-scale SW network.
Figure 12. Result of computation time for the four algorithms in SF network with 100 nodes.

5.2. Experiments on the Realistic Land Transportation Networks

In order to further verify the performance of the algorithm, two real-world networks are selected in this section, and the above five algorithms are used to select seed sets to make comparisons. The first is a logistics transport network in a certain area of Berlin, denoted as G B [38]. This network consists of 224 nodes and 376 edges, where each node represents a freight station and each edge represents a viable transportation route between freight stations.
The second is a larger-scale robot network based on the existing robots in Sun Yat-sen University as shown in Figure 13. According to the distribution, it can be divided into two types: random distribution and cluster distribution. The random robot network is denoted as G R 1 and the cluster robot network as G R 2 , which both consist of 200 nodes. In these networks, each node represents a robot, and each edge represents the communication between robots. Different from other networks, the communication between nodes in a robot network is closely related to the distance between nodes. In other words, due to physical equipment, two robots cannot communicate with each other when they are far apart unless the distance between two robots is within a threshold range.
Figure 13. Wheel mobile robots (TurtleBot2) in Sun Yat-sen University.
Figure 14 shows the experimental results of tested algorithms on three realistic networked systems. More specific performance of each algorithm is shown in Table 3. It can be concluded that RIM MA is also superior over the other four algorithms in the experiment, and seeds selected by the algorithm can achieve the maximum propagation when the network is under attack. Note that the performance of GA is better than MA-sim in G R 1 and G R 2 . This may be due to the particularity of the robot network connection, which limits the effectiveness of the global search operator in MA-sim. The local search operator in RIM MA can effectively find seeds with considerable propagation performance. Figure 15 shows the topological structure of G R 1 and G R 2 robot networks, in which the blue diamonds represent the selected seeds, from which we can see the structural characteristics of seeds with robust influence ability. One feature is that the degree of seeds is relatively smooth. If two seeds are closely connected, the generated influence may be overlapped, which tends to cause duplicated transmission resources. Due to the limited number of seeds, inactive seeds in other areas may not be handled. Secondly, the proportion of seeds with a large degree is small. As the target of malicious attack is to remove hubs in the network first, such nodes thus cannot achieve the spreading task and the obtained R s value tends to be inferior.
Figure 14. The R s performance comparison of five algorithms on three networks.
Table 3. R s performance comparison of seeds selected by different algorithms on three realistic networks.
Figure 15. The topologies of robot networks and seeds selected by RIM MA .
Experiments on three realistic networks further demonstrate the effectiveness of RIM MA , and also reveal that the proposed algorithm can provide some countermeasures for decision makers to solve realistic problems. For the Berlin logistics network, robust influential seeds can serve as an alternative solution to improve the overall transportation efficiency when attacks happen. For the robot networks, the seeds selected from the network are generally the key robots. These robots are crucial to complete the communication and information interaction tasks between robots under situations such as structural attacks and other emergencies. In summary, the RIM MA designed in this paper can effectively solve the problem of robust influence maximization, whether for some common artificial synthetic networks of different scales or some actual networks. On the other hand, the significance of determining seeds with robust influence ability is also shown in some real-world systems.

6. Conclusions

In this paper, based on the existing research on network influence maximization, the concept of robustness was introduced, and the problem of robust influence maximization was defined. Considering the challenges and problems in the land transportation network, the selection strategy on critical nodes under structural damage was studied. Both the information diffusion process and structural perturbances were considered, and the ultimate goal was to find seeds with robust influence ability against structural damages. Firstly, the IC model was adopted to simulate the diffusion process, and the 2-hop influential range of seeds was concentrated. Referring to the existing literature, an evaluation factor was designed to numerically evaluate the influence propagation performance of seeds under attacks. Then, RIM MA was designed to solve the robust influence maximization problem. The algorithm fully considers the optimal information from both neighboring and global areas, and the seed set with the maximal robust influence in the network is expected. Finally, experimental results on synthetic networks and realistic networks revealed that the performance of RIM MA is competitive compared with existing algorithms, valuable candidates are obtained for decision makers. The results provide references for solving problems such as knowledge mining, system control, and emergency management of several networks, which contribute to the development and application of land transportation systems.
In the future, there are still several difficult problems that can be further studied. First of all, the propagation model employed in this paper is the IC model. More propagation models such as the WC model and the LT model are to be studied. Then, many parameters in the experiments are configured as fixed values, including the damage ratio   ρ in the measure R s . Consequent studies on the problem of RIM with uncertain parameters are desired. Finally, how to solve the problem of robust influence maximization in complicated systems such as the multiplex network [39] and the interdependent network [40] are also worthy of further investigations.

Author Contributions

Conceptualization, D.H.; methodology, Z.F.; software, D.H. and N.C.; validation, X.T.; investigation, D.H.; writing—original draft preparation, D.H.; writing—review and editing, X.T.; supervision, X.T.; funding acquisition, X.T.; formal analysis, Z.F.; project administration, Z.F. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Key-Area Research and Development Program of Guangdong Province under Grant 2020B090921003.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest regarding the publication of this manuscript.

References

  1. Newman, M.E.J. Networks: An Introduction; Oxford University Press: New York, NY, USA, 2010. [Google Scholar]
  2. Wang, J.; Luo, H.; Tan, X. Path Planning for Automatic Guided Vehicles (AGVs) Fusing MH-RRT with Improved TEB. Actuators 2021, 10, 314. [Google Scholar] [CrossRef]
  3. Barabási, L.; Albert, R. Emergence of scaling in random networks. Science 1999, 286, 509–512. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  4. Erdõs, P.; Rényi, A. On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci. 1960, 5, 17–61. [Google Scholar]
  5. Gong, M.; Yan, J.; Shen, B.; Ma, L.; Cai, Q. Influence maximization in social networks based on discrete particle swarm optimization. Inf. Sci. 2016, 367, 600–614. [Google Scholar] [CrossRef]
  6. Wang, C.; Ma, X.; Jiang, W.; Zhao, L.; Lin, N.; Shi, J. IMCR: Influence Maximisation-Based Cluster Routing Algorithm for SDVN. In Proceedings of the 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), Zhangjiajie, China, 10–12 August 2019; pp. 2580–2586. [Google Scholar]
  7. Zhao, B.; Xu, C.; Liu, S.; Zhao, J.; Li, L. A Congestion Diffusion Model with Influence Maximization for Traffic Bottlenecks Identification in Metrocity Scales. In Proceedings of the 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, 9–12 December 2019; pp. 1717–1722. [Google Scholar]
  8. Kempe, D.; Kleinberg, J.; Tardos, É. Maximizing the spread of influence through a social network. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, 24–27 August 2003; pp. 137–146. [Google Scholar]
  9. Chen, W.; Wang, Y.; Yang, S. Efficient influence maximization in social networks. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, 28 June–1 July 2009; pp. 199–208. [Google Scholar]
  10. Rahimkhani, K.; Aleahmad, A.; Rahgozar, M.; Moeini, A. A fast algorithm for finding most influential people based on the linear threshold model. Expert Syst. Appl. 2015, 42, 1353–1361. [Google Scholar] [CrossRef]
  11. Lee, J.-R.; Chung, C.-W. A fast approximation for influence maximization in large social networks. In Proceedings of the 23rd International Conference on World Wide Web, Seoul, Korea, 7–11 April 2014; pp. 1157–1162. [Google Scholar]
  12. Zhou, M.; Liu, J. A two-phase multiobjective evolutionary algorithm for enhancing the robustness of scale-free networks against multiple malicious attacks. IEEE Trans. Cybern. 2017, 47, 539–552. [Google Scholar] [CrossRef]
  13. Wang, S.; Liu, J. Constructing robust community structure against edge-based attacks. IEEE Syst. J. 2019, 13, 582–592. [Google Scholar] [CrossRef]
  14. Holme, P.; Kim, B.J.; Yoon, C.N.; Han, S.K. Attack vulnerability of complex networks. Phys. Rev. E 2002, 65, 056109. [Google Scholar] [CrossRef] [Green Version]
  15. Zhang, J.; Wang, S.; Wang, X. Comparison analysis on vulnerability of metro networks based on complex network. Phys. A 2018, 496, 72–78. [Google Scholar] [CrossRef]
  16. Yin, R.; Yuan, H.; Zhu, H.; Song, X. Model and analyze the cascading failure of scale-Free network considering the selective forwarding attack. IEEE Access 2021, 9, 49025–49035. [Google Scholar] [CrossRef]
  17. Tefek, U.; Tandon, A.; Lim, T.J. Malicious relay detection using sentinels: A stochastic geometry framework. J. Commun. Netw. 2020, 22, 303–315. [Google Scholar] [CrossRef]
  18. Ma, L.; Gong, M.; Cai, Q.; Jiao, L. Enhancing community integrity of networks against multilevel targeted attacks. Phys. Rev. E 2013, 88, 022810. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  19. Luptáková, D.; Pospíchal, J. Community cut-off attack on malicious networks. In Proceedings of the Conference on Creativity in Intelligent Technologies and Data Science, Volgograd, Russia, 12–14 September 2017; pp. 697–708. [Google Scholar]
  20. Schneider, C.M.; Moreira, A.A.; Andrade, J.S.; Havlin, S.; Herrmann, H.J. Mitigation of malicious attacks on networks. Proc. Natl. Acad. Sci. USA 2011, 108, 3838–3841. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  21. Buldyrev, S.; Parshani, R.; Paul, G.; Stanley, H.; Havlin, S. Catastrophic cascade of failures in interdependent networks. Nature 2010, 464, 1025–1028. [Google Scholar] [CrossRef] [Green Version]
  22. Zhou, M.; Liu, J. A memetic algorithm for enhancing the robustness of scale-free networks against malicious attacks. Phys. A 2014, 410, 131–143. [Google Scholar] [CrossRef]
  23. Wang, S.; Liu, J. A multi-objective evolutionary algorithm for promoting the emergence of cooperation and controllable robustness on directed networks. IEEE Trans. Netw. Sci. Eng. 2018, 5, 92–100. [Google Scholar] [CrossRef]
  24. Wang, J.; Li, J.; Shi, Y.; Lai, J.; Tan, X. AM3Net: Adaptive mutual-learning-based multimodal data fusion network. IEEE Trans. Circuits Syst. Video Technol. 2022. Early Access. [Google Scholar] [CrossRef]
  25. Gong, M.; Song, C.; Duan, C.; Ma, L.; Shen, B. An efficient memetic algorithm for influence maximization in social networks. IEEE Comput. Intell. Mag. 2016, 11, 22–33. [Google Scholar] [CrossRef]
  26. Saito, K.; Kimura, M.; Ohara, K.; Motoda, H. Super mediator—A new centrality measure of node importance for information diffusion over social network. Inf. Sci. 2016, 329, 985–1000. [Google Scholar] [CrossRef] [Green Version]
  27. Li, D.; Wang, C.; Zhang, S.; Zhou, G.; Chu, D.; Wu, C. Positive influence maximization in signed social networks based on simulated annealing. Neurocomputing 2017, 260, 69–78. [Google Scholar] [CrossRef]
  28. Zhang, K.; Du, H.; Feldman, M.W. Maximizing influence in a social network: Improved results using a genetic algorithm. Phys. A 2017, 478, 20–30. [Google Scholar] [CrossRef]
  29. Chen, W.; Lin, T.; Tan, Z.; Zhao, M.; Zhou, X. Community cut-off attack on malicious networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 795–804. [Google Scholar]
  30. He, X.; Kempe, D. Stability and robustness in influence maximization. ACM Trans. Knowl. Disc. Data (TKDD) 2018, 12, 1–34. [Google Scholar] [CrossRef]
  31. Zeng, A.; Liu, W. Enhancing network robustness against malicious attacks. Phys. Rev. E 2012, 85, 066130. [Google Scholar] [CrossRef] [Green Version]
  32. Kim, J.; Sarkar, S.; Venkatesh, S.S.; Ryerson, M.S.; Starobinski, D. An epidemiological diffusion framework for vehicular messaging in general transportation networks. Transp. Res. Part B Methodol. 2020, 131, 160–190. [Google Scholar] [CrossRef]
  33. Baus, P.; Redi, J. Movement control algorithms for realization of fault-tolerant ad hoc robot networks. IEEE Netw. 2004, 18, 36–44. [Google Scholar] [CrossRef]
  34. Fazlollahtabar, H.; Saidi-Mehrabad, M. Methodologies to optimize automated guided vehicle scheduling and routing problems: A review study. J. Intell. Robot. Syst. 2015, 77, 525–545. [Google Scholar] [CrossRef]
  35. Zhang, B.; Tang, L.; Decastro, J.; Roemer, M.J.; Goebel, K. A recursive receding horizon planning for unmanned vehicles. IEEE Trans. Ind. Electron. 2015, 62, 2912–2920. [Google Scholar] [CrossRef]
  36. Wang, S.; Liu, J. Community robustness and its enhancement in interdependent networks. Appl. Soft Comput. 2019, 77, 665–677. [Google Scholar] [CrossRef]
  37. Watts, D.J.; Strogatz, S.H. Collective dynamics of ‘small-world’ networks. Nature 1998, 393, 440–442. [Google Scholar] [CrossRef]
  38. Farid, A.M. Symmetrica: Test case for transportation electrification research. Infrastruct. Complex. 2015, 2, 1–10. [Google Scholar] [CrossRef] [Green Version]
  39. Wang, S.; Liu, J. Robustness of single and interdependent scale-free interaction networks with various parameters. Phys. A 2016, 460, 139–151. [Google Scholar] [CrossRef]
  40. Wang, N.; Jin, Z.; Zhao, J. Cascading failures of overload behaviors on interdependent networks. Phys. A 2021, 574, 125989. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.