Next Article in Journal
A Hybrid Harmonic Curve Model for Multi-Streamer Hydrophone Positioning in Seismic Exploration
Next Article in Special Issue
Zero-Shot Traffic Identification with Attribute and Graph-Based Representations for Edge Computing
Previous Article in Journal
Using Heart Rate and Behaviors to Predict Effective Intervention Strategies for Children on the Autism Spectrum: Validation of a Technology-Based Intervention
Previous Article in Special Issue
An Improved YOLOv8-Based Foreign Detection Algorithm for Transmission Lines
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Network Dismantling on Signed Network by Evolutionary Deep Reinforcement Learning

School of Statistics and Data Science, Nankai University, Tianjin 300074, China
*
Author to whom correspondence should be addressed.
Sensors 2024, 24(24), 8026; https://doi.org/10.3390/s24248026
Submission received: 8 November 2024 / Revised: 8 December 2024 / Accepted: 14 December 2024 / Published: 16 December 2024
(This article belongs to the Special Issue Communications and Networking Based on Artificial Intelligence)

Abstract

:
Network dismantling is an important question that has attracted much attention from many different research areas, including the disruption of criminal organizations, the maintenance of stability in sensor networks, and so on. However, almost all current algorithms focus on unsigned networks, and few studies explore the problem of signed network dismantling due to its complexity and lack of data. Importantly, there is a lack of an effective quality function to assess the performance of signed network dismantling, which seriously restricts its deeper applications. To address these questions, in this paper, we design a new objective function and further propose an effective algorithm named as DSEDR, which aims to search for the best dismantling strategy based on evolutionary deep reinforcement learning. Especially, since the evolutionary computation is able to solve global optimization and the deep reinforcement learning can speed up the network computation, we integrate it for the signed network dismantling efficiently. To verify the performance of DSEDR, we apply it to a series of representative artificial and real network data and compare the efficiency with some popular baseline methods. Based on the experimental results, DSEDR has superior performance to all other methods in both efficiency and interpretability.

1. Introduction

Although extensive research has been conducted in the fields of sensors and communication, ensuring the security of sensor networks remains an unresolved issue. This challenge is closely related to the technology of network dismantling. Network dismantling aims to determine a set of nodes (edges) to be removed from the network under specific constraints and various dismantling objective functions to minimize their overall network performance. It has gained significant prominence in the field of network science owing to its broad applications across various domains [1,2]. For example, by finding the most efficient set of nodes to dismantle a network, we can solve a number of problems such as dismantling criminal organizations [3,4], reducing the spread of rumors [5], controlling viruses [6], and maintaining the stability of sensor networks [7]. It is worth noting that network dismantling is a combinatorial optimization problem that has been proven to be NP-hard [8,9,10].
To find the optimal dismantling strategy, researchers have proposed some methods, which are described in detail in Section 2. While these methods have demonstrated efficiency in rapidly dismantling networks, they are primarily designed for addressing the issue of unsigned network dismantling. In fact, interactions between individuals in the real world may be given specific meanings [11,12]. For example, users may be friends or enemies in a social network, which necessitates the use of a signed network to represent the multi-type relationships between users [13]. In order to reflect the complexities of real-world networks accurately, the signed network dismantling problem has become an important and challenging area of research. Nevertheless, there are few studies that focus on signed network dismantling, and the main challenge is lack of efficient objective function and the corresponding algorithm, which consider both the sign and topological information.
Deep reinforcement learning [14] is a powerful technology that searches the optimal solution quickly through its reward mechanism and technique. However, it is easy to obtain the local optimal solution due to the local search of gradient descent [15]. In this paper, we propose a new framework for Dismantling on Signed Networks based on Evolutionary Deep Reinforcement Learning (DSEDR) to achieve fast search for global optimal solutions by combining the global search capability of evolutionary computation with the deep reinforcement learning. Specifically, we first propose a new objective function that considers both the connectivity of the network and the proportion of co-operative relationships (i.e., positive edges) in the signed network. By considering both features of sign information and network topology simultaneously, it transforms the signed network dismantling problem into a multi-objective optimization problem. To minimize this objective function, inspired by evolutionary computation and deep reinforcement learning methods [15], DSEDR integrates the advantages of both evolutionary computation and deep reinforcement learning to search for an optimal set of disassembled nodes to minimize the objective function and thus achieve the best dismantling result on a signed network. To obtain the network embedding, DSEDR initially employs an effective encoder to capture and learn the positive and negative topological characteristics. Then, it employs a Deep Q-Network (named as DQN) as a decoder to generate a dismantling strategy through a Markov decision process. In order to obtain the optimal weight parameters of the DQN, the algorithm optimizes the DQN by combining the global search capability of the evolutionary computation and the fast local search capability of the deep reinforcement learning. Finally, to verify the efficiency of DSEDR, we apply it on multi-type artificial and real network data sets. We compare the performance with eight popular baseline methods, and the experimental results demonstrate that DSEDR owns superior performance for signed networks dismantling among all the algorithms.
The subsequent sections are organized as follows: Section 2 presents an overview of related works, and Section 3 outlines the problem of dismantling on signed networks, and then establishes the objective function. Section 4 and Section 5 introduce the main framework and the detailed algorithm procedures of DSEDR. Section 6 demonstrates the experimental results, and finally, Section 7 concludes the whole work and provides future work.

2. Related Works

2.1. Network Dismantling Methods

Recently, network dismantling has received extensive attention from various disciplines, such as operations research, network science, and computer science. Many scientists focus on this important research area and have proposed some algorithms to explore this problem. Saeed et al. [16] introduced a network dismantling method based on network embedding, using geometric space representations to address the network dismantling problem. Sebastian et al. [17] developed a solution for the rapid identification of critical nodes across various network types, employing an iterative process that converts stochastic fault tracing into targeted attacks for effective dismantling. Additionally, Braunstein et al. [18] proposed the Min-Sum algorithm, a three-stage method that closely links the network dismantling problem to the de-ringing problem. Yan et al. [19] presented the HITTER framework, which transforms the hypernetwork dismantling to a sequential decision-making problem based on Deep Reinforcement Learning (DRL). Furthermore, Fan et al. [20] introduced FINDER, a Deep Reinforcement Learning-based approach designed to train intelligent agents applicable to a wide range of realistic networks. After all, network dismantling is still a new question and has not been fully studied. Moreover, there are few studies that focus on signed network dismantling, and the main challenge is lack of efficient objective functions and algorithms that consider both the sign information and network topology.

2.2. Evolutionary Deep Reinforcement Learning Algorithms

Reinforcement learning [21] is a learning mechanism that learns how to map from states to policies in order to maximize the reward obtained. The primary objective is to maximize cumulative rewards to identify a near-optimal solution for a given problem. Deep reinforcement learning algorithms integrate the perceptual capabilities of deep learning with the decision-making ability of reinforcement learning, optimizing neural network weights via backpropagation techniques to conduct effective searches in the action space of optimization problems. However, traditional deep reinforcement learning approaches predominantly rely on local search methods based on gradient descent, which can lead to local optimum [22]. To enhance the optimization performance of these algorithms, Kwon et al. [23] proposed the S2V-DQN algorithm, which integrates graph neural networks with deep Q-learning. This innovative approach allows agents to better comprehend the structural properties of graphs by embedding information about nodes and their neighbors into state representations.
The deep evolutionary algorithm represents an optimization method that combines evolutionary algorithms with deep learning to address complex non-convex optimization problems. This algorithm explores and optimizes solutions by applying evolutionary mechanisms to the strategy search process [24]. From a strategy search perspective, deep evolutionary algorithms can be classified into two main categories. The first category employs evolutionary algorithms to replace traditional strategy gradient-based optimization methods (e.g., gradient descent), facilitating more efficient strategy updates. This category includes popular approaches such as OpenAI-ES [25] and DRL-GA [26]. The second category treats policies as evolvable individuals, merging the population search characteristics of evolutionary algorithms with the gradient optimization capabilities of reinforcement learning, aiming to optimize populations of policies, such as EPG [23] and CERL [27].

3. Problem Formulation

3.1. Network Connectivity

Let G = ( V , E ) represent a given network, V = { V 1 , V 2 , , V N } represent the set of nodes, and E = [ e 1 , e 2 , , e w ] V × V represent the set of edges [28,29]. Here, N = | V | denotes the total number of nodes, while L = | E | indicates the total number of edges. To facilitate comprehension, the key symbols used in this paper are standardized and summarized in Table 1. There are multiple types of objective functions for traditional complex network dismantling problems, including the number of connected slices, the size of the giant connected component, the number of connected node pairs, etc. [30]. In this paper, we use the size of the giant connected component after network dismantling and select it as a part of the dismantling objectives due to its favorable properties [18]. It is important to note that this metric can be substituted with other conventional disintegration objectives, which depends on the problem being addressed.
Consider removing a subset of K nodes represented by the sequence of nodes { V 1 , V 2 , . . . , V K } in the network. The size of the giant connected component (GCC), denoted as σ , and the giant connected component size ratio, represented by R σ , can be defined as follows:
R σ ( V ^ ) = σ ( G V ^ ) σ ( G ) ,
where
σ ( G ) = m a x { δ i , C i G } .
Here, C i represents the i-th connected component and δ i denotes the size of C i on the current graph G . The term σ ( G V ^ ) refers to the size of GCC of the residual graph after sequentially removing nodes in the set V ^ = { V 1 , V 2 , , V K } from G . Meanwhile, σ ( G ) represents the initial size of the GCC of G before any node removals.

3.2. Signed Networks

Signed networks, defined as a class of networked systems in which edges are labeled with “positive” and “negative” symbolic attributes, represent an effective tool for modeling social networks with emotional differences. The “positive and negative” symbols represent two opposing emotional attitudes. In particular, the positive edges are typically employed to characterize positive relationships, such as friendship, support, trust, and liking. These are indicated with a positive symbol, “+”. Negative edges are commonly used to represent negative relationships, such as enmity, opposition, distrust, and dislike. These are marked with a negative symbol, “−” [31].
The aforementioned meanings of positive and negative symbols can be applied to the signed network dismantling problem. In order to minimize the harmful impact on a malignant network, it is possible to decrease the proportion of cooperative/friendly relationships, i.e., positive edges, in the network as low as possible. Accordingly, the second optimization objective, which is denoted as R p e , is defined as the percentage of positive edges in the signed network after the removal of nodes V ^ . This metric is calculated as follows:
R p e ( V ^ ) = k + ( G V ^ ) k ( G V ^ ) ,
where k + ( G V ^ ) is the number of positive edges of the residual graph after removing nodes in the set V ^ = { V 1 , V 2 , , V K } sequentially from G , and k ( G ) is the total number of edges of the residual graph after removing nodes in the set V ^ .

3.3. The Objective Function

With the positive edge share R p e and the connectivity metric R σ , the formula for the signed network dismantling objective function (denoted as Φ ) is proposed as follows:
Φ ( V ^ ) = R σ ( V ^ ) + R p e ( V ^ ) × λ ,
where R σ is the size of the GCC in Equation (1), R p e is the positive edge share in Equation (3), and V ^ represents the set of nodes to be dismantled. λ denotes the importance of positive edge share in the signed network dismantling problem. It follows that, in general, the larger the value of λ , the greater the requirement for the reduction for positive edge share in the dismantled graph network. The value of λ is dependent upon the specific problem at hand and the graph to be dismantled. By reducing the objective function described above, it is possible to achieve a significant reduction in the connectivity of the network, while also taking into account the unique positive–negative edge relationship of the symbolic network, which reduces the cooperation ability between the nodes of the signed network. As a consequence, the objective function will facilitate the measuring of the degree of dismantling on the signed network.
The aforementioned definition of objective function conceptualizes the signed network dismantling problem as a multi-objective optimization problem. Our objective is to identify the optimal dismantling node set strategy, denoted by V ^ o p , that minimizes the objective function Φ , given the number of dismantling nodes K . The problem can be represented as follows:
V ^ o p = argmin V ^ G Φ ( V ^ ) .
In the next sections, we will propose a framework that provides a new strategy to remove the target set of nodes for optimizing the objective function, with the detailed process depicted in Figure 1.

4. Deep Reinforcement Learning

4.1. Network Embedding

In this paper, we reduce the network dimensionality reduction by extracting the low-dimensional features of a topological network through neural networks. This work uses the Deep Network Embedding for Graph Representation Learning in Signed Networks (DNESBP) algorithm [32] to perform network dimensionality reduction on signed networks, obtaining a d-dimensional feature representation vector H = h i j n × d , h i j [ 1 , 1 ] . The Deep Q-Network (named as DQN) can then input H to fully learn the positive and negative topological characteristics and structural balance properties of the signed network.
The DNESBP algorithm first uses a neural network with layers of neurons to construct an autoencoder to reduce the dimensionality of the signed network adjacency matrix A and obtain the representation learning vector, as follows:
H ( i ) = f X ( i ) ( W ( i ) ) + B ( i ) , i = 1 , , ,
where X ( 0 ) = A and H ( i ) , i 0 is the vector obtained by the i-th dimensionality reduction. H = h i j n × d denotes the vector of feature representations obtained by the final dimensionality reduction. The f function is the activation function used to generate positive and negative eigenvalues denoted as f ( x ) = e x e x e x + e x . W ( i ) is the network weight matrix in the i-th layer of the encoder, and B ( i ) is the bias vector of the network weights in the i-th layer of the encoder.
Next, the DNESBP algorithm reconstructs H by designing the decoder of the l-layer neural network and obtains the reconstruction matrix A ^ as follows:
X ^ ( i ) = f H ^ ( i ) ( W ^ ( i ) ) + B ^ ( i ) , i = 1 , , ,
where H ^ ( i ) = H ( l ) and X ^ ( l ) = A ^ , W ^ ( i ) is the network weight matrix of the decoder’s layer i, and B ^ ( i ) is the bias vector of the decoder’s layer i network weights. Further, the DNESBP algorithm supervises the dimensionality reduction process by computing the reconstruct. For the details of the loss function and the principle, please refer to [32].

4.2. Deep Q-Network

In our framework, the Deep Q-Network (DQN) [33] is implemented as a two-layer neural network specifically designed to approximate the optimal policy in reinforcement learning. The algorithm employs the dimensionality-reduced representation H as the input to DQN, which subsequently outputs a vector Q = [ q i ] . Here, each q i corresponds to the Q-value associated with a specific node, and the node with the highest Q-value is selected for removal in the subsequent step.
It is important to note that the selection of a node for removal has an impact on an original network’s topology. Consequently, to reflect the changes in network dynamics resulting from the node selection, it is necessary to update the input vectors of DQN. Here, we propose a new approach to updating the inputs of DQN. The specific method is as follows: At each seed node selection, S = { s 1 , s 2 , , s N } is used to denote the current seed node selection state, where s i = 1 and s i = 0 denote that node V i has been and has not been selected as a seed node, respectively. D = { d 1 , d 2 , , d N } is the degree of each node of the current network, while P = { p 1 , p 2 , , p N } is the positive out-degree of each node of the current network. Upon the selection of node v i at time t 1 , the corresponding reduced feature H [ i ] , degree d i , and positive out-degree p i are set to zero, while s i is set to 1. The remaining elements in D and P will be updated by recomputing them based on the new network after removing node v i . Then H, S, D, and P are spliced to form a new input vector. The detailed process can be seen in Figure 2.
DQN takes a ( d + 3 ) -dimensional vector as input and computes the Q-value from its network structure as follows:
Q = DQN ( W 1 , W 2 , [ H t , S t , D t , P t ] ) = ReLU ( [ H t , S t , D t , P t ] · W 1 ) · W 2 .
Here, W 1 represents the continuous weight matrix for the first layer of the neural network, while W 2 denotes the continuous weight vector of the second layer. The ReLU function is employed as the activation function to eliminate negative eigenvalues.

4.3. Markov Seed Selection

For the removed seed nodes, we employ a Markov decision process approach to optimize the iteration process and avoid selecting duplicated seed nodes [34]. Specifically, at time t 1 , we first obtain the seed node sequence S t 1 . Then we update H, S, D, and P through the method mentioned in the DQN Module. At time t, [ H t , S t , D t , P t ] is input into the Deep Q-Network (DQN); meanwhile, we employ a greedy strategy to select the “node selection action” (Equations (9)–(11)) corresponding to the maximum Q-value, denoted as a t . There are
Q ( S t , a ) = DQN ( W 1 , W 2 , [ H t , S t , D t , P t ] ) ,
and
Q ( S t , a t ) = max a Q ( [ H t , S t , D t , P t ] , a ) ,
where
a t = arg max a Q ( S t , a ) .
Here, a represents the set of all possible actions that can occur when a removed seed node is selected, while a t denotes the specific action chosen at time t.
At time t, if a node v i has the highest Q value, the Markov decision process selects it as a seed node and makes s i = 1 , and updates the sequence of seed nodes S t , D t H t , and P t . The current sequence of seed nodes S t is then converted to the set of seed nodes V t according to the following function O ( S ) , and calculates the objective function Φ of the current signed network as follows:
Φ ( V t ) = Φ ( O ( S t ) ) s . t . O ( S t ) = { V i V : s i = 1 } .
Next, the Markov decision process calculates the decision “reward” r t , which is defined as the difference in objective function Φ before and after taking the action a t as follows:
r t = Φ ( V t ) Φ ( V t + 1 ) .
By using the DQN Module and Markov Decision Module, DSEDR transforms the discrete combination optimization problem of selecting K seed nodes in a signed network into the continuous parameter optimization problem of DQN as follows:
min Φ ( S ) = Φ ( W 1 , W 2 ) s . t . W 1 , W 2 D Q N .
The MSS Algorithm can be seen in Algorithm 1. In the next section, we propose a new evolutionary deep reinforcement learning algorithm DSEDR to resolve this parameter optimization problem of DQN.
Algorithm 1 MSS (Markov seed selection) Algorithm
Input: 
Parameters ( W 1 , W 2 ) , size of seed set K and target network G .
Output: 
Seed set V ^ , Φ ( V ^ ) and a set of Markov decision processes { ( S t , a t , r t , S t + 1 ) } K .
   1:
Initialization: set t 1 and S t [ ];
   2:
Computing degree d i and positive out-degree p i of each node v i in G , then set D t [ d i ] N and P t [ p i ] N ;
   3:
Set DQN ( W 1 , W 2 ) ;
   4:
while  t K  do
   5:
   Compute Q ( S t , a ) based on Equation (9);
   6:
   Compute a t based on Equation (11), which selects the node with the highest Q-value and then obtain S t + 1 ;
   7:
   Update H t , D t , and P t according to a t and obtain H t + 1 , D t + 1 , and P t + 1 ;
   8:
   Compute r t = Φ ( V t ) Φ ( V t + 1 ) ;
   9:
    t t + 1 ;
 10:
end while
 11:
B { ( S t , a t , r t , S t + 1 ) } K , V ^ V K ;
 12:
return V ^ , Φ ( V ^ ) ;

5. The DSEDR Algorithm

In this section, we introduce the detailed procedures of the DSEDR algorithm designed to search for the optimal solution for the signed network dismantling problem in Equation (5). We aim to integrate the advantages of evolutionary computation and deep reinforcement learning to facilitate the optimization process.

5.1. Evolution of DQN Populations

5.1.1. Solution Representation and Evaluation

In DSEDR, all individuals in a population P = { P i } n p evolve at the same time, where each solution P i represents a DQN. It can be represented as follows:
P i = ( W i 1 , W i 2 ) , W i 1 [ 1 , 1 ] ( d + 3 ) · l , W i 2 [ 1 , 1 ] l · 1 ,
where the weight parameters W 1 and W 2 correspond to the first and second layers of DQN, respectively. The number of neural nodes in the first and second layers is given by d + 3 and l, respectively.
For each solution, DSEDR combines the network embedding, DQN, and Markov seed selection modules described above with the objective in Equation (5) to generate two outputs. One output is the set of seed nodes (denoted by V ^ i ) and Φ i , which are used in the evolutionary step to evaluate the fitness of the solution (denoted by P i ). The other output consists of a set of Markov decision processes, i.e., sequences of states, actions, rewards, and subsequent states, which are used to accelerate the DQN optimization in the deep reinforcement learning step.

5.1.2. Initialization Operations

DSEDR initially randomizes the weight sequence population of the DQN network in order to obtain the initial solution set for the DQN population. This random initialization operation ensures the diversity of the initial DQN population solutions. The specific procedure for this is outlined in Algorithm 2. In particular, a value is randomly generated for each weight parameter of DQN, ranging from −1 to 1, which serves as the initial value.
Algorithm 2 Initialization Algorithm
Input: 
Initial population size: n I .
Output: 
Initial solutions P .
 1:
Set P = { P i ( 0 ) } n I , P i ( 0 ) = ( W i 1 , W i 2 ) , i = 1 , 2 , 3 , , n I ;
 2:
for  i = 1 , 2 , 3 , , n I  do
 3:
   for each element in W i 1 , W i 2  do
 4:
     Randomly generate a value in the range of −1 to 1;
 5:
   end for
 6:
end for

5.1.3. Evolutionary Operations

Following the initial acquisition of the DQN population, the DSEDR algorithm initiates an iterative process of evolution, including crossover, mutation, and selection operations on DQN. These are conducted with the objective function Φ as the fitness value. These operations are illustrated in the subsequent description.
The crossover operation focuses on the parent solution P ( q ) , which first orders the population set based on fitness values, and then generates new offspring using two operations, solution pairing and single-point crossover. The solution pairing divides the parent solution P ( q ) into the set with “good genes” { P i b ( q ) } and the set with “average genes” { P j w ( q ) } . For each pair of solutions { P i b ( q ) , P j w ( q ) } , the single-point crossover first generates a random bit-value of either 0 or 1 for each gene position, and then crosses over the gene values with a bit-value of 1 in the two solutions to generate the offspring { P i ( q ) C } and { P j ( q ) C } , and we write down all the offspring generated by the crossover operation as follows:
{ P ( q ) C } = { P i ( q ) C } { P j ( q ) C } .
For mutation operation, a random mutation is performed by the following N function for each gene y i in the population to be mutated, which enhances the diversity of the population’s gene sequences and results in a mutated progeny { P ( q ) M } , which is presented in the following equation:
y i = N ( 1 , 1 ) ,
where the N function is a normal random function used to generate values from −1 to 1, and obeys the probability density function 1 2 π · e y 2 / 2 .
The selection operation is based on a greedy strategy, whereby the population P ( q ) E A with the highest fitness value Φ ( V ) is selected from P ( q ) C P ( q ) M . Subsequently, the population P ( q ) E A , which comprises “superior genes”, is chosen as the population to be evolved in the next generation. Refer to Algorithm 3 for details of the evolution procedure.
Algorithm 3 Evolution Algorithm
Input: 
Population size n p , parent solutions P ( q ) , crossover probability p c r o s s o v e r , and mutation probability p m u t a t i o n .
Output: 
Offspring solutions: P ( q ) E A .
   1:
Randomly divide P ( q ) into two populations: P ( q ) C and P ( q ) M ;
   2:
V ^ , Φ ( V ^ ) MSS ( P ( q ) C , K , H ) ;
   3:
Sort the solutions P ( q ) c in an ascending order based on their Φ ( V ^ ) ;
   4:
Compose { P i b ( q ) , P j w ( q ) } of the first 50% of the populations { P i b ( q ) } and the last 50% of the populations { P j w ( q ) } ;
   5:
for each pair of solutions ( P i b ( q ) , P j w ( q ) )  do
   6:
   if A randomly generated value p 1 p c r o s s o v e r  then
   7:
     Each pair of genes ( x i , x j ) in P i b ( q ) and P j w ( q ) is randomly assigned a value of 0 or 1, thus constituting the sequence M;
   8:
     for each ( x i , x j ) in { P i b ( q ) , P j w ( q ) }  do
   9:
        if  ( x i , x j ) corresponds to a value of 1 in the sequence M then
 10:
          Replace the values of x i and x j to obtain a new P i ( q ) C and P j ( q ) C ;
 11:
        end if
 12:
     end for
 13:
   end if
 14:
end for
 15:
Set { P ( q ) C } = { P i ( q ) C } { P j ( q ) C } ;
 16:
for each solution P i ( q ) M in P ( q ) M  do
 17:
   for each weight parameter y in P i ( q ) M  do
 18:
     if A randomly generated value p 2 p m u t a t i o n  then
 19:
        Update y value to a value N(−1,1),get P i ( q ) M ;
 20:
     end if
 21:
   end for
 22:
end for
 23:
V ^ , Φ ( V ^ ) MSS ( P ( q ) C P ( q ) M , K , H ) ;
 24:
The n p populations from P ( q ) C P ( q ) M with the highest Φ ( V ^ ) are selected as P ( q ) E A ;

5.2. Reinforcement Learning Operation

In large-scale networks, the numerous weight sequences associated with DQN make it difficult to find an optimal solution within limited iterations, resulting in slow convergence speed. To address this problem, DSEDR introduces an n-step Q-learning technique based on deep reinforcement learning concepts to accelerate DQN training and evolution.
DSEDR stores the historical quaternion data set, including the quadruplet { ( S t , a t , r t , S t + 1 ) } n g in the cache pool B , throughout the iterative seed node selection process conducted in accordance with the Markov decision process. Following n g iterations of training, the DSEDR algorithm samples quaternion data in batches from the cache pool based on the n b -step Q-learning technique. These data are then used to compute the loss function L ( W 1 , W 2 ) of DQN with weight parameters ( W 1 , W 2 ) , which evaluates the expected difference between predicted Q values and target Q values. The loss function can be computed as follows:
L ( W 1 , W 2 ) = E C B i = 0 n b 1 r t + i + γ max Q ( S t + n b , a * ; W 1 , W 2 ) predicted Q values Q ( S t , a t ; W 1 , W 2 ) target Q values 2 ,
where γ denotes the hyperparameter used to determine the importance of the reward r t + i and E denotes the expectation function. B denotes the cache pool used to store the quaternion data, and C = { ( S t , a t , r t , S t + 1 ) } n b denotes that each batch is taken out from the cache pool B for updating the quaternion data of DQN with a batch size of n b . Based on the above objective loss function L ( W 1 , W 2 ) , the DSEDR algorithm can further compute the gradient of the loss function Δ ( W 1 , W 2 ) , which can be computed as follows:
Δ ( W 1 , W 2 ) = α i = 0 n b 1 r t + i + γ max Q ( S t + n b , a * ; W 1 , W 2 ) Q ( S t , a t ; W 1 , W 2 ) · ( W 1 , W 2 ) Q ( S t , a t ; W 1 , W 2 ) ,
where α denotes the learning rate when updating DQN in reverse, and ( W 1 , W 2 ) denotes the deviation to ( W 1 , W 2 ) .
Then DSEDR adopt the following stochastic gradient descent algorithm to inversely update the network weights of the DQN ( W 1 , W 2 ) :
( W 1 , W 2 ) = ( W 1 , W 2 ) Δ ( W 1 , W 2 ) .
Following the aforementioned operation, the convergence of the network weights of DQN can be accelerated towards superior weights. Refer to Algorithm 4 for details of the DRL procedure.
Algorithm 4 DRL (deep reinforcement learning) Algorithm
Input: 
Cache pool B , batch size n b , population P ( q ) E A .
Output: 
Population P ( q ) D R L .
 1:
Randomly retrieve C = { ( S t , a t , r t , S t + 1 ) } n b of size n b from the cache pool B ;
 2:
Calculating L ( W 1 , W 2 ) by Equation (17);
 3:
Calculating Δ ( W 1 , W 2 ) of L ( W 1 , W 2 ) by Equation (18);
 4:
Based on Δ ( W 1 , W 2 ) , reverse update ( W 1 , W 2 ) corresponding to the most favorable population P ( q ) E A , thus obtaining a new population P ( q ) D R L ;

5.3. DSEDR Algorithm

DSEDR combines evolutionary computation with deep reinforcement learning to enhance the evolution of DQNs. The evolutionary computation concurrently evolves a population of individuals, where each individual embodies a DQN, and ultimately derives a solution to the signed network dismantling problem through the previously outlined seed selection strategy. In contrast, deep reinforcement learning leverages the combined information and network-specific insights of the DQNs to speed up their evolutionary process.
More specifically, in each iteration, the DSEDR algorithm employs a two-step approach to optimize the DQN network weight parameters. First, it searches globally through the crossover and mutation of the genetic algorithm. Second, it searches locally through the stochastic gradient descent algorithm of reinforcement learning. Additionally, it introduces the n-step Q-learning technique, which makes use of the quaternions stored in the cache pool for the back propagation to update the DQN network weights. The network weights of DQNs are updated in order to accelerate the training convergence and evolution of DQNs. Furthermore, DQN networks are evaluated and screened by the use of a Markov decision process and the aforementioned objective function Φ in Equation (4). The part of the population with the highest fitness value is selected, while the other part of the population is randomly selected for the next iteration. This approach ensures the evolution of the population and the diversity of stochasticity, which can be achieved by the use of the greedy strategy.

5.4. Complexity Analysis

As previously stated, DSEDR (Algorithm 5) includes four different sub-algorithms: MSS (Algorithm 1), Initialization (Algorithm 2), Evolution (Algorithm 3), and DRL (Algorithm 4). The computational complexity of each sub-algorithm will be calculated separately and subsequently combined to derive the overall algorithmic complexity of DSEDR.
  • MSS: The time complexity of MSS can be computed as O ( K × ( d × l + K ¯ × K ¯ ) ) , where K is the number of nodes to be removed, K ¯ is the average connectivity of nodes in the target network, d and l denote the number of neurons in the first and the second layers of the DQN, respectively.
  • Initialization: The time complexity of Initialization can be computed as O ( n I × d × l ) , where n I is the initial population size.
  • EA: The time complexity of EA can be computed as O ( n I × K × ( d × l + K ¯ × K ¯ ) ) .
  • DRL: The time complexity of DRL can be computed as O ( n b × ( d × l + K ¯ × K ¯ ) ) , where n b is the batch size in the DRL Algorithm.
Algorithm 5 DSEDR Algorithm
Input: 
Initial population size: n I , maximum number of iterations n g , population size n p , parent solutions P ( q ) , crossover probability p c r o s s o v e r , and mutation probability p m u t a t i o n .
 1:
{ P i ( 0 ) } n I Initialisation ( Algorithm   2 )   ( n I ) ;
 2:
for q = 0 to n g 1  do
 3:
    P ( q ) E A Evolution ( Algorithm   3 )   ( n p , P ( q ) , p c r o s s o v e r , p m u t a t i o n ) ;
 4:
    P ( q ) D R L DRL ( Algorithm   4 )   ( P ( q ) E A ) ;
 5:
    V , Φ ( V ) MSS ( Algorithm   1 )   ( P ( q ) P ( q ) E A P ( q ) D R L , K , H )
 6:
   Select population with the highest Φ ( V ) and size of n p / 2 from P ( q ) P ( q ) E A P ( q ) D R L as P ( q ) O p t ;
 7:
   Randomly select population with the size of n p / 2 from P ( q ) P ( q ) E A P ( q ) D R L P ( q ) O p t as P ( q ) R a n d o m ;
 8:
    P ( q + 1 ) P ( q ) R a n d o m P ( q ) O p t ;
 9:
end for
By summing up all the complexities of the four sub-algorithms, for a total of n g iterations, the total time complexity of DSEDR can be computed as follows:
O ( n g × ( n I × K + n b ) × ( d × l + K ¯ × K ¯ ) ) .
It is worth noting that in sparse networks, since the number of links is nearly linear with the number of nodes, we can further reduce this complexity to O ( n g × ( n I × K + n b ) × ( d × l + θ ) ) , where θ is a constant.

6. Experiments

In this section, we compare the performance of DSEDR with those of various baseline methods on multiple-type network datasets to verify its efficiency. Specifically, we first provide a brief description of the experimental settings and then show the experiment results on artificial networks and a real-world network, respectively. All experiments were conducted on a server running Linux, equipped with a 16-core AMD EPYC 9354 CPU, 60.1 GB of memory, an NVIDIA RTX 4090 GPU with 25.2 GB of graphics memory, and a storage disk with 751.6 GB capacity.

6.1. Experimental Settings

6.1.1. Baseline Algorithm

Because there are few methods that have been proposed for signed network dismantling, the selection of the baselines is primarily based on classical centrality methods and excellent conventional algorithms in complex network dismantling. In this paper, we select 10 classical centrality metrics and 2 conventional algorithms, MinSum [18] and FINDER [20], as baselines. These classical centrality metrics include metrics that are not related to edge symbols, such as Degree [35], Betweenness [36], K-shell [37], and Closeness [38], as well as metrics that take edge symbols into account, such as Positive degree (P-DEG), Negative degree (N-DEG), Net degree (Net-DEG), Ratio degree (Ratio-DEG), Prestige [39], and PageRank [40]. A brief description of these baselines is provided below.
  • Degree: The degree of a node, i.e., the number of neighboring nodes directly connected to the node [35].
  • Betweenness: Betweenness Centrality (BC) reflects how often a node appears on the shortest paths between pairs of other nodes. The BC of a node is defined as follows:
    B C ( i ) = i s , j t , s t g s t i g s t ,
    where g s t is the number of all shortest paths from node v s to v t , and g s t i is the number of shortest paths passing through v i among the g s t shortest paths from node v s to v t [36].
  • K-shell: K-shell centrality categorizes nodes based on their degrees to assess their importance in a network. Assuming there are no isolated nodes in the network, we eliminate nodes with one connection until no such nodes remain and assign them to the 1-shell. Similarly, we recursively eliminate nodes with degree 2 to form the 2-shell. This process ends when all nodes have been assigned to one of the shells [37].
  • Closeness: Closeness Centrality reflects the distance between a node and all other nodes in the network and measures the average shortest path length from the node to all other nodes. A higher closeness value indicates a more central position within the network. It can be computed as follows:
    d i = N 1 i j d i j ,
    where d i j is the length of the shortest path between node v i and node v j [38].
  • Positive degree (P-DEG): The number of positive edges connected to the i-th node, denoted as k i + .
  • Negative degree (N-DEG): The number of negative edges connected to the i-th node, denoted as k i .
  • Net degree (Net-DEG): This metric represents the difference between P-DEG and N-DEG:
    k i + k i .
  • Ratio degree (Ratio-DEG): It indicates the proportion of positive edges connected to node v i relative to its total edges in the network, as follows:
    k i + k i + + k i .
  • Prestige: Prestige is determined by both the positive and negative incoming links to a node [39]. The prestige of node i ( P r i ) is calculated as follows:
    P r i = k i + k i k i + + k i
  • PageRank: PageRank, which was inspired by Larry Page of Google, is among the most prevalent ranking algorithms in use today [40]. We can represent the PageRank score of node i as P R ( i ) . This rank value is computed (in an iterative manner) as follows:
    P R i ( t + 1 ) = α j I N i P R j ( t ) | O U T j | + ( 1 α ) 1 N
    where α is a forgetting factor. N is the number of nodes in the network. I N i represents the set of nodes that have edges pointing to node i, and | O U T j | represents the number of outgoing links from node j.
  • MinSum and FINDER: Two outstanding algorithms in complex network dismantling. For more information, please refer to [18,20].

6.1.2. Parameter Setting

Next, we present the specific parameter setting regarding the evolutionary algorithm and the DQN network in the experiments, which contain the number of iterations, population size, training batch size, etc., as detailed in Table 2.
Among these parameters, we give the results of the influence of crossover and mutation probabilities on the DSEDR algorithm in the following experiments to demonstrate that DSEDR maintains a stable dismantling efficiency under different evolutionary probability settings.

6.2. Artificial Network

In our experiments with artificial networks, we choose the famous LFR network [41] and the BA network [42].
The LFR network is a model for generating complex networks with an intrinsic community structure. In LFR networks, the mixing parameter μ controls the proportion of cross-community edges in the total edges of nodes. A smaller mixing parameter indicates a clearer community structure and more connected nodes within each community. In our experiments, we set the power law exponent of the node degree distribution of networks β to 3 and the power law exponent of the community size distribution γ to 1.5, and construct 5 medium-sized LFR networks with 1000 nodes by setting 5 different mixing parameters μ .
The BA network is a model for generating scale-free networks, characterized by a power-law degree distribution. In this model, a few nodes (high-degree nodes) have a large number of connections, while the majority of nodes have only a few connections. In the BA model, the parameter ω represents the number of connections each new node makes to existing nodes. A larger value for this parameter increases the network’s overall connectivity, making the power-law feature more significant. In our experiments, we construct 5 medium-sized BA networks with 1000 nodes by setting 5 different parameters ω .
Due to the inherent randomness in our algorithm, we conduct 20 independent experiments using DSEDR to dismantle 10 % nodes on each network. Figure 3 presents the results of 20 experiments in a box plot and compares the efficiency of DSEDR with 8 baselines. Figure 4 illustrates the impact of p c r o s s o v e r on the performance of DSEDR across various artificial networks. For each value of p c r o s s o v e r , we conduct 20 experiments and then plot the mean value of the objective function Φ in a bar plot for comparison. The results show that our algorithm consistently outperforms the 12 baselines on 10 artificial networks under different μ and ω settings, and remains stable under different p c r o s s o v e r settings.
To verify that DSEDR still maintains its high efficiency on large-scale networks, we also conducted the same experiments on the LFR and BA networks with 10,000 nodes. The results are shown in Figure 5, which demonstrate that even in large-scale networks, DSEDR still exhibits excellent efficiency.

6.3. Real-World Network

For the real network dismantling experiment, we choose the war network as the experimental network. The war network is a network data set of military relationships extracted from the Military Warfare Project [43], which reflects the alliances and antagonisms constituted by 166 countries between 2000 and 2010. The positive edge represents an alliance relationship between two countries, while the negative edge represents an antagonistic relationship between two countries. The detailed data information of this network is shown in Table 3. In this part, we present the dismantling efficiency comparison of the DSEDR algorithm with 8 baselines on the war network and the influence of parameter p c r o s s o v e r on DSEDR’s dismantling performance. Further more, the visualization of the dismantling procedure and the analysis of real-world meaning of the dismantling are given as well.

6.3.1. Efficiency and Parameter Analysis

Figure 6 shows the efficiency comparison of DSEDR and 8 baselines. In the experiment, we select 10 various numbers of dismantled nodes. Considering the randomness in the initialization part of the algorithm, we conduct 20 experiments for each number of dismantled nodes, respectively. Subsequently, we present the results as a box plot for a comparison with the 12 baselines. It is evident that the DSEDR algorithm is much more effective in optimizing the objective function Φ than the baselines. In particular, when the number of removed nodes reaches 20% of the total points (i.e., 33 nodes), the DSEDR algorithm exhibits an apparent advantage over the baselines.
Figure 7 shows the influence of the parameter p c r o s s o v e r on DSEDR’s performance when 33 nodes are dismantled. In order to compare DSEDR with 12 baselines more concisely, we choose only the two baselines with the best dismantling result, Degree and N-DEG, for comparison here. We conduct 20 experiments with DSEDR at each different p c r o s s o v e r , and the mean values of the obtained Φ are plotted as a bar plot to compare with the 2 baselines. The performance of the DSEDR algorithm is relatively stable under different p c r o s s o v e r and remains significantly more efficient than the baselines under different parameter settings according to the figure.

6.3.2. Visualization and Real Meaning Analysis

To further demonstrate the effect and application value of the DSEDR algorithm in signed network dismantling, this part presents the visualization of the war network dismantling process using the DSEDR algorithm, i.e., Figure 8. The figure comprises three snapshots, (b), (c), and (d), each displaying the war network after dismantling different numbers of nodes during the process (a).
As we can see from Figure 8d, DSEDR dismantles the complex coalition-versus relationship among 166 countries by removing 33 of the nodes (which can be interpreted as isolating the diplomatic relations of these countries) and breaks them down into clearly structured components of coalition nodes, antagonistic nodes, and a number of independent nodes. Unlike traditional complex network dismantling, DSEDR does not only seek to reduce the size of the giant connectivity component in signed network dismantling. When the structure of the node clusters is relatively clear, DSEDR further seeks to increase the proportion of negative edges in the network so as to reduce the stability of the alliance and confrontation network formed by these 166 countries, which is able to give an opinion on the maintenance of the international peace.

7. Conclusions

In this paper, we propose a new framework named DSEDR to explore the signed network dismantling problem that integrates the advantages of both evolutionary computation and deep reinforcement learning. First, we propose a new objective function that integrates features of sign information and network topology. Then, we transform the objective function minimization to a continuous parameter optimization of a deep Q-learning network. To obtain the optimal parameter, DSEDR utilizes evolutionary computation by considering different DQN parameters as different population individuals, and searches for the optimal DQN parameters by combining the global search capability in evolutionary algorithms with the fast local search capability in deep reinforcement learning. Finally, to verify the efficiency of DSEDR, we apply it to multiple-type artificial and real network data sets. We compare the performance with 12 popular baseline methods, and the experimental results demonstrate that DSEDR owns superior performance for signed networks dismantling among all the algorithms. Due to its demonstrated superiority in signed network dismantling problems, DSEDR has great application value, such as disrupting rumor propagation networks, finding critical components in sensor networks, and maintaining sensor network stability.
In future research, we will further extend the application of networks to a more large-scale network with millions of nodes. Moreover, with the development of high-order network studies, we can apply DSEDR on more complex topologies, such as directed networks and temporal networks. Meanwhile, we can evolve the parameter encoder together with the parameters of DQN, which can further improve the performance of our framework.

Author Contributions

Conceptualization, Y.O. and F.X.; methodology, Y.O. and H.Z.; formal analysis, F.X. and H.Z.; data analysis, H.L. and H.Z.; writing—original draft preparation, Y.O., F.X. and H.Z.; writing—review and editing, H.L. and H.Z.; visualization, Y.O. and H.L.; supervision, H.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (Grant 71871233) and Beijing Natural Science Foundation (Grant 9182015).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author due to restrictions on privacy and confidentiality, which prevent open sharing of the dataset.

Acknowledgments

We are grateful to the anonymous reviewers for their valuable suggestions.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Akhtar, M.U.; Liu, J.; Liu, X.; Ahmed, S.; Cui, X. NRAND: An efficient and robust dismantling approach for infectious disease network. Inf. Process. Manag. 2023, 60, 103221. [Google Scholar] [CrossRef]
  2. Qi, M.; Chen, P.; Wu, J.; Liang, Y.; Duan, X. Robustness measurement of multiplex networks based on graph spectrum. Chaos 2023, 33, 021102. [Google Scholar] [CrossRef] [PubMed]
  3. Collins, B.; Hoang, D.T.; Nguyen, N.T.; Hwang, D. A new model for predicting and dismantling a complex terrorist network. IEEE Access 2022, 10, 126466–126478. [Google Scholar] [CrossRef]
  4. Duijn, P.A.; Kashirin, V.; Sloot, P.M. The relative ineffectiveness of criminal network disruption. Sci. Rep. 2014, 4, 4238. [Google Scholar] [CrossRef]
  5. Tripathy, R.M.; Bagchi, A.; Mehta, S. A study of rumor control strategies on social networks. In Proceedings of the ACM International Conference on Information & Knowledge Management, Toronto, ON, Canada, 26–30 October 2010. [Google Scholar]
  6. Zhan, X.-X.; Zhang, K.; Ge, L.; Huang, J.; Zhang, Z.; Wei, L.; Sun, G.-Q.; Liu, C.; Zhang, Z.-K. Exploring the effect of social media and spatial characteristics during the COVID-19 pandemic in china. IEEE Trans. Netw. Sci. Eng. 2022, 10, 553–564. [Google Scholar] [CrossRef]
  7. Rahman, K.C. A survey on sensor network. J. Comput. Inf. Technol. 2010, 1, 76–87. [Google Scholar]
  8. Bui, T.N.; Jones, C. Finding good approximate vertex and edge partitions is np-hard. Inf. Process. Lett. 1992, 42, 153–159. [Google Scholar] [CrossRef]
  9. Buldyrev, S.V.; Parshani, R.; Paul, G.; Stanley, H.E.; Havlin, S. Catastrophic cascade of failures in interdependent networks. Nature 2010, 464, 1025–1028. [Google Scholar] [CrossRef] [PubMed]
  10. Osat, S.; Faqeeh, A.; Radicchi, F. Optimal percolation on multiplex networks. Nat. Commun. 2017, 8, 1540. [Google Scholar] [CrossRef] [PubMed]
  11. Leskovec, J.; Huttenlocher, D.; Kleinberg, J. Signed networks in social media. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Atlanta, GA, USA, 10–15 April 2010; pp. 1361–1370. [Google Scholar]
  12. Li, H.-J.; Feng, Y.; Xia, C.; Cao, J. Overlapping graph clustering in attributed networks via generalized cluster potential games. ACM Trans. Knowl. Discov. Data 2024, 18, 1–26. [Google Scholar] [CrossRef]
  13. Tang, J.; Chang, Y.; Aggarwal, C.; Liu, H. A survey of signed network mining in social media. ACM Comput. Surv. 2016, 49, 1–37. [Google Scholar] [CrossRef]
  14. Arulkumaran, K.; Deisenroth, M.P.; Brundage, M.; Bharath, A.A. Deep Reinforcement Learning: A Brief Survey. IEEE Signal Process. Mag. 2017, 34, 26–38. [Google Scholar] [CrossRef]
  15. Ma, L.; Shao, Z.; Li, X.; Lin, Q.; Li, J.; Leung, V.C.; Nandi, A.K. Influence Maximization in Complex Networks by Using Evolutionary Deep Reinforcement Learning. IEEE Trans. Emerg. Top. Comput. Intell. 2023, 7, 995–1009. [Google Scholar] [CrossRef]
  16. Osat, S.; Papadopoulos, F.; Teixeira, A.S.; Radicchi, F. Embedding-aided network dismantling. Phys. Rev. Res. 2023, 5, 013076. [Google Scholar] [CrossRef]
  17. Wandelt, S.; Lin, W.; Sun, X.; Zanin, M. From random failures to targeted attacks in network dismantling. Reliab. Eng. Syst. Saf. 2022, 218, 108146. [Google Scholar] [CrossRef]
  18. Braunstein, A.; Dall’Asta, L.; Semerjian, G.; Zdeborová, L. Network dismantling. Proc. Natl. Acad. Sci. USA 2016, 113, 12368–12373. [Google Scholar] [CrossRef]
  19. Yan, D.; Xie, W.; Zhang, Y.; He, Q.; Yang, Y. Hypernetwork dismantling via deep reinforcement learning. IEEE Trans. Netw. Sci. Eng. 2022, 9, 3302–3315. [Google Scholar] [CrossRef]
  20. Fan, C.; Zeng, L.; Sun, Y.; Liu, Y.Y. Finding key players in complex networks through deep reinforcement learning. Nat. Mach. Intell. 2020, 2, 317–324. [Google Scholar] [CrossRef] [PubMed]
  21. Deepali, J.J.; Ishaan, K.; Sadanand, G.; Omkar, K.; Divya, P.; Shivkumar, P. Reinforcement Learning: A Survey. In Machine Learning and Information Processing; Springer: Berlin, Germany, 2021; pp. 297–308. [Google Scholar]
  22. Liu, F.Y.; Li, Z.N.; Qian, C. Self-Guided Evolution Strategies with Historical Estimated Gradients. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-20) IJCAI, Yokohama, Japan, 11–17 July 2020; pp. 1474–1480. [Google Scholar]
  23. Khadka, S.; Tumer, K. Evolution-guided policy gradient in reinforcement learning. In Advances in Neural Information Processing Systems 31; Neural Information Processing Systems Foundation, Inc. (NeurIPS): Montreal, QC, Canada, 2018. [Google Scholar]
  24. Zhan, Z.; Li, J.; Zhang, J. Evolutionary deep learning: A survey. Neurocomputing 2022, 483, 42–58. [Google Scholar] [CrossRef]
  25. Cui, X.; Zhang, W.; Tüske, Z.; Picheny, M. Evolutionary stochastic gradient descent for optimization of deep neural networks. In Advances in Neural Information Processing Systems 31; Neural Information Processing Systems Foundation, Inc. (NeurIPS): Montreal, QC, Canada, 2018. [Google Scholar]
  26. Such, F.P.; Madhavan, V.; Conti, E.; Lehman, J.; Stanley, K.O.; Clune, J. Deep neuroevolution: Genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning. arXiv 2017, arXiv:1712.06567. [Google Scholar] [CrossRef]
  27. Khadka, S.; Majumdar, S.; Nassar, T.; Dwiel, Z.; Tumer, E.; Miret, S.; Liu, Y.; Tumer, K. Collaborative evolutionary reinforcement learning. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019. [Google Scholar]
  28. Li, H.-J.; Xu, W.; Qiu, C.; Pei, J. Fast Markov Clustering Algorithm Based on Belief Dynamics. IEEE Trans. Cybern. 2023, 53, 3716–3725. [Google Scholar] [CrossRef] [PubMed]
  29. Li, H.-J.; Cao, H.; Feng, Y.; Li, X.; Pei, J. Optimization of Graph Clustering Inspired by Dynamic Belief Systems. IEEE Trans. Knowl. Data Eng. 2024, 36, 6773–6785. [Google Scholar] [CrossRef]
  30. Borgatti, S.P. Identifying sets of key players in a social network. Comput. Math. Organ. Theory 2006, 12, 21–34. [Google Scholar] [CrossRef]
  31. Cheng, S.Q.; Shen, H.W.; Zhang, G.Q.; Cheng, X.Q. Survey of signed network research. Ruan Jian Xue Bao/J. Softw. 2013, 25, 1–15. [Google Scholar]
  32. Shen, X.; Chung, F.L. Deep network embedding for graph representation learning in signed networks. IEEE Trans. Cybern. 2020, 50, 1556–1568. [Google Scholar] [CrossRef] [PubMed]
  33. Han, X.N.; Liu, H.P.; Sun, F.C.; Zhang, X.Y. Active object detection with multistep action prediction using deep Q-network. IEEE Trans. Ind. Inform. 2019, 15, 3723–3731. [Google Scholar] [CrossRef]
  34. Zhu, Y.H.; Zhao, D.B. Online minimax Q network learning for two-player zero-sum Markov games. IEEE Trans. Neural Netw. Learn. Syst. 2022, 33, 1228–1241. [Google Scholar] [CrossRef]
  35. Albert, R.; Jeong, H.; Barabási, A.L. Error and attack tolerance of complex networks. Nature 2000, 406, 378–382. [Google Scholar] [CrossRef]
  36. Freeman, L.C. A set of measures of centrality based on betweenness. Sociometry 1977, 40, 35–41. [Google Scholar] [CrossRef]
  37. Carmi, S.; Havlin, S.; Kirkpatrick, S.; Shavitt, Y.; Shir, E. A model of Internet topology using k-shell decomposition. Proc. Natl. Acad. Sci. USA 2007, 104, 11150–11154. [Google Scholar] [CrossRef] [PubMed]
  38. Freeman, L.C. Centrality in social networks conceptual clarification. Soc. Netw. 1978, 1, 215–239. [Google Scholar] [CrossRef]
  39. Zolfaghar, K.; Aghaie, A. Mining trust and distrust relationships in social Web applications. In Proceedings of the 2010 IEEE 6th International Conference on Intelligent Computer Communication and Processing, Cluj-Napoca, Romania, 26–28 August 2010; pp. 73–80. [Google Scholar]
  40. Page, L.; Brin, S.; Motwani, R.; Winograd, T. The PageRank Citation Ranking: Bringing Order to the Web. In Proceedings of the Web Conference, Toronto, ON, Canada, 11–14 May 1999. [Google Scholar]
  41. Lancichinetti, A.; Fortunato, S.; Radicchi, F. Benchmark graphs for testing community detection algorithms. Phys. Rev. E—Stat. Nonlinear Soft Matter Phys. 2008, 78, 046110. [Google Scholar] [CrossRef] [PubMed]
  42. Barabási, A.L.; Albert, R. Emergence of scaling in random networks. Science 1999, 286, 509–512. [Google Scholar] [CrossRef]
  43. Ghosn, F.; Palmer, G.; Bremer, S.A. The MID3 data set, 1993–2001: Procedures, coding rules, and description. Confl. Manag. Peace Sci. 2004, 21, 133–154. [Google Scholar] [CrossRef]
Figure 1. Flowchart of the DSEDR algorithm. Specifically, the algorithm consists of three parts. First, the Network Embedding procedure generates the embedding of the target network as the input of DQN, which transforms the problem into the optimization of DQN’s weight ( W 1 , W 2 ) . Then Evolutionary Deep Reinforcement Learning is employed to find the best weight ( W F 1 , W F 2 ) . Finally, Signed Network Dismantling is conducted by using DQN with the output weight.
Figure 1. Flowchart of the DSEDR algorithm. Specifically, the algorithm consists of three parts. First, the Network Embedding procedure generates the embedding of the target network as the input of DQN, which transforms the problem into the optimization of DQN’s weight ( W 1 , W 2 ) . Then Evolutionary Deep Reinforcement Learning is employed to find the best weight ( W F 1 , W F 2 ) . Finally, Signed Network Dismantling is conducted by using DQN with the output weight.
Sensors 24 08026 g001
Figure 2. The flowchart of Markov seed selection, where H t , S t , D t , and P t denote network embeddings, seed selection state vector, degree vector, and positive out-degree vector at time t, respectively. Q t is the Q-value vector; r t and a t are reward and action at t. Dashed lines represent positive edges, and solid lines represent negative edges.
Figure 2. The flowchart of Markov seed selection, where H t , S t , D t , and P t denote network embeddings, seed selection state vector, degree vector, and positive out-degree vector at time t, respectively. Q t is the Q-value vector; r t and a t are reward and action at t. Dashed lines represent positive edges, and solid lines represent negative edges.
Sensors 24 08026 g002
Figure 3. Efficiency comparison on artificial networks. Among 13 algorithms, DSEDR has the best performance in the LFR network under 5 different mixing parameters μ and in the BA network under 5 different parameters ω .
Figure 3. Efficiency comparison on artificial networks. Among 13 algorithms, DSEDR has the best performance in the LFR network under 5 different mixing parameters μ and in the BA network under 5 different parameters ω .
Sensors 24 08026 g003
Figure 4. Influence of p c r o s s o v e r on artificial networks. Specifically, (a) is the result of a different p c r o s s o v e r on LFR with a different μ , and (b) shows the result of a different p c r o s s o v e r on BA with a different ω .
Figure 4. Influence of p c r o s s o v e r on artificial networks. Specifically, (a) is the result of a different p c r o s s o v e r on LFR with a different μ , and (b) shows the result of a different p c r o s s o v e r on BA with a different ω .
Sensors 24 08026 g004
Figure 5. Efficiency comparison on 10 different networks with 10,000 nodes after dismantling 10% nodes.
Figure 5. Efficiency comparison on 10 different networks with 10,000 nodes after dismantling 10% nodes.
Sensors 24 08026 g005
Figure 6. Efficiency comparison on the war network. Among 9 algorithms, DSEDR has excellent performance compared with all 12 baselines for 10 different numbers of removed nodes.
Figure 6. Efficiency comparison on the war network. Among 9 algorithms, DSEDR has excellent performance compared with all 12 baselines for 10 different numbers of removed nodes.
Sensors 24 08026 g006
Figure 7. Influence of p c r o s s o v e r on the war network, where red bars denote the efficiency of DSEDR with different p c r o s s o v e r . Here, we select the top 2 baselines, Degree and Net-DEG, to be compared.
Figure 7. Influence of p c r o s s o v e r on the war network, where red bars denote the efficiency of DSEDR with different p c r o s s o v e r . Here, we select the top 2 baselines, Degree and Net-DEG, to be compared.
Sensors 24 08026 g007
Figure 8. The process of dismantling the war network using DSEDR. DSEDR seeks to design a node removal sequence to minimize the objective function Φ in Equation (4). (a) illustrates the objective function curve on the war network with the horizontal axis being the number of dismantled nodes and the vertical axis being the Φ of the residual graph after removing these nodes. (bd) show the snapshots after removing 8 (b), 17 (c), and 33 (d) key nodes(cyan) determined by DSEDR at the different time points marked in the objective function curve of DSEDR in (a), respectively. The red line denotes the positive edge, and the green line denotes the negative edge.
Figure 8. The process of dismantling the war network using DSEDR. DSEDR seeks to design a node removal sequence to minimize the objective function Φ in Equation (4). (a) illustrates the objective function curve on the war network with the horizontal axis being the number of dismantled nodes and the vertical axis being the Φ of the residual graph after removing these nodes. (bd) show the snapshots after removing 8 (b), 17 (c), and 33 (d) key nodes(cyan) determined by DSEDR at the different time points marked in the objective function curve of DSEDR in (a), respectively. The red line denotes the positive edge, and the green line denotes the negative edge.
Sensors 24 08026 g008
Table 1. Summary of notation.
Table 1. Summary of notation.
NotationInstruction
G Target network
V Set of nodes
E Set of edges
N Number of nodes
L Number of edges
σ Size of giant connected component
C i The i-th connected component of G
δ i Size of C i
Φ Objective function
V ^ Set of nodes to be dismantled
W 1 , W 2 Weights of DQN neural network
a t Action of DQN at t
S t Seed node selection state of DRL at time t
V t Selected nodes set to be removed at t
D t Degree vector of each node
P t Positive out-degree vector
r t Decision reward
n I Initial population size
n b Batch size in DRL algorithm
n g Maximum number of iterations
n p Population size in the EA algorithm
P Population in evolution
K The number of nodes to be removed
K ¯ The average connectivity of nodes
dThe 1st layer neurons’ number of the DQN
lThe 2nd layer neurons’ number of the DQN
Table 2. Specific experimental parameter settings for the proposed DSEDR algorithm.
Table 2. Specific experimental parameter settings for the proposed DSEDR algorithm.
ParameterValue
Iteration number n g 100
Evolutionary population size n p 100
Crossover probability p c r o s s o v e r 0.8
Mutation probability p m u t a t i o n 0.2
Network embedding dimension d64
Training batch size n b 512
Training discount rate γ 0.8
Training learning rate α 0.001
Importance of positive edge share λ in Φ 1
Table 3. Detailed information of the war network.
Table 3. Detailed information of the war network.
Network ParameterValue
number of nodes n166
number of sides k1433
number of positive sides k + 1295
number of negative side k 138
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ou, Y.; Xiong, F.; Zhang, H.; Li, H. Network Dismantling on Signed Network by Evolutionary Deep Reinforcement Learning. Sensors 2024, 24, 8026. https://doi.org/10.3390/s24248026

AMA Style

Ou Y, Xiong F, Zhang H, Li H. Network Dismantling on Signed Network by Evolutionary Deep Reinforcement Learning. Sensors. 2024; 24(24):8026. https://doi.org/10.3390/s24248026

Chicago/Turabian Style

Ou, Yuxuan, Fujing Xiong, Hairong Zhang, and Huijia Li. 2024. "Network Dismantling on Signed Network by Evolutionary Deep Reinforcement Learning" Sensors 24, no. 24: 8026. https://doi.org/10.3390/s24248026

APA Style

Ou, Y., Xiong, F., Zhang, H., & Li, H. (2024). Network Dismantling on Signed Network by Evolutionary Deep Reinforcement Learning. Sensors, 24(24), 8026. https://doi.org/10.3390/s24248026

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop