Network Dismantling on Signed Network by Evolutionary Deep Reinforcement Learning

Ou, Yuxuan; Xiong, Fujing; Zhang, Hairong; Li, Huijia

doi:10.3390/s24248026

Open AccessArticle

Network Dismantling on Signed Network by Evolutionary Deep Reinforcement Learning

School of Statistics and Data Science, Nankai University, Tianjin 300074, China

^*

Author to whom correspondence should be addressed.

Sensors 2024, 24(24), 8026; https://doi.org/10.3390/s24248026

Submission received: 8 November 2024 / Revised: 8 December 2024 / Accepted: 14 December 2024 / Published: 16 December 2024

(This article belongs to the Special Issue Communications and Networking Based on Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

Network dismantling is an important question that has attracted much attention from many different research areas, including the disruption of criminal organizations, the maintenance of stability in sensor networks, and so on. However, almost all current algorithms focus on unsigned networks, and few studies explore the problem of signed network dismantling due to its complexity and lack of data. Importantly, there is a lack of an effective quality function to assess the performance of signed network dismantling, which seriously restricts its deeper applications. To address these questions, in this paper, we design a new objective function and further propose an effective algorithm named as DSEDR, which aims to search for the best dismantling strategy based on evolutionary deep reinforcement learning. Especially, since the evolutionary computation is able to solve global optimization and the deep reinforcement learning can speed up the network computation, we integrate it for the signed network dismantling efficiently. To verify the performance of DSEDR, we apply it to a series of representative artificial and real network data and compare the efficiency with some popular baseline methods. Based on the experimental results, DSEDR has superior performance to all other methods in both efficiency and interpretability.

Keywords:

signed network; network dismantling; evolutionary computation; deep learning; reinforcement learning

1. Introduction

Although extensive research has been conducted in the fields of sensors and communication, ensuring the security of sensor networks remains an unresolved issue. This challenge is closely related to the technology of network dismantling. Network dismantling aims to determine a set of nodes (edges) to be removed from the network under specific constraints and various dismantling objective functions to minimize their overall network performance. It has gained significant prominence in the field of network science owing to its broad applications across various domains [1,2]. For example, by finding the most efficient set of nodes to dismantle a network, we can solve a number of problems such as dismantling criminal organizations [3,4], reducing the spread of rumors [5], controlling viruses [6], and maintaining the stability of sensor networks [7]. It is worth noting that network dismantling is a combinatorial optimization problem that has been proven to be NP-hard [8,9,10].

To find the optimal dismantling strategy, researchers have proposed some methods, which are described in detail in Section 2. While these methods have demonstrated efficiency in rapidly dismantling networks, they are primarily designed for addressing the issue of unsigned network dismantling. In fact, interactions between individuals in the real world may be given specific meanings [11,12]. For example, users may be friends or enemies in a social network, which necessitates the use of a signed network to represent the multi-type relationships between users [13]. In order to reflect the complexities of real-world networks accurately, the signed network dismantling problem has become an important and challenging area of research. Nevertheless, there are few studies that focus on signed network dismantling, and the main challenge is lack of efficient objective function and the corresponding algorithm, which consider both the sign and topological information.

Deep reinforcement learning [14] is a powerful technology that searches the optimal solution quickly through its reward mechanism and technique. However, it is easy to obtain the local optimal solution due to the local search of gradient descent [15]. In this paper, we propose a new framework for Dismantling on Signed Networks based on Evolutionary Deep Reinforcement Learning (DSEDR) to achieve fast search for global optimal solutions by combining the global search capability of evolutionary computation with the deep reinforcement learning. Specifically, we first propose a new objective function that considers both the connectivity of the network and the proportion of co-operative relationships (i.e., positive edges) in the signed network. By considering both features of sign information and network topology simultaneously, it transforms the signed network dismantling problem into a multi-objective optimization problem. To minimize this objective function, inspired by evolutionary computation and deep reinforcement learning methods [15], DSEDR integrates the advantages of both evolutionary computation and deep reinforcement learning to search for an optimal set of disassembled nodes to minimize the objective function and thus achieve the best dismantling result on a signed network. To obtain the network embedding, DSEDR initially employs an effective encoder to capture and learn the positive and negative topological characteristics. Then, it employs a Deep Q-Network (named as DQN) as a decoder to generate a dismantling strategy through a Markov decision process. In order to obtain the optimal weight parameters of the DQN, the algorithm optimizes the DQN by combining the global search capability of the evolutionary computation and the fast local search capability of the deep reinforcement learning. Finally, to verify the efficiency of DSEDR, we apply it on multi-type artificial and real network data sets. We compare the performance with eight popular baseline methods, and the experimental results demonstrate that DSEDR owns superior performance for signed networks dismantling among all the algorithms.

The subsequent sections are organized as follows: Section 2 presents an overview of related works, and Section 3 outlines the problem of dismantling on signed networks, and then establishes the objective function. Section 4 and Section 5 introduce the main framework and the detailed algorithm procedures of DSEDR. Section 6 demonstrates the experimental results, and finally, Section 7 concludes the whole work and provides future work.

2. Related Works

2.1. Network Dismantling Methods

Recently, network dismantling has received extensive attention from various disciplines, such as operations research, network science, and computer science. Many scientists focus on this important research area and have proposed some algorithms to explore this problem. Saeed et al. [16] introduced a network dismantling method based on network embedding, using geometric space representations to address the network dismantling problem. Sebastian et al. [17] developed a solution for the rapid identification of critical nodes across various network types, employing an iterative process that converts stochastic fault tracing into targeted attacks for effective dismantling. Additionally, Braunstein et al. [18] proposed the Min-Sum algorithm, a three-stage method that closely links the network dismantling problem to the de-ringing problem. Yan et al. [19] presented the HITTER framework, which transforms the hypernetwork dismantling to a sequential decision-making problem based on Deep Reinforcement Learning (DRL). Furthermore, Fan et al. [20] introduced FINDER, a Deep Reinforcement Learning-based approach designed to train intelligent agents applicable to a wide range of realistic networks. After all, network dismantling is still a new question and has not been fully studied. Moreover, there are few studies that focus on signed network dismantling, and the main challenge is lack of efficient objective functions and algorithms that consider both the sign information and network topology.

2.2. Evolutionary Deep Reinforcement Learning Algorithms

Reinforcement learning [21] is a learning mechanism that learns how to map from states to policies in order to maximize the reward obtained. The primary objective is to maximize cumulative rewards to identify a near-optimal solution for a given problem. Deep reinforcement learning algorithms integrate the perceptual capabilities of deep learning with the decision-making ability of reinforcement learning, optimizing neural network weights via backpropagation techniques to conduct effective searches in the action space of optimization problems. However, traditional deep reinforcement learning approaches predominantly rely on local search methods based on gradient descent, which can lead to local optimum [22]. To enhance the optimization performance of these algorithms, Kwon et al. [23] proposed the S2V-DQN algorithm, which integrates graph neural networks with deep Q-learning. This innovative approach allows agents to better comprehend the structural properties of graphs by embedding information about nodes and their neighbors into state representations.

The deep evolutionary algorithm represents an optimization method that combines evolutionary algorithms with deep learning to address complex non-convex optimization problems. This algorithm explores and optimizes solutions by applying evolutionary mechanisms to the strategy search process [24]. From a strategy search perspective, deep evolutionary algorithms can be classified into two main categories. The first category employs evolutionary algorithms to replace traditional strategy gradient-based optimization methods (e.g., gradient descent), facilitating more efficient strategy updates. This category includes popular approaches such as OpenAI-ES [25] and DRL-GA [26]. The second category treats policies as evolvable individuals, merging the population search characteristics of evolutionary algorithms with the gradient optimization capabilities of reinforcement learning, aiming to optimize populations of policies, such as EPG [23] and CERL [27].

3. Problem Formulation

3.1. Network Connectivity

Let

G = (V, E)

represent a given network,

V = {V_{1}, V_{2}, \dots, V_{N}}

represent the set of nodes, and

E = [e_{1}, e_{2}, \dots, e_{w}] \subseteq V \times V

represent the set of edges [28,29]. Here,

N = | V |

denotes the total number of nodes, while

L = | E |

indicates the total number of edges. To facilitate comprehension, the key symbols used in this paper are standardized and summarized in Table 1. There are multiple types of objective functions for traditional complex network dismantling problems, including the number of connected slices, the size of the giant connected component, the number of connected node pairs, etc. [30]. In this paper, we use the size of the giant connected component after network dismantling and select it as a part of the dismantling objectives due to its favorable properties [18]. It is important to note that this metric can be substituted with other conventional disintegration objectives, which depends on the problem being addressed.

Consider removing a subset of

K

nodes represented by the sequence of nodes

{V_{1}, V_{2}, . . .,

V_{K}}

in the network. The size of the giant connected component (GCC), denoted as

σ

, and the giant connected component size ratio, represented by

R_{σ}

, can be defined as follows:

R_{σ} (\hat{V}) = \frac{σ (G - \hat{V})}{σ (G)},

(1)

where

σ (G) = m a x {δ_{i}, C_{i} \subset G} .

(2)

Here,

C_{i}

represents the i-th connected component and

δ_{i}

denotes the size of

C_{i}

on the current graph

G

. The term

σ (G - \hat{V})

refers to the size of GCC of the residual graph after sequentially removing nodes in the set

\hat{V} = {V_{1}, V_{2}, \dots, V_{K}}

from

G

. Meanwhile,

σ (G)

represents the initial size of the GCC of

G

before any node removals.

3.2. Signed Networks

Signed networks, defined as a class of networked systems in which edges are labeled with “positive” and “negative” symbolic attributes, represent an effective tool for modeling social networks with emotional differences. The “positive and negative” symbols represent two opposing emotional attitudes. In particular, the positive edges are typically employed to characterize positive relationships, such as friendship, support, trust, and liking. These are indicated with a positive symbol, “+”. Negative edges are commonly used to represent negative relationships, such as enmity, opposition, distrust, and dislike. These are marked with a negative symbol, “−” [31].

The aforementioned meanings of positive and negative symbols can be applied to the signed network dismantling problem. In order to minimize the harmful impact on a malignant network, it is possible to decrease the proportion of cooperative/friendly relationships, i.e., positive edges, in the network as low as possible. Accordingly, the second optimization objective, which is denoted as

R_{p e}

, is defined as the percentage of positive edges in the signed network after the removal of nodes

\hat{V}

. This metric is calculated as follows:

R_{p e} (\hat{V}) = \frac{k^{+} (G - \hat{V})}{k (G - \hat{V})},

(3)

where

k^{+} (G - \hat{V})

is the number of positive edges of the residual graph after removing nodes in the set

\hat{V} = {V_{1}, V_{2}, \dots, V_{K}}

sequentially from

G

, and

k (G)

is the total number of edges of the residual graph after removing nodes in the set

\hat{V}

.

3.3. The Objective Function

With the positive edge share

R_{p e}

and the connectivity metric

R_{σ}

, the formula for the signed network dismantling objective function (denoted as

Φ

) is proposed as follows:

Φ (\hat{V}) = R_{σ} (\hat{V}) + R_{p e} (\hat{V}) \times λ,

(4)

where

R_{σ}

is the size of the GCC in Equation (1),

R_{p e}

is the positive edge share in Equation (3), and

\hat{V}

represents the set of nodes to be dismantled.

λ

denotes the importance of positive edge share in the signed network dismantling problem. It follows that, in general, the larger the value of

λ

, the greater the requirement for the reduction for positive edge share in the dismantled graph network. The value of

λ

is dependent upon the specific problem at hand and the graph to be dismantled. By reducing the objective function described above, it is possible to achieve a significant reduction in the connectivity of the network, while also taking into account the unique positive–negative edge relationship of the symbolic network, which reduces the cooperation ability between the nodes of the signed network. As a consequence, the objective function will facilitate the measuring of the degree of dismantling on the signed network.

The aforementioned definition of objective function conceptualizes the signed network dismantling problem as a multi-objective optimization problem. Our objective is to identify the optimal dismantling node set strategy, denoted by

{\hat{V}}_{o p}

, that minimizes the objective function

Φ

, given the number of dismantling nodes

K

. The problem can be represented as follows:

{\hat{V}}_{o p} = \underset{\hat{V} \subset G}{argmin} Φ (\hat{V}) .

(5)

In the next sections, we will propose a framework that provides a new strategy to remove the target set of nodes for optimizing the objective function, with the detailed process depicted in Figure 1.

4. Deep Reinforcement Learning

4.1. Network Embedding

In this paper, we reduce the network dimensionality reduction by extracting the low-dimensional features of a topological network through neural networks. This work uses the Deep Network Embedding for Graph Representation Learning in Signed Networks (DNESBP) algorithm [32] to perform network dimensionality reduction on signed networks, obtaining a d-dimensional feature representation vector

H = {[h_{i j}]}_{n \times d}

,

h_{i j} \in [- 1, 1]

. The Deep Q-Network (named as DQN) can then input H to fully learn the positive and negative topological characteristics and structural balance properties of the signed network.

The DNESBP algorithm first uses a neural network with ℓ layers of neurons to construct an autoencoder to reduce the dimensionality of the signed network adjacency matrix A and obtain the representation learning vector, as follows:

H^{(i)} = f (X^{(i)} {(W^{(i)})}^{⊤} + B^{(i)}), i = 1, \dots, ℓ,

(6)

where

X^{(0)} = A

and

H^{(i)}, i \neq 0

is the vector obtained by the i-th dimensionality reduction.

H = {[h_{i j}]}_{n \times d}

denotes the vector of feature representations obtained by the final dimensionality reduction. The f function is the activation function used to generate positive and negative eigenvalues denoted as

f (x) = (\frac{e^{x} - e^{- x}}{e^{x} + e^{- x}})

.

W^{(i)}

is the network weight matrix in the i-th layer of the encoder, and

B^{(i)}

is the bias vector of the network weights in the i-th layer of the encoder.

Next, the DNESBP algorithm reconstructs H by designing the decoder of the l-layer neural network and obtains the reconstruction matrix

\hat{A}

as follows:

{\hat{X}}^{(i)} = f ({\hat{H}}^{(i)} {({\hat{W}}^{(i)})}^{⊤} + {\hat{B}}^{(i)}), i = 1, \dots, ℓ,

(7)

where

{\hat{H}}^{(i)} = H^{(l)}

and

{\hat{X}}^{(l)} = \hat{A}

,

{\hat{W}}^{(i)}

is the network weight matrix of the decoder’s layer i, and

{\hat{B}}^{(i)}

is the bias vector of the decoder’s layer i network weights. Further, the DNESBP algorithm supervises the dimensionality reduction process by computing the reconstruct. For the details of the loss function and the principle, please refer to [32].

4.2. Deep Q-Network

In our framework, the Deep Q-Network (DQN) [33] is implemented as a two-layer neural network specifically designed to approximate the optimal policy in reinforcement learning. The algorithm employs the dimensionality-reduced representation H as the input to DQN, which subsequently outputs a vector

Q = [q_{i}]

. Here, each

q_{i}

corresponds to the Q-value associated with a specific node, and the node with the highest Q-value is selected for removal in the subsequent step.

It is important to note that the selection of a node for removal has an impact on an original network’s topology. Consequently, to reflect the changes in network dynamics resulting from the node selection, it is necessary to update the input vectors of DQN. Here, we propose a new approach to updating the inputs of DQN. The specific method is as follows: At each seed node selection,

S = {s_{1}, s_{2}, \dots, s_{N}}

is used to denote the current seed node selection state, where

s_{i} = 1

and

s_{i} = 0

denote that node

V_{i}

has been and has not been selected as a seed node, respectively.

D = {d_{1}, d_{2}, \dots, d_{N}}

is the degree of each node of the current network, while

P = {p_{1}, p_{2}, \dots, p_{N}}

is the positive out-degree of each node of the current network. Upon the selection of node

v_{i}

at time

t - 1

, the corresponding reduced feature

H [i]

, degree

d_{i}

, and positive out-degree

p_{i}

are set to zero, while

s_{i}

is set to 1. The remaining elements in D and P will be updated by recomputing them based on the new network after removing node

v_{i}

. Then H, S, D, and P are spliced to form a new input vector. The detailed process can be seen in Figure 2.

DQN takes a

(d + 3)

-dimensional vector as input and computes the Q-value from its network structure as follows:

\begin{matrix} Q & = DQN (W^{1}, W^{2}, [H_{t}, S_{t}, D_{t}, P_{t}]) \\ = ReLU ([H_{t}, S_{t}, D_{t}, P_{t}] \cdot W^{1}) \cdot W^{2} . \end{matrix}

(8)

Here,

W^{1}

represents the continuous weight matrix for the first layer of the neural network, while

W^{2}

denotes the continuous weight vector of the second layer. The ReLU function is employed as the activation function to eliminate negative eigenvalues.

4.3. Markov Seed Selection

For the removed seed nodes, we employ a Markov decision process approach to optimize the iteration process and avoid selecting duplicated seed nodes [34]. Specifically, at time

t - 1

, we first obtain the seed node sequence

S_{t - 1}

. Then we update H, S, D, and P through the method mentioned in the DQN Module. At time t,

[H_{t}, S_{t}, D_{t}, P_{t}]

is input into the Deep Q-Network (DQN); meanwhile, we employ a greedy strategy to select the “node selection action” (Equations (9)–(11)) corresponding to the maximum Q-value, denoted as

a_{t}

. There are

Q (S_{t}, a) = DQN (W^{1}, W^{2}, [H_{t}, S_{t}, D_{t}, P_{t}]),

(9)

and

Q (S_{t}, a_{t}) = max_{a} Q ([H_{t}, S_{t}, D_{t}, P_{t}], a),

(10)

where

a_{t} = arg max_{a} Q (S_{t}, a) .

(11)

Here, a represents the set of all possible actions that can occur when a removed seed node is selected, while

a_{t}

denotes the specific action chosen at time t.

At time t, if a node

v_{i}

has the highest Q value, the Markov decision process selects it as a seed node and makes

s_{i} = 1

, and updates the sequence of seed nodes

S_{t}

,

D_{t}

H_{t}

, and

P_{t}

. The current sequence of seed nodes

S_{t}

is then converted to the set of seed nodes

V_{t}

according to the following function

O (S)

, and calculates the objective function

Φ

of the current signed network as follows:

\begin{matrix} Φ (V_{t}) & = Φ (O (S_{t})) \\ s . t . O (S_{t}) & = {V_{i} \in V : s_{i} = 1} . \end{matrix}

(12)

Next, the Markov decision process calculates the decision “reward”

r_{t}

, which is defined as the difference in objective function

Φ

before and after taking the action

a_{t}

as follows:

r_{t} = Φ (V_{t}) - Φ (V_{t + 1}) .

(13)

By using the DQN Module and Markov Decision Module, DSEDR transforms the discrete combination optimization problem of selecting

K

seed nodes in a signed network into the continuous parameter optimization problem of DQN as follows:

\begin{matrix} min & Φ (S) = Φ (W^{1}, W^{2}) \\ s . t . & W^{1}, W^{2} \in D Q N . \end{matrix}

(14)

The MSS Algorithm can be seen in Algorithm 1. In the next section, we propose a new evolutionary deep reinforcement learning algorithm DSEDR to resolve this parameter optimization problem of DQN.

Algorithm 1 MSS (Markov seed selection) Algorithm

Input:: Parameters $(W^{1}, W^{2})$ , size of seed set $K$ and target network $G$ .
Output:: Seed set $\hat{V}$ , $Φ (\hat{V})$ and a set of Markov decision processes ${(S_{t}, a_{t}, r_{t}, S_{t + 1})}_{K}$ .
1:: Initialization: set $t \leftarrow 1$ and $S_{t} \leftarrow$ [ ];
2:: Computing degree $d_{i}$ and positive out-degree $p_{i}$ of each node $v_{i}$ in $G$ , then set $D_{t} \leftarrow {[d_{i}]}_{N}$ and $P_{t} \leftarrow {[p_{i}]}_{N}$ ;
3:: Set DQN $\leftarrow (W^{1}, W^{2})$ ;
4:: while $t \leq K$ do
5:: Compute $Q (S_{t}, a)$ based on Equation (9);
6:: Compute $a_{t}$ based on Equation (11), which selects the node with the highest Q-value and then obtain $S_{t + 1}$ ;
7:: Update $H_{t}$ , $D_{t}$ , and $P_{t}$ according to $a_{t}$ and obtain $H_{t + 1}$ , $D_{t + 1}$ , and $P_{t + 1}$ ;
8:: Compute $r_{t} = Φ (V_{t}) - Φ (V_{t + 1})$ ;
9:: $t \leftarrow t + 1$ ;
10:: end while
11:: $B \leftarrow {(S_{t}, a_{t}, r_{t}, S_{t + 1})}_{K}$ , $\hat{V} \leftarrow V_{K}$ ;
12:: return $\hat{V}, Φ (\hat{V})$ ;

5. The DSEDR Algorithm

In this section, we introduce the detailed procedures of the DSEDR algorithm designed to search for the optimal solution for the signed network dismantling problem in Equation (5). We aim to integrate the advantages of evolutionary computation and deep reinforcement learning to facilitate the optimization process.

5.1. Evolution of DQN Populations

5.1.1. Solution Representation and Evaluation

In DSEDR, all individuals in a population

P = {P_{i}}_{n_{p}}

evolve at the same time, where each solution

P_{i}

represents a DQN. It can be represented as follows:

P_{i} = (W_{i}^{1}, W_{i}^{2}), W_{i}^{1} \in {[- 1, 1]}_{(d + 3) \cdot l}, W_{i}^{2} \in {[- 1, 1]}_{l \cdot 1},

(15)

where the weight parameters

W^{1}

and

W^{2}

correspond to the first and second layers of DQN, respectively. The number of neural nodes in the first and second layers is given by

d + 3

and l, respectively.

For each solution, DSEDR combines the network embedding, DQN, and Markov seed selection modules described above with the objective in Equation (5) to generate two outputs. One output is the set of seed nodes (denoted by

{\hat{V}}^{i}

) and

Φ^{i}

, which are used in the evolutionary step to evaluate the fitness of the solution (denoted by

P_{i}

). The other output consists of a set of Markov decision processes, i.e., sequences of states, actions, rewards, and subsequent states, which are used to accelerate the DQN optimization in the deep reinforcement learning step.

5.1.2. Initialization Operations

DSEDR initially randomizes the weight sequence population of the DQN network in order to obtain the initial solution set for the DQN population. This random initialization operation ensures the diversity of the initial DQN population solutions. The specific procedure for this is outlined in Algorithm 2. In particular, a value is randomly generated for each weight parameter of DQN, ranging from −1 to 1, which serves as the initial value.

Algorithm 2 Initialization Algorithm

Input:: Initial population size: $n_{I}$ .
Output:: Initial solutions $P$ .
1:: Set $P = {P_{i} (0)}_{n_{I}}, P_{i} (0) = (W_{i}^{1}, W_{i}^{2}), i = 1, 2, 3, \dots, n_{I}$ ;
2:: for $i = 1, 2, 3, \dots, n_{I}$ do
3:: for each element in $W_{i}^{1}, W_{i}^{2}$ do
4:: Randomly generate a value in the range of −1 to 1;
5:: end for
6:: end for

5.1.3. Evolutionary Operations

Following the initial acquisition of the DQN population, the DSEDR algorithm initiates an iterative process of evolution, including crossover, mutation, and selection operations on DQN. These are conducted with the objective function

Φ

as the fitness value. These operations are illustrated in the subsequent description.

The crossover operation focuses on the parent solution

P (q)

, which first orders the population set based on fitness values, and then generates new offspring using two operations, solution pairing and single-point crossover. The solution pairing divides the parent solution

P (q)

into the set with “good genes”

{P_{i}^{b} (q)}

and the set with “average genes”

{P_{j}^{w} (q)}

. For each pair of solutions

{P_{i}^{b} (q), P_{j}^{w} (q)}

, the single-point crossover first generates a random bit-value of either 0 or 1 for each gene position, and then crosses over the gene values with a bit-value of 1 in the two solutions to generate the offspring

{P_{i} {(q)}^{C}}

and

{P_{j} {(q)}^{C}}

, and we write down all the offspring generated by the crossover operation as follows:

{P {(q)}^{C}} = {P_{i} {(q)}^{C}} \cup {P_{j} {(q)}^{C}} .

(16)

For mutation operation, a random mutation is performed by the following N function for each gene

y_{i}

in the population to be mutated, which enhances the diversity of the population’s gene sequences and results in a mutated progeny

{P {(q)}^{M}}

, which is presented in the following equation:

y_{i} = N (- 1, 1),

where the N function is a normal random function used to generate values from −1 to 1, and obeys the probability density function

\frac{1}{\sqrt{2 π}} \cdot e^{- y^{2} / 2}

.

The selection operation is based on a greedy strategy, whereby the population

P {(q)}_{E A}

with the highest fitness value

Φ (V)

is selected from

P {(q)}^{C} \cup P {(q)}^{M}

. Subsequently, the population

P {(q)}_{E A}

, which comprises “superior genes”, is chosen as the population to be evolved in the next generation. Refer to Algorithm 3 for details of the evolution procedure.

Algorithm 3 Evolution Algorithm

Input:: Population size $n_{p}$ , parent solutions $P (q)$ , crossover probability $p_{c r o s s o v e r}$ , and mutation probability $p_{m u t a t i o n}$ .
Output:: Offspring solutions: $P {(q)}_{E A}$ .
1:: Randomly divide $P (q)$ into two populations: $P {(q)}_{C}$ and $P {(q)}_{M}$ ;
2:: $\hat{V}, Φ (\hat{V}) \leftarrow MSS (P {(q)}_{C}, K, H)$ ;
3:: Sort the solutions $P {(q)}_{c}$ in an ascending order based on their $Φ (\hat{V})$ ;
4:: Compose ${P_{i}^{b} (q), P_{j}^{w} (q)}$ of the first 50% of the populations ${P_{i}^{b} (q)}$ and the last 50% of the populations ${P_{j}^{w} (q)}$ ;
5:: for each pair of solutions $(P_{i}^{b} (q), P_{j}^{w} (q))$ do
6:: if A randomly generated value $p_{1} \leq p_{c r o s s o v e r}$ then
7:: Each pair of genes $(x_{i}, x_{j})$ in $P_{i}^{b} (q)$ and $P_{j}^{w} (q)$ is randomly assigned a value of 0 or 1, thus constituting the sequence M;
8:: for each $(x_{i}, x_{j})$ in ${P_{i}^{b} (q), P_{j}^{w} (q)}$ do
9:: if $(x_{i}, x_{j})$ corresponds to a value of 1 in the sequence M then
10:: Replace the values of $x_{i}$ and $x_{j}$ to obtain a new $P_{i} {(q)}^{C}$ and $P_{j} {(q)}^{C}$ ;
11:: end if
12:: end for
13:: end if
14:: end for
15:: Set ${P {(q)}^{C}} = {P_{i} {(q)}^{C}} \cup {P_{j} {(q)}^{C}}$ ;
16:: for each solution $P_{i} {(q)}_{M}$ in $P {(q)}_{M}$ do
17:: for each weight parameter y in $P_{i} {(q)}_{M}$ do
18:: if A randomly generated value $p_{2} \leq p_{m u t a t i o n}$ then
19:: Update y value to a value N(−1,1),get $P_{i} {(q)}^{M}$ ;
20:: end if
21:: end for
22:: end for
23:: $\hat{V}, Φ (\hat{V}) \leftarrow MSS (P {(q)}^{C} \cup P {(q)}^{M}, K, H)$ ;
24:: The $n_{p}$ populations from $P {(q)}^{C} \cup P {(q)}^{M}$ with the highest $Φ (\hat{V})$ are selected as $P {(q)}_{E A}$ ;

5.2. Reinforcement Learning Operation

In large-scale networks, the numerous weight sequences associated with DQN make it difficult to find an optimal solution within limited iterations, resulting in slow convergence speed. To address this problem, DSEDR introduces an n-step Q-learning technique based on deep reinforcement learning concepts to accelerate DQN training and evolution.

DSEDR stores the historical quaternion data set, including the quadruplet

{(S_{t}, a_{t}, r_{t}, S_{t + 1})}_{n_{g}}

in the cache pool

B

, throughout the iterative seed node selection process conducted in accordance with the Markov decision process. Following

n_{g}

iterations of training, the DSEDR algorithm samples quaternion data in batches from the cache pool based on the

n_{b}

-step Q-learning technique. These data are then used to compute the loss function

L (W^{1}, W^{2})

of DQN with weight parameters

(W^{1}, W^{2})

, which evaluates the expected difference between predicted Q values and target Q values. The loss function can be computed as follows:

L (W^{1}, W^{2}) = E_{C \in B} [{(\underset{predicted Q values}{\underset{︸}{\sum_{i = 0}^{n_{b} - 1} r_{t + i} + γ max Q (S_{t + n_{b}}, a^{*}; W^{1}, W^{2})}} - \underset{target Q values}{\underset{︸}{Q (S_{t}, a_{t}; W^{1}, W^{2})}})}^{2}],

(17)

where

γ

denotes the hyperparameter used to determine the importance of the reward

r_{t + i}

and E denotes the expectation function.

B

denotes the cache pool used to store the quaternion data, and

C = {(S_{t}, a_{t}, r_{t}, S_{t + 1})}_{n_{b}}

denotes that each batch is taken out from the cache pool

B

for updating the quaternion data of DQN with a batch size of

n_{b}

. Based on the above objective loss function

L (W^{1}, W^{2})

, the DSEDR algorithm can further compute the gradient of the loss function

Δ (W^{1}, W^{2})

, which can be computed as follows:

\begin{matrix} Δ (W^{1}, W^{2}) = & α [\sum_{i = 0}^{n_{b} - 1} r_{t + i} + γ max Q (S_{t + n_{b}}, a^{*}; W^{1}, W^{2}) - Q (S_{t}, a_{t}; W^{1}, W^{2})] \\ \cdot \nabla_{(W^{1}, W^{2})} Q (S_{t}, a_{t}; W^{1}, W^{2}), \end{matrix}

(18)

where

α

denotes the learning rate when updating DQN in reverse, and

\nabla_{(W^{1}, W^{2})}

denotes the deviation to

(W^{1}, W^{2})

.

Then DSEDR adopt the following stochastic gradient descent algorithm to inversely update the network weights of the DQN

(W^{1}, W^{2})

:

{(W^{1}, W^{2})}^{'} = (W^{1}, W^{2}) - Δ (W^{1}, W^{2}) .

(19)

Following the aforementioned operation, the convergence of the network weights of DQN can be accelerated towards superior weights. Refer to Algorithm 4 for details of the DRL procedure.

Algorithm 4 DRL (deep reinforcement learning) Algorithm

Input:: Cache pool $B$ , batch size $n_{b}$ , population $P {(q)}_{E A}$ .
Output:: Population $P {(q)}_{D R L}$ .
1:: Randomly retrieve $C = {(S_{t}, a_{t}, r_{t}, S_{t + 1})}_{n_{b}}$ of size $n_{b}$ from the cache pool $B$ ;
2:: Calculating $L (W^{1}, W^{2})$ by Equation (17);
3:: Calculating $Δ (W^{1}, W^{2})$ of $L (W^{1}, W^{2})$ by Equation (18);
4:: Based on $Δ (W^{1}, W^{2})$ , reverse update $(W^{1}, W^{2})$ corresponding to the most favorable population $P {(q)}_{E A}$ , thus obtaining a new population $P {(q)}_{D R L}$ ;

5.3. DSEDR Algorithm

DSEDR combines evolutionary computation with deep reinforcement learning to enhance the evolution of DQNs. The evolutionary computation concurrently evolves a population of individuals, where each individual embodies a DQN, and ultimately derives a solution to the signed network dismantling problem through the previously outlined seed selection strategy. In contrast, deep reinforcement learning leverages the combined information and network-specific insights of the DQNs to speed up their evolutionary process.

More specifically, in each iteration, the DSEDR algorithm employs a two-step approach to optimize the DQN network weight parameters. First, it searches globally through the crossover and mutation of the genetic algorithm. Second, it searches locally through the stochastic gradient descent algorithm of reinforcement learning. Additionally, it introduces the n-step Q-learning technique, which makes use of the quaternions stored in the cache pool for the back propagation to update the DQN network weights. The network weights of DQNs are updated in order to accelerate the training convergence and evolution of DQNs. Furthermore, DQN networks are evaluated and screened by the use of a Markov decision process and the aforementioned objective function

Φ

in Equation (4). The part of the population with the highest fitness value is selected, while the other part of the population is randomly selected for the next iteration. This approach ensures the evolution of the population and the diversity of stochasticity, which can be achieved by the use of the greedy strategy.

5.4. Complexity Analysis

As previously stated, DSEDR (Algorithm 5) includes four different sub-algorithms:

MSS

(Algorithm 1),

Initialization

(Algorithm 2),

Evolution

(Algorithm 3), and

DRL

(Algorithm 4). The computational complexity of each sub-algorithm will be calculated separately and subsequently combined to derive the overall algorithmic complexity of DSEDR.

MSS: The time complexity of MSS can be computed as $O (K \times (d \times l + \bar{K} \times \bar{K}))$ , where $K$ is the number of nodes to be removed, $\bar{K}$ is the average connectivity of nodes in the target network, d and l denote the number of neurons in the first and the second layers of the DQN, respectively.
Initialization: The time complexity of Initialization can be computed as $O (n_{I} \times d \times l)$ , where $n_{I}$ is the initial population size.
EA: The time complexity of EA can be computed as $O (n_{I} \times K \times (d \times l + \bar{K} \times \bar{K}))$ .
DRL: The time complexity of DRL can be computed as $O (n_{b} \times (d \times l + \bar{K} \times \bar{K}))$ , where $n_{b}$ is the batch size in the DRL Algorithm.

Algorithm 5 DSEDR Algorithm

Input:: Initial population size: $n_{I}$ , maximum number of iterations $n_{g}$ , population size $n_{p}$ , parent solutions $P (q)$ , crossover probability $p_{c r o s s o v e r}$ , and mutation probability $p_{m u t a t i o n}$ .
1:: ${P_{i} (0)}_{n_{I}} \leftarrow Initialisation (Algorithm 2) (n_{I})$ ;
2:: for q = 0 to $n_{g} - 1$ do
3:: $P {(q)}_{E A} \leftarrow Evolution (Algorithm 3) (n_{p}, P (q), p_{c r o s s o v e r}, p_{m u t a t i o n})$ ;
4:: $P {(q)}_{D R L} \leftarrow DRL (Algorithm 4) (P {(q)}_{E A})$ ;
5:: $V, Φ (V) \leftarrow MSS (Algorithm 1) (P (q) \cup P {(q)}_{E A} \cup P {(q)}_{D R L}, K, H)$
6:: Select population with the highest $Φ (V)$ and size of $n_{p} / 2$ from $P (q) \cup P {(q)}_{E A} \cup P {(q)}_{D R L}$ as $P {(q)}_{O p t}$ ;
7:: Randomly select population with the size of $n_{p} / 2$ from $P (q) \cup P {(q)}_{E A} \cup P {(q)}_{D R L} - P {(q)}_{O p t}$ as $P {(q)}_{R a n d o m}$ ;
8:: $P (q + 1) \leftarrow P {(q)}_{R a n d o m} \cup P {(q)}_{O p t}$ ;
9:: end for

By summing up all the complexities of the four sub-algorithms, for a total of

n_{g}

iterations, the total time complexity of DSEDR can be computed as follows:

O (n_{g} \times (n_{I} \times K + n_{b}) \times (d \times l + \bar{K} \times \bar{K})) .

(20)

It is worth noting that in sparse networks, since the number of links is nearly linear with the number of nodes, we can further reduce this complexity to

O (n_{g} \times (n_{I} \times K + n_{b}) \times (d \times l + θ))

, where

θ

is a constant.

6. Experiments

In this section, we compare the performance of DSEDR with those of various baseline methods on multiple-type network datasets to verify its efficiency. Specifically, we first provide a brief description of the experimental settings and then show the experiment results on artificial networks and a real-world network, respectively. All experiments were conducted on a server running Linux, equipped with a 16-core AMD EPYC 9354 CPU, 60.1 GB of memory, an NVIDIA RTX 4090 GPU with 25.2 GB of graphics memory, and a storage disk with 751.6 GB capacity.

6.1. Experimental Settings

6.1.1. Baseline Algorithm

Because there are few methods that have been proposed for signed network dismantling, the selection of the baselines is primarily based on classical centrality methods and excellent conventional algorithms in complex network dismantling. In this paper, we select 10 classical centrality metrics and 2 conventional algorithms, MinSum [18] and FINDER [20], as baselines. These classical centrality metrics include metrics that are not related to edge symbols, such as Degree [35], Betweenness [36], K-shell [37], and Closeness [38], as well as metrics that take edge symbols into account, such as Positive degree (P-DEG), Negative degree (N-DEG), Net degree (Net-DEG), Ratio degree (Ratio-DEG), Prestige [39], and PageRank [40]. A brief description of these baselines is provided below.

Degree: The degree of a node, i.e., the number of neighboring nodes directly connected to the node [35].
Betweenness: Betweenness Centrality (BC) reflects how often a node appears on the shortest paths between pairs of other nodes. The BC of a node is defined as follows:

$B C (i) = \sum_{i \neq s, j \neq t, s \neq t} \frac{g_{s t}^{i}}{g_{s t}},$

(21)

where $g_{s t}$ is the number of all shortest paths from node $v_{s}$ to $v_{t}$ , and $g_{s t}^{i}$ is the number of shortest paths passing through $v_{i}$ among the $g_{s t}$ shortest paths from node $v_{s}$ to $v_{t}$ [36].
K-shell: K-shell centrality categorizes nodes based on their degrees to assess their importance in a network. Assuming there are no isolated nodes in the network, we eliminate nodes with one connection until no such nodes remain and assign them to the 1-shell. Similarly, we recursively eliminate nodes with degree 2 to form the 2-shell. This process ends when all nodes have been assigned to one of the shells [37].
Closeness: Closeness Centrality reflects the distance between a node and all other nodes in the network and measures the average shortest path length from the node to all other nodes. A higher closeness value indicates a more central position within the network. It can be computed as follows:

$d_{i} = \frac{N - 1}{\sum_{i \neq j} d_{i j}},$

(22)

where $d_{i j}$ is the length of the shortest path between node $v_{i}$ and node $v_{j}$ [38].
Positive degree (P-DEG): The number of positive edges connected to the i-th node, denoted as $k_{i}^{+}$ .
Negative degree (N-DEG): The number of negative edges connected to the i-th node, denoted as $k_{i}^{-}$ .
Net degree (Net-DEG): This metric represents the difference between P-DEG and N-DEG:

$k_{i}^{+} - k_{i}^{-} .$

(23)
Ratio degree (Ratio-DEG): It indicates the proportion of positive edges connected to node $v_{i}$ relative to its total edges in the network, as follows:

$\frac{k_{i}^{+}}{k_{i}^{+} + k_{i}^{-}} .$

(24)
Prestige: Prestige is determined by both the positive and negative incoming links to a node [39]. The prestige of node i ( $P r_{i}$ ) is calculated as follows:

$P r_{i} = \frac{k_{i}^{+} - k_{i}^{-}}{k_{i}^{+} + k_{i}^{-}}$

(25)
PageRank: PageRank, which was inspired by Larry Page of Google, is among the most prevalent ranking algorithms in use today [40]. We can represent the PageRank score of node i as $P R (i)$ . This rank value is computed (in an iterative manner) as follows:

$P R_{i} (t + 1) = α \sum_{j \in I N_{i}} \frac{P R_{j} (t)}{| O U T_{j} |} + (1 - α) \frac{1}{N}$

(26)

where $α$ is a forgetting factor. N is the number of nodes in the network. $I N_{i}$ represents the set of nodes that have edges pointing to node i, and $| O U T_{j} |$ represents the number of outgoing links from node j.
MinSum and FINDER: Two outstanding algorithms in complex network dismantling. For more information, please refer to [18,20].

6.1.2. Parameter Setting

Next, we present the specific parameter setting regarding the evolutionary algorithm and the DQN network in the experiments, which contain the number of iterations, population size, training batch size, etc., as detailed in Table 2.

Among these parameters, we give the results of the influence of crossover and mutation probabilities on the DSEDR algorithm in the following experiments to demonstrate that DSEDR maintains a stable dismantling efficiency under different evolutionary probability settings.

6.2. Artificial Network

In our experiments with artificial networks, we choose the famous LFR network [41] and the BA network [42].

The LFR network is a model for generating complex networks with an intrinsic community structure. In LFR networks, the mixing parameter

μ

controls the proportion of cross-community edges in the total edges of nodes. A smaller mixing parameter indicates a clearer community structure and more connected nodes within each community. In our experiments, we set the power law exponent of the node degree distribution of networks

β

to 3 and the power law exponent of the community size distribution

γ

to 1.5, and construct 5 medium-sized LFR networks with 1000 nodes by setting 5 different mixing parameters

μ

.

The BA network is a model for generating scale-free networks, characterized by a power-law degree distribution. In this model, a few nodes (high-degree nodes) have a large number of connections, while the majority of nodes have only a few connections. In the BA model, the parameter

ω

represents the number of connections each new node makes to existing nodes. A larger value for this parameter increases the network’s overall connectivity, making the power-law feature more significant. In our experiments, we construct 5 medium-sized BA networks with 1000 nodes by setting 5 different parameters

ω

.

Due to the inherent randomness in our algorithm, we conduct 20 independent experiments using DSEDR to dismantle

10 %

nodes on each network. Figure 3 presents the results of 20 experiments in a box plot and compares the efficiency of DSEDR with 8 baselines. Figure 4 illustrates the impact of

p_{c r o s s o v e r}

on the performance of DSEDR across various artificial networks. For each value of

p_{c r o s s o v e r}

, we conduct 20 experiments and then plot the mean value of the objective function

Φ

in a bar plot for comparison. The results show that our algorithm consistently outperforms the 12 baselines on 10 artificial networks under different

μ

and

ω

settings, and remains stable under different

p_{c r o s s o v e r}

settings.

To verify that DSEDR still maintains its high efficiency on large-scale networks, we also conducted the same experiments on the LFR and BA networks with 10,000 nodes. The results are shown in Figure 5, which demonstrate that even in large-scale networks, DSEDR still exhibits excellent efficiency.

6.3. Real-World Network

For the real network dismantling experiment, we choose the war network as the experimental network. The war network is a network data set of military relationships extracted from the Military Warfare Project [43], which reflects the alliances and antagonisms constituted by 166 countries between 2000 and 2010. The positive edge represents an alliance relationship between two countries, while the negative edge represents an antagonistic relationship between two countries. The detailed data information of this network is shown in Table 3. In this part, we present the dismantling efficiency comparison of the DSEDR algorithm with 8 baselines on the war network and the influence of parameter

p_{c r o s s o v e r}

on DSEDR’s dismantling performance. Further more, the visualization of the dismantling procedure and the analysis of real-world meaning of the dismantling are given as well.

6.3.1. Efficiency and Parameter Analysis

Figure 6 shows the efficiency comparison of DSEDR and 8 baselines. In the experiment, we select 10 various numbers of dismantled nodes. Considering the randomness in the initialization part of the algorithm, we conduct 20 experiments for each number of dismantled nodes, respectively. Subsequently, we present the results as a box plot for a comparison with the 12 baselines. It is evident that the DSEDR algorithm is much more effective in optimizing the objective function

Φ

than the baselines. In particular, when the number of removed nodes reaches 20% of the total points (i.e., 33 nodes), the DSEDR algorithm exhibits an apparent advantage over the baselines.

Figure 7 shows the influence of the parameter

p_{c r o s s o v e r}

on DSEDR’s performance when 33 nodes are dismantled. In order to compare DSEDR with 12 baselines more concisely, we choose only the two baselines with the best dismantling result, Degree and N-DEG, for comparison here. We conduct 20 experiments with DSEDR at each different

p_{c r o s s o v e r}

, and the mean values of the obtained

Φ

are plotted as a bar plot to compare with the 2 baselines. The performance of the DSEDR algorithm is relatively stable under different

p_{c r o s s o v e r}

and remains significantly more efficient than the baselines under different parameter settings according to the figure.

6.3.2. Visualization and Real Meaning Analysis

To further demonstrate the effect and application value of the DSEDR algorithm in signed network dismantling, this part presents the visualization of the war network dismantling process using the DSEDR algorithm, i.e., Figure 8. The figure comprises three snapshots, (b), (c), and (d), each displaying the war network after dismantling different numbers of nodes during the process (a).

As we can see from Figure 8d, DSEDR dismantles the complex coalition-versus relationship among 166 countries by removing 33 of the nodes (which can be interpreted as isolating the diplomatic relations of these countries) and breaks them down into clearly structured components of coalition nodes, antagonistic nodes, and a number of independent nodes. Unlike traditional complex network dismantling, DSEDR does not only seek to reduce the size of the giant connectivity component in signed network dismantling. When the structure of the node clusters is relatively clear, DSEDR further seeks to increase the proportion of negative edges in the network so as to reduce the stability of the alliance and confrontation network formed by these 166 countries, which is able to give an opinion on the maintenance of the international peace.

7. Conclusions

In this paper, we propose a new framework named DSEDR to explore the signed network dismantling problem that integrates the advantages of both evolutionary computation and deep reinforcement learning. First, we propose a new objective function that integrates features of sign information and network topology. Then, we transform the objective function minimization to a continuous parameter optimization of a deep Q-learning network. To obtain the optimal parameter, DSEDR utilizes evolutionary computation by considering different DQN parameters as different population individuals, and searches for the optimal DQN parameters by combining the global search capability in evolutionary algorithms with the fast local search capability in deep reinforcement learning. Finally, to verify the efficiency of DSEDR, we apply it to multiple-type artificial and real network data sets. We compare the performance with 12 popular baseline methods, and the experimental results demonstrate that DSEDR owns superior performance for signed networks dismantling among all the algorithms. Due to its demonstrated superiority in signed network dismantling problems, DSEDR has great application value, such as disrupting rumor propagation networks, finding critical components in sensor networks, and maintaining sensor network stability.

In future research, we will further extend the application of networks to a more large-scale network with millions of nodes. Moreover, with the development of high-order network studies, we can apply DSEDR on more complex topologies, such as directed networks and temporal networks. Meanwhile, we can evolve the parameter encoder together with the parameters of DQN, which can further improve the performance of our framework.

Author Contributions

Conceptualization, Y.O. and F.X.; methodology, Y.O. and H.Z.; formal analysis, F.X. and H.Z.; data analysis, H.L. and H.Z.; writing—original draft preparation, Y.O., F.X. and H.Z.; writing—review and editing, H.L. and H.Z.; visualization, Y.O. and H.L.; supervision, H.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (Grant 71871233) and Beijing Natural Science Foundation (Grant 9182015).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author due to restrictions on privacy and confidentiality, which prevent open sharing of the dataset.

Acknowledgments

We are grateful to the anonymous reviewers for their valuable suggestions.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Akhtar, M.U.; Liu, J.; Liu, X.; Ahmed, S.; Cui, X. NRAND: An efficient and robust dismantling approach for infectious disease network. Inf. Process. Manag. 2023, 60, 103221. [Google Scholar] [CrossRef]
Qi, M.; Chen, P.; Wu, J.; Liang, Y.; Duan, X. Robustness measurement of multiplex networks based on graph spectrum. Chaos 2023, 33, 021102. [Google Scholar] [CrossRef] [PubMed]
Collins, B.; Hoang, D.T.; Nguyen, N.T.; Hwang, D. A new model for predicting and dismantling a complex terrorist network. IEEE Access 2022, 10, 126466–126478. [Google Scholar] [CrossRef]
Duijn, P.A.; Kashirin, V.; Sloot, P.M. The relative ineffectiveness of criminal network disruption. Sci. Rep. 2014, 4, 4238. [Google Scholar] [CrossRef]
Tripathy, R.M.; Bagchi, A.; Mehta, S. A study of rumor control strategies on social networks. In Proceedings of the ACM International Conference on Information & Knowledge Management, Toronto, ON, Canada, 26–30 October 2010. [Google Scholar]
Zhan, X.-X.; Zhang, K.; Ge, L.; Huang, J.; Zhang, Z.; Wei, L.; Sun, G.-Q.; Liu, C.; Zhang, Z.-K. Exploring the effect of social media and spatial characteristics during the COVID-19 pandemic in china. IEEE Trans. Netw. Sci. Eng. 2022, 10, 553–564. [Google Scholar] [CrossRef]
Rahman, K.C. A survey on sensor network. J. Comput. Inf. Technol. 2010, 1, 76–87. [Google Scholar]
Bui, T.N.; Jones, C. Finding good approximate vertex and edge partitions is np-hard. Inf. Process. Lett. 1992, 42, 153–159. [Google Scholar] [CrossRef]
Buldyrev, S.V.; Parshani, R.; Paul, G.; Stanley, H.E.; Havlin, S. Catastrophic cascade of failures in interdependent networks. Nature 2010, 464, 1025–1028. [Google Scholar] [CrossRef] [PubMed]
Osat, S.; Faqeeh, A.; Radicchi, F. Optimal percolation on multiplex networks. Nat. Commun. 2017, 8, 1540. [Google Scholar] [CrossRef] [PubMed]
Leskovec, J.; Huttenlocher, D.; Kleinberg, J. Signed networks in social media. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Atlanta, GA, USA, 10–15 April 2010; pp. 1361–1370. [Google Scholar]
Li, H.-J.; Feng, Y.; Xia, C.; Cao, J. Overlapping graph clustering in attributed networks via generalized cluster potential games. ACM Trans. Knowl. Discov. Data 2024, 18, 1–26. [Google Scholar] [CrossRef]
Tang, J.; Chang, Y.; Aggarwal, C.; Liu, H. A survey of signed network mining in social media. ACM Comput. Surv. 2016, 49, 1–37. [Google Scholar] [CrossRef]
Arulkumaran, K.; Deisenroth, M.P.; Brundage, M.; Bharath, A.A. Deep Reinforcement Learning: A Brief Survey. IEEE Signal Process. Mag. 2017, 34, 26–38. [Google Scholar] [CrossRef]
Ma, L.; Shao, Z.; Li, X.; Lin, Q.; Li, J.; Leung, V.C.; Nandi, A.K. Influence Maximization in Complex Networks by Using Evolutionary Deep Reinforcement Learning. IEEE Trans. Emerg. Top. Comput. Intell. 2023, 7, 995–1009. [Google Scholar] [CrossRef]
Osat, S.; Papadopoulos, F.; Teixeira, A.S.; Radicchi, F. Embedding-aided network dismantling. Phys. Rev. Res. 2023, 5, 013076. [Google Scholar] [CrossRef]
Wandelt, S.; Lin, W.; Sun, X.; Zanin, M. From random failures to targeted attacks in network dismantling. Reliab. Eng. Syst. Saf. 2022, 218, 108146. [Google Scholar] [CrossRef]
Braunstein, A.; Dall’Asta, L.; Semerjian, G.; Zdeborová, L. Network dismantling. Proc. Natl. Acad. Sci. USA 2016, 113, 12368–12373. [Google Scholar] [CrossRef]
Yan, D.; Xie, W.; Zhang, Y.; He, Q.; Yang, Y. Hypernetwork dismantling via deep reinforcement learning. IEEE Trans. Netw. Sci. Eng. 2022, 9, 3302–3315. [Google Scholar] [CrossRef]
Fan, C.; Zeng, L.; Sun, Y.; Liu, Y.Y. Finding key players in complex networks through deep reinforcement learning. Nat. Mach. Intell. 2020, 2, 317–324. [Google Scholar] [CrossRef] [PubMed]
Deepali, J.J.; Ishaan, K.; Sadanand, G.; Omkar, K.; Divya, P.; Shivkumar, P. Reinforcement Learning: A Survey. In Machine Learning and Information Processing; Springer: Berlin, Germany, 2021; pp. 297–308. [Google Scholar]
Liu, F.Y.; Li, Z.N.; Qian, C. Self-Guided Evolution Strategies with Historical Estimated Gradients. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-20) IJCAI, Yokohama, Japan, 11–17 July 2020; pp. 1474–1480. [Google Scholar]
Khadka, S.; Tumer, K. Evolution-guided policy gradient in reinforcement learning. In Advances in Neural Information Processing Systems 31; Neural Information Processing Systems Foundation, Inc. (NeurIPS): Montreal, QC, Canada, 2018. [Google Scholar]
Zhan, Z.; Li, J.; Zhang, J. Evolutionary deep learning: A survey. Neurocomputing 2022, 483, 42–58. [Google Scholar] [CrossRef]
Cui, X.; Zhang, W.; Tüske, Z.; Picheny, M. Evolutionary stochastic gradient descent for optimization of deep neural networks. In Advances in Neural Information Processing Systems 31; Neural Information Processing Systems Foundation, Inc. (NeurIPS): Montreal, QC, Canada, 2018. [Google Scholar]
Such, F.P.; Madhavan, V.; Conti, E.; Lehman, J.; Stanley, K.O.; Clune, J. Deep neuroevolution: Genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning. arXiv 2017, arXiv:1712.06567. [Google Scholar] [CrossRef]
Khadka, S.; Majumdar, S.; Nassar, T.; Dwiel, Z.; Tumer, E.; Miret, S.; Liu, Y.; Tumer, K. Collaborative evolutionary reinforcement learning. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019. [Google Scholar]
Li, H.-J.; Xu, W.; Qiu, C.; Pei, J. Fast Markov Clustering Algorithm Based on Belief Dynamics. IEEE Trans. Cybern. 2023, 53, 3716–3725. [Google Scholar] [CrossRef] [PubMed]
Li, H.-J.; Cao, H.; Feng, Y.; Li, X.; Pei, J. Optimization of Graph Clustering Inspired by Dynamic Belief Systems. IEEE Trans. Knowl. Data Eng. 2024, 36, 6773–6785. [Google Scholar] [CrossRef]
Borgatti, S.P. Identifying sets of key players in a social network. Comput. Math. Organ. Theory 2006, 12, 21–34. [Google Scholar] [CrossRef]
Cheng, S.Q.; Shen, H.W.; Zhang, G.Q.; Cheng, X.Q. Survey of signed network research. Ruan Jian Xue Bao/J. Softw. 2013, 25, 1–15. [Google Scholar]
Shen, X.; Chung, F.L. Deep network embedding for graph representation learning in signed networks. IEEE Trans. Cybern. 2020, 50, 1556–1568. [Google Scholar] [CrossRef] [PubMed]
Han, X.N.; Liu, H.P.; Sun, F.C.; Zhang, X.Y. Active object detection with multistep action prediction using deep Q-network. IEEE Trans. Ind. Inform. 2019, 15, 3723–3731. [Google Scholar] [CrossRef]
Zhu, Y.H.; Zhao, D.B. Online minimax Q network learning for two-player zero-sum Markov games. IEEE Trans. Neural Netw. Learn. Syst. 2022, 33, 1228–1241. [Google Scholar] [CrossRef]
Albert, R.; Jeong, H.; Barabási, A.L. Error and attack tolerance of complex networks. Nature 2000, 406, 378–382. [Google Scholar] [CrossRef]
Freeman, L.C. A set of measures of centrality based on betweenness. Sociometry 1977, 40, 35–41. [Google Scholar] [CrossRef]
Carmi, S.; Havlin, S.; Kirkpatrick, S.; Shavitt, Y.; Shir, E. A model of Internet topology using k-shell decomposition. Proc. Natl. Acad. Sci. USA 2007, 104, 11150–11154. [Google Scholar] [CrossRef] [PubMed]
Freeman, L.C. Centrality in social networks conceptual clarification. Soc. Netw. 1978, 1, 215–239. [Google Scholar] [CrossRef]
Zolfaghar, K.; Aghaie, A. Mining trust and distrust relationships in social Web applications. In Proceedings of the 2010 IEEE 6th International Conference on Intelligent Computer Communication and Processing, Cluj-Napoca, Romania, 26–28 August 2010; pp. 73–80. [Google Scholar]
Page, L.; Brin, S.; Motwani, R.; Winograd, T. The PageRank Citation Ranking: Bringing Order to the Web. In Proceedings of the Web Conference, Toronto, ON, Canada, 11–14 May 1999. [Google Scholar]
Lancichinetti, A.; Fortunato, S.; Radicchi, F. Benchmark graphs for testing community detection algorithms. Phys. Rev. E—Stat. Nonlinear Soft Matter Phys. 2008, 78, 046110. [Google Scholar] [CrossRef] [PubMed]
Barabási, A.L.; Albert, R. Emergence of scaling in random networks. Science 1999, 286, 509–512. [Google Scholar] [CrossRef]
Ghosn, F.; Palmer, G.; Bremer, S.A. The MID3 data set, 1993–2001: Procedures, coding rules, and description. Confl. Manag. Peace Sci. 2004, 21, 133–154. [Google Scholar] [CrossRef]

Figure 1. Flowchart of the DSEDR algorithm. Specifically, the algorithm consists of three parts. First, the Network Embedding procedure generates the embedding of the target network as the input of DQN, which transforms the problem into the optimization of DQN’s weight

(W^{1}, W^{2})

. Then Evolutionary Deep Reinforcement Learning is employed to find the best weight

(W_{F}^{1}, W_{F}^{2})

. Finally, Signed Network Dismantling is conducted by using DQN with the output weight.

Figure 1. Flowchart of the DSEDR algorithm. Specifically, the algorithm consists of three parts. First, the Network Embedding procedure generates the embedding of the target network as the input of DQN, which transforms the problem into the optimization of DQN’s weight

(W^{1}, W^{2})

. Then Evolutionary Deep Reinforcement Learning is employed to find the best weight

(W_{F}^{1}, W_{F}^{2})

. Finally, Signed Network Dismantling is conducted by using DQN with the output weight.

Figure 2. The flowchart of Markov seed selection, where

H_{t}

,

S_{t}

,

D_{t}

, and

P_{t}

denote network embeddings, seed selection state vector, degree vector, and positive out-degree vector at time t, respectively.

Q_{t}

is the Q-value vector;

r_{t}

and

a_{t}

are reward and action at t. Dashed lines represent positive edges, and solid lines represent negative edges.

Figure 2. The flowchart of Markov seed selection, where

H_{t}

,

S_{t}

,

D_{t}

, and

P_{t}

denote network embeddings, seed selection state vector, degree vector, and positive out-degree vector at time t, respectively.

Q_{t}

is the Q-value vector;

r_{t}

and

a_{t}

are reward and action at t. Dashed lines represent positive edges, and solid lines represent negative edges.

Figure 3. Efficiency comparison on artificial networks. Among 13 algorithms, DSEDR has the best performance in the LFR network under 5 different mixing parameters

μ

and in the BA network under 5 different parameters

ω

.

Figure 3. Efficiency comparison on artificial networks. Among 13 algorithms, DSEDR has the best performance in the LFR network under 5 different mixing parameters

μ

and in the BA network under 5 different parameters

ω

.

Figure 4. Influence of

p_{c r o s s o v e r}

on artificial networks. Specifically, (a) is the result of a different

p_{c r o s s o v e r}

on LFR with a different

μ

, and (b) shows the result of a different

p_{c r o s s o v e r}

on BA with a different

ω

.

Figure 4. Influence of

p_{c r o s s o v e r}

on artificial networks. Specifically, (a) is the result of a different

p_{c r o s s o v e r}

on LFR with a different

μ

, and (b) shows the result of a different

p_{c r o s s o v e r}

on BA with a different

ω

.

Figure 5. Efficiency comparison on 10 different networks with 10,000 nodes after dismantling 10% nodes.

Figure 6. Efficiency comparison on the war network. Among 9 algorithms, DSEDR has excellent performance compared with all 12 baselines for 10 different numbers of removed nodes.

Figure 7. Influence of

p_{c r o s s o v e r}

on the war network, where red bars denote the efficiency of DSEDR with different

p_{c r o s s o v e r}

. Here, we select the top 2 baselines, Degree and Net-DEG, to be compared.

Figure 7. Influence of

p_{c r o s s o v e r}

on the war network, where red bars denote the efficiency of DSEDR with different

p_{c r o s s o v e r}

. Here, we select the top 2 baselines, Degree and Net-DEG, to be compared.

Figure 8. The process of dismantling the war network using DSEDR. DSEDR seeks to design a node removal sequence to minimize the objective function

Φ

in Equation (4). (a) illustrates the objective function curve on the war network with the horizontal axis being the number of dismantled nodes and the vertical axis being the

Φ

of the residual graph after removing these nodes. (b–d) show the snapshots after removing 8 (b), 17 (c), and 33 (d) key nodes(cyan) determined by DSEDR at the different time points marked in the objective function curve of DSEDR in (a), respectively. The red line denotes the positive edge, and the green line denotes the negative edge.

Figure 8. The process of dismantling the war network using DSEDR. DSEDR seeks to design a node removal sequence to minimize the objective function

Φ

in Equation (4). (a) illustrates the objective function curve on the war network with the horizontal axis being the number of dismantled nodes and the vertical axis being the

Φ

of the residual graph after removing these nodes. (b–d) show the snapshots after removing 8 (b), 17 (c), and 33 (d) key nodes(cyan) determined by DSEDR at the different time points marked in the objective function curve of DSEDR in (a), respectively. The red line denotes the positive edge, and the green line denotes the negative edge.

Table 1. Summary of notation.

Notation	Instruction
$G$	Target network
$V$	Set of nodes
$E$	Set of edges
$N$	Number of nodes
$L$	Number of edges
$σ$	Size of giant connected component
$C_{i}$	The i-th connected component of $G$
$δ_{i}$	Size of $C_{i}$
$Φ$	Objective function
$\hat{V}$	Set of nodes to be dismantled
$W^{1}, W^{2}$	Weights of DQN neural network
$a_{t}$	Action of DQN at t
$S_{t}$	Seed node selection state of DRL at time t
$V_{t}$	Selected nodes set to be removed at t
$D_{t}$	Degree vector of each node
$P_{t}$	Positive out-degree vector
$r_{t}$	Decision reward
$n_{I}$	Initial population size
$n_{b}$	Batch size in DRL algorithm
$n_{g}$	Maximum number of iterations
$n_{p}$	Population size in the EA algorithm
$P$	Population in evolution
$K$	The number of nodes to be removed
$\bar{K}$	The average connectivity of nodes
d	The 1st layer neurons’ number of the DQN
l	The 2nd layer neurons’ number of the DQN

Table 2. Specific experimental parameter settings for the proposed DSEDR algorithm.

Parameter	Value
Iteration number $n_{g}$	100
Evolutionary population size $n_{p}$	100
Crossover probability $p_{c r o s s o v e r}$	0.8
Mutation probability $p_{m u t a t i o n}$	0.2
Network embedding dimension d	64
Training batch size $n_{b}$	512
Training discount rate $γ$	0.8
Training learning rate $α$	0.001
Importance of positive edge share $λ$ in $Φ$	1

Table 3. Detailed information of the war network.

Network Parameter	Value
number of nodes n	166
number of sides k	1433
number of positive sides $k^{+}$	1295
number of negative side $k^{-}$	138

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ou, Y.; Xiong, F.; Zhang, H.; Li, H. Network Dismantling on Signed Network by Evolutionary Deep Reinforcement Learning. Sensors 2024, 24, 8026. https://doi.org/10.3390/s24248026

AMA Style

Ou Y, Xiong F, Zhang H, Li H. Network Dismantling on Signed Network by Evolutionary Deep Reinforcement Learning. Sensors. 2024; 24(24):8026. https://doi.org/10.3390/s24248026

Chicago/Turabian Style

Ou, Yuxuan, Fujing Xiong, Hairong Zhang, and Huijia Li. 2024. "Network Dismantling on Signed Network by Evolutionary Deep Reinforcement Learning" Sensors 24, no. 24: 8026. https://doi.org/10.3390/s24248026

APA Style

Ou, Y., Xiong, F., Zhang, H., & Li, H. (2024). Network Dismantling on Signed Network by Evolutionary Deep Reinforcement Learning. Sensors, 24(24), 8026. https://doi.org/10.3390/s24248026

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Network Dismantling on Signed Network by Evolutionary Deep Reinforcement Learning

Abstract

1. Introduction

2. Related Works

2.1. Network Dismantling Methods

2.2. Evolutionary Deep Reinforcement Learning Algorithms

3. Problem Formulation

3.1. Network Connectivity

3.2. Signed Networks

3.3. The Objective Function

4. Deep Reinforcement Learning

4.1. Network Embedding

4.2. Deep Q-Network

4.3. Markov Seed Selection

5. The DSEDR Algorithm

5.1. Evolution of DQN Populations

5.1.1. Solution Representation and Evaluation

5.1.2. Initialization Operations

5.1.3. Evolutionary Operations

5.2. Reinforcement Learning Operation

5.3. DSEDR Algorithm

5.4. Complexity Analysis

6. Experiments

6.1. Experimental Settings

6.1.1. Baseline Algorithm

6.1.2. Parameter Setting

6.2. Artificial Network

6.3. Real-World Network

6.3.1. Efficiency and Parameter Analysis

6.3.2. Visualization and Real Meaning Analysis

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI