1. Introduction
With the rapid development of the internet and cloud computing technology, the demand for the integration of the physical world and the digital world is increasing [
1]. The efficient, green, and intelligent characteristics of CPSs are highly compatible with the needs of the times, putting them in a booming and prosperous period under the information age, and are rapidly being applied in areas such as industrial production, national defense and military, smart grids, smart healthcare, and other fields [
2,
3,
4,
5]. In industrial production, CPS technology is deeply intertwined with Industry 4.0, based on embedded systems and information technology, and it can monitor the status of physical equipment in factories in real time to deeply perceive the production process, and its development is highly consistent with that of smart factories [
6,
7]. In recent years, with the rapid promotion of automatic driving, the highly integrated information technology of the car can combine and interact with computer and network technology, and vehicle operation in the physical world using functions such as perception technology, real-time data processing, and control algorithms [
8,
9].
However, the integration of wireless communication technology in CPSs gives the possibility of remote control of the system, and therefore wireless communication has the advantages of convenient control, but also makes the communication between systems highly open and vulnerable to malicious attackers [
10,
11,
12]. Such attacks can be categorized into data tampering attacks and denial of service (DoS) attacks according to their effects, among which data tampering attacks can be realized in the form of data modification attacks, false data injection (FDI) attacks, and data replay attacks [
13]. In CPSs, data tampering attacks can hijack and tamper with the normal data in the communication between the sender and the receiver, so that the data used by the receiver are not the true data of the system, which in turn causes the control link to give the wrong control commands, leading to the malfunctioning of the whole system [
14,
15,
16].
The study of data tampering attacks is conducive to the defender to make timely and appropriate decisions on the attack strategy, issue correct control instructions, and minimize the damage of data tampering attacks on the CPS, so there have been related studies. Ref. [
17] proposed a linear spoofing attack strategy for the false data injection attack and gave the corresponding feasibility constraints to ensure that the attacker could successfully inject false data without being detected. Ref. [
18] investigated the problem of designing FDI attacks for a class of CPSs with a state estimator and an attack detector, where the desired sequence of non-perfect attacks could be designed by analyzing the maximum eigenvalues of the auxiliary matrices and the corresponding eigenvectors in the absence of a priori knowledge of the estimator. And unlike previous zero-mean Gaussian distribution attack strategies, ref. [
19] proposed an optimal attack strategy based on arbitrary mean Gaussian distribution. With an understanding of the attacker’s attack pattern, the study of attack detection from the defender’s perspective focused on how to identify and respond to ongoing or potential attacks. In terms of FDI attack detection, ref. [
20] designed a nonlinear local joint estimator, and a learning-based fusion criterion was proposed for multi-sensor FDI attacks to simultaneously estimate the system state and the attack signal. For covert FDI attacks in train ground communication systems, ref. [
21] used an authentication mechanism to design an automatically generated multiplicative coding scheme to detect intrusions, dynamically updated the coding sequences online to encrypt and decrypt the raw train measurements, and then built a defense model based on the navigational projection algorithm to reconstruct the train location information corrupted by the attacker. For detecting FDI attacks in smart grids, ref. [
22] proposed a CNN-LSTM method based on PSO optimization to improve its security and stability. After an attack exists in the system, research on defenses focus on how to mitigate the impact of the attack. Ref. [
23] gave an optimal attack strategy for data tampering attacks in the framework of system identification based on binary observations, and finally a compensation algorithm to defend against data tampering attacks was given and verified. For data tampering attacks in finite impulse response systems, ref. [
24] implemented the optimal attack strategy and further proposed an online defense strategy from the perspective of the defender. Ref. [
25] designed a robust adaptive sliding mode observer to estimate the state of the power system, utilized the attack reconstruction method to estimate the FDI attack signals, and finally proposed a reliable sliding mode control strategy to eliminate the effects of the FDI attack.
For the data tampering attack, in order to achieve the optimal attack, the traditional method uses the estimation error as the index, which is based on knowing the defender’s identification algorithm, but how should the attacker construct the optimization index when the attacker has no knowledge of the defender’s identification algorithm? In addition, after constructing the optimization metrics, so how should the attacker solve for the optimal attack strategy? In the actual attack, the data sequences are transmitted in real time in the network channel, how should the attacker meet the real-time requirement of online attack? In this paper, under the framework of system identification with multiple binary observations, the research is carried out from the attacker’s perspective to pave the way for better defense. For the construction of indicators, since the defender mainly uses the information available in the data for identification, if the data contain less information, or if the data distribution is more concentrated due to the attack, then the identification of the defender will deviate more from the true value. Therefore, this paper introduces information entropy as an index to describe the effect of the attack, and at the same time models the problem of selecting the optimal data tampering attack under the multi-binary observations as an information entropy-based optimization problem with energy constraints. Due to the complexity of multi-observations, the optimization problem is a multi-parameter optimization task, so this paper gives the solution set of the optimal attack strategy based on the PSO algorithm. In order to solve the lag problem caused by the computation of the PSO algorithm, this paper carries out real-time optimization based on the BP network, and finally verifies the conclusion through simulation.
In this paper, there are different attack strategies in each communication network, the objective to be optimized is the set of optimal attack strategies, the identification algorithm of the defender is unknown for the attacker, and in addition the online attack needs to meet the real-time requirement of the attack. The contributions of this paper are as follows:
For the problem of attack indicators in multi-quantified observation linear systems, if the defender identification algorithm is unknown, this paper constructs the indicators based on the average entropy and turns the optimal data tampering attack problem into an optimization problem with energy constraints.
For the optimization problem where the average entropy expression is multi-parameter single-objective, this paper obtains the optimal solution based on the PSO algorithm and designs the estimation method for the unknown parameters.
In order to meet the real-time requirements of the attack, the lag of the PSO search solution leads to the inability to carry out the optimal tampering attack in real time, and therefore, in this paper, we use the optimal solution of PSO to construct the dataset, and construct the BP neural network for training in order to obtain the model that can be processed fast to obtain the solution set of the optimal data tampering attack.
The remaining sections of this paper are organized as follows.
Section 2 describes the model of the system and the strategies of data tampering attacks;
Section 3 constructs the model of the optimization problem under information entropy and obtains the optimal attack strategy based on the PSO algorithm;
Section 4 gives the estimation of the unknown parameters as well as the implementation of online attack based on the BP network;
Section 5 illustrates the reasonableness of the obtained conclusions through numerical simulations; and
Section 6 gives the summary and outlook of this paper.
2. Problem Formulation
Given a system consisting of
m mutually independent binary observations, consider the
l-th observing system,
:
where
is the system parameter;
,
…,
is the input parameter for each observation system, which is a fixed value that does not change by moment
k; and
is the system noise.
is the system output of the
l-th observing system at the
k-th moment, measured by a binary observation sensor with threshold
, which is expressed as:
As shown in
Figure 1,
denotes the data that were transmitted to the estimation center through the
l-th network channel at the moment of
k, but were subjected to a cyberattack during transmission, resulting in data tampering. The data received by the estimation center are denoted as
, which is related to
:
The data tampering attack strategy of the l-th observation system described in the above equation is abbreviated as .
In the given system, the identification algorithm of the unknown parameter
needs to be constructed by the defender, while the attacker aims to disrupt the effectiveness of the identification algorithms under multiple binary observations. Since there are multiple communication networks, so the attacker can adopt different attack strategies in different networks, the attacker’s set of tampering strategies is noted as
In this system, the following questions are given and addressed in this paper:
- (1)
From the perspective of the attacker, when the system parameters are known, how to adjust the attack strategy so that the system can achieve the maximum attack effect with the minimum energy under the energy constraint.
- (2)
How to construct the attacker’s identification algorithm to accomplish the parameter estimation when the system parameters are unknown.
- (3)
How to implement the optimal data tampering attacks while meeting the real time.
Assumption 1. The system noise is a sequence of independent and identically distributed Gaussian random variables whose probability distribution function and probability density function are denoted as and , respectively.
Remark 1. To solve the case that the attacker does not have the knowledge of the threshold and the system parameters, one can view the distribution and system parameters jointly as uncertainties, and extract the information about the noise distribution from the designed input and relevant output data of the system with a modified algorithm.
Remark 2. The system in (1) is a quantized linear system model with multiple observing systems, which has a simplified structure but is easy to extend. In the subsequent research, the nonlinearization module can be used to combine and connect, which can well portray the complex nonlinear system. By adding a static nonlinear function with this system, it can be combined into a Hammerstein system or Wiener system.
3. Optimal Attack Strategy
Since the attacker has no knowledge of the defense’s identification algorithm, it is impossible to construct an explicit expression for solving the optimal attack strategy based on the identification algorithm. The information entropy is used in information theory to express the uncertainty contained in the system, which is only related to the distribution probability of the data in the channel and does not depend on the defense’s identification algorithm, and is often used as a detection index of malicious attacks in CPSs. In this section, the average entropy is introduced to construct the attack effect metrics of the system subjected to tampering attacks from the perspective of reducing the amount of information acquired by the estimation center.
3.1. The Modeling of the Optimal Attack Strategy Problem
For a discrete random variable
X, with possible values of
, its probability distribution is
, and then the information entropy of
X is defined as:
Theorem 1. Under the Assumption 1 and the tampering strategy (3)
, the overall average entropy of the system after tampering is where . Proof. For a sequence of binary data
in the channel after a data tampering attack, with possible values of
and probability distributions of
and
, respectively. When the input vector corresponding to
is
, according to (
2) and (
3), it is obtained that
From (
4), the information entropy of
is expressed as:
The overall information entropy of all the data in the system, which is the joint entropy of all the random variables of the system, is expressed according to the chain rule of entropy as:
where
is the conditional entropy, which is used to measure the uncertainty or information content of one random variable,
, given other random variables,
and
, provide mutual information to quantify the information reduced by
. Since
are independent of each other, at this point
provides no additional information for the elimination of uncertainty in
, i.e.,
, with
, which leads to
.
Combining (
6) and (
7), there is:
The expression for the average entropy of the system, , with respect to the attack strategy is obtained and the theorem is proved. □
In the actual model, the attack energy is not infinite, so the data tampering attack in this paper needs to satisfy both the total energy constraint and the maximum energy constraint. The total energy constraint is the upper bound of the sum of attack strategies in the set of attack strategies, , denoted as . There are also restrictions in the attack strategy for a single group attack strategy, and and denote the maximum energy constraints for and , respectively.
Under the energy-constrained condition, combined with (
5), the question to be investigated in this paper is transformed into: How can the attacking policy reasonably allocate energy to the multi-observation system in order to minimize the average entropy of the system as a whole and obtain the optimal data tampering attack strategy? There are:
For the data tampering attacks in such multiple binary observation systems,
is denoted as the optimal attack strategy, which is also the solution obtained after solving problem (
8) with constraints (9)–(11).
3.2. Optimal Attack Strategy Solving Based on PSO
As an attacker, the attack strategy for each observation system is different, and when the system contains multiple observations the expression (
5) for the average entropy of the system as a whole has multiple independent variables, and the only dependent variable is the average entropy of the system as a whole,
. For this kind of optimization problem with multiple independent variables and a single objective, it is difficult to find an explicit solution by ordinary computational methods, so we need an algorithm to solve the problem.
Intelligent optimization algorithms such as PSO and the genetic algorithm (GA) are the most commonly used choices for solving this optimization problem [
26,
27]. The PSO algorithm is chosen in this paper for the following reasons:
The PSO algorithm is based on particle optimality and population optimality, where memory and particles with less fitness are saved, while GA has no memory and previous knowledge is destroyed with the change of population.
For problems with fewer parameters, the PSO algorithm can obtain the optimal solution quickly without losing performance.
The advantage of GA mainly lies in its global search ability; this paper introduces adaptive mutation in the PSO algorithm, which can help the PSO algorithm to jump out of the local optimum.
It is easy to adjust the parameters in the PSO algorithm to select the optimal value.
Combining the above factors, this paper selects the PSO algorithm as a method for solving the multi-parameter optimization problems.
In (
1), at
, if the attacker knows about the system parameters,
, the threshold,
, and the input vector,
, this paper is based on PSO for the optimization problem (
8) to solve the optimal attack strategy solution set
.
Optimization design for PSO requires determining the population size, the particle dimensions, the positions and velocities of the particles, the learning factors, the inertia weights, the fitness function, and the maximum number of evolutionary generations.
Population size. Too large a population size will increase the complexity of the algorithm, resulting in a slow solution, while too small a population size will result in the algorithm failing to find an optimal solution.
Particle dimension. The particle dimension is set to since there are m observational systems, each of which the attacker can choose individually .
Learning factor. The larger the learning factor is, the easier it is to skip the optimal position, while too small a learning factor will result in particles falling into the local optimum easily.
Inertia weight. The inertia weight is mainly to determine how much the particle is influenced by the velocity vector of the previous moment, and it is enough to choose the appropriate value.
Fitness function. The fitness value is the function that needs to be optimized as the target. In this paper, the fitness function (
5) of the particle is the average entropy of the system, and the optimal position of the population in PSO is the solution to the optimization problem (
8) when iterated for many times or when the termination conditions are satisfied.
Maximum evolutionary algebra. Its effect on the optimization solution process is similar to that of population size; too small a maximum evolutionary generation will cause the algorithm to terminate the iterations early and fail to obtain the optimal solution.
After selecting suitable initialization parameters, the optimal attack strategy,
, can be obtained based on the PSO algorithm, and the algorithm flow is as Algorithm 1. In Algorithm 1, the parameters are initialized firstly. In each iteration round, the velocity and particle position are updated according to the optimal and current positions, while
is a random value. After that, adaptive mutation is performed according to the threshold
to provide the possibility for the particle to jump out of the local optimum. Since the optimization problem in this paper introduces energy constraints, the current position of the particle needs to be corrected, and the optimal position
and
are updated. After the requirements of iteration accuracy or maximum number of iterations are met, the loop is terminated and the optimal attack strategy,
, is given. Under the premise that the attacker knows the threshold,
, the input vector,
, and the system parameter,
, in the system, Algorithm 1 can satisfy the energy constraints and obtain Ω*.
Algorithm 1: The PSO algorithm to solve Ω* |
|
4. Implementation of Optimal Attack Strategy
Under the premise of given system parameters, the PSO algorithm can obtain the optimal data tampering attack strategy of the system. But, in the actual model, the system parameters, , are encrypted and protected, and cannot easily to be uncovered, so this section gives the estimation algorithm of the system parameter, , under the condition that the parameters are unknown. After the identification, the implementation of the online optimal attack is given.
4.1. Estimation of Unknown Parameter
For the
l-th observation system, the data
output by the binary observation in the structure shown in
Figure 1 is binomially distributed. Based on Assumption 1, the probability that
is 1 is given by the following equation
Denoting
, when
, there is
. Thus, for
m systems, we have the following
m equations
Noting that
and
, an estimate of the system parameter,
, can be obtained when the matrix,
, has full rank,
4.2. Real-Time Improvement by BP Neural Network
In the case of unknown parameters, the previous subsection gives the expression (
13) for the attacker to obtain the estimated value,
, based on the data
output from the binary observer, and the attacker can estimate the parameters while accepting the data and utilize Algorithm 1 to implement the optimal data tampering attack. Although the PSO algorithm converges every time the global optimal position is calculated, it is still difficult to meet the real-time requirements of the data tampering attack, so this subsection introduces the BP neural network to improve the real-time performance.
Taking different as neural network inputs, the optimal attack strategy, , under different is obtained by PSO, the data of mapping the relationship from to are used as a dataset, and the model is obtained after BP neural network training, and the trained model is denoted as , which is used for the prediction of optimal attack strategies in online attacks.
The first step is the selection of the dataset. The BP network is a data-driven machine learning algorithm, and training requires a large amount of sufficiently motivated data, so we need to uniformly sample the input, , in the sample space, and then utilize the PSO algorithm to obtain the optimal output of different inputs, that is, the optimal data tampering attack strategy. The input vector, , in the training set needs to be sampled uniformly within the upper and lower bounds, so the input data are obtained as follows:
Assume an n-dimensional vector , where each dimension, , takes values ranging between and . Here is the lower bound and is the upper bound.
Initialization, for the i-th dimension, , the range of values is .
The value of
can be expressed as follows:
where
is an integer indicating the index of the sampling point in the
i-th dimension, and
is the interval between each step, that is, the step length of the uniform sampling, which is computed by the formula
, where
is the total number of samples in the
i-th dimension.
Eventually, after sampling each dimension uniformly in the n-dimensional space, all possible combinations of values of the vector are the combinations in each dimension , where each follows the uniform sampling rule described above.
After obtaining the dataset, the structure of the BP network is designed as a two-layer hidden layer. The BP neural network used to generate the dataset has the following architecture:
Input: the input to the network is a 3-dimensional vector, which is the parameter estimation of at the moment k.
Output: the output of the network is a 6-dimensional vector, which is the optimal data tampering attack strategy, .
Parameters:
- -
Input layer: three neurons, corresponding to the 3-dimensional input vector.
- -
Hidden layer 1: 20 neurons, activation function is tansig.
- -
Hidden layer 2: 20 neurons, activation function is tansig.
- -
Output layer: six neurons, activation function is purelin.
- -
Other parameters: epoch: 1000; optimizer: Levenberg–Marquardt.
The training process of the BP network model is shown in
Figure 2.
4.3. Implementation of Online Attack
The model saved after training can be directly invoked in the online attack, and the corresponding attack strategy is generated in real time according to the current moment,
. Therefore, the flow of online attack is as follows
where the initially given value
is obtained from
by
at
k moments and
, and after the parameter estimation,
, is obtained, the estimated value is inputted into the trained model to obtain the current optimal attack strategy,
, and the optimal attack is implemented at the current moment.
5. Numerical Simulation
In this paper, we consider the system based on (
1)
where the sample length
; system parameters
; the threshold of binary sensor
; the attack energy constraint is
, the total energy constraint
; and the noise satisfies Assumption 1, and is a Gaussian distribution with mean 0 and standard deviation 10. The initial number of populations for PSO is set to 50, the spatial dimension is 6, the maximum number of iterations is 100, the inertia weight is set to 0.8, and the self-learning factor and population learning factor are both 0.3.
The BP neural network used contains two hidden layers, the number of nodes in the input and output layers is related to the dimension of the input and output data, the intermediate hidden layers are set to 20 neuron nodes, and the data in the training set are used in the PSO optimization algorithm to generate 1000 groups of and their corresponding . The training and testing sets are divided into 7:3.
While
is given, we compared Algorithm 1 with GA. Both have their populations set to 50, the maximum number of iterations is 100, and the fitness function to be optimized is the average entropy in (
5). As shown in
Figure 3, it can be seen that the algorithm converges to the optimal value in advance under Algorithm 1, and the average entropy of the global optimal value converged to is better compared with the GA.
Figure 4 plots the average entropy under the optimal attack strategy based on Algorithm 1, and the other two groups of attack strategies are{0.1, 0.3, 0.2, 0.6, 0.5, 0.2} and {0.1, 0.2, 0.3, 0.2, 0.4, 0.4} in comparison. The two groups are in offline form and cannot be updated in real time according to the change of system parameters, and it can be seen that the average entropy of Algorithm 1 is the smallest, which means that the attack is the most effective. In addition, in
Figure 5, the estimation error calculated by 2-norm between the true and estimated values of the system parameter,
, under different attack strategies are shown, at which time the attack based on Algorithm 1 is better than the other two groups of attack strategies.
However, it is difficult for Algorithm 1 to meet the attacker’s requirements for real-time performance in actual attacks, and thus the BP network is needed to optimize the real-time performance of online attacks. In this simulation, 1000 sets of mapping data from different
to the optimal attack strategy solution set,
, are used as the training set, and the network model is obtained after iterative training of the BP network. In the process of online attack, the attacker utilizes the trained model and takes the real-time value of parameter estimation,
, as the network input to obtain the output value,
, and uses it to carry out the attack strategy for the current attack. In the simulation experiments, the average entropy of the improved online attacks based on the BP network are smaller than the other two sets of strategies, as shown in
Figure 6, and, under the same conditions,
Figure 7 represents the comparison of the estimation errors between the true values of the system parameters and the estimation values of the proposed network model in this paper, which are larger than the estimation errors of other two sets of data. In addition, under the same hardware platform, in MATLAB environment, the comparison of the time consumption of the two for 5000 sets of data is nearly 19:1, as shown in
Table 1, and therefore the BP model is faster than Algorithm 1 and meets the real-time requirements.