Advanced Bad Data Injection Attack and Its Migration in Cyber-Physical Systems

: False data injection (FDI) attack is a hot topic in cyber-physical systems (CPSs). Attackers inject bad data into sensors or return false data to the controller to cause the inaccurate state estimation. Although there exists many detection approaches, such as bad data detector (BDD), sequence pattern mining, and machine learning methods, a smart attacker still can inject perfectly false data to go undetected. In this paper, we focus on the advanced false data injection (AFDI) attack and its detection method. An AFDI attack refers to the attack where a malicious entity accurately and successively changes sensory data, making the normal system state continuously evaluated as other legal system states, causing wrong outﬂow of controllers. The attack can lead to an automatic and long-term system failure/performance degradation. We ﬁrst depict the AFDI attack model and analyze limitations of existing detectors for detecting AFDI. Second, we develop an approach based on machine learning, which utilizes the k-Nearest Neighbor (KNN) technique and heterogeneous data including sensory data and system commands to implement a classiﬁer for detecting AFDI attacks. Finally, simulation experiments are given to demonstrate AFDI attack impact and the effectiveness of the proposed method for detecting AFDI attacks.


Introduction
Many critical infrastructures, such as smart grid and smart transportation systems, are fundamentally supported by the underlying cyber systems [1]. Efficient and convenient management can be achieved by adopting cyber systems. However, there exist many vulnerabilities in cyber systems, like malformed message attacks [2][3][4] and denial of service attacks [5]. Therefore, for modern cyber-physical systems (CPSs), many new vulnerabilities from cyberspace have been exposed, and consequently, security has been a crucial factor for these modern CPSs. For example, attackers can delay the data transmission between cyberspace and physical space by intruding the cyberspace to cause the wrong control of physical process [6]. Currently, a vast number of existing studies pay attention to attacks and their migrations in CPSs [6][7][8][9][10][11][12]. In particular, FDI is a hot topic in CPSs. FDI attack refers to one kind of attack that attackers modify measurements of sensors or return false sensory data to the controller, causing the wrong state estimation [11,12]. Attackers can launch FDI attacks by injecting bad data into sensors, intruding communication systems to modify feedback data, or modifying time stamps of sensory data.
Previous works [13][14][15][16][17][18][19] have extensively studied FDI attacks. However, previous FDI attacks mainly pay attention to masking the wrong system state or disturbing the state estimation by launching just one attack. They did not discuss how to keep going undetected for a long time after an FDI caused the system fault. Especially, continuously falsifying legal states by injecting legal data into the normal systems to cause automatic and long-term disruption/performance degradation, is a research area with limited work. Legal data mean the measurements that can pass detectors and can be estimated as legal system state. A legal state refers to the state where controllers consider the system is normally running.
On the other hand, many effective detectors of FDI attacks have been proposed. For example, detectors based on machine learning [20,21] and methods based on models [13,22] have been used to detect FDI attacks. The above methods can better detect some special FDI attacks. However, an effective detection method still needs to be explored when attackers elaborately falsify normal system states. Especially, the above detection approaches are ineffective to identify FDI attacks that continuously falsify other legal states when the systems run normally.
In this paper, we focus on FDI attacks and their migrations in large-scale CPSs. First, we depict a complex FDI attack called advanced false data injection (AFDI) attack, where attackers infer system parameters by long-term monitoring sensory data, and continuously inject legal data to modify the normal system state as another legal system state, leading to automatic outflow of mismatched commands. Consequently, long-term harm or performance degradation of physical systems is obtained. Different from previous FDI attacks that only consider how to mask the system exception or disturb the system by launching one FDI attack, AFDI attacks focus on how to cause automatic and successive disruption by only injecting legal measurements under normal situations. Second, we prove that the existing detection approaches, such as BDD, sequence pattern mining-based detectors, and Machine Learning-Based detectors, cannot identify AFDI attacks. Third, bearing in mind that commands from the controllers cause changes in sensory data, and attackers hope to inject false data to generate wrong commands, we propose a novel detection method based on machine learning. The proposed method utilizes the KNN technique and heterogeneous data, including commands from the controller and sensory data to obtain a classifier for detecting AFDI attacks. Finally, our simulation experiments validate the impact of AFDI attacks and the effectiveness of the proposed detection approach.
Our contributions can be summarized as follows.
• We depict the AFDI attack model, which can only directly and successively cause system failure/performance degradation by injecting legal data into the normal system. • We discuss the limitations of fundamental detectors for detecting AFDI attacks.

•
A novel detector based on machine learning, Heterogeneous Data Learning Detector (HeteD), is proposed, which can effectively identify AFDI attacks.
The rest of this paper is organized as follows. In Section 2, we introduce the system model of CPS and FDI attack model. We depict the attack model of AFDI in Section 3. In Section 4, we discuss the limitation of fundamental detectors and describe the proposed detection approach against AFDI attacks. Numerical results are given in Section 5. The related works about FDI attacks and detection are reviewed in Section 6. In Section 7, we conclude this work and propose future work.

System Model of CPS
The model of a large-scale CPS is shown in Figure 1, which includes a cyber system, communication system, and physical system. The cyber system is composed of the central controller and state estimator. The physical system includes a vast number of PLCs and sensors. The communication system is responsible for transmitting information between the physical system and cyber system. Each cycle, sensors measure the physical process and send sensory data to the cyber system. Sensory data is first transmitted to the state estimator and the state estimator evaluates the system state. If no exceptions about sensory data exist, the estimated system state is sent to the central controller. Then, the central controller issues commands based on the evaluated state to the physical domain by the communication system. When commands reach the physical system, PLCs first receive commands and disaggregate them into many subcommands to control the physical process. After that, the physical process has a change and sensory data may be different from previous measurements. The above process can be modeled in terms of a 5-tuple .., c m } is a finite set of commands from the central controller. c k is the kth kind of command. C(k) = {c i , ..., c j } indicates the commands issued by the central controller at time k. • T = {t 1 , ..., t n T } is a finite set of time series. A time series is the measured values of one sensor with the change of time. t i = {t i (1), ..., t i (k)} T means the time series from the ith sensor. t i (k) denotes the measurement of the ith sensor at time instant k. n T means the number of sensors. • S = {s 1 , ..., s n } is a finite set of physical system states. s j = {a 1 , ..., a n S } T denotes the jth state and a i ∈ R. Estimators can evaluate the system state at time k, S(k), based on values of sensors, which can be computed by where C matrix ∈ R n T ×n S is called the Jacobian matrix of the system topology. S(k) ∈ S denotes the evaluated state at time k, and under normal circumstances S(k) = S(k) and T(k) ={t 1 (k), ..., t n T (k)} T . The system's physical dynamics is described by the following widely adopted discrete-time model [22].
where A ∈ R n T ×n T and B ∈ R n T ×m are constant matrices.
denotes the control signals determined by the control commands at time k. The value of u i (k) is determined by the corresponding command c i . g() ∈ R m×1 is a function of u(k). • R = {r 1 , ..., r i , ..., r n R } is a finite set of relationships among states and commands. r k = s i → {c i , ..., c j } (r k ∈ R) indicates that when the system state is s i , the central controller will issue a set of commands {c i , ..., c j }. These commands may be activated by the corresponding state or be input by operators.
.., f n F } is a subset of set S. Any state in set F is called an illegal state. A disruption or degradation of performance occurs when the system state is f i .

FDI Attack Model
We describe an FDI attack from three aspects: attack goal, attacker's knowledge, and attack capability. Attacks with different goals, knowledge, and capabilities can obtain different levels of impact.
Attack goal. Injecting false data can make the evaluated state S(t) differ from the real state S(t). Based on different intents, the goal can be divided into two categories: (i) Inject bad data to mask the wrong state S(t) = s f (s f ∈ F). (ii) Inject data to cause the outflow of wrong commands C(t) (S(t) → C(t) / ∈ R). Attacker's knowledge. An attacker can have different levels of knowledge of the target system, including (i) the sensory data, T; (ii) Jacobian matrix, C matrix ; (iii) system commands, C; (iv) system parameters A, B, and g; (v) and knowledge of states R and F. Depending on the assumptions made on each of these components, we can envisage different attack scenarios. Typically, two main settings are considered, referred to as attacks with perfect and limited knowledge.

•
Perfect knowledge. In this case, everything of the targeted system assumes to be known by the attacker. Components T and C can be obtained by intruding into the communication system. An attacker can know F and C matrix from the public information and system theory.
Combining history data and C matrix , R can be obtained. If the attacker is a designer of systems, {A, B, g} may be known.

•
Limited knowledge. We assume that getting information from the public dataset and collecting data by intruding into the CPS can be implemented, which means in this case, although there exists a wide range of limited knowledge, {T, C matrix , C, F, R} can be obtained by the attacker.
Considering that the CPS may be complex, {A, B, g} cannot be obtained in many situations.
Attack capability. An attacker can have different levels of attack capability, including (i) attackers can inject any value into a part of sensors; (ii) attackers can simultaneously modify sensory data of all sensors, but the injected data is limited; and (iii) attackers can inject any sensory data into any sensor. • Situation (i) because there exist vulnerabilities in sensors and intruding into some sensors may be easy, and a part of the sensor can be attacked and any bad data can be injected. • Situation (ii) as described in the work by the authors of [18], attacks, such as time synchronization attacks, can be launched to modify time stamps. All values of sensors can be simultaneously modified; however, injected values of sensory data are in a limited range. • Situation (iii) there exists a vast number of sensors and these sensors are distributed. Intruding into all sensors is not practical. However, an attacker may intrude into the communication system and modify feedback data transmitted from the sensors to the state estimator. For example, in the work by the authors of [19], sensory data is fed back from PLCs to the state estimator. There exist possibilities that attackers manipulate PLCs to modify all of the feedback data (i.e., attack the website of firmware update of PLCs).

AFDI Attack Model
We first characterize the AFDI model from attack goal, attacker's knowledge, and attack capability. Then, we describe the attack process of AFDI.
AFDI Attack Goal. For AFDI attacks, the outflow of mismatched commands is needed to disrupt the physical system. Considering that long-term disruption or performance degradation is the primary target of AFDI attacks, continuously masking the fault state is also needed.
AFDI Attacker's Knowledge. To go undetected for a long time, an attacker should know what sensory data is proper for the current state, which means that knowledge {T, C matrix , C, F, R, A, B, g} is needed. By analyzing historical data and public information, {C matrix , C, F, T, R} can be obtained. In the latter paragraph, we propose a method that attackers without prior knowledge of {A, B, g} can perfectly falsify system states. Launching an AFDI attack only requires limited knowledge. AFDI Attack Capability. Considering that attacks last for a long time and measurements of any sensor may be influenced, an AFDI attack requires the ability that attackers can modify all sensory data. Therefore, the ability to inject any data into any sensor is needed by an attacker to launch AFDI attacks.
We assume that a high-skilled attacker can own the limited knowledge and the needed attack capability in the follow-up discussion, which is feasible such as attack event stuxnet [23]. The attack process is described as follows.

•
Collect information and compute parameter A A malicious entity collects sensory data {T(1), ..., T(N)} and system commands {C(1), ..., C(N)}. Then, the attacker generates a collection V = null and adds any element < T(i), T(j) >, satisfying i < j, S(i) = S(j), and C(i) = C(j) into V. Therefore, we can obtain By utilizing linear regression, A = • Search state transition path By analyzing historical data, a malicious entity can find the existing relationships R among states and commands. Simultaneously, state transition paths are also mined. A state transition path denotes the state sequence, such as {s 1 , s 2 , s 3 }, where the system state is s 1 at the beginning, and then the state is changed to s 2 . With the outflow of commands, the system state becomes s 3 .
In this case, an attacker needs to find two normal state transition paths: where C d , C f , C i k , and C j k mean different sets of commands. G s denotes a series of state transitions.
s k C i k −→ s l denotes that when the system state is s k and commands C i k are issued from the controller, the state becomes s l .

•
Inject continuously bad data At time l, the system state is , commands are C f (bf Path 2), and the attack is launched at time l + 1. We use T(k) to represent the historical sensory data where the system state is s d and commands are C d (Path 1). In this case, attackers will continuously modify sensory data such that the controller considers that the state path is Path 1. The injected bad data T(l + j) bad at time l + j satisfies where t bad means the duration of bad data injection. Theorem 1. When injected bad data satisfies (4), the evaluated state at time l + j is the same to the state at time k + j.
Proof. Based on (1), the estimated state at time l + j is From Theorem 1, we know that by utilizing the historical data, attackers can continuously falsify other legal states by bad data injection satisfying (4). During the continuous attacks, the controller considers that Path 1 is executed; however, the actual situation is shown in Path 3.
.., C i n } is issued when the state sequence is {s i 1 , ..., s i n }, but for state s h k , commands C i k are not proper to manipulate the physical process, and the physical system may go into illegal states {..., s h k , ...}(s h k ∈ F) for a long time.

Limitation of Fundamental Detectors and Our Detection Approach
In this section, first, we discuss the limitations of existing fundamental detectors including BDD, detector based on machine learning, and sequence pattern mining-based detector. Then, we propose a novel and effective method to detect AFDI attacks.

Limitations of Fundamental Detectors
Reviewing the existing FDI detection approaches, we divide detectors into three categories, including residual based BDD, detectors utilizing classifiers based on machine learning, and detectors based on sequence pattern mining. Next, we discuss their limitations for detecting AFDI attacks, respectively.

Residual-Based BDD
Residual-based BDD utilizes the residual between the observed sensory data and estimated sensory data to identify anomalies. Intuitively, the normal measured sensory data is close to the actual values, and the bad data may move the estimated variables away from their true values [13]. Equation (5) represents the detection theory, where l BDD denotes the threshold.
Although the residual based BDD can identify many types of bad data, previous studies such as [13] have proved that when the bad data satisfies (6), attacks cannot be detected.
where β is a linear combination of the column vectors of C matrix (i.e., β = C matrix × h). T(k) bad and T(k) denote the bad data and normal data at time k, respectively. Following the fore discussion, in Theorem 2, we show that an AFDI attack goes undetected by the residual-based BDD. (1) and (2)) and the residual-based BDD, the AFDI attacks are always undetectable.

Theorem 2. For the descriptor system (Equations
Proof. When the AFDI attack is launched at time t + 1 and the attacker hopes to falsify the state transition path since time j + 1 (S(t) = S(j)), the difference between injected bad data and actual data is Therefore, based on (6), the injected bad data can pass the residual based BDD.

Machine Learning-Based Detectors
In previous studies, the detector based on machine learning can be seen as a binary classification problem. From the perspective of machine learning, there exist three modes including supervised learning, semi-supervised learning, and unsupervised learning. Because methods based on supervised learning provide better detection results [24], we mainly discuss their limitations.
Supervised machine learning techniques utilize a set of labeled training data (i.e., the samples that we have known whether they are abnormal) to generate a classifier. When a new sample without labeling is input into the classifier, the model will show its label. We mainly focus on two methods: Class I: classifier based on sensory data T(t) [21,25]. Class II: classifier based on the difference between two sensory data T(t) − T(t − l d ) [24] , where l d ∈ I is a constant and its value depends on the setting of defenders.
For Class II, samples with labels can be represented as Although there exists many machine learning techniques, such as KNN, random forest (RF), support vector machine (SVM), and naive bayes (NB), they cannot provide effective detection results by only using sensory data to train model when attackers inject bad data as described in Section 3. Theorems 3 and 4 show that under some conditions, AFDI attacks go undetected by the supervised learning techniques. Assume that the task is to predict the label of injected bad data T(t) at time t. The machine learning technique, 1-nearest neighbor learning, is seen as an example.
Theorem 3. For the descriptor system (1) and (2) and Class I-based detector utilizing the 1-nearest neighbor learning technique y t = arg min the AFDI attacks are always undetectable.
Proof. When bad data is injected at time t, we can obtain the following information.
(1) y t = 1; (2) y j = y j = y j−1 = y j−1 = y t−1 = y t−1 = 0, where y i denotes the predicted label and y i denotes the actual label; Then, we predict the label of T(t), y t , as arg min When k = j, y t = y j = 0 because S(t) = S(j). Therefore, the AFDI attacks can go undetected.
Theorem 4. For the descriptor system (Equations (1) and (2)) and Class II-based detector utilizing the 1-nearest neighbor learning technique when attackers can find two paths satisfying Path 1 and Path 2 in the historical data and constant l d is smaller than the duration of G s , the AFDI attacks are undetectable.
Proof. Assume that bad data is injected at time t and Because l d is smaller than the duration of G s , the evaluated system state S(t − l d ) at time t − l d is the same to the evaluated system state S(j − l d ). Therefore, we can obtain Based on the above result, it is easy to obtain y t = y j = 0 and AFDI attacks can go undetected. Because most of machine learning techniques assume that similar samples tend to have similar labels and attack data T(t) can always find the similar benign data T(j), the ability is also similar for the detection of FDI attacks under different machine learning techniques [24]. Therefore, we say that the type of approach is ineffective to identify AFDI attacks.

Sequence Pattern Mining-Based Detector
The sequence pattern mining-based detector utilizes a series of state transitions to denote the normal system behaviors and abnormal system behaviors. An effective method has been proposed in the works by the authors of [26,27], where defenders compute the number of occurrences of every state transition path and use some methods to determine whether a path is normal. In general, the larger the number of occurrences is, the larger the possibility that it is a normal path. An exception is issued when the current path is not in the historical data. For example, Path 1 and Path 2 are normal, but Path 3 is abnormal.
Although the method can effectively detect many attacks, when attackers inject bad data, as described in Section 3, they cannot provide the effective detection results. Theorem 5 shows that AFDI attacks can be undetectable by the detector based on sequence pattern mining.
Theorem 5. For the descriptor system (Equations (1) and (2)) and sequence pattern mining-based detector, the AFDI attacks are always undetectable.
Proof. Assume that Path 1 and Path 2 often occur in the running process. When a malicious entity launches attacks and falsifies the executed Path 2 as Path 1, the real state transition path is Path 3. However, the controller and detector consider that Path 1 is executed. From the detector's point of view, Path 1 is normal and attacks do not occur. Therefore, AFDI cannot be detected by the sequence pattern mining-based detector.

Our Methodology: Heterogeneous Data Learning Detector (HeteD) for Detecting AFDI Attacks
From the analysis in the previous subsection, it is clear to see that an AFDI attack can continuously inject bad data to falsify system states, going undetected by the existing detectors. These detectors mainly pay attention to sensory data, but the AFDI can perfectly modify data to be undetectable. By analyzing the state transition path, we observe that although attackers can falsify the changes in states, the commands from the controller are not modified. Actually, the impact of the attack is created because of the improper combination between the system state and issued commands. Therefore, defenders can utilize the information from commands to enhance the detector based on sensory data learning. By utilizing the heterogeneous data including commands and sensory data, we propose HeteD based on machine learning techniques to identify AFDI attacks.
As shown in Figure 2, HeteD receives sensory data from sensors and commands from the controller every unit time. There are three components in the detector, including "Process Data" component, "Generate Sample" component, and "Classifier" component. Each unit time, commands are transmitted to the component "Process Data". The component changes commands as a series of signals and then sends them to component "Generate Sample". "Generate sample" combines signals and measurements of sensors to obtain a detectable sample and transmits it to "Classifier". Classifier is responsible for detecting the sample and deciding whether the data is abnormal based on KNN technique. Next, we describe the above process in details.

Controller Sensors
Process data Generate Sample

Classifier Detector
Commands Measurements

Signals
Sample Alarms

Process Heterogeneous Data
In this step, we change every command as a control signal. The control signal has two values "0" and "1". We divide different commands into two classes, including continuous commands and instant commands. A continuous command denotes that when the command occurs, its impact on the system state can last some time or the impact is broken until the next related command occurs. An instant command represents that when the command occurs, its impact on the system state instantly occurs, but cannot be kept. For example, command "turn on or off a valve" is a continuous command, and command "Increase demands" in the smart grid case is an instant command. Data processing of the two types of commands is described as follows.

• Continuous Command Processing
When a continuous command occurs, the value of the signal becomes "1" and it remains invariant for a time interval. The duration depends on the character of the corresponding command. For example, if the influence of the command only lasts a fixed time interval, the value of the signal becomes "0" when the duration of "1" is equal to the fixed time interval. If the impact of a command is stopped when another occurs, the value of the signal becomes "0" after the corresponding command occurs. Figure 3a-c describes the two situations. In Figure 3a, a command that allocates resource for users with effective limited time interval t interval is issued at time 0. When the time is t interval , the value of the signal becomes "0". In Figure 3b,c, we show the values of two signals about two commands "turn on the valve" and "turn off the valve". When "turn on the valve" occurs at time 0, the value of the corresponding signal becomes "1". When "turn off the valve" occurs at time t interval , the value of the signal about command "turn on the valve" becomes "0" and the value of the signal about command "turn off the valve" is "1".

• Instant Command Processing
When an instant command occurs, the value of the signal becomes "1" from "0" or changes from "1" to "0". The value of the signal remains invariant until its next outflow. In Figure 3d, a command that increases demand into the smart grid is issued at time 0 and time t interval . The value of the corresponding signal changes from "0" to "1" at 0 and changes from "1" to "0" at t interval .

Sample Generation
Each unit time, HeteD combines signals and sensory data to generate a sample and sends the sample to "Classifier". Next, we describe the structure of a sample.
We utilize variable V S (t) = {V(t) T , T(t) T } T to represent the combination of signals and time series at time t, where V(t) = {v 1 (t), v 2 (t), ..., v m (t)} T denotes the signal vector at time t and v i (t) denotes the signal of command c i at time t. To capture the temporal structure of data sequence, we generate samples using the concept of first difference (the work by the authors of [24] used the technique to implement FDML detector, which attributes to Class II introduced in Section 4.1.2). Therefore, the new sample N S (t) at time t can be described as where l d is the parameter and set by the detector.

Classifier
"Classifier" is implemented by utilizing the KNN technique. We assume that there are many historical data, including normal samples and abnormal samples. If there are fewer abnormal samples, we can obtain them by simulating attacks. Label is annotated on every historical sample. If the sample is normal, the corresponding label is "0", otherwise, the label is "1". When a new sample N S (t) is input into "Classifier", the component will search K c historical samples satisfying min where K c ∈ I is a parameter of the detector.
For K c selected samples, if the number of samples whose labels are equal to "1" is larger than the number of samples whose labels are equal to "0", the predicted label of N S (t) is "1", otherwise, the label is "0". When the predicted label is "1", an alarm is issued from the detector.
At last, we discuss the reason that HeteD can obtain better detection effect than FDML. Considering the process of AFDI attacks, when attackers inject bad data satisfying (Equation (4)) since time t, bad sensory data T (t) causes the outflow of wrong control commands. From HeteD's point of view, sensory data is T (t), and the control commands are changed from C(t) to C (t). Next, we prove when AFDI attacks satisfy (Equation (4)), we can obtain According to the above conclusion, we say there exist possibilities that the normal situation is evaluated as the normal sample and there exist possibilities that the abnormal situation is evaluated as the abnormal sample. Therefore, comparing with FDML (Theorem 4), HeteD can obtain a better detection effect.

Numerical Results
In this section, several simulations are given to validate the impact of AFDI attacks and evaluate the performance of HeteD. Considering we have proved that BDD, sequence pattern mining-based detector, and Class I-based detector utilizing machine learning have no ability to identify AFDI attacks, we only compare the effectiveness of our method with the approach, FDML, using the first difference between two sensory data in the work by the authors of [24].

Scenario
The imbalance between demand and generation can lead to deviation of the frequency from its normal value. If the deviation is larger than a threshold for a long time, some generators may be disconnected from the grid leading to the cascading failure [28]. Direct load control is responsible for recovering the deviation in frequency. The model of the direct load control in the smart grid is shown in Figure 4. The direct load controller can turn off or on electric appliances to change loads of smart grid to decrease the deviation in frequency. When the frequency is lower than the normal frequency, f n = 50 HZ, the direct load controller decreases the use of direct load. When the current frequency is higher than the normal frequency, the direct load controller increases the use of direct load [29]. Users can also randomly turn on or off appliances to increase or decrease loads of the smart grid. Under normal situations, when demands of users have a change, generation of generators has a change and the frequency oscillates until the frequency remains invariant. The generation G(t) changes according to where f sp is the set point of frequency and G MAX denotes the max power of generators. M = 3 s and u t = 1 s.

Direct Load Controller
Electric grid

Data Centers
Household Electric Appliances Users Electric Vehicle When the frequency becomes stable, the frequency may not be equal to the normal frequency. Then, as described in the work by the authors of [30], the direct load controller is activated and the output load L D satisfies where the output load refers to the load that the direct load controller will turn on or off. L terminal (t) denotes the sum of terminal loads at time t. L U (t) denotes the number of loads that can be turned off at time t and L R (t) means the number of loads that can be turned on at time t.
We use Matlab to implement the direct load control. We only pay attention to the sensors that measure the frequency of the smart grid and demands of users. The frequency can be computed by the relationship between demands and generation [30], described as where ω nor denotes the rotating frequency at normal frequency and h = 4. The system has 14 commands, shown in Table 1, where commands C1-C8 are automatically issued from the direct load controller and commands C9-C16 are from the changes in users' demands. The direct load controller issues commands based on the frequency and current demands of users. Users' demands can be changed randomly by users. When the frequency becomes higher than 50.2 HZ or lower than 49.8 HZ and lasts for 40 s, generators or electrical appliances will be broken. Table 1. Description of commands in the smart grid.

Command Description
C 1 /C 16 Turn on a load of X ∈ (0 MW, 120 MW) Turn off a load of X ∈ (0 MW, 120 MW) We randomly change users' demands, L U (t) and L R (t), to collect a series of normal historical data.

Attack Effect on the Smart Grid
In the subsection, we pay attention to two normal situations: Situation 1: As shown in Figure 5, at time t = 0, demands change from 2400 MW to 2220 MW and the frequency begins to oscillate. Until the frequency becomes stable at time t = 22, the direct load controller issues command C 3 and the frequency returns to the normal value.
Situation 2: As shown in Figure 5, at time t = 0, demands change from 2400 MW to 2760 MW and the frequency begins to oscillate. Until the frequency becomes stable at time t = 22, the direct load controller issues command C 6 and the frequency returns to the normal value.
Based on the two situations, we study two AFDI attack cases: Case 1: When situation 1 occurs, attackers use measurements of demands and frequency under situation 2 to replace the real measurements of demands and frequency, and the attack lasts for 100 s. Case 2: When situation 2 occurs, attackers use measurements of demands and frequency under situation 1 to replace the real measurements of demands and frequency, and the attack lasts for 100 s.  The real measurements under two attack cases are shown in Figure 6. For case 1, comparing Figure 6a with Figure 5a, due to false data injection using measurements of situation 2, the direct load controller thinks that the frequency is lower than the normal frequency; therefore, command "turn off loads" is issued at time t = 22. However, the real demand is smaller than the generation and the operation makes the deviation of frequency larger as shown in Figure 6b. At last, the frequency is higher than 50.2 HZ and remains invariant for a long time. Until time t = 100, generators will be disconnected and electrical appliances may be broken.  For case 2, comparing Figure 6a with Figure 5a, due to false data injection using measurements of situation 1, the direct load controller thinks that the frequency is higher than the normal frequency; therefore, command "turn on loads" is issued at time t = 22. However, the real demand is larger than the generation and the operation makes the deviation of frequency larger as shown in Figure 6b. At last, the frequency is lower than 49.8 HZ and remains invariant for a long time. Until time t = 100, generators will be disconnected and electrical appliances may be broken.
From the above results, we can know AFDI attacks in the smart gird can obtain large disruption.

Performance of HeteD on the Smart Grid
We first introduce two performance metrics, including false positive ratio and false negative ratio. False positive ratio refers to the ratio of samples that are secure and classified as anomalies to all secure samples. False negative ratio is the ratio of samples that are anomalies, but are classified as normal samples to all abnormal samples.
For commands C1-C16, we see them as instant commands. To compare the detection performance of our approach with FDML, we launch 50,000 AFDI attacks at different time points to obtain 50,000 attack samples. Every attack lasts 120 s. We use the same 50,000 situations without attacks as normal samples. Forty-nine-thousand attack samples and 49,000 normal samples are used as training samples, and extra samples are testing samples.
We implement two detectors-HeteD and FDML-by python. The detection performance of two detectors on the corresponding dataset is shown in Figure 7 where K c =5. With different values of l d , we can see the false positive ratio of HeteD is lower than FDML in Figure 7a. With different values of l d , we can also clearly observe that the false negative ratio of HeteD remains lower than FDML in Figure 7b. Comparing the detection results of HeteD and FDML in the smart grid, we can know that when we use HeteD and set l d = 1, the false positive ratio is very small, the false negative ratio is lower than 20%, and good detection effect can be achieved.
Next, we analyze the reason that the two detection methods have different detection effects. When we use FDML to detect exceptions, the samples are constructed by demands and frequency. As described in Equation (4), when l d is small, there exists a vector T(k) satisfying Theorem 4. Therefore, most of abnormal situations may be seen as normal samples and many normal situations are seen as abnormal samples. For example, when case 1 in Figure 6 occurs, the bad data (frequency and demands under situation 2) can be seen as normal samples and FDML can not detect anomalies. When case 2 in Figure 6 occurs, the same conclusion is obtained. Actually, malicious samples and normal samples have the same vectors. Therefore, the detection effect is very poor. When we use HeteD to detect exceptions, the samples are constructed by sensory data and control commands. The abrupt changes in control commands lead to the situation that most of attacks can not be seen as normal samples and most of normal running situations are not seen as attack samples. For example, when case 1 in Figure 6 occurs, the bad data (frequency demands under situation 2) cannot match with the command sequence {C 13 , C 6 } because {C 12 , C 6 } are issued under normal situations. When case 2 in Figure 6 occurs, the bad data (frequency, demands under situation 1) can not match with command sequence {C 12 , C 3 } because {C 13 , C 3 } are issued under normal situations. When l d = 1, the above two examples can be identified. Therefore, HeteD can provide the better detection effect.

Scenario
A tank system [7,31] is simulated by using Matlab/Simulink, and its structure is shown in Figure 8. The controller of the tank system receives requests from users to produce liquid C, E, F, and G by the neutralization process of ingredient A, ingredient B, and ingredient D. Liquid C is produced when the ratio of ingredient A to ingredient B is 1. When the ratio is 3, liquid G is produced. Liquid E is produced when the ratio of ingredient A to ingredient D is 1, and when the ratio is 3, liquid F is obtained. Every time the tank system only generates one kind of product. Every ingredient can be supplied by three tanks and flows out from tanks by 3 mL/second. Every tank has a sensor to measure the current amount of ingredient or product. Moreover, from the tank with the ingredient to the tank neutralizing product, there exist pumps controlling the input of product tank. When a product is neutralized, the corresponding valves of outputting liquid are open and the liquid flows out from their tanks by 6 mL/second. The tank system also provides a sensor to measure which service is needed by users. Services include service 1: 60 × 3 × 2 mL liquid C, service 2: 60 × 3 × 4 mL liquid C, service 3: 60 × 3 × 4 mL liquid G, service 4: 60 × 3 × 2 mL liquid E, service 5: 60 × 3 × 4 mL liquid E, and service 6: 60 × 3 × 4 mL liquid F.
Producing liquid C with 60 × 3 × 4 mL is described as an example to demonstrate the process. The initial state is s 0 . When service 2 is requested, the state becomes s 2 and the controller issues commands to turn on two pumps that input ingredient A to tank P2 (i.e., p11 and p12). Then the state changes from s 2 to s 8 . Until the system state becomes s 14 , commands which close pumps are issued. After 60 s, the system automatically issues commands to turn on the valve V11. Then the system state becomes s 19 . Tables 2 and 3 show the system commands and system states. We randomly request different services for a long time and obtain a set of normal data. Tank11  Tank12  Tank13  Tank21  Tank22  Tank23 TankP2 D D D Tank31  Tank32  Tank33 Pump Figure 8. The structure of the tank system.  In this subsection, we use an attack case to illustrate the attack effect. By analyzing the history data, we first obtain parameter A = E 11×11 , where E denotes the identify matrix. Every time users request a service, the system state is s 0 at the beginning. We select two state transition paths from the historical data to generate the attack case Since time t = 2580, the system state is s 0 and users need liquid G. At t = 2581, an AFDI attack is launched and false data is injected to tell the controller that service 5 is needed. Attackers hope the controller will execute Path 4. Figure 9a,b shows the situation about production in TankP2 and TankP1 under the normal situation and attacked situation. In Figure 9a, we can clearly observe that liquid G and liquid F are successively needed. However, in Figure 9b, the actual production is liquid E (service 5). The above results illustrate the AFDI attack impact on the tank system.

Performance of HeteD on the Tank System
Commands RS1 ∼ RS6 are seen as instant commands and others are continuous commands. We randomly launch 50,000 AFDI attacks at different time points. Every FDI attack lasts 120 s. We also select the same data without attacks as the normal samples. 49,000 attack samples and 49,000 normal samples are used as the training samples, and residual 1000 attack samples and 1000 normal samples are testing samples.
In Figure 10, we show the detection performance with changes in the values of l d under two detectors, where K c = 5. With the increase of l d , we can see that the false positive ratio and false negative ratio of HeteD keeps invariant and all of attacks can be identified. The detection results are very good and all of attacks can be detected. For FDML, when l d = 1, the best detection results can be achieved. FDML is ineffective to detect AFDI attacks because of extremely high false positive ratio and extremely high false negative ratio. Next, we analyze the reason that HeteD has better detection effect than FDML. When AFDI attacks are launched, measurements of tanks are modified and the bad data can be evaluated as normal system states. According to Theorem 4, normal samples and abnormal samples have the same vectors and the detection model based on machine learning will provide poor results. For HeteD, taking the attack case as an example, when attackers modify the state from s 3 to s 5 , control command RS3 is not modified as RS5. When the detection sample is constructed, the abnormal sample is not the same as the normal sample. Therefore, HeteD can provide better detection results. Based on the attack cases in the smart grid and tank system, we can see that AFDI attacks can successively degrade the system performance or disrupt the physical system for a long time. By analyzing the detection effect of FDML and HeteD in the smart grid and tank system, the proposed detection method is better than FDML when detecting AFDI attacks. Although our method, in effect, detects AFDI attacks in many CPSs, the following issues, not explored in this paper, should be considered in the case of real attacks. (1) A vast number of commands and sensors may decrease the performance of HeteD, which may be solved by decreasing the dimension of the sample.
(2) Noise of measurements may impact the injected bad data and detection performance. (3) External input of control commands may impact the detection results.

Related Work
In this section, we review the previous works from two aspects, including FDI attacks and detection methods.

FDI Attack
In the work by the authors of [13], FDI attack was first introduced and attackers could inject proper bad data to be undetectable by BDD of the smart grid. Attackers only need partial knowledge to launch FDI attacks for masking the system exception or disturbing the normal running. In the work by the authors of [14], FDI attacks with perfect knowledge were proposed to disturb the system running. However, the above two works only discussed a single attack and does not concern how to launch successive attacks. In the work by the authors of [15], attackers with perfect knowledge injected false data to successively mask transmission line outages leading to a serious situation without awareness. With further research, attack strategies have been improved, for example, optimal FDI attack actions are studied to mask the exceptional frequency leading to the largest disruption of generators in the work by the authors of [16]. In the work by the authors of [17], FDI attacks with limited knowledge are described to mask the line outages. In the work by the authors of [18], time synchronization attack was introduced. Attackers falsified the GPS signal to generate sensory data with bad time stamps, masking the system faults. Different from previous works, this method does not need system knowledge. In the work by the authors of [19], FDI attacks with perfect knowledge were used to cooperate with false command injection attacks. False command injection caused the system failure and FDI could delay the time of attack detection and develop impact of attacks. Based on the above discussion, the characters of FDI attacks [13][14][15][16][17][18][19] are summarized in Table 4. We can clearly see that the FDI attack becomes more complex, hides the traces of attacks better, and causes greater disruption. However, previous FDI attacks mainly pay attention to masking system exceptions or conducting the system exception by only launching one FDI. Previous works do not discuss how to keep going undetected for a long time after an FDI causes the exception. Especially, continuously falsifying legal states by injecting legal data into the normal systems to cause the automatic and long-term disruption/performance degradation, is a research area with limited work.

Detection of FDI Attacks
For FDI detection, many effective approaches have been proposed. In the work by the authors of [11], the method based on correlation mining between data was proposed. However, when attackers can know the correlations among different types of data and modify multiple types of sensory data like time synchronization attack, the detection results are poor. In works by the authors of [12,20,21,25], the method based on machine learning utilizing classification models was depicted. Although the above approaches can detect some FDI attacks escaping from BDD, other complex FDI attacks, such as an elaborate time synchronization attack, cannot be detected. In the work by the authors of [24], a new machine learning method based on first difference was utilized to detect time synchronization attacks. In the work by the authors of [26], the authors built a system model and used sequential pattern mining to analyze state paths and detect anomalies. However, AFDI can not be detected by this method. The above methods can better detect some special FDI attacks. However, an effective detection method still needs to be explored when attackers elaborately falsify normal system states. Especially, the above detection approaches are ineffective to identify AFDI attacks. The characters of the above detection methods are summarized in Table 5, where time cost refers to the duration from analyzing data to obtaining detection results. The difference between HeteD and previous works is that HeteD utilizes the control commands and sensory data to train the detection model.

Conclusions
We have described the AFDI attack model, which can directly cause long-term disruption of physical systems by injecting continuously data into the normal system. For example, in the smart grid, with the injection of bad sensory data, the frequency largely changes and the traditional detectors cannot identify. The deviation of frequency exceeds its threshold and the smart grid may encounter cascading failure. For the tank system, with the injection of bad sensory data, the production process is broken and wrong produce is achieved; however, traditional detectors cannot identify these situations. We also prove that the traditional attack detectors are ineffective to detect AFDI attacks. Most importantly, we propose a novel and effective ML-based method by utilizing commands and sensory data to identify AFDI attacks. The simulation results show that our detector, HeteD, can effectively identify AFDI attacks with low false positive ratio and low false negative ratio. For example, in the scenario of smart grid, our detector can identify most of AFDI attacks with 2.3% false positive ratio and 13.8% false negative ratio. The FDML provides worse detection results with 60% false positive ratio and 50% false negative ratio. In the scenario of the tank system, our detector identified all AFDI attacks with 0% false negative ratio. The FDML provides worse detection results with 70% false positive ratio and 28% false negative ratio. From the above results, we can know that the proposed detection method is very effective. In the future, a detailed analysis of the convergence of detecting other attacks in real-world applications will be provided.

Conflicts of Interest:
The authors declare no conflicts of interest.