A Security Scheme Based on Intranal-Adding Links for Integrated Industrial Cyber-Physical Systems

With the advent of the Internet of Everything era, the Industrial Internet is increasingly showing mutual integration and development. Its core framework, the industrial CPS (Cyber-Physical Systems), has received more and more attention and in-depth research in recent years. These complex industrial CPS systems are usually composed of multiple interdependent sub-networks (such as physical networks and control networks, etc.). Minor faults or failure behaviors between sub-networks may cause serious cascading failure effects of the entire system. In this paper, we will propose a security scheme based on intranal-adding links in the face of the integrated and converged industrial CPS system environment. Firstly, by calculating the size of the largest connected component in the entire system, we can compare and analyze industrial CPS systems’ security performance under random attacks. Secondly, we compare and analyze the risk of cascading failure between integrated industrial CPS systems under different intranal-adding link strategies. Finally, the simulation results verify the system security strategy’s effectiveness under different strategies and show a relatively better exchange strategy to enhance the system’s security. In addition, this paper’s research work can help us design how to further optimize the interdependent industrial CPS system’s topology to cope with the integrated and converged industrial CPS system environment.


Introduction
Cyber-physical systems (CPS) integrate computing components, networks, and physical processes into specific environments [1][2][3]. Many social systems can be abstracted as CPS. The CPS plays an irrreplaceable role in a broad range. For instance, the power grid system and vehicle network system are considered as the CPS [2,[4][5][6][7]. Preserving the robustness of these social systems is essential. One typical CPS can be abstracted as cyber networks and physical networks. These networks are integrated and constructed as an interdependent network. An interdependent network that is built by several networks is more accessible than a single network [8][9][10]. As shown in Figure 1, the industrial internet of things is a part of the internet of things and many standard systems are in the range of the industrial internet of things. The industrial internet of things can also be abstracted as cyber networks integrate physical networks similar to CPSs. Computers are used to control and monitor physical networks. The physical networks can exchange data with other systems [11].

Interdependent Networks
The social networks link nodes from several networks with specific rules. Therefore, the social system's scale increases from a single network to interdependent networks. There are three interdependent network models which are established in existing studies. All of these models have been widely used in studies. To better the fidelity of the connection of network models, the 'one-to-multiple correspondence' model is built [14][15][16][17]. In this model, node i in the network A has just one dependent link from node j, which belongs to the network B. However, j has several independent links from A nodes. This correspondence model is more complex than the 'one-to-one correspondence' model. Some real-world networks obey this correspondence model.
The especially complex correspondence model is 'multiple-to-multiple correspondence' [18,19]. Scholars observe that node i in the network A has at least two independent links from B. However, node j has more than one independent link from A. Thus, this model is greater in scale and complicated.

Cascading Failure
Scholars discuss detailed cascading failure's processes in [12,20]. In [12], they design an interdependent network by two single networks and assume random attacks starting the failure. They consider the necessary and sufficient conditions for normal working nodes. In the following simulation, we follow these conclusions from [12] that a normal working node must satisfy these two conditions at the same time: (i) This node has more than one dependent link from normal working nodes; (ii) This node belongs to the giant component.
Because some nodes do not follow the above two conditions, these nodes and within links are removed. The cascading failure recursive propagation is in different networks. When the failure stops, the statement of the system will be one of two conditions: (i) One is that the system is collapsing; (ii) The other statement is that there are still some nodes working normally. The entire system will be in a stable state.
In the study of resisting cascading failure, five approaches to enhance the robustness of the homogeneous interdependent network are applied. They protect crucial network nodes from strengthening the reliability of networks [21,22]. Making nodes' autonomy enhance robustness is costly [18,23]. Adjusting dependency link allocation and refiguring the topology of the network by rewiring [7,[24][25][26][27] are also applied in networks. However, these two methods are only suitable for designing networks. Adding links in systems is simulated in [28][29][30]. They find that adding links can improve a CPS's reliability. The above methods are performing more significant effects in improving the reliability of an interdependent model. Nevertheless, they have some limitations on practical applications.
In Section 2, we specify the model of the CPS, which is built in Section 4. In Section 3, we describe seven intra-links adding strategies and describe the adding links processes of these strategies in the CPS. Section 4 is the simulation figures description and results. Section 5 is conclusions and works which we are exploring in the future.

Mathematical Model
At first, we describe CPS models in detail, which are simulated in Section 4. Next, the node number's formulas during cascading failure processes are shown [12]. In the end, we show a simple cascading failure model within the 'one-to-multiple correspondence' model in Figure 2.  In [31], scholars propose a classification for CPSs. Besides characteristic behaviors of the CPS, different algorithms to model the intra-connection and inter-connection between networks are researched in [32,33].

Interdependent Model
In our simulation, we construct a CPS model composed of two complex networks: the network A and network B. Intra-links order complex network's degree distribution. All inter-links are randomly connecting nodes with different networks [14,20]. All links are non-directional to these CPS models. This setting means that, if node i has an inter-link with node j, then these two nodes depend on each other.
Both the Erdös-Rényi (ER) network and scale-free (SF) network are systematically studied [12,34]. In interdependent network models, network A and network B are ER networks or SF networks. If the network is an ER network, the network nodes' degree must order binomial distribution. The SF network degree distribution is following the power-law distribution. The formula of power-law degree distribution is P(k) ∝ k −γ . In this formula, P(k) is the degree distribution and γ is the power-law exponent.
As explained in Section 1, the 'one-to-multiple' model is preferred to model independent relationships between power stations and control equipment. To properly simulate the power grid system, the 'one-to-multiple correspondence' is a better choice for modeling inter-links' connection relationships. The inter-links' coupled ratio is set at 3:1. It denotes that node i, which belongs to the network A, relies on node j of network B, but j includes three independent nodes from A. If j fails, i will fail. However, if i fails, j may not fail. The condition for j fails is that all independent links with it fail.

Mathematical Formulation
With the introduction of the cascading failure setting in Section 1, scholars derive the formulas of the number of nodes in all cascading failure processes. The notations of this section are shown in Table 1. For a 'one-to-one correspondence' model, nodes' number at a stable state in the network A and network B is: The fraction of nodes which is not failed after initial attacks N Ai , N Bi The fraction of normal nodes of network The generating functions of network A, B In the following simulation models, existing studies of cascading failure follow. The cascading failure is triggered by a small fraction of the nodes' failure. Thus, the most popular assumption is that the random attack occurs in the network A and the number of (1 − p)N A nodes failed-following this assumption, by removing (1 − p)N A nodes at random from the network A as random attacks. At the same time, all links within failed nodes are deleted. The remaining nodes' number of network A are: The fraction of nodes in the giant component in N A1 is: Each node in the network B relies on three nodes from the network A. As in the above settings, one node in network B will fail if it does not have inter-links in the second stage. The normal working nodes in network B are [20]: As failed nodes and their intra-links are erased from the system, the network B separates into several components. Nodes in the giant component will be preserved while the others are deleted. The fraction of preserved network B nodes is: These failed nodes in the network B will lead to the cascading failure to network A. This failure propagates between the CPS until it stops. The nodes' number must be the formulas in the steady stage: The next stage of the cascading failure is shown in Table 2. With these formulas, the fraction of working nodes at the steady state in the network A and B is [20]: where Equation (11) changed into: To perform the cascading failure in a 'one-to-multiple correspondence' model, some nodes' connection of the simulation model is illustrated in Figure 2. The relationship between intra-links and inter-links is illustrated in the figure's initial stage. When the cascading failure ends, the connection of the system is depicted in stage 4.

Methodology
In [28], different adding strategies are proposed to enhance the 'one-to-one correspondence' model's reliability. The calculation formulas and implementation method of seven adding strategies applied in the following simulation are given in this section.
NONE implies that nodes do not add intra-links to CPS models. The model's construction has not been modified.
I. Random adding strategy (RA) RA is randomly selecting two nodes from the network A or network B. Then, analyze the connection of intra-links within these two nodes and set one intra-link to link these two nodes. Based on the basic requirements of undirected networks, parallel links and self-loops are forbidden. After one adding operation, there will be one intra-link between two randomly selected nodes. In the entire CPS model, inter-links are not altered after RA strategy. In the simulation, RA is used as a control experiment to contrast with other strategies.
II. Low degree adding strategy (LD) Degree centrality is widely utilized to calculate the importance of nodes [12,28,35]. It is well-known that the intra-link number of one node is the node's degree in an undirected network [9,34].
To complete the LD one time, getting all nodes' degrees first. Then, check the connection relationship of the two minimum degree value nodes of a single network. Finally, add one intra-link between these two selected nodes. In the complete process of adding intra-links, parallel links and self-loops are forbidden.
III. High degree adding strategy (HD) HD is getting nodes' degrees first. Then, check the connection relationship of the two highest degree value nodes in one single network. Finally, insert one intra-link between two selected nodes. In the whole process of adding intra-links, parallel links and self-loops are forbidden.
IV. Low betweenness adding strategy (LB) Nodes' intra-links constitute several paths in a single network. Betweenness centrality measures nodes' importance by these paths [28,36]. One node's betweenness centrality is: where σ ij is the shortest paths number from node i to node j. σ ij (v) is the shortest paths number from node i to node j which are through the node v [28,34,35]. If one node has a large number of shortest paths to other nodes, this node must be a critical node in the network. Using the low betweenness adding strategy, getting all nodes' betweenness centrality values is the first step. Next, evaluate the connection relationship of the two lowest betweenness value nodes of one single network. Adding one intra-link of these two selected nodes is the last step. After the above processes, one time LB is completed. In the complete process of adding intra-links, parallel links and self-loops are forbidden.
V. High betweenness adding strategy (HB) HB is getting all nodes' betweenness values first. Then, check the connection relationship of the two highest betweenness value nodes of a single network. Finally, insert one intra-link between these two selected nodes. In the whole process of adding intra-links, parallel links and self-loops are forbidden.
VI. Low eigenvector centrality adding strategy (LEC) Considering the node's neighbors to judge the node's importance is the eigenvector centrality. This centrality has a broad range of applications in daily life, such as satellite cities around economically developed cities and satellite cities around cities with developed tourism. Scholars construct a matrix A to indicate the nodes' intra-link relationship to express one node's neighbors.. In the matrix A, an element A ij denotes whether there is an intra-link between node i and node j. If A ij = 1, node i and node j have one intra-link linking each other; if A ij = 0, i and j do not have a path. Because the eigenvector centrality will be changed due to nodes' neighborhoods, the initial value of node's eigenvector centrality x i = 1. The value of x i changes into x i [34,35]: where κ 1 is the largest eigenvector value in the matrix A. If a node is connected to multiple important nodes, the importance of the node will increase. To finish one time LEC, getting all nodes' eigenvector centrality value is necessary. Then, check the connection relationship of the two smallest eigenvector value nodes of the single network. Finally, add one intra-link between these two selected nodes. In the entire process of adding intra-links, parallel links and self-loops are forbidden.
VII. High eigenvector centrality adding strategy (HEC) HEC is getting nodes' eigenvector centrality values first and checking the connection relationship of the two highest eigenvector centrality value nodes of the single network next. Finally, add one intra-link between these two selected nodes. In the whole process of adding intra-links, parallel links and self-loops are forbidden.

Results and Discussion
The first subsection explains the parameters of simulation models. In the next, detailed simulation processes are performed. The results are shown at the end of this section.

Parameters
In the following simulation, the CPS models are within the 'one-to-multiple correspondence' model. Without loss of generality, the ratio of inter-link relationships is set at 3:1 (the ratio is the largest setting which we can simulate). Based on the above setting, the nodes' number is N A = 9000 and N B = 3000 in two networks. Following previous research settings, both the ER and SF network's average degree is k = 4 and γ in the SF network is 3. The real-world network's degree is closed to k = 4 [12,34]. If one social network follows the power-law distribution, γ is usually between 2 and 3. When γ = 3, this network corresponds to the typical value of the BA model [34]. Both the intra-links and inter-links are non-directional in all models.
In [28], they have found that adding intra-links in double networks yields better performance than in a single network. Thus, we add intra-links in double networks of the CPS model. At first, it is essential to determine the number of added intra-links. f L indicates that the fraction of adding links is: where L A and L B represent the intra-link's number in the initial network A and B. L means adding links' number. As the nodes number of two networks and the average degree have been determined, the number of intra-links in these networks is 36,000 and 12,000. Therefore, the adding intra-link's number in the network A is f L · 36,000 and in the network B is f L · 12,000. We cannot add links indefinitely due to the cost of the system building. As the research of community interaction [37], the most popular node in one system has more than six links with other nodes. In this way, we set 8 as the highest degree of one node in this paper. Therefore, f L cannot be larger than 50%. To clarify the meanings of these parameters, we display these parameters in Table 3. Table 3. Notations of parameters and metrics.

Symbol Meaning
The nodes' number in network A, B k The average degree of the network γ The parameter of the SF network f L The fraction of adding intra-links L A , L B The intra-links number of network A, B G The functions of normal working nodes after cascading failures p c The ability of the network to fight random attacks

Reliability Metrics
Apply the random attack into our models to simulate the real-world networks' attack. To get to know the reliability of one CPS, the metric G which means the normal working nodes number fraction after the cascading failure stops is applied: where N A (N B ) means the working node's number in a stable state of the network A (B). p c is used to reflect the ability of one CPS to fight against random attacks. To clear the meanings of these metrics, we have shown them in Table 3.

Simulation Setup
We write C++ programs to model the cascading process in an interdependent network. We increase 1 − p by 0.25 each time to simulate and obtain G values. For each value of 1 − p, we execute 20 times and get an average of G value as the final result to decrease the error. The processes of these simulations are: i.
in the first, we build two complex networks (which are named A and B) to represent an interdependent network. These two networks are selected from the ER and SF network which are generated by binomial distribution and power-law distribution, respectively. ii. then, we couple these two networks within the 'one-to-multiple correspondence' model. The relationships of inter-links are random connections and the ratio maintains 3:1. iii. we apply one adding strategy to one specific model. The relationships of intra-links will change. iv. in the further, (1 − p)N A nodes are chosen at random as failed nodes representing the attacked nodes of the network A. v. cascading failure propagates between network A and B. We simulate each propagate's stage and record the working node's number of the system at each step. vi. finally, we calculate the steady stage node's number of this entire interdependent network.

Network Size and p c
Seven adding strategies are applied in models to verify the performance of the CPS. In Figures 3-6, the values of f L are setting as 15%, 25%, 35%, and 45%, respectively. In each subfigure, we plot relationships of G, p c and 1 − p. Then, we plot not adding links (NONE) as a contrast simulation for the other strategies which are detailed in Section 3. From the Figures 3-6, we get the following conclusions:

I.
All strategies make greater robustness of the CPS model. As the value of the f L increases, one CPS model gets more reliable. All seven adding strategies have higher values of G and p c than NONE. It means that increasing the number of intra-links can enhance an interdependent network's reliability. When the number of intra-links increases, the instructions of one network get more complex. The G and p c values are increasing when f L increases. The p c nears 0.7 under LEC in Figure 3d. When f L gets 45%, the value of p c is more than 0.8 (in Figure 6d). This conclusion obeys the previous conclusions which are mentioned in [28]. II. Under the identical model settings, adding links gets better results in increasing G values with low centrality values than by high centrality values, especially when f L is small (in Figure 3). This finding and reason have been shown in [9]. We use the form X-Y-Z to illustrate the interdependent networks after adding intralinks [28]. X and Y denote the types of the network of interdependent networks. Z represents the single network of the CPS. We find that the betweenness values of strategies are lower than the original networks (NONE). Gaps between betweennesses among ER-SF, SF-ER, and SF-SF interdependent networks are getting smaller when f L increases. The betweenness values of the A network in the ER-ER model are highest under all f L . In Figure 8, we plot the eigenvector centrality values of different strategies. Eigenvector centrality values of LEC are lower than the other strategies when f L = 15%. When f L = 25%, f L = 35% and f L = 45%, HD and HEC display a tendency to decay compared to their adjacent values. All strategies' eigenvector centrality values are more closed with f L getting bigger. The eigenvector centrality values of the A network are smaller than its correspondence B network. The ER network has higher eigenvector centrality values. In Figure 8, the eigenvector centrality values of all strategies are not lower than NONE. According to Figures 7 and 8, we can obtain that decreasing the network's betweenness value and increasing the network's eigenvector centrality value could enhance network reliability. Nevertheless, a too high or too low cost cannot get the largest G and

Conclusions and Future Works
In this paper, increasing the number of intra-links by several strategies to enhance the robustness of different CPSs' models is our goal. To enrich simulation models, we simulate four kinds of heterogeneous interdependent networks. Then, we add intra-links in interdependent models with different adding links' ratios. Finally, we record G and p c when cascading failure stops. Adding intra-links strategies can enhance the network's reliability and low centrality values adding methods get better system reliability. Our findings believe that a low degree strategy has the best performance in increasing G in the ER-ER system. A low eigenvector strategy is the first choice of scenarios included in SF networks.
Our simulation has some limitations: we should give more theoretical studies about interdependent network's reliability and different evaluation metrics should be suggested as reflecting the reliability of models. These limitations are directions which we are working towards.