Ensuring SDN Resilience under the Influence of Cyber Attacks: Combining Methods of Topological Transformation of Stochastic Networks, Markov Processes, and Neural Networks

Kotenko, Igor; Saenko, Igor; Privalov, Andrey; Lauta, Oleg

doi:10.3390/bdcc7020066

Open AccessArticle

Ensuring SDN Resilience under the Influence of Cyber Attacks: Combining Methods of Topological Transformation of Stochastic Networks, Markov Processes, and Neural Networks

¹

Laboratory of Computer Security Problems, St. Petersburg Federal Research Center of the Russian Academy of Sciences (SPC RAS), 39, 14th Liniya, Saint Petersburg 199178, Russia

²

Electrical Communication Department, Emperor Alexander I Saint-Petersburg State Transport University, 9 Moskovsky pr., Saint Petersburg 190031, Russia

³

Department of Integrated Information Security, Admiral Makarov State University of Maritime and Inland Shipping, 5/7 Dvinskaya St., Saint Petersburg 198035, Russia

^*

Author to whom correspondence should be addressed.

Big Data Cogn. Comput. 2023, 7(2), 66; https://doi.org/10.3390/bdcc7020066

Submission received: 15 January 2023 / Revised: 10 March 2023 / Accepted: 22 March 2023 / Published: 30 March 2023

(This article belongs to the Special Issue Cyber Security in Big Data Era)

Download

Browse Figures

Versions Notes

Abstract

:

The article proposes an approach to ensuring the functioning of Software-Defined Networks (SDN) in cyber attack conditions based on the analytical modeling of cyber attacks using the method of topological transformation of stochastic networks. Unlike other well-known approaches, the proposed approach combines the SDN resilience assessment based on analytical modeling and the SDN state monitoring based on a neural network. The mathematical foundations of this assessment are considered, which make it possible to calculate the resilience indicators of SDN using analytical expressions. As the main indicator, it is proposed to use the correct operation coefficient for the resilience of SDN. The approach under consideration involves the development of verbal models of cyber attacks, followed by the construction of their analytical models. In order to build analytical models of cyber attacks, the method of topological transformation of stochastic networks (TTSN) is used. To obtain initial data in the simulation, the SDN simulation bench was justified and deployed in the EVE-NG (Emulated Virtual Environment Next Generation) virtual environment. The result of the simulation is the time distribution function and the average time for the cyber attack implementation. These results are then used to evaluate the SDN resilience indicators, which are found by using the Markov processes theory. In order to ensure the resilience of the SDN functioning, the article substantiates an algorithm for monitoring the state of controllers and their automatic restructuring, built on the basis of a neural network. When one is choosing a neural network, a comparative evaluation of the convolutional neural network and the LSTM neural network is carried out. The experimental results of analytical modeling and simulation are presented and their comparative evaluation is carried out, which showed that the proposed approach has a sufficiently high accuracy, completeness of the obtained solutions and it took a short time to obtain the result.

Keywords:

cyber attack; resilience; software-defined network; Markov process; correct action coefficient; method of topological transformation of stochastic networks

1. Introduction

The sharp increase in traffic volume and the change in services provided to a large number of users, the formation of high-performance clusters for processing big data and highly scalable virtualized environments for providing cloud services have seriously changed the structure and requirements for modern data transmission networks. One of the concepts for building a data transmission network of various corporate structures is a software-defined network that operates from the network layer.

Software-defined networks (SDN) help to create automated, programmable, flexible and cost-effective network infrastructures. They help to systematically solve most of the accumulated problems, including those related to network and information security [1].

In addition, SDN technology, in the near future, will introduce aspects of the open code for the network component of the cloud infrastructure, which is considered to be the most favorable basis for the development and implementation of a wide range of applications. It is based on the implementation of network devices and their functions on any network host using the OpenvSwitch software switch. This approach allows one to use almost any computing device in the network as a switch/router.

Another important feature of SDN technology is centralized controller-based network management, implemented using the OpenFlow management protocol and allowing users to not only to manage network devices, but also collect network statistics, which allows one to more effectively solve emerging problems in the network by reconfiguring all the network devices at the same time.

The OpenFlow control protocol offers a number of attributes that are particularly well suited to implement a highly reliable and manageable environment:

The flow paradigm is ideal for security because it offers an end-to-end, service-oriented approach that is not bound by traditional routing constraints;
Logically centralized management allows one to effectively control performance and threats throughout the network;
Granular policy control can be based on application, maintenance, organization and geographic criteria rather than physical configuration;
Resource-based security policies allow the consolidated management of multiple devices with different security risks, from highly secure firewalls and security devices to device access;
Dynamic and flexible configuration of the security policy is provided by software control;
Flexible traffic control provides the fast deterrence and isolation of intrusions without affecting other network users [2].

The conceptual structure of SDN includes the application layer, API, control layer, OpenFlow and data layer. The level of data transfer, represented by software or hardware-software switches, performs the functions of L2 and L3 level switches for processing and transmitting network traffic. Each switch receives a set of rules via the control channel and OpenFlow control protocol from the controller. In turn, the OpenFlow control protocol provides the controller with the ability to use special routing tables and/or modify packets transmitted to switches. The rules transmitted using the OpenFlow protocol can be both group and discrete for each flow separately. Packets arriving at the switch’s input buffer are first checked to see if their headers match the rule templates in the null table. Packet headers are compared with rule templates in descending order of rule priority, and if the header matches the template, then the instruction associated with the selected rule is executed. Such instructions can be: sending a packet to a certain port, dropping the packet or sending the packet to the controller for further analysis.

Three main components can be distinguished in SDN (Figure 1): controller; control channel (OpenFlow); outer.

As with the classical architecture, the main elements are routers/switches that process L2/L3 network traffic. However, in this case, network devices are entrusted with the function of forwarding traffic between end users, and all decisions related to filtering and route rebuilding, which in the classical implementation of the network are performed by dynamic routing protocols, in this case, are fulfilled by the controller.

The controller, in turn, has two interfaces: the OpenFlow server, which directly manages the network and checks the status of ports/devices using the OF-CONFIG protocol, and an API provided to network applications.

Understanding of functioning processes is determined by two levels of SDN technology (Table 1):

Control level (control plane);
Data transfer level (data plane).

Communication between the levels is carried out by the Open Flow control protocol. An appropriate secure channel is implemented for the operation of the control protocol. It can be either a separate physical controller–switch connection or a logical channel passing through other SDN devices. Information is exchanged over the control channel. Control commands from the controller to the switch and information about the status of logical switches and the communication channel between devices are transmitted from the switch to the controller. One of the main advantages of this solution is centralized management. Such centralization allows one to dynamically change the routes of traffic transmission in the network based on changing conditions. When one is creating new routes and connecting new channels, the controller responsible for a particular segment sends the necessary rules to each device. This distinguishes SDN from the classical approach, in which, when there are changes in the network structure, the administrator is forced to prescribe the necessary rules element by element manually or using the SSH/TELNET control protocols [3].

However, such as many new solutions, SDN has a number of drawbacks [4,5]:

A software solution entails thousands of lines of program code, which, in turn, entails the presence of unintentional errors;
A significant part of the vulnerabilities was pumped into the technology from the TCP/IP protocol stack;
The presence of a device that fully manages the network and owns all the information about the network requires additional protection means and mechanisms;
A new technology implies an intensive emergence of new vulnerabilities specific to this technology.

All of this allows us to highlight the main threat vectors for SDN:

SDN network users receiving network services;
Channel from user to network device;
Network device Open Flow;
Control and monitoring channel Open Flow;
SDN controller.

The controller, as a key component in managing the entire SDN infrastructure, is the most vulnerable element, an attack on which can lead to consequences that are critical for the entire infrastructure.

When any network application is able to change the flow tables of any switch managed by this controller, the situation does not meet modern information security requirements. Different kinds of applications require different levels of access, and the more detailed the limitations of each application are (according to the nature of the task being performed), the more reliable the network will be.

Thus, the possible results of CA impact on SDN are blocking the SDN controller and the Open Flow control and monitoring channel, introducing false information about an SDN network users receiving network services, violating the established rules for collecting, processing and transmitting information in SDN and failures in the operation of SDN, as well as compromising the transmitted or received information.

This suggests that CAs and the ability to counteract their implementation are key factors that determine the resilience of SDN. For this reason, in the article, we focus our attention on CAs, as they are least studied topic, in our opinion, and also the most important group of destabilizing factors. Moreover, we will interpret the term SDN resilience as the ability of a data transmission network, in which the network control level is separated from data transmission devices and implemented in a software way, to implement its functions and processes under CAs.

Despite the fact that quite a lot of works are devoted to the issues of ensuring the resilience of telecommunication networks and their protection against cyber attacks (their analysis is given in Section 2), the problem of ensuring the SDN resilience remains unresolved. This is largely due to the fact that the known solutions for ensuring the resilience and security of SDN do not take into account the specifics of the construction and operation of these networks. This predetermined the need to develop the approach proposed in this article.

The approach under consideration involves the development of verbal models of CAs, and the construction of their analytical models. In order to build analytical models of CAs, the method of topological transformation of stochastic networks is used. To obtain initial data in the simulation, the SDN simulation bench was justified and deployed in the EVE-NG (Emulated Virtual Environment Next Generation) virtual environment (https://www.eve-ng.net/, London, UK, accessed on 15 January 2023). The result of the simulation is the time distribution function and the average time for CA implementation. These results are then used to evaluate the resilience indicators of SDN, which are found by using the methods of the theory of Markov processes [6]. This approach is distinguished by higher accuracy and resilience of the obtained solutions and has proven itself suitable for modeling multi-step stochastic processes of various natures.

The approach considered in this article passed the experimental assessment for some of the most well-known and popular types of attacks. Network topology spoofing attacks are typical examples of passive-type attacks that do not cause damage to the SDN, but reveal important information that an attacker can use to carry out more serious attacks. The controller hack/crash attack is a typical example of active attacks that significantly disrupt the performance of SDN. These types of attacks will be considered in this article as objects for analytical modeling.

The theoretical contribution of the results discussed in this article lies in the further development of methods for analytical modeling of cyber attacks against SDN and their application to assess resilience as a very important property of a data transmission network, in which the network control layer is separated from data transmission devices and implemented in software for network virtualization.

The novelty of the results obtained is determined both by the methods used and by the scope of their application. In methodological terms, the novelty of the results is determined by the integration of methods, each of which has been individually well studied by the authors in relation to cybersecurity problems. These methods include statistical methods, in particular, the methods of the theory of Markov processes [7,8,9], the method of topological transformation of stochastic networks (TTSN) [10,11,12,13,14] and methods of deep machine learning using neural networks of the Long Short-Term Memory (LSTM) type [7,15]. Markov models make it possible to obtain analytical expressions for assessing the stability of SDN under various cyber attacks. The TTSN method provides these models with input data. LSTM networks ensure the ability to monitor network traffic in order to effectively detect computer attacks. Thus, in terms of their scope, the results obtained cover almost all stages of the operation of the SDN computer security system, which is responsible for detecting, evaluating and eliminating the consequences of computer attacks.

The further structure of the article is as follows. Section 2 provides an overview of related works. Section 3 reveals the content of the approach to assessing SDN resilience based on the TTSN method. Section 4 contains the results of analytical modeling using the examples of the two most characteristic types of attacks. The results of the experimental assessment of SDN resilience are presented in Section 5. Section 6 contains the main conclusions and discussion about the directions for future research.

2. Related Work

The economic paradigm of the modern world has led to the fact that developers of network equipment save on their costs, which in turn affects the overall resilience of information and telecommunication networks. This should not be forgotten when one is considering SDN technologies. Thus, SDN, on the one hand, creates a certain risk, opening up new opportunities for attackers [4,5,6,16], and on the other hand, it provides new opportunities for creating alternative information security services.

To date, there are three main directions for ensuring the SDN resilience in the context of targeted information technology impacts [17].

The first way is the optimization of the route used to reduce the technological cycle of controlling the central controller [18]. In [19,20,21,22], variations in the improved Dijkstra algorithm based on the weighted graph model were considered. The error of the solution does not exceed 5–7% on average. The heuristic insertion method based on the “nearest neighbor” principle, as well as its offshoot, “taboo search”, were considered in [23,24]. However, despite the simplicity of the solution, these approaches are based on formally unfounded considerations, so it is difficult to prove that the heuristic algorithm for each set of initial data finds solutions that are optimal.

The second “structural” approach to SDN resiliency is based on taking into account the characteristics of the network architecture. Often, an important factor is the resiliency of not the entire network, but its main part—the control system. Cyber attacks on SDN in 89% of cases are aimed at the control subsystem, since a failure in its operation leads to a general “fall” in its level of function. In [25], the structural SDN resilience is assessed through four main indicators: robustness, redundancy, resource management flexibility and speed. This assessment allows one to separate the routing process from data forwarding, which is essential for networks of this kind.

A number of studies in the field of structural SDN resilience are aimed at redundant key controllers [26,27,28,29,30] with a large number of duplicated tributary channels and the use of alternative algebraic topologies (for example, “Fat Tree”) [31], as well as hybrids [32,33,34,35] of multilevel topologies, such as “Star” and “Double Ring”, with the organization of protection of access channels according to the “1 + 1” principle. All the considered types of the organization of structural SDN resilience are characterized by common disadvantages: incompatibility of the virtual configuration with network controllers and the high cost of consistent hardware of the same type [36].

The third option for increasing the network resilience includes combined methods that combine the characteristic features of the first two options [37,38,39,40,41]. Unfortunately, the “childhood illnesses” of the low SDN resilience under CAs [42] have been ignored in combined approaches.

In [43], a method similar to that proposed by us is considered for the preventive detection of the fact of impact on the central SDN controller, but unlike our approach, it is proposed to emulate the network protection system at the application level and use a complementary filter to eliminate anomalous requests to the control subsystem, which, as is known, has characteristic time delays during transient processes.

In [44], it was proposed to use recursive learning to detect a CA on SDN. In this work, methods of approximation theory are used to reduce the time complexity. The proposed approach has demonstrated a high efficiency in detecting DDoS attacks due to the fact that it dramatically reduces the number of transfers over the network. The ideas presented in this work contributed to the fact that we chose the LSTM network, which is a type of recursive neural networks, to monitor the state of SDN.

A number of works, for example [45,46,47,48], are devoted to general issues of the quantitative assessment of the resilience of complex dynamic systems, which include SDN. They consider system resilience as its ability “to plan and prepare for, absorb, respond to, and recover from disasters and adapt to new conditions [48]”. In addition, these works suggest an approach to assessing the computer network resilience based on taking into account the critical functionality and features of external influences on the network elements. Critical functionality can be defined as the quality of a system [48], as well as a metric of system performance, which is introduced to derive an integrated measure of resilience. For example, critical functionality can be calculated as the percentage of nodes that are functioning [46].

Among the works on modeling cyber attacks and countermeasures, the following papers should be noted, [49,50], where discrete event modeling systems OPNET and COMNET, respectively, are considered. These systems use models of queuing networks, which assume that requests for service arrive according to a priori known distribution laws. However, these laws are not known in practice. In our approach, they are determined by the results of analytical modeling.

Another example of an attack modeling and analysis system is the CAMIAC tool (Cyber Attack Modeling and Impact Assessment Component) [51]. It generates attack graphs, analyzes the consequences of the implementation of attacks and predicts future actions of the attacker. However, this tool does not allow one to obtain the functions of distribution of the attack time and does not allocate communication directions and routes in the network. This makes evaluating the cyber resilience of a network with CAMIAC extremely difficult.

Thus, the following conclusions can be drawn from the related work analysis. First, stochastic analytical modeling and methods of the theory of Markov processes are of great importance for the development of countermeasures in modern cybersecurity systems. Secondly, stochastic models used to simulate attacks should be able to calculate the distribution functions of the random variables of interest to us (for example, the time of the attack and its individual stages) with minimal computational costs. Thirdly, stochastic models should provide high flexibility and be applicable for modeling attacks of any type. The approaches discussed above do not fully meet these requirements.

The methods of the theory of Markov processes and the TTSN, which underlie the approach described below to assessing the SDN resilience, can eliminate this drawback. The TTSN method is based on the Graphical Evaluation and Review Technique (GERT) developed by A.A.B. Pritzker [52]. GERT is a stochastic network project management method that has many advantages over traditional critical path methods and program evaluation and analysis methods. It combines circuit analysis, the probability theory, the simulation technique and the signal flow graph [53]. There are works in which GERT-based approaches are successfully used to assess the reliability of complex systems [54], to study high-tech product project plan dynamics [55], to process fuzzy data in research project management [56], to manage construction projects [57], to remanufacture process routing [58] and others. In our work, the GERT method was further developed due to the wider use in it of analytical expressions of probability theory related to the procedures of topological graph transformation.

3. An Approach to Ensuring the SDN Resilience

3.1. Basic Expressions for Evaluating SDN Resilience

When one is assessing SDN resilience, it is necessary to determine the criteria under which it will cease to perform the functions assigned to it. Logically, the network can be divided into transport and local components (Figure 2). When one is considering the transport component, it is reasonable to assume that the network will cease to be operational under the following conditions:

Failure of the transport network controller or substitution of the controller in order to control the network intruder in their own interests;
Failure of routers responsible for the transport component of the network;
Topology substitution, in which an intruder posing as a transport network router creates black holes for transmitted traffic;
Failure of the communication channels between network nodes.

Taking into account the conditions described above, let us evaluate the resilience of the software-defined network with redundant OpenFlow switches and a controller. To do this, it is necessary to represent the network in the form of a Markov process with discrete states in continuous time; the time spent in one state is distributed according to the exponential law [7].

Figure 3 shows a graph of the discrete states and conditional transitions.

Table 2 describes the conditional discrete states of a distributed corporate SDN under CAs.

The transitions between states are presented in Table 3.

Initial data for the task are as follows:

SDN aggregated resilient-state graph under CA conditions $G = (S, λ)$ (see Figure 3).
A set of SDN states under CA maintenance conditions:

S = (S_{1}, S_{2}, S_{3}, S_{4}, S_{5})

(1)

3.: A set of event flows, when the SDN states change in the CA conditions:

Λ = (λ_{12}, λ_{21}, λ_{23}, . . ., λ_{i j})

(2)

4.: Characteristics of persistent SDN aggregated states when they are exposed to CAs (see Table 2).
5.: Event flow intensities (see Table 3).
6.: The probability vector of the initial states of the system: $p_{i} (0) = |1000000|$ .
7.: Normalization condition:

\sum_{i = 0}^{4} p_{i} (t) = 1

(3)

The moments of probabilistic transitions of SDN from state to state when using the protection strategy are uncertain, random and occur under the action of event flows, which are characterized by intensities

λ_{i j}

(see Table 3).

Intensities are an important characteristic of event flows and represent the average number of events arriving per unit of time. The numerical values of the intensities λ will be set in accordance with the simulation model. When one is solving a system of linear differential equations with constant coefficients (homogeneous Markov process), we use continuous time

t \to \infty

.

According to the labeled state graph (see Figure 3), we compose the system of Kolmogorov differential equations with unknown functions

p_{i} (t)

[59]:

D (P, T) = \{\begin{array}{l} \frac{d p_{0} (t)}{d t} = λ_{51} p_{4} (t) + λ_{31} p_{2} (t) - λ_{12} p_{0} (t), \\ \frac{d p_{1} (t)}{d t} = λ_{12} p_{0} (t) + λ_{32} p_{2} (t) - λ_{23} p_{1} (t), \\ \frac{d p_{2} (t)}{d t} = λ_{23} p_{1} (t) + λ_{31} p_{0} (t) + λ_{35} p_{4} (t) + λ_{34} p_{3} (t) - (λ_{23} + λ_{34}) p_{2} (t), \\ \frac{d p_{3} (t)}{d t} = λ_{43} p_{2} (t) + λ_{45} p_{4} (t) - λ_{34} p_{3} (t), \\ \frac{d p_{4} (t)}{d t} = λ_{51} p_{0} (t) - (λ_{35} + λ_{45}) p_{4} (t), \\ \sum_{i = 0}^{4} p_{i} (t) = 1 . \end{array}

(4)

Each equation in the system (4) describes the change in the probability of the SDN being in the i-th state, depending on the available transitions from this state to other states and vice versa.

SDN resilience is a fairly broad concept, even in terms of SDN resilience in the face of CAs, because, as mentioned above, the emergence of new information security threats is a constant process. Therefore, resilience should be considered as the ability of a distributed corporate network using SDN technology to withstand a certain class of CAs. S₄ is the state of SDN operation in case of successful CAs. That is, the probability of the transition of a distributed corporate network using SDN technology is the inverse concept of network resilience.

Thus, the flowchart of the proposed technique for assessing the resilience of a distributed SDN under CAs is shown in Figure 4. The technique contains four main stages: (1) the formation of initial data; (2) obtaining the probability values of the system being in state S₄ for the generated data; (3) correction of set parameters; (4) assessment of the impact of the adjusted initial data on the stability of the system under study.

The technique allows, when one is determining the most relevant attacks for corporate SDN, users to determine the degree of its resilience. The results obtained and the conclusions based on them make it possible to obtain an adequate assessment of the SDN resilience under simulated conditions to attacks characteristic of this network.

However, before evaluating the resilience of SDN, one must first assess the risk of CAs. To do this, it is initially necessary to build mathematical models of CAs based on the verbal model.

3.2. Examples of CA Reference Models

The SDN is an open network architecture that has been proposed in recent years to address some of the key weaknesses of traditional data networks. SDN proponents argue that network management logic and network functions are two separate concepts, and therefore should be separated into different layers. To this end, the concepts of control plane and data plane were introduced into SDN: a centralized control plane (otherwise called the controller) manages the logic of the network and controls traffic engineering functions from the data plane (called switches), which simply forward packets between networks.

Thus, SDN can be viewed as a physically distributed switching structure with logically centralized control. It is designed to provide highly dynamic management and QoS/security policies.

Consider the process of CAs implementation against SDN (Table 4).

To ensure the basic requirements for a model with a given level of accuracy and to obtain the most reliable probabilistic temporal characteristics, it is required to develop a complex model of the CA against the SDN, consisting of a verbal CA model of the SDN, a mathematical model and a simulation model.

To develop a mathematical model, it is proposed to use the reference CA models and the TTSN method.

3.2.1. Verbal Model of CAs against SDN

For SDN, there is a number of specific CAs aimed at the planes implemented within the framework of this network construction technology. The classification of SDN-specific attacks is presented in Table 5.

Most network attacks use technologies such as: spoofing; man in the middle; tampering; rejection; information disclosure; DoS.

The approach implemented within the framework of SDN is already being implemented in many projects of both equipment manufacturers and telecom operators. In this regard, the operation of this approach revealed certain problems in ensuring the resilience of the SDN architecture (Table 6).

Cyber attacks against SDN elements are implemented in the form of targeted hardware and software impacts, leading to a violation or reduction of the efficiency of technological cycles.

3.2.2. Model of the “Substitution of Network Topology” Attack against SDN

To create a mathematical model of a CA of the “Substitution of network topology” type, let us imagine the scenario of its implementation in the form of a sequence of actions (Table 7).

The result of the “Substitution of network topology” cyber attack is the work of an attacker impersonating a trusted network device using identification data obtained by technical computer intelligence.

Let us represent the CA implementation process described above as a stochastic network (Figure 5). Each node of this network is denoted by a diamond and corresponds to the CA stage presented in Table 7. Transitions between nodes occur with the probabilities indicated in Table 7 [13].

To determine the equivalent function, the concept of a closed stochastic network is introduced, as well as loops of the first and k-th orders [10,11,12].

The equivalent function of the k-th order loop is defined as

Q_{k} (s) = \prod_{i = 1}^{k} Q_{i} (s),

(5)

where

Q_{i} (s)

is the equivalent function of the i-th loop of the first order, defined as the product of the equivalent functions of the branches included in this loop.

Let us transform the original stochastic network into a closed one (Figure 6).

This allows us to use the topological Mason’s equation [60] for closed graphs to determine the equivalent function of the original network

H = 1 + \sum_{k = 1}^{K} (- 1)^{k} Q_{k} (s) = 0,

(6)

where K is the maximum order of loops included in the stochastic network.

After the stochastic network is closed by a fictitious branch,

Q_{a} (s) = 1 / h (s) h (s)

is the desired equivalent function, and we define all the loops.

Loops of the first order:

w (s) \cdot m (s) \cdot l (s) \cdot P_{Π} \cdot d (s) / h (s); (1 - P_{Π}) \cdot z (s) \cdot l (s) .

(7)

There are no loops of the second and higher orders.

Then, Equation (7) can be written as

1 - w (s) \cdot m (s) \cdot l (s) \cdot P_{Π} \cdot d (s) / h (s) - (1 - P_{Π}) \cdot z (s) \cdot l (s) = 0 .

(8)

The equivalent function in this case is

h (s) = \frac{w (s) \cdot m (s) \cdot l (s) \cdot P_{Π} \cdot d (s)}{1 - (1 - P_{Π}) \cdot z (s) \cdot l (s)} .

(9)

To determine the calculated ratio of the distribution function, we assume that

\{\begin{matrix} W (t) = 1 - \exp [- w t]; \\ M (t) = 1 - \exp [- m t]; \\ L (t) = 1 - \exp [- l t]; \\ D (t) = 1 - \exp [- d t]; \\ Z (t) = 1 - \exp [- z t], \end{matrix}

(10)

where

w = 1 / {\bar{t}}_{m e m},

m = 1 / {\bar{t}}_{tac},

l = 1 / {\bar{t}}_{t r}

,

d = 1 / {\bar{t}}_{r e},

z = 1 / {\bar{t}}_{r e p l},

{\bar{t}}_{mem},

{\bar{t}}_{tac},

{\bar{t}}_{tr},

{\bar{t}}_{re}

and

{\bar{t}}_{repl}

is the average implementation time of the k-th CA process.

Using the Laplace transform, we find images of the distribution density functions of the execution time of the k-th CA process:

\begin{matrix} l (s) = \int_{0}^{\infty} \exp (- s t) d [L (t)] = \frac{l}{l + s}; \\ d (s) = \int_{0}^{\infty} \exp (- s t) d [D (t)] = \frac{d}{d + s}; \\ \begin{matrix} z (s) = \int_{0}^{\infty} \exp (- s t) d [Z (t)] = \frac{z}{z + s}; \\ w (s) = \int_{0}^{\infty} \exp (- s t) d [W (t)] = \frac{w}{w + s}; \\ m (s) = \int_{0}^{\infty} \exp (- s t) d [M (t)] = \frac{m}{m + s} . \end{matrix} \end{matrix}

(11)

After substituting Expression (10) into Expression (11) and the results obtained into Expression (9), we obtain

h (s) = \frac{w \cdot m \cdot l \cdot P_{Π} \cdot d \cdot (z + s)}{(w + s) \cdot (d + s) \cdot (m + s) \cdot [(l + s) \cdot (z + s) - (1 - P_{Π}) \cdot z \cdot l]} .

(12)

To simplify calculations, we define:

\begin{array}{l} A = d + l + m + w + z \cdot d + l + m + w + z, \\ B = [C_{2} + w \cdot (d + l + m + z)] \\ C = [(w + d) \cdot C_{2} + l \cdot C_{1}] \\ D = [w \cdot [d \cdot C_{1} + l \cdot [d \cdot (m + z) + C_{1}]] + d \cdot l {\cdot C}_{1}], \end{array}

where

C_{1} = m \cdot z - (1 - P_{n}) \cdot m \cdot z

;

C_{2} = [l \cdot (d + m + z) + d \cdot (m + z) + C_{1}]

.

In order to determine the original of the equivalent Function (12), we use [61]:

h (s) = \sum_{k = 1}^{n} \frac{f (s_{k})}{φ^{'} (s_{k})} \cdot \frac{1}{s - s_{k}} = \sum_{k = 1}^{5} \frac{w \cdot m \cdot l \cdot P_{Π} \cdot d \cdot (z + s_{k})}{5 s_{k}^{4} + 4 A \cdot s_{k}^{3} + 3 B \cdot s_{k}^{2} + 2 C \cdot s_{k} + D} \cdot \frac{1}{s - s_{k}} .

(13)

Then

h (t) = L^{- 1} {h (s)} = \sum_{k = 1}^{5} \frac{w \cdot m \cdot l \cdot P_{Π} \cdot d \cdot (z + s_{k})}{5 s_{k}^{4} + 4 A \cdot s_{k}^{3} + 3 B \cdot s_{k}^{2} + 2 C \cdot s_{k} + D} \cdot \exp [s_{k} t] .

(14)

The resulting expression is a function of the probability density. Therefore, the desired integral probability distribution function is defined as follows

F (t) = \int_{0}^{t} h (t) d t = \sum_{k = 1}^{5} \frac{w \cdot m \cdot l \cdot P_{Π} \cdot d \cdot (z + s_{k})}{5 s_{k}^{4} + 4 A \cdot s_{k}^{3} + 3 B \cdot s_{k}^{2} + 2 C \cdot s_{k} + D} \cdot \frac{1 - \exp [s_{k} t]}{- s_{k}},

(15)

and the average time

{\bar{t}}_{CA}

, spent on the implementation of CA, is determined as follows

{\bar{t}}_{CA} = \int_{0}^{\infty} t \cdot h (t) d t = \sum_{k = 1}^{5} \frac{w \cdot m \cdot l \cdot P_{Π} \cdot d \cdot (z + s_{k})}{5 s_{k}^{4} + 4 A \cdot s_{k}^{3} + 3 B \cdot s_{k}^{2} + 2 C \cdot s_{k} + D} \cdot \frac{1}{(- s_{k})^{2}} .

(16)

3.2.3. Model of the “Hacking/Crashing Controller” Attack against SDN

Let us now consider the process of successive CAs against the transmission plane. An SDN transmission plane is a collection of edge routers.

Consider the CA in the case when the intruder aims to disrupt the information exchange between the local structures of the corporate network without knowing the features of its functioning. The CA process begins with sending packets of various protocols to the SDN router to determine the response of routers in time

{\bar{t}}_{r e}

with the distribution function M(t). Based on the reaction of the network device to requests, the vulnerabilities are determined in time

{\bar{t}}_{s e n}

with the distribution function B(t). Then, the decision-making system makes a decision to conduct the “Heavy Packet” CA in time

{\bar{t}}_{r}

with the function L(t). If the attack leads to disruption of the SDN operation and the probability of this event is equal to P_n, then the attacker proceeds to eliminate CA traces in time

{\bar{t}}_{s t}

with the distribution function N(t). With the inverse probability P_pr = 1 − P_p, the protection tools will detect an attempt to carry out an attack and start the process of network reconfiguration. In this case, the attack process will start again from the moment the decision is made with the choice of the next attack of the universally unique identifier (UUID) flooding type (spam sending by the universal number of the attacked switch) in time

{\bar{t}}_{r 2}

with the distribution function U(t). If the attack leads to disruption of the SDN, and the probability of this event is equal to P_p2, then the attacker proceeds to eliminate the CA traces in time

{\bar{t}}_{s t}

with the distribution function N(t).

The probability that the protection tools will detect an attack attempt and start the network reconfiguration process is P_pr = 1 − P_p2. In this case, the attack process will start again from the moment the decision is made, and the next attack of the “Sending a large number of packets” type will be chosen in time

{\bar{t}}_{r 3}

with the distribution function G(t). If the attack leads to disruption of the SDN and the probability of this event is equal to P_n3, then the attacker proceeds to eliminate traces of the SC in time

{\bar{t}}_{s t}

with the distribution function N(t).

The probability that the protection tools will detect an attack attempt and start the network reconfiguration process is P_pr = 1 − P_n3. In this case, the system will start the attack process again from the moment the decision was made.

The process of conducting the CA to the data transmission plane in the SDN described above can be represented as a stochastic network shown in Figure 7.

To use the method of topological transformation and obtain an equivalent function that preserves in its structure the distribution parameters and the logic of interaction of elementary random processes [13,14], we close the resulting stochastic network, while ensuring its closure (Figure 8).

Using the Laplace transform [62], we define images of the density functions for the distribution of the execution time of the k-th CA process:

\begin{matrix} l (s) = \int_{0}^{\infty} \exp (- s t) d [L (t)] = \frac{l}{l + s}; \\ b (s) = \int_{0}^{\infty} \exp (- s t) d [B (t)] = \frac{b}{b + s}; \\ \begin{matrix} u (s) = \int_{0}^{\infty} \exp (- s t) d [U (t)] = \frac{u}{u + s}; \\ g (s) = \int_{0}^{\infty} \exp (- s t) d [G (t)] = \frac{g}{g + s}; \\ \begin{matrix} n (s) = \int_{0}^{\infty} \exp (- s t) d [N (t)] = \frac{n}{n + s}; \\ m (s) = \int_{0}^{\infty} \exp (- s t) d [M (t)] = \frac{m}{m + s} . \end{matrix} \end{matrix} \end{matrix}

(17)

By substituting the obtained expressions into the equivalent function of the stochastic network of the CAs against the SDN control plane and passing to the original space, we obtain:

\begin{matrix} F (t) = [\sum_{k = 1}^{5} \frac{m \cdot b \cdot l \cdot P_{n} \cdot n (z + s 1_{k})}{φ^{'} (s 1_{k})} \cdot \frac{1 - \exp [s 1_{k} t]}{- s 1_{k}}] + [\sum_{k = 1}^{5} \frac{m \cdot b \cdot u \cdot P_{n 2} \cdot n (z + s 2_{k})}{φ^{'} (s 2_{k})} \cdot \frac{1 - \exp [s 2_{k} t]}{- s 2_{k}}] + \\ + [\sum_{k = 1}^{5} \frac{m \cdot b \cdot u \cdot P_{n 2} \cdot n (z + s 2_{k})}{φ^{'} (s 2_{k})} \cdot \frac{1 - \exp [s 2_{k} t]}{- s 2_{k}}] - [\sum_{k = 1}^{5} \frac{m \cdot b \cdot l \cdot P_{n} \cdot n (z + s 1_{k})}{φ^{'} (s 1_{k})} \cdot \frac{1 - \exp [s 1_{k} t]}{- s 1_{k}}] \times \\ + [\sum_{k = 1}^{5} \frac{m \cdot b \cdot u \cdot P_{n 2} \cdot n (z + s 2_{k})}{φ^{'} (s 2_{k})} \cdot \frac{1 - \exp [s 2_{k} t]}{- s 2_{k}}] - [\sum_{k = 1}^{5} \frac{m \cdot b \cdot l \cdot P_{n} \cdot n (z + s 1_{k})}{φ^{'} (s 1_{k})} \cdot \frac{1 - \exp [s 1_{k} t]}{- s 1_{k}}] \times \end{matrix}

(18)

\begin{matrix} {\bar{t}}_{CA} = [\sum_{k = 1}^{5} \frac{m \cdot b \cdot l \cdot P_{n} \cdot n (z + s 1_{k})}{φ' (s 1_{k})} \cdot \frac{1}{(- s 1_{k})^{2}}] + [\sum_{k = 1}^{5} \frac{m \cdot b \cdot u \cdot P_{n 2} \cdot n (z + s 2_{k})}{φ' (s 2_{k})} \cdot \frac{1}{(- s 2_{k})^{2}}] + \\ + [\sum_{k = 1}^{5} \frac{m \cdot b \cdot g \cdot P_{n 3} \cdot n (z + s 3_{k})}{φ' (s 3_{k})} \cdot \frac{1}{(- s 3_{k})^{2}}] - [\sum_{k = 1}^{5} \frac{m \cdot b \cdot l \cdot P_{n} \cdot n (z + s 1_{k})}{φ^{'} (s 1_{k})} \cdot \frac{1}{(- s 1_{k})^{2}}] \times \\ \times [\sum_{k = 1}^{5} \frac{m \cdot b \cdot u \cdot P_{n 2} \cdot n (z + s 2_{k})}{φ' (s 2_{k})} \cdot \frac{1}{(- s 2_{k})^{2}}] \cdot [\sum_{k = 1}^{5} \frac{m \cdot b \cdot g \cdot P_{n 3} \cdot n (z + s 3_{k})}{φ^{'} (s 3_{k})} \cdot \frac{1}{(- s 3_{k})^{2}}] . \end{matrix}

(19)

Thus, the integral distribution functions and the average time

{\bar{t}}_{CA}

of the implementation of the CA are determined. It is clear that these calculations are made for each CA stage. In order to obtain normalized values for the probabilistic temporal characteristics of CAs, an SDN simulation model has been developed.

4. Experimental Results

4.1. Description of the Simulation Stand

In order to obtain initial data about the implementation of the CAs, an integrated computer simulation model of a data transmission network using SDN was developed (Figure 9). This model covers a network connecting the main office (Main Office) with four subordinate offices (from Office 1 to Office 4). Routers (from rtk 1001 to rtk 1007) provide VPN services to the base stations (from base station 1 to base station 4) that provide communication channels between offices. SDN technology network devices are designated as OvS. The place through which a computer attack is made is base station 4 (marked as AttackEnemy).

The model was developed in order to study the resilience of SDN in the conditions of CAs, taking into account the features of its functioning (information exchange, network equipment settings, the use of monitoring tools, etc.). The peculiarity of this model is its complexity. Firstly, using SDN technology in a virtual environment, images of real network devices of telecom operators, as well as images of Open Flow devices of switches and the Runos 2 controller, were used. Secondly, to simulate the load, instead of packet generator programs, the programs designed for information exchange with real equipment were used. Thirdly, several software products were used to collect and analyze traffic in parallel, which makes it possible to assess the reliability of the obtained network statistics. Finally, during the planning and conducting of the CA stages, the average time for their implementation was taken into account, which was calculated during the performance of a given number of experiments.

The model was developed on the basis of the EVE-NG virtual network laboratory. Unlike the well-known Mininet virtual network environment simulation program [63], the EVE-NG system allows you to create models using real hardware and software to simulate the real functioning of SDN. For this reason, images of real equipment devices used in corporate data transmission networks were added to the model: Dionis-NX, Juniper, Cisco, Huawei and others. Linux images (Ubuntu) with installed software routers OpenvSwitch were used for SDN modeling, and the Runos 2.0 controller model was used as the controller image.

Table 8 lists the software products, device images and real equipment used in the simulation computer model of the data transmission network applying SDN technology.

To collect and process network statistics, Nfsen2 (Netflow Sensor) [64] and Wireshark [65] software products, as well as the Zabbix [66] network monitoring tool, were installed on a separate computer.

External devices are connected to the virtual model of the data transmission network. External devices are virtual or real traffic generation devices. The model uses workstations on Linux Ubuntu operating systems, with the help of which videoconferencing is organized, as well as the image of the Proteus-SP [67] automatic telephone exchange, which simulates the exchange of information between users via IP telephony and messaging using mail server.

4.2. An Example of an Attack Simulation Model against SDN

After the formation of the simulation stand, the description of the logical structure of the SDN and the CAs, it is necessary to proceed to the formation of the simulation model of CAs. Consider an example of a simulation model of the CA against SDN. To do this, each stage of the attack implementation process is characterized by an average execution time. To conduct experiments and obtain the parameters of the average time for the CA stages, it is proposed to use the tools EVE-NG of virtual modeling of computer networks.

To carry out a remote CA, an attacker can be located at any point that has a connection to a public communication system. The CA stages can be summarized in the diagram shown in Figure 10.

The choice of the CA type is influenced by two main factors (Figure 11):

The goals and objectives of the upcoming CA;
Information about the data transmission network, on which CA implementation is based on.

Thus, the first factor for creating a CA simulation model on SDN is the attack goals and objectives. Let us say the goal of the CA is to obtain control or partial control of the data network to collect and analyze the transmitted traffic. One type of CA in SDN to achieve this kind of goal is a network topology spoofing attack.

The second factor is the parameters that characterize the operation of SDN devices. These include standard IP/MAC addresses of network devices, unique numbers (UUID) of a software switch/controller, an SSL key and others. Figure 12 shows the main parameters of OpenvSwitch v. 2.9.8.

Having received the necessary data, the attacker can conduct a CA by making all the necessary settings for the device/devices from which it is supposed to carry out CAs such as “Substitution of network topology” (Figure 13). The attack implementation steps shown in this figure correspond to those in Table 9.

During the experiments, network traffic data were obtained. The result of the CA was the work of an attacker impersonating a trusted network device using identification data obtained by technical computer intelligence.

To obtain adequate data in the simulation model, the model time and number of experiments should be determined. When one is conducting experiments using a simulation model, the model time will be the time spent conducting the studied CA against the SDN. It is logical that with each i-th experiment, its implementation time will be shorter than the previous one, until it reaches the value of the minimum execution time.

We define the number of experiments as necessary to fulfill the inequality

T_{i - 1} - T_{i} \leq 5 s

, i.e., the execution time of the CA process of the next experiment is not shorter than the execution time of the previous one by 5 s. We also exclude the time of the first and last experiments from the formula for determining the average time of the CA, since the first experiment is the longest because it is carried out in stages with parallel testing of all elements of the model, and the last one is almost equal to the previous one, and the last time differs from the previous one by no more than 5 s. The average time of each CA stage is determined as follows (where N is the number of experiments):

T_{a v, i} = \frac{\sum_{2}^{i - 1} T_{j}}{N}

(20)

As an analogy, all the typical SDN attacks listed in Table 5 were implemented.

4.3. Probabilistic and Temporal Characteristics of Attacks against SDN

The simulation results are used for calculating the probabilistic temporal characteristics of attacks using TTSN.

Using the obtained values as initial data, the dependences F(t) and

{\bar{t}}_{CA}

were obtained; they are presented in Figure 14, Figure 15, Figure 16, Figure 17, Figure 18 and Figure 19. The graphs indicated in these figures as “a” show the dependence of the average time

\bar{t_{C A}}

on the probability of the CA implementation P_n. Graphs marked as “b” show the dependence of the integral probability distribution function F(t) on the time t spent on CA implementation. The following values of time and probability are used as initial data, corresponding to the profile model of the CAs; the average time periods for each stage of the CA are the values from Table 9: P_res = 0.2; 0.6; 0.8.

The presented dependences make it possible to determine the probability P_n(t ≤ T₃) for the implementation of each CA for a time not exceeding the specified T₃, the average time of their implementation, as well as the time corresponding to the given level of threat of their implementation. So, for example, if the probability of network disruption is P_n = 0.8, which essentially determines its accessibility to the means of the attacker’s CA, after 17 min of operation, the network will be inoperable with a probability of F(t = 17) = 0.8, while the average time of CA implementation is about 10 min (Figure 14).

4.4. Assessing the SDN Resilience under CAs Conditions

To assess the SDN resilience under CAs conditions, three network structures were designed and implemented:

Structure 1—SDN structure consisting of three elements with one controller (see Figure 2);
Structure 2—SDN structure with two controllers with separation of the control function and interception of each other’s control functions according to a given algorithm (Figure 20);
Structure 3—SDN structure with two controllers, where one controller is the main controller and performs control functions, and the second controller is in hot standby mode (Figure 21).

The calculation results are presented in the form of graphs (Figure 22, Figure 23 and Figure 24).

Analysis of the results showed that the SDN structures under the influence of Flooding attacks and “Hacking/Crashing of the controller” attacks do not meet the requirements for resilience. In order to ensure the resilience of the SDN functioning in the face of attacks, it is necessary to develop an algorithm for monitoring the state of controllers and their automatic restructuring, since after 18 min of successful CA implementation, the probability of stable SDN operation begins to tend to 0.

4.5. Creating the Fault-Tolerant SDN in CAs Environment

Considering the above, it is possible to formulate requirements for a fault-tolerant SDN control loop, which includes three main levels:

The transmission level control loop;
Infrastructure for monitoring and managing OpenFlow;
Inter controller communication infrastructure.

The complexity of the functioning of this circuit lies in the formation of conditions for initiating SDN recovery processes when various CAs are implemented by an attacker.

To increase the SDN resilience in CAs environment, a protection system is needed (Figure 25), which is based on a network recovery algorithm that distinguishes two levels: the transmission level; the management level. The main components of the proposed protection system are a database of SDN configurations and decision blocks built using a neural network.

When they are conducting a CA, an attacker performs a series of sequential actions, upon detection of which the protection system should signal to the network administrator, and also recommend the use of a counteraction scenario. At the same time, the characteristics of the CA for the SDN levels will be different. For example, the “Hacking/Crashing of the controller” attack aimed at a soft switch will be characterized by an increase in traffic passing through the controller, an increase in the delay or no response from the controller, and, accordingly, for the controller, an increase in the processor load, the number of requests and so on.

In this case, it is obvious that the conditions for the response of software switches and controllers should be divided into CA features.

To create the SDN resilient system in a CAs environment, a client–server structure has been developed, which consists of agents operating on software routers and a server operating on a controller (Figure 26).

The OvS Agent runs on the software router, and the server of the failover system is deployed on the controller, which interact via the OpenFlow protocol.

The OvS agent (Figure 27) is a complete SDN anomaly analyzer based on the LSTM neural network for software routers. At the same time, having determined the most probable attack scenario, the OvS Agent offers the option for the network administrator to launch a countermeasure (recovery) scenario.

Further, after the decision has been made, the selected scenario is executed by the official. The procedure is loaded from the configuration database, automatic settings are made and a connection to the backup controller is made when the complete reconfiguration mode is selected.

The server of the failover system is launched in two versions: server–master and server–slave versions. For their joint work, a synchronization service is implemented, which is responsible for switching controllers among themselves, as well as mirroring service information necessary for making decisions in a recovery scenario.

To determine the CA, as well as in the OvS Agent, an LSTM network is used, the training data set for which will differ from that used by the agent. Conducted CAs can be directed at various elements of the network and cause different consequences. Thus, an important factor is the absence of contradictions between the executed counteraction scenarios by the agent and the system server, even if there is no OpenFlow control channel between them.

The server block diagram of the proposed SDN resiliency system is shown in Figure 28.

Thus, the architecture of the SDN resiliency system consists of OvS Agents launched together with software routers at the level of attachment of the SDN technology responsible for taking measures to reconfigure the network in case of CA detection using the LSTM neural network and the system server responsible for taking measures at the level control up to the commissioning of the backup controller (slave).

4.6. Implementation of a Neural Network for CA Detection

To detect anomalies in SDN network traffic, a neural network architecture has been developed, which is based on LSTM cells (Figure 29). The neural network should return 1 if anomalies are detected in the network packet and 0 if there are no anomalies. In other words, the neural network solves the binary classification problem.

LSTM neural networks have been chosen to avoid the problem of long-term dependency. The key components of the LSTM are the state cell and filters to protect and control the state of the cell. Filters let you skip information based on certain conditions. They consist of a sigmoidal neural network layer and a pointwise multiplication operation.

Keras and TensorFlow libraries were used to write the classifier. Plots are built by the Matplotlib module. All calculations were made in the Jupiter Notebook integrated development environment. The neural network architecture consists of seven layers.

The input layer takes three-dimensional tensors with a length of 179 characters as the input. The output layer of the classifier has only one neuron, which determines the probabilistic belonging of the data to the class of anomalies.

The classifier is trained on pre-labeled data that includes both classes. The training lasted 200 epochs with the size of the portions (batch) of data being 128 sequences (lines) long. Figure 30 shows a graph of increasing accuracy and decreasing loss function in the process of neural network training.

To train the LSTM network, the Adam (Adaptive moment estimation) algorithm was chosen, which is considered as one of the most effective optimization algorithms used in training neural networks. This algorithm, unlike other well-known optimization algorithms, for example, the Root Mean Square Propagation (RMSProp) algorithm, adapts the parameter learning rate not based on the average of the first moment (mean), as in RMSProp, but on the basis of the average of the second moments of the gradients. This provides a high learning efficiency with low memory consumption. To train the LSTM model, 30 epochs were enough, with the step size at each iteration being equal to 0.001. Binary cross entropy was used as a loss function, which penalizes not only erroneous predictions, but also uncertain (noisy) predictions. In order to reduce errors of the first kind (i.e., False Positive), 10-fold cross-validation was used, the essence of which was as follows. The data set was randomly split into 10 parts. The model was trained on nine parts of the data, and the rest of the data were used for testing. In total, partition, training and testing were repeated 10 times.

To assess the possible retraining of the model, a graph of accuracy and recall was created (Figure 31). The intersection of the accuracy and recall curves shows the optimum in anomaly detection in such a way that legitimate requests are not blocked.

It can be seen from the figure that accuracy and recall tend to unity when the model is trained. This means that the number of anomaly detection errors tends to zero.

The analysis showed that the proposed algorithm copes well to ensure SDN resilience by rearranging the operating modes of the controllers, while the threshold value can be adjusted to increase the detection accuracy.

5. Discussion

The presented experimental data confirm the reliability and validity of the proposed approach and the possibility of using it to assess and ensure SDN resilience.

At the same time, a number of questions remain, which we will try to answer in this section. Such questions include the following:

What are the advantages and disadvantages of the proposed method?
How can one use this approach in practice in terms of intrusion detection?

5.1. Advantages and Disadvantages of the Approach

The advantages of the approach, first of all, consist of the high accuracy of the estimates obtained and their high resilience, which consists of the ability to keep the result of the assessment around the real value when varying the initial data in a fairly wide range. This advantage is determined by the nature of performing mathematical calculations, which are based on stochastic networks and focus on obtaining the distribution function of a random variable, which is the time of the attack. Approaches to stochastic modeling based on Markov models have the same accuracy. However, Markov models, as a rule, are associated with the need to form a state graph, and subsequently, solve a system of linear equations. With a large number of nodes in the network, the dimension of such a graph, and consequently, such an equation system will be extremely large, which will significantly complicate their application by large computer networks. At the same time, the proposed approach has fairly good scalability.

Considering the possible disadvantages of the proposed approach, it should first of all be said that the need to perform a certain number of calculations, which may seem unnecessarily large, force us to abandon the use of this method and use other, easier mathematical approaches. Such simpler approaches include a number of approaches based on the formation of attack graphs and the use of security metrics that take into account expert knowledge [68,69]. However, the accuracy of estimates with these approaches depends on the expert’s qualifications. In the proposed method, dependence on expert knowledge is minimized. On the other hand, the need to perform a certain number of mathematical calculations in the proposed approach may frighten users due to the lack of appropriate software and tools (frameworks) at a professional or commercial level. The creation of such frameworks is currently not a big problem. If we assume that such tools have appeared on the security software market, then the proposed approach significantly outperforms approaches based on expert knowledge.

5.2. Using the Approach in Practice

We initially analyzed existing distributed SDN management platforms, such as HyperFlow [70] application, Onix [71], Kandoo controller [72], OpenFlow protocol [73], ElastiCon [74], ONOS controller [75], High Availability Controller [76] and others.

The analysis showed that all the above platforms do not implement SDN recovery after CAs. When control is transferred from the main to the backup controller after its failure as a result of a large CA, the backup controller will also be attacked. Thus, the development of such approaches to ensure SDN resilience is required, which are able not only to ensure uninterrupted network management by providing redundancy, but also to detect and eliminate the causes of a controller failure.

The proposed approach helps to choose the most reasonable countermeasure. Knowing the place where the attack was detected and its type, the administrator, in accordance with the proposed approach, evaluates the SDN resilience under the conditions of a possible attack implementation, and the justified LSTM neural network, through the OvS Agents, monitors the state of the controllers and automatically rebuilds them.

In this case, this estimate can be refined. By evaluating, on the one hand, the results of calculating the resilience and possible losses caused by a CA, and, on the other hand, the costs of implementing countermeasures, the administrator can either manually select the most appropriate countermeasure or it can do so automatically using a reasonable SDN protection system under CA conditions. The trained LSTM determines the most probable attack scenario in real time and offers the network administrator to launch a countermeasure (recovery) scenario. Further, after the decision is made, the selected scenario is executed by the official. The procedure is loaded from the configuration database, automatic settings are made and connection to the backup controller is performed when the full reconfiguration mode is selected.

In addition, in order to reduce the time for performing mathematical calculations during a detected attack, it is possible to prepare stochastic networks in advance for the most known types of attacks and build families of dependencies of the average time of their implementation on the parameters characterizing the attack scenario and SDN configuration. This will reduce the time to develop and justify a countermeasure and, ultimately, will contribute to a higher level of SDN security in general.

5.3. Evaluation of the Effectiveness and Validity of the Proposed Approach

To evaluate the effectiveness of the proposed LSTM structure, a comparative evaluation was carried out with a convolutional neural network (CNN) (Figure 32). CNN can reduce the data preprocessing time through dimensionality reduction operations. In order to take advantage of CNNs, it is necessary to transform the dimensionality of the network data into 2D spatially correlated data. This is conducted to represent each network packet as a vector. Thus, instead of convolving in 2D, convolution is performed in 1D in the time dimension, but with a 2D kernel. The training of a CNN model is shown in Figure 33 and Figure 34.

As can be seen from the figures, the CNN is inferior to LSTM. The graphs are non-uniform, saw-toothed and the training accuracy reached 0.92 and stopped growing after the 25th epoch.

Figure 35 and Figure 36 compare the accuracy of the CNN with LSTM, as well as the loss function in the training neural network process.

It can be seen from the figures that in CNN, the learning engines take more time because they have to memorize rather large sequences. On the contrary, the learning mechanisms of the LSTM network are more optimized and remember less information.

The validity of the proposed approach is ensured as follows. Firstly, this approach uses well-proven methods, which are the TTSN method, the theory of Markov processes and deep learning technology using LSTM networks. In other words, we can discuss ensuring the constructive validity of the approach. Secondly, as mentioned above, each experiment on the simulation stand was carried out several times. The values of probabilistic temporal characteristics and resilience probabilities presented in the graphs should be considered as average values. This provided internal validity. At the same time, it should be noted that the proposed approach has some limitations that affect its external validity. They mainly concern the nature of the impact of computer attacks on the SDN. In the approach we have considered, it is assumed that, firstly, the attacks are single, and, secondly, they have a known implementation algorithm. If the SDN is affected by multiple or combined attacks, then the approach should be slightly modified by building two-level stochastic networks. At the bottom level of these networks, the stages of implementation of individual attacks will be displayed, and the relationship of attacks will be reflected at the top. If the attack is unknown, then the proposed approach allows it to be detected using the LSTM network. However, to apply the stochastic network and the Markov attack model, additional expert judgments are needed.

6. Conclusions

The article proposed a new approach to the analytical modeling of CAs based on the TTSN method. The essence of this method is to replace the set of elementary branches of the stochastic network with one equivalent branch, followed by the determination of the equivalent function of the network, as well as the initial moments and the distribution function of the CA implementation time. The verification of the proposed approach was carried out to simulate CAs of the “Substitution of network topology” and “Hacking/Crashing controller” attacks, which are among the most common and dangerous ones for SDN.

The proposed method for assessing SDN resilience under CA conditions makes it possible to determine the indicators that characterize it and substantiate its most resilient structure. The use of CA reference models makes it possible to determine the probabilistic temporal characteristics of known attacks used as initial data necessary for assessing threats and substantiating the requirements for protecting the network from CAs.

In order to ensure the resilience of SDN, a multi-agent security system is proposed, in which software routers act as agents. Using the results of CA detection and classification obtained on the LSTM neural network, this protection system makes decisions based on countering the CA by reconfiguring the SDN parameters. Experimental evaluation has shown that the use of the proposed protection system can increase the resilience of SDN under CAs conditions from 10 to 45 percent.

Further research directions are associated with the extension of the proposed approach to multiple and combined impacts on SDN, as well as with the construction of analytical models for the implementation of countermeasures and their integration with CA analytical models.

Author Contributions

I.K. was responsible for conceptualization and methodology; I.S. and A.P. analyzed the data; O.L. conceived and designed the experiment; all authors wrote the article. All authors have read and agreed to the published version of the manuscript.

Funding

This research is being supported by the grant of RSF #21-71-20078 in SPC RAS.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Vicentini, C.; Santin, A.; Viegas, E.; Abreu, V. SDN-based and multitenant-aware resource provisioning mechanism for cloud-based big data streaming. J. Netw. Comput. Appl. 2018, 126, 133–149. [Google Scholar] [CrossRef]
Chuluundorj, Z.; Taylor, C.; Walls, R.; Shue, C. Can the User Help? Leveraging User Actions for Network Profiling. In Proceedings of the 2021 Eighth International Conference on Software Defined Systems (SDS), Gandia, Spain, 6–9 December 2021; pp. 1–8. [Google Scholar] [CrossRef]
Lei, Y.; Lanson, J.; Kaldawy, R.; Estrada, J.; Shue, C. Can Host-Based SDNs Rival the Traffic Engineering Abilities of Switch-Based SDNs? In Proceedings of the 2020 11th International Conference on Network of the Future (NoF), Bordeaux, France, 12–14 October 2020; pp. 91–99. [Google Scholar] [CrossRef]
Devlic, A.; John, W.; Sköldström, P. A Use-Case Based Analysis of Network Management Functions in the ONF SDN Model. In Proceedings of the 2012 European Workshop on Software Defined Networking, Darmstadt, Germany, 25–26 October 2012; pp. 85–90. [Google Scholar] [CrossRef]
Zhang, Z.; Ma, L.; Leung, K.K.; Le, F. More Is Not Always Better: An Analytical Study of Controller Synchronizations in Distributed SDN. IEEE/ACM Trans. Netw. 2021, 29, 1580–1590. [Google Scholar] [CrossRef]
Cox, J.H.; Chung, J.; Donovan, S.; Ivey, J.; Clark, R.J.; Riley, G.; Owen, H.L. III. Advancing Software-Defined Networks: A Survey. IEEE Access 2017, 5, 25487–25526. [Google Scholar] [CrossRef]
Kotenko, I.; Saenko, I.; Lauta, O.; Karpov, M. Methodology for Management of the Protection System of Smart Power Supply Networks in the Context of Cyberattacks. Energies 2021, 14, 5963. [Google Scholar] [CrossRef]
Privalov, A.; Lukicheva, V.; Kotenko, I.; Saenko, I. Method of early detection of cyber-attacks on telecommunication networks based on traffic analysis by extreme filtering. Energies 2019, 12, 4768. [Google Scholar] [CrossRef]
Privalov, A.; Lukicheva, V.; Kotenko, I.; Saenko, I. Increasing the sensitivity of the method of early detection of cyber-attacks in telecommunication networks based on traffic analysis by extreme filtering. Energies 2020, 13, 2774. [Google Scholar] [CrossRef]
Kotenko, I.; Saenko, I.; Lauta, O. Analytical modeling and assessment of cyber resilience on the base of stochastic networks conversion. In Proceedings of the 2018 10th International Workshop on Resilient Networks Design and Modeling (RNDM), Longyearbyen, Norway, 27–29 August 2018; pp. 1–8. [Google Scholar] [CrossRef]
Kotenko, I.; Saenko, I.; Lauta, O.; Kocinyak, M. Assessment of computer network resilience under impact of cyber attacks on the basis of stochastic networks conversion. In Mobile Internet Security. MobiSec 2016. Communications in Computer and Information Science; You, I., Leu, F.Y., Chen, H.C., Kotenko, I., Eds.; Springer: Singapore, 2018; Volume 797, pp. 107–117. [Google Scholar] [CrossRef]
Kotenko, I.V.; Saenko, I.B.; Kotsynyak, M.A.; Lauta, O.S. Assessment of Cyber-Resilience of Computer Networks Based on Simulation of Cyber Attacks by the Stochastic Networks Conversion Method. SPIIRAS Proc. 2017, 6, 160–184. [Google Scholar] [CrossRef]
Kotenko, I.; Saenko, I.; Lauta, O. Modeling the Impact of Cyber Attacks. In Cyber Resilience of Systems and Networks, Risk, Systems and Decisions; Kott, A., Linkov, I., Eds.; Springer: Cham, Switzerland, 2019; pp. 154–196. [Google Scholar] [CrossRef]
Privalov, A.; Titov, D.; Kotenko, I.; Saenko, I.; Evglevskaya, N. Evaluating the functioning quality of data transmission networks in the context of cyberattacks. Energies 2021, 14, 4755. [Google Scholar] [CrossRef]
Kotenko, I.; Lauta, O.; Kribel, K.; Saenko, I. LSTM Neural Networks for Detecting Anomalies Caused by Web Application Cyber Attacks. In Frontiers in Artificial Intelligence and Applications, Vol. 337, New Trends in Intelligent Software Methodologies, Tools and Techniques Proceedings of the 20th International Conference on New Trends in Intelligent Software Methodologies, Tools and Techniques (SoMeT_21), Cancun, Mexico, 21–23 September 2021; Fujita, H., Perez-Meana, H., Eds.; IOS Press: Cancun, Mexico, 2021; pp. 127–140. [Google Scholar] [CrossRef]
Ahmadi, V.; Ahmadi, V.; Jalili, A.; Khor, S.M.; Keshtgari, M. A hybrid NSGA-II for solving multiobjective controller placement in SDN. In Proceedings of the 2015 2nd International Conference on Knowledge-Based Engineering and Innovation (KBEI), Tehran, Iran, 5–6 November 2015; pp. 663–669. [Google Scholar] [CrossRef]
Shu, Z.; Wan, J.; Lin, J.; Wang, S.H.; Li, D.; Rho, S.; Yang, C.H. Traffic engineering in software-defined networking: Measurement and management. IEEE Access 2016, 4, 3246–3256. [Google Scholar] [CrossRef]
Egilmez, H.E.; Dane, S.T.; Bagci, K.T.; Tekalp, A.M. OpenQoS: An OpenFlow controller design for multimedia delivery with end-to-end Quality of Service over Software-Defined Networks. In Proceedings of the 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, Hollywood, CA, USA, 3–6 December 2012; pp. 1–8. [Google Scholar]
Cabarkapa, D.; Rancic, D. Software-Defined Networking: The Impact of Scalability on Controller Performance. In Proceedings of the 2022 IEEE Zooming Innovation in Consumer Technologies Conference (ZINC), Novi Sad, Serbia, 25–26 May 2022; pp. 17–21. [Google Scholar] [CrossRef]
Bannour, F.; Souihi, S.; Mellouk, A. Scalability and Reliability Aware SDN Controller Placement Strategies. In Proceedings of the 2017 13th International Conference on Network and Service Management (CNSM), Tokyo, Japan, 26–30 November 2017; pp. 1–4. [Google Scholar] [CrossRef]
Hu, Y.; Wang, W.; Gong, X.; Que, X.; Cheng, S. BalanceFlow: Controller load balancing for OpenFlow networks. In Proceedings of the 2012 IEEE 2nd International Conference on Cloud Computing and Intelligence Systems, Hangzhou, China, 30 October–1 November 2012; pp. 780–785. [Google Scholar] [CrossRef]
Aglan, M.A.; Sobh, M.A.; Bahaa-Eldin, A.M. Reliability and Scalability in SDN Networks. In Proceedings of the 2018 13th International Conference on Computer Engineering and Systems (ICCES), Cairo, Egypt, 18–19 December 2018; pp. 549–554. [Google Scholar] [CrossRef]
Shalimov, A.; Zuikov, D.; Zimarina, D.; Pashkov, V.; Smeliansky, R. Advanced study of SDN/OpenFlow controllers. In Proceedings of the 9th Central & Eastern European Software Engineering Conference in Russia (CEE-SECR ’13), Moscow, Russia, 24–25 October 2013; pp. 1–6. [Google Scholar] [CrossRef]
Ros, F.J.; Ruiz, P.M. On reliable controller placements in Software-Defined Networks. Comput. Commun. 2016, 77, 41–51. [Google Scholar] [CrossRef]
Yao, G.; Bi, J.; Li, Y.; Guo, L. On the Capacitated Controller Placement Problem in Software Defined Networks. IEEE Commun. Lett. 2014, 18, 1339–1342. [Google Scholar] [CrossRef]
Park, S.M.; Ju, S.; Jaiyong, L. Efficient Routing for Traffic Offloading in Software-defined Network. Procedia Comput. Sci. 2014, 34, 674–679. [Google Scholar] [CrossRef]
Singh, S.; Jha, R.K. A survey on Software Defined Networking: Architecture for next generation network. J. Netw. Syst. Manag. 2017, 25, 321–374. [Google Scholar] [CrossRef]
Lange, S.; Gebert, S.; Spoerhase, J.; Rygielski, P.; Zinner, T.; Kounev, S.; Tran-Gia, P. Specialized Heuristics for the Controller Placement Problem in Large Scale SDN Networks. In Proceedings of the 2015 27th International Teletraffic Congress, Ghent, Belgium, 8–10 September 2015; pp. 210–218. [Google Scholar] [CrossRef]
Song, S.; Lee, J.; Son, K.; Jung, H.; Lee, J. A congestion avoidance algorithm in SDN environment. In Proceedings of the 2016 International Conference on Information Networking (ICOIN), Kota Kinabalu, Malaysia, 13–15 January 2016; pp. 498–511. [Google Scholar] [CrossRef]
Kamisiski, A.; Doma, J.; Wjcik, R.; Jajszczyk. Two Rerouting-Based Congestion Control Algorithms for Centrally Managed Flow-Oriented Networks. IEEE Commun. Lett. 2016, 20, 1963–1966. [Google Scholar] [CrossRef]
Wu, Y.W.; Zhang, W. OpenFlow-Based Global Load Balancing in Fat-Tree Networks. Adv. Mater. Res. 2014, 989–994, 4794–4798. [Google Scholar] [CrossRef]
Li, J.; Chang, X.; Ren, Y.; Zhang, Z.; Wang, G. An Effective Path Load Balancing Mechanism Based on SDN. In Proceedings of the 2014 IEEE 13th International Conference on Trust, Security and Privacy in Computing and Communications, Beijing, China, 24–26 September 2014; pp. 527–533. [Google Scholar] [CrossRef]
Celenlioglu, M.R.; Alsadi, M.; Mantar, H.A. Design, implementation and evaluation of SDN-based resource management model. In Proceedings of the 2015 7th International Conference on New Technologies, Mobility and Security (NTMS), Paris, France, 27–29 July 2015; pp. 1–5. [Google Scholar] [CrossRef]
Li, W.; Meng, M.; Kwok, L.M. A survey on OpenFlow-based Software Defined Networks: Security challenges and countermeasures. J. Netw. Comput. Appl. 2016, 68, 126–139. [Google Scholar] [CrossRef]
Goranson, P.; Black, C.; Culver, T. Software Defined Networks: A Comprehensive Approach; Elsevier: Cambridge, UK, 2017. [Google Scholar]
Prodanov, N.S.; Nikolova, K.S.; Atamian, D.K. Load Balancing Implementation in Software Defined Networks. In Proceedings of the 2022 57th International Scientific Conference on Information, Communication and Energy Systems and Technologies (ICEST), Ohrid, North Macedonia, 16–18 June 2022; pp. 1–4. [Google Scholar] [CrossRef]
Netes, V. End-to-End Availability of Cloud Services. In Proceedings of the 2018 22nd Conference of Open Innovations Association (FRUCT), Jyvaskyla, Finland, 15–18 May 2018; pp. 198–203. [Google Scholar] [CrossRef]
Haas, Z.J.; Culver, T.L.; Sarac, K. Vulnerability Challenges of Software Defined Networking. IEEE Commun. Mag. 2021, 59, 88–93. [Google Scholar] [CrossRef]
Feng, M.; Mao, S.; Jiang, T. Enhancing the performance of future wireless networks with software-defined networking. Front. Inf. Technol. Electron. Eng. 2016, 17, 606–619. [Google Scholar] [CrossRef]
Long, H.; Shen, Y.; Guo, M.; Tang, F. LABERIO: Dynamic load-balanced Routing in OpenFlow-enabled Net-works. In Proceedings of the 2013 IEEE 27th International Conference on Advanced Information Networking and Applications (AINA), Barcelona, Spain, 25–28 March 2013; pp. 290–297. [Google Scholar] [CrossRef]
He, J.; Zong, C.-H.; Zhu, H.-Y.; Xu, F.-Y. Research on stability of cooperation in SDN. In Proceedings of the 2005 International Conference on Machine Learning and Cybernetics, Guangzhou, China, 18–21 August 2005; Volume 5, pp. 2971–2976. [Google Scholar] [CrossRef]
Wang, R.; Butnariu, D.; Rexford, J. OpenFlow-based server load balancing gone wild. In Proceedings of the 11th USENIX Conference on Hot Topics in Management of Internet, Cloud, and Enterprise Networks and Services, Boston, MA, USA, 29 March 2011; Available online: http://www.usenix.org/events/hotice11/tech/full_papers/Wang_Richard.pdf (accessed on 15 January 2023).
Govindarajan, K.; Meng, K.C.; Ong, H.; Tat, W.M.; Sivanand, S.; Leong, L.S. Realizing the Quality of Service (QoS) in Software-Defined Networking (SDN) based Cloud infrastructure. In Proceedings of the 2020 2nd International Conference on Information and Communication Technology (ICoICT), Bandung, Indonesia, 28–30 May 2014; pp. 505–510. [Google Scholar] [CrossRef]
Shukla, P.K.; Maheshwary, P.; Subramanian, E.K.; Shilpa, V.J.; Varma, P.R.K. Traffic flow monitoring in software-defined network using modified recursive learning. Phys. Commun. 2023, 57, 101997. [Google Scholar] [CrossRef]
Linkov, I.; Eisenberg, D.A.; Bates, M.E.; Chang, D.; Convertino, M.; Allen, J.H.; Flynn, S.E.; Seager, T.P. Measurable resilience for actionable policy. Environ. Sci. Technol. 2013, 47, 10108–10110. [Google Scholar] [CrossRef]
Linkov, I.; Eisenberg, D.A.; Plourde, K.; Seager, T.P.; Allen, J.; Kott, A. Resilience metrics for cyber systems. Environ. Syst. Decis. 2013, 33, 471–476. [Google Scholar] [CrossRef]
Ganin, A.; Massaro, E.; Gutfraind, A.; Steen, N.; Keisler, J.M.; Kott, A.; Mangoubi, R.; Linkov, I. Operational resilience: Concepts, design and analysis. Sci. Rep. 2016, 6, 19540. [Google Scholar] [CrossRef] [PubMed]
Bocchini, P.; Frangopol, D.M.; Ummenhofer, T.; Zinke, T. Resilience and Sustainability of Civil Infrastructure: Toward a Unified Approach. J. Infrastruct. Syst. 2014, 20, 04014004. [Google Scholar] [CrossRef]
OPNET Technologies. Available online: http://www.opnet.com.tw (accessed on 15 January 2023).
Ahuja, S.P. COMNET III: A network simulation laboratory environment for a course in communications networks. In Proceedings of the 28th Annual Frontiers in Education Conference (FIE ’98), Tempe, AZ, USA, 4–7 November 1998; Volume 3, pp. 1085–1088. [Google Scholar] [CrossRef]
Kotenko, I.; Chechulin, A. A Cyber Attack Modeling and Impact Assessment Framework. In Proceedings of the 5th IEEE International Conference on Cyber Conflict (CyCon), Tallinn, Estonia, 4–7 June 2013; pp. 1–24. Available online: https://ieeexplore.ieee.org/document/6568374 (accessed on 15 January 2023).
Pritsker, A.A.B. GERT: Graphical Evaluation and Review Technique. 1966. Available online: https://www.rand.org/content/dam/rand/pubs/research_memoranda/2006/RM4973.pdf (accessed on 15 January 2023).
Yi-song, Z.; Dong, L.; Feng, Z. Study on a GERT based method for hi-tech product development project planning. In Proceedings of the 2009 16th International Conference on Industrial Engineering and Engineering Management, Beijing, China, 21–23 October 2009; pp. 1022–1026. [Google Scholar] [CrossRef]
Clayton, E.R.; Cooley, J.W. Use of Q-GERT Network Simulation in Reliability Analysis. IEEE Trans. Reliab. 1981, R-30, 321–324. [Google Scholar] [CrossRef]
Shibanov, A.; Saprykin, A. Calculation of the output value distribution of the GERT network with exponentially and evenly distributed random values. In Proceedings of the 2018 ELEKTRO, Mikulov, Czech Republic, 21–23 May 2018; pp. 1–6. [Google Scholar] [CrossRef]
Gavareshki, M.H.K. New fuzzy GERT method for research projects scheduling. In Proceedings of the 2004 IEEE International Engineering Management Conference (IEEE Cat. No.04CH37574), Singapore, 18–21 October 2004; Volume 2, pp. 820–824. [Google Scholar] [CrossRef]
Kannan, R. Graphical Evaluation and Review Technique (GERT): The Panorama in the Computation and Visualization of Network-Based Project Management. In Advances in Secure Computing, Internet Services, and Applications; Tripathy, B., Acharjya, D., Eds.; IGI Global: Hershey, PA, USA, 2014; pp. 165–179. [Google Scholar] [CrossRef]
Li, C.; Tang, Y.; Li, C. A GERT-based analytical method for remanufacturing process routing. In Proceedings of the 2011 IEEE International Conference on Automation Science and Engineering, Trieste, Italy, 24–27 August 2011; pp. 462–467. [Google Scholar] [CrossRef]
Freitas, A.T.; Oliveira, A.L. Implicit resolution of the Chapman-Kolmogorov equations for sequential circuits: An application in power estimation. In Proceedings of the 2003 Design, Automation and Test in Europe Conference and Exhibition, Munich, Germany, 7 March 2003; pp. 764–769. [Google Scholar] [CrossRef]
Agarwal, M.; Sen, K.; Mohan, P. GERT Analysis of m-Consecutive-k-Out-of-n Systems. IEEE Trans. Reliab. 2007, 56, 26–34. [Google Scholar] [CrossRef]
Masuda, H.; Kanda, Y.; Okamoto, Y.; Hirono, K.; Hoshino, R.; Wakao, S.; Tsuburaya, T. Topology optimization of IH-equipment using Heaviside function in 2-D axisymmetric electromagnetic field. In Proceedings of the 2017 18th International Symposium on Electromagnetic Fields in Mechatronics, Electrical and Electronic Engineering (ISEF) Book of Abstracts, Lodz, Poland, 14–16 September 2017; pp. 1–2. [Google Scholar] [CrossRef]
Nahin, P.J. Behind the Laplace transform. IEEE Spectrum 1991, 28, 60. [Google Scholar] [CrossRef]
Yan, L.; McKeown, N. Learning Networking by Reproducing Research Results. ACM SIGCOMM Comput. Commun. Rev. 2017, 47, 19–26. [Google Scholar] [CrossRef]
NfSen—Netflow Sensor. Available online: https://nfsen.sourceforge.net (accessed on 15 January 2023).
Wireshark. Available online: https://www.wireshark.org (accessed on 15 January 2023).
Zabbix 6.2. Improve Your Monitoring Performance. Available online: https://www.zabbix.com (accessed on 15 January 2023).
Proteus Enterprise. Business Intelligence from Your Communications Data. Available online: https://info.enghouseinteractive.com/rs/547-FBA-390/images/proteus-enterprise-enghouse.pdf (accessed on 15 January 2023).
Kotenko, I.; Chechulin, A. Computer attack modeling and security evaluation based on attack graphs. In Proceedings of the 2013 IEEE 7th International Conference on Intelligent Data Acquisition and Advanced Computing Systems (IDAACS), Berlin, Germany, 12–14 September 2013; pp. 614–619. [Google Scholar] [CrossRef]
Kotenko, I.; Doynikova, E. Dynamical Calculation of Security Metrics for Countermeasure Selection in Computer Networks. In Proceedings of the 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP), Heraklion, Greece, 17–19 February 2016; pp. 558–565. [Google Scholar] [CrossRef]
Mission Control for Creatives. Available online: https://www.hyperflow.io (accessed on 15 January 2023).
Koponen, T.; Casado, M.; Gude, N.; Stribling, J.; Poutievski, L.; Zhu, M.; Ramanathan, R.; Iwata, Y.; Inoue, H.; Hama, T.; et al. Onix: A Distributed Control Platform for Large-scale Production Networks. 2010, 10, p. 6. Available online: https://www.usenix.org/legacy/event/osdi10/tech/full_papers/Koponen.pdf (accessed on 15 January 2023).
Yeganeh, S.H.; Ganjali, Y. Kandoo: A framework for efficient and scalable offloading of control applications. In Proceedings of the First Workshop on Hot Topics in Software Defined Networks (HotSDN ’12), Helsinki, Finland, 13 August 2012; pp. 19–24. [Google Scholar] [CrossRef]
OpenFlow Protocol. Available online: https://www.sciencedirect.com/topics/computer-science/openflow-protocol (accessed on 15 January 2023).
ElasticON: Illuminate the Possibilities. Available online: https://www.elasticon.com/event/e473ab1b-88b4-4326-aa8d-e6054a566e48/summary (accessed on 15 January 2023).
Open Network Operating System (ONOS). Available online: https://opennetworking.org/onos (accessed on 15 January 2023).
Pashkov, V.; Shalimov, A.; Smeliansky, R. Controller failover for SDN enterprise networks. In Proceedings of the 2014 International Science and Technology Conference (Modern Networking Technologies) (MoNeTeC), Moscow, Russia, 28–29 October 2014; pp. 1–6. [Google Scholar] [CrossRef]

Figure 1. Structural diagram of a software-defined network.

Figure 2. SDN network.

Figure 3. Graph of discrete states and conditional transitions in SDN.

Figure 4. The flowchart of SDN resiliency assessment in the conditions of CAs.

Figure 5. Stochastic network for the “Substitution of network topology” attack.

Figure 6. Closed stochastic network for the “Substitution of network topology” attack.

Figure 7. Stochastic network of the CA to transmission plane in SDN.

Figure 8. Closed-loop stochastic network of the cyber attack on the transmission plane in SDN.

Figure 9. Computer simulation model of a data transmission network using SDN.

Figure 10. Generalized scheme of CA stages.

Figure 11. Main factors affecting the type of implemented CAs against SDN.

Figure 12. Basic parameters of soft switch, OpenvSwitch v. 2.9.8.

Figure 13. Carrying out an attack such as “Substitution of network topology” on devices.

Figure 14. Probabilistic temporal characteristics of the “man-in-the-middle” attack (a) dependence of the average time on the probability of the implementation of the CA; (b) dependence of the integral probability distribution function on the time of the implementation of the CA.

Figure 15. Probabilistic temporal characteristics of the “Hacking/Crashing controller” attack. (a) dependence of the average time on the probability of the implementation of the CA; (b) dependence of the integral probability distribution function on the time of the implementation of the CA.

Figure 16. Probabilistic temporal characteristics of the “Service Chain Interference” attack. (a) dependence of the average time on the probability of the implementation of the CA; (b) dependence of the integral probability distribution function on the time of the implementation of the CA.

Figure 17. Probabilistic temporal characteristics of the “Internal Storage Abuse” attack. (a) dependence of the average time on the probability of the implementation of the CA; (b) dependence of the integral probability distribution function on the time of the implementation of the CA.

Figure 18. Probabilistic temporal characteristics of the DoS attack. (a) dependence of the average time on the probability of the implementation of the CA; (b) dependence of the integral probability distribution function on the time of the implementation of the CA.

Figure 19. Probabilistic temporal characteristics of the “Scalability and availability” attack. (a) dependence of the average time on the probability of the implementation of the CA; (b) dependence of the integral probability distribution function on the time of the implementation of the CA.

Figure 20. SDN structure with two controllers with separation of control functions and interception of each other’s control functions according to a given algorithm.

Figure 21. SDN structure with two controllers, when one controller is the main one and performs all control functions.

Figure 22. Dependence of the SDN resilience probability on the CA implementation time for Structure 1.

Figure 23. Dependence of the SDN resilience probability on the CA implementation time for Structure 2.

Figure 24. Dependence of the SDN resilience probability on the CA implementation time for Structure 3.

Figure 25. A variant of creating the SDN protection system in the CA conditions.

Figure 26. Generalized structure of the system for ensuring resilience of the SDN segment.

Figure 27. Flowchart of the OvS Agent operation when CA features are detected.

Figure 28. Flowchart of the server operation of the SDN resiliency system.

Figure 29. Classifier tree structure.

Figure 30. Graph of increasing accuracy and decreasing loss function in neural network training.

Figure 31. Graph of accuracy and recall of model training.

Figure 32. Convolutional neural network model.

Figure 33. Graph of increasing accuracy and decreasing loss function in the CNN training.

Figure 34. Graph of precision and recall of CNN training.

Figure 35. Comparative evaluation of CNN and LSTM accuracy.

Figure 36. Loss function in training CNN and LSTM models.

Table 1. SDN levels.

Processing Level	Where Does It Start	Performance Indicators	Types of Processes and Tasks
Control Plane	CPU of the controller	Thousands of packets per second	Routing protocols (e.g., OSPF, IS-IS and BGP), Spanning Tree, SYSLOG, AAA (Authentication Authorization Accounting), NDE (NetFlow Data Ex-port), CLI (command Line interface) and SNMP
Data Plane	Dedicated hardware ASIC	Millions or billions of packets per second	L2 and L3 switching (IPv4/IPv6), MPLS forwarding, VRF Forwarding, QoS (Quality of Service) marking, Policing, Netflow collection and ACL (Access Control Lists)

Table 2. Conditional discrete states of a corporate SDN in the conditions of cyber attacks.

State Symbol	Description of the Conditional Discrete State
S₁	Stable resilient operation without failures
S₂	Functioning under the conditions of technical computer intelligence (implementation by the malefactor of collecting information on the CA object)
S₃	Functioning under the conditions of CAs against SDN
S₄	Functioning in case of a successful attack (successful connection to the attacked network and gaining access to the attacked controller)
S₅	Anomaly detection in the network, CA detection and elimination of the consequences of a successful attack

Table 3. SDN state transitions.

Designation	Description
λ₁₂	The presence of conditions for connecting an external intruder (for example, using a public network)
λ₂₃	Obtaining sufficient necessary information to carry out a CA
λ₃₂	Failed CA without detection of the attacker’s actions by the network security administrator
λ₃₄	Successful completion of the attack
λ₄₃	Denial of access obtained in a successful CA and caused by preventive actions without detection
λ₃₁	Unsuccessful CA with detection of the intruder’s actions by the network security administrator
λ₄₅	Detection of anomalies in the behavior of network devices, in network traffic and on the base of other parameters that indicate a CA
λ₅₁	Rebooting network devices using new unknown malefactor’s parameters

Table 4. The process of CAs on SDN.

Stages of Impact Implementation	Basic Execution Methods
Collection of information	The first stage of the attack implementation is the collection of information about the attacked system or node. It includes such actions as determining the network topology, the type and version of the operating system of the attacked node, as well as available network and other services, and so on. These actions are implemented in various ways.
Exploring the environment	At this stage, the attacker explores the network environment around the intended target of the attack. Such areas, for example, include the hosts of the “victim’s” Internet provider or the hosts of the remote office of the attacked company. At this stage, the attacker may be trying to determine the addresses of “trusted” systems (for example, the partner’s network) and nodes that are directly connected to the target of attack (for example, the ISP router), etc. Such actions are quite difficult to detect, since they are performed over a sufficiently long period of time and outside the area controlled by security measures (firewalls, intrusion detection systems, etc.)
Network topology identification	There are two main methods for determining the network topology used by attackers: (1) TTL modulation; (2) recording the route.
Node identification	Host identification is usually achieved by sending the ICMP ECHO_REQUEST command using the ping utility. The ECHO_REPLY response message indicates that the node is available. There are free programs that automate and speed up the process of identifying a large number of nodes in parallel, such as fping or nmap. The danger of this method is that ECHO_REQUEST requests are not fixed by the standard means of the node. To do this, one need to use traffic analysis tools, firewalls or Intrusion Detection Systems (IDS).
Service identification or port scanning	Identification of services, as a rule, is carried out by detecting open ports (port scanning). These ports are very often associated with services based on the TCP or UDP protocols. For example, open port 80 implies a web server; 25th port—SMTP mail server; 31,337th—server part of the Trojan horse Back Orifice; 12,345th or 12,346th—the server part of the NetBus Trojan horse.
Operating system identification	The main mechanism for remote OS determination is the analysis of responses to requests, taking into account different implementations of the TCP/IP stack in various operating systems. Each OS implements the TCP/IP protocol stack in its own way, which makes it possible to determine which OS is installed on a remote host using special requests and responses.Another, less effective and extremely limited, way to identify OS nodes is to analyze the network services found in the previous step. For example, open port 139 allows one to conclude that the remote host is most likely running an OS of the Windows family. Various programs can be used to determine the OS. For example, nmap or queso.
Determining the role of a host	The penultimate step at the stage of collecting information about the attacked host is to determine its role, for example, performing the functions of a firewall or a Web server. This step is performed on the basis of already collected information about active services, host names, network topology and so on. For example, an open port 80 may indicate the presence of a Web server, blocking an ICMP packet indicates a potential presence of a firewall, and the DNS host name proxy.domain.ru or fw.domain.ru is self-explanatory.
Identify host vulnerabilities	The last step is to look for vulnerabilities. At this step, the attacker either manually determines the vulnerabilities that can be used to implement an attack or uses various automated tools. Shadow Security Scanner, nmap, Retina and others can be used as automated tools.
Implementation of the attack	From this moment, an attempt to access the attacked node begins. In this case, access can be either direct, i.e., penetration into the host, or indirectly, for example, when implementing a denial-of-service (DoS) attack. The implementation of attacks in the case of direct access can also be divided into two stages: penetration and establishing control.
Targets of attacks	It should be noted that the attacker at the second stage can pursue two goals. First, obtaining unauthorized access to the site itself and the information contained on it. Secondly, gaining unauthorized access to a node in order to carry out further attacks on other nodes. The first goal, as a rule, can be achieved only after the achievement of the second one. That is, first the attacker creates a base for themself for further attacks, and only after that can they penetrates to other nodes. This is necessary in order to hide or significantly complicate finding the source of the attack.
Completion of the attack	The stage of completion of the attack is “covering up the tracks” on the part of the attacker. This is usually achieved by deleting relevant entries from the node’s logs and other actions that return the attacked system to its original, “pre-attacked” state.

Table 5. Classification of attacks specific to SDN.

SDN Plane	Threat/Attack	Description
1. Data	1.1. Flooding attacks	Switch flow tables contain only a limited number of flow rules
	1.2. “Man-in-the-middle” attacks	Active listening, in which the attacker establishes independent ties, because TLS is an add-on option, and it is not a standard
	1.3. Hacking/crashing of the controller	Since hacking the controller increases the risk to the data plane
2. Management	2.1. Service chain intervention	This attack can lead to two consequences: (1) A malicious application can participate in the chain and delete the control message before other applications receive the necessary information; (2) A malicious application can become trapped in an endless loop to stop the chain execution of applications.
	2.2. Internal Storage Abuse	Using the internal memory of the controller
	2.3. Control Message Manipulation	Manipulation of control messages
	2.4. Northbound API Abuse	An SDN application can manipulate the behavior of other applications using a poorly designed Northbound API
	2.5. System Variable Manipulation	Manipulation of system variables
	2.6. Network Topology Poisoning	Changing the network topology
	2.7. DoS attacks	No significant authentication
	2.8. Unauthorized access to the controller	There are no valid user access rights
	2.9. Scalability and availability	Increasing the size and shear of the network creates problems
3. Applications	3.1. Lack of authentication/authorization	Applications do not use any means of authentication
	3.2. Inserting fraudulent flow rules	Connected malicious applications can insert false rules into flow tables
	3.3. Lack of access control	Difficult to implement access control

Table 6. Challenges in SDN architecture resilience.

Region	Objects	Problems	Existing Solutions	Disadvantages
External level	Services. Switches.	Thread table memory limit. Vulnerability to synchronous attacks.	Intermediate safety devices	Fails to integrate into a virtualized environment
			Software-defined security	Cost
			Machine learning classification methods	Poor performance against massive attacks
Inner level	Controller. NBI/SBI. SDN applications.	One point of failure (controller compromise). Network manipulation (controller interception). Lack of authorization and authentication. Lack of encryption. Performance degradation. Susceptible to spoofing attacks.	Encrypted channel	Does not support all SDN controllers and switches. Does not encrypt all transmission data.
Inner level	Controller. NBI/SBI. SDN applications.		Access Control List (ACL)	Hard to manage and use

Table 7. Scenario of a CA of the “Substitution of network topology” type.

No.	Description of the CA stage	Stage Symbol
1	Checking the connection channel to the attacked network	w(s)
2	Exchange with the network controller via the Open Flow control protocol; passing off your device as a legitimate network device	m(s)
3	Sending network statistics data to the controller via the control protocol; checking the response of the controller	l(s)
4	Creating a network topology by sending false network statistics	z(s)
5	Network management by tricking the network controller with false messages from the Open Flow protocol	d(s)

Table 8. List of used software products.

No.	Device (Software Product) Name	Note
Switches and routers
1	JuniperSRX-240 (QEMU)	Network device acting as a border router
2	Dionis-NX (QEMU)	Network device acting as a firewall
3	Cisco3845 (QEMU)	Simulating the public communication system operation
4	OpenvSwitch (Linux Ubuntu)	Software SDN Router
Tools for simulating information exchange
5	Linux Ubuntu	Operating system
6	Lifesize	Video conferencing
7	Proteus-SP	IP telephone exchange
8	SIP-T22R	IP telephone
9	Runos 2.0	SDN controller
Modeling environment and auxiliary tools
10	EVE-NG	Data network simulation environment
11	NFsen	A tool for collecting information from network devices about information flows
12	Zabbix 3.4	Monitoring tool
13	Wireshark	Means of intercepting traffic in the data transmission network
14	VMware	Virtualization environment for running guest operating systems

Table 9. Stages of “Substitution of network topology” CA.

No.	Description of the CA Stage	Stage Symbol
1	Checking the connection channel to the attacked network	w(s)
2	Exchange with the network controller via the Open Flow control protocol; passing off your device as a legitimate network device	m(s)
3	Sending network statistics data to the controller via the control protocol; checking the response of the controller	l(s)
4	Creating a network topology by sending false network statistics	z(s)
5	Network management by tricking the network controller with false messages from the Open Flow protocol	q(s)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kotenko, I.; Saenko, I.; Privalov, A.; Lauta, O. Ensuring SDN Resilience under the Influence of Cyber Attacks: Combining Methods of Topological Transformation of Stochastic Networks, Markov Processes, and Neural Networks. Big Data Cogn. Comput. 2023, 7, 66. https://doi.org/10.3390/bdcc7020066

AMA Style

Kotenko I, Saenko I, Privalov A, Lauta O. Ensuring SDN Resilience under the Influence of Cyber Attacks: Combining Methods of Topological Transformation of Stochastic Networks, Markov Processes, and Neural Networks. Big Data and Cognitive Computing. 2023; 7(2):66. https://doi.org/10.3390/bdcc7020066

Chicago/Turabian Style

Kotenko, Igor, Igor Saenko, Andrey Privalov, and Oleg Lauta. 2023. "Ensuring SDN Resilience under the Influence of Cyber Attacks: Combining Methods of Topological Transformation of Stochastic Networks, Markov Processes, and Neural Networks" Big Data and Cognitive Computing 7, no. 2: 66. https://doi.org/10.3390/bdcc7020066

APA Style

Kotenko, I., Saenko, I., Privalov, A., & Lauta, O. (2023). Ensuring SDN Resilience under the Influence of Cyber Attacks: Combining Methods of Topological Transformation of Stochastic Networks, Markov Processes, and Neural Networks. Big Data and Cognitive Computing, 7(2), 66. https://doi.org/10.3390/bdcc7020066

Article Menu

Ensuring SDN Resilience under the Influence of Cyber Attacks: Combining Methods of Topological Transformation of Stochastic Networks, Markov Processes, and Neural Networks

Abstract

1. Introduction

2. Related Work

3. An Approach to Ensuring the SDN Resilience

3.1. Basic Expressions for Evaluating SDN Resilience

3.2. Examples of CA Reference Models

3.2.1. Verbal Model of CAs against SDN

3.2.2. Model of the “Substitution of Network Topology” Attack against SDN

3.2.3. Model of the “Hacking/Crashing Controller” Attack against SDN

4. Experimental Results

4.1. Description of the Simulation Stand

4.2. An Example of an Attack Simulation Model against SDN

4.3. Probabilistic and Temporal Characteristics of Attacks against SDN

4.4. Assessing the SDN Resilience under CAs Conditions

4.5. Creating the Fault-Tolerant SDN in CAs Environment

4.6. Implementation of a Neural Network for CA Detection

5. Discussion

5.1. Advantages and Disadvantages of the Approach

5.2. Using the Approach in Practice

5.3. Evaluation of the Effectiveness and Validity of the Proposed Approach

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI