Artiﬁcial Intelligence Control Logic in Next-Generation Programmable Networks

: The new generation of programmable networks allow mechanisms to be deployed for the efﬁcient control of dynamic bandwidth allocation and ensure Quality of Service (QoS) in terms of Key Performance Indicators (KPIs) for delay or loss sensitive Internet of Things (IoT) services. To achieve ﬂexible, dynamic and automated network resource management in Software-Deﬁned Networking (SDN), Artiﬁcial Intelligence (AI) algorithms can provide an effective solution. In the paper, we propose the solution for network resources allocation, where the AI algorithm is responsible for controlling intent-based routing in SDN. The paper focuses on the problem of optimal switching of intents between two designated paths using the Deep-Q-Learning approach based on an artiﬁcial neural network. The proposed algorithm is the main novelty of this paper. The Developed Networked Application Emulation System (NAPES) allows the AI solution to be tested with different patterns to evaluate the performance of the proposed solution. The AI algorithm was trained to maximize the total throughput in the network and effective network utilization. The results presented conﬁrm the validity of applied AI approach to the problem of improving network performance in next-generation networks and the usefulness of the NAPES trafﬁc generator for efﬁcient economical and technical deployment in IoT networking systems evaluation.


Introduction
This paper presents a solution developed in the FlexNet (www.celticnext.eu/projectflexnet) project related to the management of network resources using Software-Defined Networking (SND) technology. SDN is a new networking solution in which a central server, called a controller, oversees all processes and controls network behaviour, ensuring the best possible network quality. This new paradigm of network design, as opposed to the traditional method, has benefits. It is much easier to adapt to new network policies using software, as it is easier to add or modify controller-based network-level rules using software, rather than manually applying a limited set of commands to these devices. The SDN model allows control functions found in network devices to be taken and moved centrally at the SDN controller level, allowing network devices to communicate with each other in an efficient manner.
Presented approach is in accordance to FlexNet project objective of building up a new paradigm of flexible network communications to foster IoT value creation. The Flexible IoT Network provides the IoT value creators with the availability to consume network communications on demand, in real time, and automatically to fulfil their specific needs.
This new network paradigm is fully aligned with the efforts currently ongoing in the implementation of 5G technology, providing high-quality and consistent connectivity for people and objects, creating the perception of infinite capacity. The FlexNet project proposes flexible resource management using SDN with the support of an AI solution for different IoT use cases.
Usually, the network is managed manually using commands, scripts or special tools, without automation for the efficient allocation of network resources. For several years, new network solutions with improved management solutions can be seen, where the most advanced one is the SDN solution [1][2][3]. In [4], the basic mechanisms and techniques of SDN for dynamic and scalable network control envisioned for 5G technology are defined [5]. One of the main elements is the SDN controller, which allows adaptive dynamic provisioning of resources by applying management rules to the traffic flow in the network [6]. On the other hand, the traffic generated in the network is also becoming more complex, especially in IoT applications [7][8][9][10][11] where the management of large data volumes requires more flexibility and scalability [12,13].
This flexibility and efficiency approach is often supported by artificial intelligence (AI) algorithms. In our proposed approach, the AI algorithm supports the intent routing control in SDN and is responsible for the resource allocation mechanism and network parameters such as throughput and network losses. This approach is also used in our ongoing FlexNet project [14][15][16]. The FlexNet project covers different use cases related to IoT technology, where different Quality of Service (QoS) requirements have to be met.
In addition, an IoT traffic generator called Networked Application Emulation System (NAPES) was developed and used for solution validation purposes. New applications and end-user devices generate different network traffic patterns; therefore, for validating new network solutions in complex application scenarios, the aforementioned IoT traffic generator is a very useful, cost-effective and fast-to-implement tool.

Motivation
The evolution of the next-generation telecommunication network in programmable flexible resources control solutions is driven by growing application requirements in service quality and networking resources [17]. Especially, it is the challenge in the IoT domain where multiply vertical use cases with different types of requirements are needed to be effectively implement. In the FlexNet project, we focused on different types of IoT use cases like: • Emergency and public safety with low latency requests; • Video surveillance applications, where on demand high bandwidth is required.
In these situations, SDN approach allows for flexible, on demand creation of new services using network controller handling virtualized resource across the network. Automatized management of the network resources in relation, from one side to applications demands and from other side to their optimal effective utilization is complex task, where AI mechanisms are promising solutions.
AI has seen a surge of interest in the networking community [18]. Recent contributions include data-driven flow control for wide-area networks, job scheduling, and network congestion control [19]. A particularly promising domain is the network management. Researchers have used Machine Learning (ML) to automate the management of network resources in relation to routing or network traffic optimization [20][21][22][23][24]. In loT networks, we expect network conditions to vary over time and space. Time varying conditions may be long term (seasonal) or short term, resulting in a significant impact on network performance [25]. ML techniques will be developed in order to detect such changes and signal them to the SDN layer for timely action to be taken to improve the overall network performance. Another solution is a Knowledge-Defined Network (KDN) which also is a next step on the path towards an implementation of a self-driving network [26]. KDN is a complementary solution for SDN that brings reasoning processes and ML techniques into the network control plane to enable autonomous and fast operation and minimization of operational costs.

Article Organisation
The article is organized as follows: in Section 2, we briefly describe the problem and application of AI in the contribution. Section 3 presents description of IoT traffic generator called NAPES we developed and use in the contribution. In Section 4, the results are presented. Contributions of this work are summarized and future work directions are discussed in Section 5.

Problem Description
The work has been done in the scope of FlexNet project, where Open Network Operating System (ONOS) controller has been used to control SDN. In ONOS, the concept of intents is used. An intent express a willingness to send a specific amount of data through a network [27].
In the considered architecture, for each registered intent, a pair of paths is computed. The paths are computed immediately after an intent is registered, and the process can take up to a couple of seconds. Then, when a network is in operational state, the AI module described in this paper selects one path from these precomputed paths for each intent. The selection itself has to be fast and prone to the unexpected behaviour of network links and intents. In other words, we face the problem of optimal switching of intents between two previously computed paths. The switching is performed based on an output of an artificial neural network that has been trained to recognize situations and states of the network requiring switching from one path to another.
Consider the following sets: E edges I intents P i paths for intent i ∈ I E p edges used on path p ∈ P Each intent has its volume defined for each moment in time in the considered time horizon < 0; T >. However, the volume is not known in advance. Intents have to be assigned to available paths in a way that minimizes the congestion resulting from the limited capacity of the edges. The volume and the capacities are expressed using the following constants: v it volume of intent i ∈ I at time t ∈< 0; T > c e capacity on edge e ∈ E The assignment and the congestion are expressed using the following variables: x it ∈ P i equals the path intent i ∈ I uses at time t ∈< 0; T > y it volume intent i ∈ I is actually sending at time t ∈< 0; T > due to congestion We express congestion using the following function that depends on the network, loads, and selected routes: is a non-trivial function that depends on the network, its current state, and an utilized congestion avoidance algorithm. The actual optimization problem we are solving is as follows: We notice that even a formally defined problem with unrealistically constant and predictable behaviour of intents and state of links is NP-hard, because we can reduce the satisfiability problem (SAT) [28] to it. Therefore, we decided first to decompose the problem to alleviate the computational complexity and second to use artificial neural networks to cope with unexpected behaviour of intents and links. The decomposition consists of considering each intent independently. The considered artificial neural network knows neither an exact topology of SDN nor volumes and paths of all intents. Instead, it is fed with the view of SDN from a point of view of a single intent that consists of the detailed information about the considered intent and the aggregated information about other intents in the vicinity.
We developed a system whose conceptual model is presented in Figure 1. First, it collects information about intents registered in ONOS using Intent Monitor and Reroute (IMR) service and standard ONOS API. Then, it collects information about traffic in a network using ifstat, which is an open-source, interface statistics collecting tool that is available in almost all Linux machines. In the next step, it combines this knowledge and creates observations for a randomly selected group of registered intent independently. The computed observations for each selected intent independently enter a neural network that returns two values associated with expected values of q-functions for two possible actions: doing nothing or switching to another path. If the latter value is greater than the former value, the switching is performed. In this way, in each step, only a part of registered intents is considered, and usually only a small fraction of this part of intents is issued with a command to change a current path.
The artificial neural network used in this research is a Deep-Q-Network (DQN) [29] consisting of four inputs (observations), two outputs, and four hidden layers of ten, five, five, and three neurons. We used DQNs, because it gave the best results in our internal studies. The learning process was implemented using Ray framework [30] and its goal was to maximize the total throughput in the network. The neural network takes a state of a network from a viewpoint of a single intent as an input and as an output it produces the expected q-function value for two cases: (a) when nothing happens and (b) when the path is switched.
The observations provided to the artificial neural network depend on a considered intent and are computed using data collected from a network and from ONOS. The observations are as follows: • Ratio of efficiency on active path; • Ratio of potential efficiency on inactive path; • Edge occupancy percentage on active path; • Edge occupancy percentage on inactive path.
The first observation is a ratio of an obtained momentum throughput for the intent to the maximum throughput declared for this intent. The maximum declared throughput is provided by the owner of the intent when creating the intent using the ONOS interface. The second observation is a ratio of the minimum available capacity on all edges of the inactive path to the declared intent's maximum throughput. The third observation is the minimum ratio of an obtained momentum throughput for the intent to a total traffic on an edge for all edges of the active path. Similarly, the fourth observation is the minimum ratio of the obtained momentum throughput for the intent to a total potential traffic (current traffic increases by the current momentum throughput of the intent) on an edge for all edges of the inactive path. All these observations are computed independently for each intent using data collected from ONOS and ifstat.
The artificial neural network that is fed with the above observations was trained using a heavily modified Iroko framework [31] that is based on Open AI Gym [32] and was originally implemented to optimize traffic in data centers. Episodes were run with the mininet framework using a modified version of goben [33] to generate various traffic patterns. The neural network was trained to maximize the total throughput in a network. The modifications to Iroko framework are as follows: • Procedures to collect traffic from a network using ifstat; • Procedures to compute the previously described observations from the traffic; • Objective function fed to Ray that now expresses the total throughput in a network; • Dynamically modified network topologies that can change during training between episodes; • Dynamically selected currently considered intents that also change between episodes; • Support for dynamic traffic that can vary between iterations.
The network was trained in series of episodes. Each episode was defined by an SDN network topology and a set of active traffic demands. Each demand was described by its source, destination, and an active time period. Note that the proposed neural network was trained using various topologies and traffic patters; thus, it is topology independent. Because of the limitations of mininet and utilized machines, the considered topologies consisted of ten switches at most. In the training process, we used three different topologies with eight different traffic patterns for each topology. The episodes were repeated for 36 times, totalling 864 episodes in the training process. Each episode lasted for 100 s in real time, resulting in 100 training steps. In each step, an active demand was selected at random and only this demand was considered in this particular step. The observations were computed for the selected demand, and depending on the output of a current neural network the action was taken.
We set the learning rate of RLLib's Adam optimizer [34] to 5 × 10 −4 . A replay buffer size was set to 5000, which is approximately twice as much as a number of steps in one loop over all utilized topologies and traffic patterns. The exploration decreased linearly from 1.0 to 0.02 in the first 5000 steps and then remained constant. Finally, the target network was updated each 200 steps, and the training batch size was set to 8.

Generating Network Traffic
To test our AI solution, we use Networked Application Emulation System (NAPES), a traffic generator we developed within the FlexNet project. The motivation behind the development of NAPES is to enable rapid deployment of setups that exhibit complex, time varying traffic patterns, which closely resembles those of actual applications. The main innovative idea behind NAPES is that it allows distributed, traffic-generating "applications" to be defined with elements of application logic. Specifically, a NAPES application consists of communicating application components, each of which can have several state machines representing the component's logic. State machines are meant to approximate the logic of an actual application. Interacting state machines of an application's components jointly give rise to complex traffic patterns. A NAPES developer specifies the state machines (i.e., states and state transitions) for each of an application's components.
State transitions occur in response to events, which may originate locally (timer events) or may arrive from other components of the application. Thus, the events serve the purpose of intra-and inter-component coordination.
Besides lightweight event-based coordination, components communicate via flows, which form the traffic stress-testing the network under investigation. A component may generate a flow addressed to another component. A flow is a sequence of packets described with some parameters, e.g., ones that describe a distribution of inter-packet times and a distribution of packet sizes. Flows are sent and received by means of components' ports. A client port generates the packets of a flow, while a server port receives the packets.
The generation of flows is governed by the states of the states machines inside the components comprising a NAPES application. Specifically, for each client port, a NAPES user specifies a flow to be generated in each state of a state machine assigned to control the client port (in some states there may be no flow at all). Thus, when an event (a local one or received from another component) causes a state transition, flows generated by the component's client ports controlled by the state machine in question may change. As a result of state changes of components' state machines, a NAPES application may generate traffic of different patterns. Overall, the above mechanisms allow a NAPES to approximate some actual IoT application, whose components are also likely "to be in different states" reflecting the state of the environment.
In terms of implementation, a component is specified in a data structure to be interpreted by the so-called NAPES Runtime. A NAPES Runtime should be installed on a computing device to make it a NAPES node. We intend to implement a NAPES Runtime on different platforms, from resource-constrained IoT nodes to cloud servers. We currently have a Linux implementation; runtimes for Android and Arduino are being developed. As the accepted component data structure is the same, no matter what the underlying platform, NAPES components are portable and can be run on different nodes. Inter-component events are exchanged as short Message Queuing Telemetry Transport (MQTT) messages.
A general approach to stress-test a network with NAPES is to (a) develop a multicomponent NAPES application with an application logic close to that of an actual application, (b) connect to the network under test a number of NAPES nodes (computing devices with NAPES Runtimes), (c) distribute the data structure representations of the application's components to the NAPES nodes, (d) run the application and collect logs. Notably, NAPES applications are agnostic as to an underlying stress-tested network. There is no application-to-network signaling, which calls for some network intelligence and adaptivity.

Experiments and Results
The AI algorithms were tested with the network topology shown in Figure 2. The test network consists of thirteen NAPES hosts on which a simple client server application is deployed. All network's links have the capacity of 50 Mbps and the access links (between hosts and network switches) have the capacity of 1 Gbps. The hosts at the top of Figure 2 represent sensor nodes (e.g., equipped with cameras) and the host at the bottom represents a cloud server. The network topology was setup in such a way that it becomes easily congested around the cloud server node, when shortest path routing is used, while the network has still enough capacity to handle the traffic generated by the active sensors. The application emulates the case when a space (e.g., the streets of a smart city) is instrumented with multiple sensor nodes. A sensor node may be in the standby or alert mode. When in the standby mode, no traffic is generated; when in the alert mode, a node generates a simple User Datagram Protocol (UDP) flow with the specific bit rate of 20 Mbps. Note that the bit rates have been chosen to allow convenient experimentation with the AI algorithm; they need not be realistic from the application domain point of view. Imagine that some stimulus (e.g., noise from a street sweeper vehicle at night time) triggers a change from the usual standby mode to the alert mode. As the sweeper moves down the streets, different sensor nodes are exposed to the noise and enter the alert mode (starting generating traffic).
We assumed that two sensor nodes enter the alert mode at a time (imagine two cameras deployed at each intersection). Moreover, pairs of sensors are alerted according to a regular pattern. Specifically, NAPES components running on the nodes Host-01 and Host-02 start generating traffic at the time 0 s (one flow per node), components on the nodes Host-03 and Host-04 start at the time 20 s, components on the nodes Host-05 and Host-06 start at the time 40 s, etc. Each component remains in the alert mode for 40 s and, afterwards, returns to the standby mode (with no traffic generated), which lasts for 100 s. After that, a component enters the alert mode again.
To conduct the experiments with the AI algorithm, we emulated the above test network using the MiniNet tool [35]. The network nodes represent the Open Virtual Switch (OVS). The test flows were generated using the NAPES traffic generator. In all cases the test instances were run on the hardware platform with the following parameters: Ubuntu system version 20.04.
In the following experiments, we compared 2 network parameters i.e., total throughput obtained by the node representing cloud server and temporary network loss rate (averaged over all flows) to show the advantage of the proposed AI algorithm. Figure 3 shows total throughput without AI and Figure 4 shows total throughput with the AI algorithm. It is noticeable that the AI algorithm allows for better network resource utilization by rerouting traffic to alternative paths and as a result increases the overall network throughput. The hosts Host-01 ... Host-12 are arranged in pairs that periodically switch between active and inactive states. When four hosts become active at the same the network becomes congested if all flows are routed over the same link (this may happen with shortest path routing). With the AI algorithm, the flows can be routed over link disjoint paths, thus, network congestion can be avoided. This can be observed, for example, in the time period between 20 and 40 s. Without AI, the server throughput is only 50 Mbps, as all active flows are routed over a single link with a capacity of 50 Mbps. When the AI algorithm is enabled, two flows can be rerouted to alternative disjoint paths and the throughput increases to 80 Mbps (which is equal to the offered traffic).  The advantage of using AI is even clearer for the loss rate parameter. With the AI algorithm, network losses are significantly lower than without using AI (compare Figures 5 and 6). However, the traffic losses cannot be completely avoided as the AI algorithm needs some time to "observe" the traffic prior to the path switch over decision.  It can be seen in Figure 5 that the network suffers from six congestion periods due to the naive shortest path routing (the same can be seen also in in Figure 3). It happens when four hosts nearest to the same aggregation switch (sw01, sw02, and sw03) are all operating. The situation happens two times for each aggregation switch during the 280 s experiment. The first congestion period (form 20 to 40 s) happens when hosts: Host-01, Host-02, Host-03, and Host-04 are operating and all the traffic is routed on the shortest path. In such a case, link sw00-sw01 becomes the bottleneck, because all the traffic is using it. However, in the presented AI framework, each intent can chose between two paths that are as link disjointed as possible. It is clearly seen in Figure 2 that, for each host (Host01-Host-12) in the considered network, there exists a pair of paths to the server (Host-00) that shares only two links. These links are directly connected either to the host or to the server. Therefore, there is always a combination of possible paths, which can be assigned to intents, that will prevent from the described congestion situations on the links connecting the server with the aggregation switches. As displayed in Figure 4, the presented AI framework usually finds this perfect assignment and allows for considerable loss reductions.
The results of the above experiments obtained for five different test cases are presented in Tables 1 and 2. Table 1 shows the per flow average throughput obtained for experiments performed with and without the AI algorithm. Table 2 shows similar results for the average flow loss rate parameter. We can see that the traffic losses (and the resulting throughput reductions) are not evenly distributed between flows. Without the AI algorithm, the per flow loss rate can be as high as 40 % (meaning the same level of throughput decrease). The AI algorithm reduces the average per flow loss rate over 3 times, to an average level of 6 % (from an average of 18 % for the case with the shortest path routing).

Conclusions
In this paper, the management of SDN network resources with AI support for IoT applications was presented. Particularly, the studied problem focuses on an optimal switching of intents between two available paths in an SDN networks using AI algorithm. In the developed architecture, the switching is performed based on Deep-Q-Network AI mechanism where its main purpose is to maximize the total throughput in the SDN network with minimal average total data loss rate for served IoT applications. The presented approach is an intelligent SDN management system with AI support, especially applicable in programmable next-generation networks (5G and beyond).
Moreover, the network resource allocation in context of QoS assurance is a growing problem, especially for IoT services. The designed solution is suitable for fast and optimal resource allocation, especially in cases of emergency and delay-sensitive IoT services. To test the proposed AI solution, we used an IoT traffic generator called NAPES, developed within the FlexNet project. We compared the performance and quality parameters, i.e., total throughput in the network and average data loss rate, to evaluate the AI usefulness in the designed solution. The presented results confirm effectiveness of proposed AI approach, which significantly improves the overall Quality of Service and network performance. In particular, it maximizes the total network throughput as well as significantly reduces the average total data loss rate. The obtained results confirm the rightness of applying AI to the presented problem.
Further research will be focused on more comprehensive experiments that include real scenarios from IoT use cases with more extensive network topologies, with different QoS parameters (losses and latency) and with real traffic scenarios from IoT vertical services, loss and delay sensitive.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: