Handling Efficient VNF Placement with Graph-Based Reinforcement Learning for SFC Fault Tolerance

Ros, Seyha; Tam, Prohim; Song, Inseok; Kang, Seungwoo; Kim, Seokhoon

doi:10.3390/electronics13132552

Open AccessArticle

Handling Efficient VNF Placement with Graph-Based Reinforcement Learning for SFC Fault Tolerance

by

Seyha Ros

¹

,

Prohim Tam

¹

,

Inseok Song

¹,

Seungwoo Kang

¹ and

Seokhoon Kim

^1,2,*

¹

Department of Software Convergence, Soonchunhyang University, Asan 31538, Republic of Korea

²

Department of Computer Software Engineering, Soonchunhyang University, Asan 31538, Republic of Korea

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(13), 2552; https://doi.org/10.3390/electronics13132552

Submission received: 20 May 2024 / Revised: 14 June 2024 / Accepted: 21 June 2024 / Published: 28 June 2024

(This article belongs to the Special Issue Recent Advances of Cloud, Edge, and Parallel Computing)

Download

Browse Figures

Versions Notes

Abstract

Network functions virtualization (NFV) has become the platform for decomposing the sequence of virtual network functions (VNFs), which can be grouped as a forwarding graph of service function chaining (SFC) to serve multi-service slice requirements. NFV-enabled SFC consists of several challenges in reaching the reliability and efficiency of key performance indicators (KPIs) in management and orchestration (MANO) decision-making control. The problem of SFC fault tolerance is one of the most critical challenges for provisioning service requests, and it needs resource availability. In this article, we proposed graph neural network (GNN)-based deep reinforcement learning (DRL) to enhance SFC fault tolerance (GRL-SFT), which targets the chain graph representation, long-term approximation, and self-organizing service orchestration for future massive Internet of Everything applications. We formulate the problem as the Markov decision process (MDP). DRL seeks to maximize the cumulative rewards by maximizing the service request acceptance ratios and minimizing the average completion delays. The proposed model solves the VNF management problem in a short time and configures the node allocation reliably for real-time restoration. Our simulation result demonstrates the effectiveness of the proposed scheme and indicates better performance in terms of total rewards, delays, acceptances, failures, and restoration ratios in different network topologies compared to reference schemes.

Keywords:

deep reinforcement learning; fault tolerance; graph neural network; network functions virtualization; service function chaining

1. Introduction

Fifth generation (5G) and beyond 5G (B5G) are deeply exposed to the most complex techniques for the current generation of a new architecture for handling the massively brilliant IoT devices [1,2,3]. Moreover, network functions virtualization (NFV) [4] provides the needed flexibility and scalability to deploy the next-generation homogeneous network resource and network applications in a shared infrastructure. In addition, NFV deploys the network service to instantiate the provision of general-purpose virtual network services such as storage and servers and leverages interconnected virtual network functions (VNFs) [5]. NFV technologies especially emerge to illustrate the network functions (NFs) (e.g., load balancing, firewall, deep packet inspection) [6,7,8,9]. While deploying VNFs [10] to facilitate resource compacity, resource consumption is still needed with software-defined networking (SDN) [11] for a programmable capability globally viewed in terms of flow rules and network traffic management. Moreover, SDN provides network flexibility through the data plane and control plane via open flow protocol, which leads to meeting flexibility leverage programming controllers in centralization networks [12,13,14]. By adopting both SDN and NFV, in which the controller levitates and facilitates computing resource tasks, the controller is able to levitate and facilitate [15,16]. Through the sequence of ordered VNFs, the service handler’s technique can be implemented via service function chaining (SFC) to define and manage the flow of network services across multiple VNFs [17,18,19]. SFC allows network operators to define and enforce specific paths for traffic flows through various network services. By chaining these functions together, operators can create customized network service paths for different types of traffic flows based on their requirements.

On the other hand, multi-access edge computing (MEC) [20,21,22,23] produces leveraged, computationally intensive storage for offering device and model training/inference capability to the user in edge networks. Meanwhile, MEC is a technique that challenges the minimizing latency-efficient resource computing and consumption, meets the high-quality of service (high-QoS) [24], and improves the serving of resource utilization, which gets along with the SDN/NFV controller to overcome the heterogeneity of generated IoT data [25]. Leveraging deep reinforcement learning (DRL) is the most efficient method for routing network traffic to enable an autonomously driven configuration of dynamic resources in cooperation with the SDN/NFV controller. In addition, graph neural networks (GNNs) replace the gathering network environment by vectorizing the attribute of network topologies that can update a network function’s hidden states for flow/node-level prediction. Researchers have used the graph convolutional network (GCN) variant [26,27] to extract the features of nodes and links of the physical network that enable it to handle the adaptive SFC placement and maximize the long-term rewards.

Despite these advantages, the migration from adjusted, high-grade hardware to create virtualization software often sacrifices reliability and availability. VNF consists of potential failure caused by either software bugs or hardware, in scenarios such as creating a new service appliance. As the chain length of VNFs increases, the overall availability of the SFC significantly diminishes since the SFC relies on all VNFs.

In response to the aforementioned statement, we proposed the deployment of SFC in SDN/NFV based on GNNs with DRL in networks, which aimed to optimize the resource consumption of services, overhead-intensive computation, and dynamic service adjustments in MEC. We consider the optimization of core networks to avoid fault tolerance during SFC execution. Moreover, to tackle the challenges of using the GNN-DRL-based approach, efficient generalization in training experiences is needed to differentiate network topologies in congested service requests and high failure ratios that need restoration and migrated resources. The utilization variance of GCNs is gathered from the features of nodes, in which nodes represent the VNFs in a graph, and edges represent relationships between these SFC cluster entities. Our main contributions are summarized as follows:

We formulated physical resources in MEC to obtain the virtual resource capability by leveraging a GCN to gather the states. The node cluster of VNF management and orchestration in user requests were used to interact with the GCN. It represents all the VNF deployments as nodes, and the message-passing method consists of feature nodes and links.
The utilization of Markov decision process (MDP) formulation and a DRL algorithm was leveraged to learn an optimal network resource adjustment policy adaptively.
This article proposed GRL-SFT, which combines GNN-DQN for SFC deployment within NFV architecture. The GRL-SFT framework was implemented in a simulation environment to conduct the experiment for evaluating the performance in terms of cumulative reward, delay, acceptance ratio, failure, and restoration ratio when processing different chain sizes and user request rates in different topology sizes.

The rest of the paper is organized as follows. Section 2 reviews the preliminary network background of VNF placement. Section 3 demonstrates the potential for formulating problems for our proposed approach in handling network failure tolerance in terms of providing VNF resources and minimizing the delay of VNF installation and instantiation. Section 4 significantly presents the proposed GNN-DRL-based SFC for addressing multi-service scenarios. GNN-DRL is utilized to obtain the hierarchical leverage of MDP and GNN-based DRL to enhance the optimal fault-tolerant SFC placement. To evaluate the scheme’s performance, Section 6 shows the network setting and conducts experiments on how well our proposed approach works for indicating several vital aspects (e.g., reward, delay per SFC, SFCR acceptance ratio, SFCR failure ratio, and SFCR restoration ratio) that are compared with reference schemes. The last section describes the conclusion and future work. Table 1 presents the used acronyms.

2. Background Studies

This section gathers several complementary studies of the enabling paradigms in our paper’s domain. The SFC task requires the consideration of network topology, VNF deployment, and user requests. Figure 1 presents how SFC defines and enforces specific dynamic paths for the traffic flows through various network services that abstract from the physical resource to the instance virtual resource. Hence, VNF resource is indicated allocation resource that align from VNF1, VNF2, and VNF3. Those VNF are allocated different capacity resources. NFVs consist of the resource-configured module of the network function selection to recover failed nodes for variability requirements. Mapping and redundancy techniques are often used to ensure the SFC network’s availability; hence, when the physical server faces VNF failure, participants of the network traffic are to be redirected to the backup VNF nodes.

2.1. VNF Placement Solution

NFV, MEC, and SDN are paradigms for providing efficient resource utilization, namely softwarization and virtualization, to enhance network service providers in a resource capability context. Based on the apparent network configuration, the network controller and management illustrate the performance of offering resource reliability and efficiency based on VNF placement paradigms. Several approaches have been explored to minimize operational costs, including deployment, energy, traffic forecasting and forwarding, and resource fragmentation costs. Aiming at the traffic prediction and adaptive scaling of VNF, in reference [28], the network prediction of the traffic congestion of SFC deployment is proposed to obtain an improved network processing of the consequential scaling of the VNF. This paper proposed a k-short path algorithm to calculate the routing path between VNF instances to complete the dynamic deployment of SFC, which aims to optimize the fluctuations of network traffic patterns of service demand. In reference [29], the SFC placement problem is addressed with the aim of minimizing energy and traffic costs.

Furthermore, reference [30] considers the SFC placement problem to reduce the number of physical servers and links required for service provisioning. Pei et al. [31] focused on the SFC embedding problem (SFC-EP) in distributed cloud environments. The authors formulate the problem as binary integer programming (BIP), developed using dynamically configured VNF placement. The authors leveraged two proposals, namely the SFC embedding approach (SFC-MAP) and the VNF dynamic release algorithm (VNF-DRA), which can provide an efficient resource capability placement scheme by applying the shortest path algorithm in the multi-layer graph while optimizing the VNF instance to reduce deployment time, respectively.

2.2. Graph-Based Reinforcement Learning for SFC

Several DL methods are proposed to normalize and control the network data-centric policy by interacting with VNFs in core network paradigms. However, the insufficient optimal approaches for overcoming the network performance eliminate the resource constraints and minimize the cost of resource allocation and VNF instance pooling. Several research works have studied comparative differences in model utilization based on surveys [27,32,33]. GNNs can extract node features and solve the problem by correlating neighbor-sharing network information for another node. In reference [34], a variational graph autoencoder (VGA) method was proposed to accelerate virtual network embedding. Their proposed model leverages the adjacency matrix and the resource feature matrix of physical networks to group the cluster’s physical nodes and embed the virtual network by choosing the servers in each cluster. Sun et al. [35] implemented a graph network (GN) to delineate the nodes and connections within a network topology. They enhanced this model using RL to devise a VNF deployment approach that minimizes the associated lowest deployment costs. The energy-efficient VNF placement of the graph in reference [36] is proposed by utilizing the GCN to encode the graph structure representing the physical network and node.

2.3. SFC Fault Tolerance Awareness

In this sub-section, SFC fault tolerance awareness is a critical area that mainly occurred with NFVO; meanwhile, regarding the deployment of VNF instances and adjustments, Chantre and da Fonseca [37] investigated the redundant stateless VNF placement problem in LTE networks. In the case of malfunctions, states generated via stateful VNFs during traffic processing need to be transferred to standby instances to ensure seamless request redirection. However, Kothandaraman et al. [38] studied the joint active and standby stateless VNF placement problem without considering VNF state transfers. Yang et al. [39] and Yuan et al. [40] jointly optimized the objectives, but several challenges remain. Their work could not ensure a reliable service and would waste abundant resources. In addition, its incapacity to deal with real-time requests makes its performance poor. Thus, an effective solution to schedule both active and standby instances concurrently and flexibly is urgently needed.

Previous solutions for optimizing SFC placement have faced the problem of overspending when dealing with complex network states. Another essential paradigm is to leverage machine learning mechanisms. For instance, the authors in [41] proposed DRL-based approaches to handle the large network state space in the SFC placement problem. The authors in [42] formulated the problem as a virtual network function orchestration (VNFO) problem and handled it with a DRL-based approach. However, existing efforts either neglected fault tolerance or addressed the fault tolerance problem offline. Our model solves the fault-tolerant SFC placement for the first time, which automatically learns and acts in real-time. In particular, we decide the placements of VNF resources with different levels of resource guarantees, which can discriminate customized SFC availability.

3. Problem and Formulation

3.1. Substrate Network Model

In this section, we describe the physical infrastructure resources that are mapped into several network functions that host the multiple service functions to demonstrate the requirements and monitor the network state function underlying a single shared infrastructure. MEC illustrates the reachable QoS requirements mapping regardless of MEC node/link failure or overload. We modeled the physical resource set as directed by the graph

V = {1, 2, 3 \dots, v \in |V|)

, which has a finite total resource capacity (CPU, memory, and disk), which is denoted by

V \in ℝ^{+}

. Each physical link,

E = \{1, 2, \dots, e \in |E|\}

, has a bandwidth resource participant in e of k of the MEC instance denoted as

b w_{v}^{t}

. Table 2 demonstrates the denotation of the network system models utilized in this paper.

3.2. Service Function Chaining Request Models

In this NFV, services contain several applications that are deployed as VNFs and specify virtualization software (e.g., VPNs, firewalls, load balancing, and mobile networks). We model network functionality as an undirected network graph,

\bar{G} = (\bar{V}, \bar{E})

, where

\bar{V}

and

\bar{E}

are denoted as the set of the VNFs and virtual links (VLs) of interconnected VNF nodes, respectively.

\bar{E}

was used to compose and chain VNF VLs. Hence, VNFs are investigated for making a sequence of changing resource requirements such as in the CPU, memory, and disk in

c_{\bar{v}}^{f, t}, m_{\bar{v}}^{f, t}, d_{\bar{v}}^{f, t}

, respectively.

3.3. Problem Formulation

In this section, we focus on solving the SFC placement problems of the network system model in MEC. In the definition of SFC,

F_{r}

is a setting of VNFs, and

\bar{V}

refers to the

\bar{v}

-th VNF in

F_{r}

when

\bar{V}

deploys onto the VNF servers. It needs computing resources (CPU, memory, and disk) and bandwidth, which is defined as

S_{g}

.

Decision variable: we define the following decision variable where the binary variable

\emptyset_{v}^{f_{k}}

is used to represent whether

{\bar{v}}_{f} \in V

is mapped at the physical node or 0 otherwise. As in Equation (1),

X_{\bar{v}} = 1

indicates the acceptance of SFC

F_{r}

while

X_{\bar{v}}

= 0 indicates its rejection. This can be expressed as follows:

X_{r, v}^{s} = \{\begin{matrix} 1, |\bar{V}| = \sum_{v = 1}^{M} \sum_{\bar{v} = 1}^{\bar{V}} \emptyset_{v, \bar{v}}^{F_{r}} \\ 0, o t h e r w i s e \end{matrix}

(1)

Constrained by physical resource requirements, the deployment of VNF mapping, which aim to optimize the resource capacity of the physical network node that orchestrates the process, is an interrelated two-step process.

(1): VNF placement: place the VNFs in the SFC request and properly install them onto the physical node.
(2): VL selection: ensure the mapping of the links is consistent and proper between virtual and physical links for ordering the VNFs.

X_{r, v}^{s} = \{\begin{matrix} 1, VNF r in SFC s is mapped to MEC m \\ 0, o t h e r w i s e \end{matrix}

(2)

Z_{v, e}^{E} = \{\begin{matrix} 1, {\bar{E}}^{r} i s m a p p i n g t o E_{v, e} \\ 0, o t h e r w i s e \end{matrix}

(3)

To ensure the user’s service quality, delays in data traffic should be reduced as much as possible in SFC orchestration. These delays include two parts:

(1): The processing delay of VNF instances for hosting on the physical node.
(2): The propagation delay of data transportation over a physical link.

D_{T o t a l}^{\bar{E}} = \sum_{v \in V} \sum_{r \in \bar{V}} X_{r, v}^{s} D_{r}^{s} + \sum_{E_{v, e}} \sum_{\bar{e} \in \bar{E}} Z_{v, e}^{E} D_{ρ}

(4)

where

D_{r}^{s}

denotes the processing delay of the VNF

r

in SFC request s.

D_{ρ}

denotes the propagation delay of data transmission over the physical link

E_{v, e}

.

The delay of each SFC cannot satisfy the exceeding of its maximum tolerance delay, shown as follows:

\sum_{v \in V} \sum_{\bar{v} \in \bar{V}} X_{r, v}^{s} D_{r}^{s} + \sum_{E_{v, e}} \sum_{\bar{e} \in \bar{E}} Z_{v, e}^{E} D_{ρ} \leq D_{m a x}

(5)

During the orchestration process, the physical resources available must align with the ongoing resource demands. Therefore, the following requirements must be satisfied:

\sum_{v \in V} \sum_{\bar{v} \in \bar{V}} X_{r, v}^{s} C P_{r}^{s} (. | C_{\bar{v}}^{f, t}, m_{\bar{v}}^{f, t}, d_{\bar{v}}^{f, t}) < σ_{k}^{t} (C_{v}^{m a x}, m_{v}^{m a x}, d_{v}^{m a x})

(6)

\sum_{E_{v, e}} \sum_{\bar{e} \in \bar{E}} Z_{v, e}^{E} b w_{\bar{E}}^{t} \leq b w_{E}^{t}

(7)

Equation (6) denotes the total computing resources consumed by VNFs deployed on node

v

that should be less than the total computing resources. Equation (7) illustrates that the total bandwidth resource utilized by a VL on the physical link should be less than the total bandwidth of the physical link.

There is resource competition between service requests in this case of limited resources. We aim to optimize the delay of each SFCR. Therefore, we minimize the average complete delay of the physical resource with the following optimization objectives:

\min_{\{X_{r, v}^{s}, Z_{r, v,}^{s}\}} \sum_{E_{v, e}} \sum_{\bar{e} \in \bar{E}} (D_{T o t a l}^{\bar{E}})

(8)

s.t. (5), (6), (7)

3.4. Markov Decision Process Models

In this model, we have gathered to normalize resource utilization and problems as MDP models. In this phase, the DRL approach can be a value-based approach (DQN). We formally present the MDP model, which is typically defined as

< S, A, P, R, γ >

, where

S

and

A

are the sets of continuous states and discrete actions, respectively.

P : S \times A \times S

is the transition probability distribution.

R : S \times A

is the reward function.

γ \in [0, 1]

is the discount factor in the further rewards.

The MDP state transition is defined as (

s_{t}, a_{t}, r_{t}, s_{t + 1})

, where

s_{t}

is the current network state of resource utilization in the MEC server.

a_{t}

is the action taken for handling to configure the resource of the SFC embedding.

r_{t}

is the reward function with evaluation score metrics following the minimization objective and

s_{t + 1}

is the new network state of the next iteration.

4. Proposed Network Scenario

4.1. Configuring SFC Execution Based on GNN-DRL Approach

In this section, we aim to provide the SFC techniques with opportunities to instantiate orchestrated and managed service priorities for achieving SFC fault tolerance with the feasibility of mapping the real-time communication aspects. Hence, the VNF module can be instantiated on the specific servers when the SFC request arrives in the network. SFC becomes a significant resource responsibility of network mapping in terms of sequential VNF placement in the formation of management and orchestration (MANO) for mobile networks and edge devices. When SFC is enabled in SDN-NFV, an edge computing infrastructure context, it allows for the composing of services and the handling of the effortless customization service to guarantee the policy aspect.

Figure 2 delves into the resource complexity of constraints, particularly in terms of compromised resource utilization. This is a crucial area where the proposed use of the GNN-based SFC comes into play. The GNN-based SFC is designed to handle the network configuration of components (VNFO and VNFM) in a way that minimizes resource utilization. This is a significant step in our proposed solution.

(1): The initialization of all the network parameters is configured as the first communication step in the SFC request, in which SFC inspects the packets in the control layer to consider two processing steps, which instantiates the SFC request or checks the available SFC requests. By considering this processing flow, the controller checks the resource based on the service criteria containing the information that used to be utilized or previously applied. On the other hand, besides optimizing the resource allocation of the SFC list requested, a GNN is implemented to perform a feature exaction of a node and the correlation of links in which the state transition of the adjacency matrix is multiple of the hidden state. The hidden state is the annotation matrix where node label vectors are set, including network topologies and ingress messages.
(2): The NFVO is used to conduct the extraction, initialize VNFFG, and define the specific order of the SFs. After the NFVO gathers information averaging the degree to which the CPU load, memory, and capacity utilization increases or decreases from the threshold value, virtual network function management (VNFM) defines the procedure of instantiating or deleting the VNFs due to SFC requests and acknowledges the NFVO.
(3): Along with the GNN embedding process, which realizes a distributed identified information input representation vector that may represent many independent factors, the embedding process is applied with a VNF vector and the node encoding/decoding (considering the final representation of the encoder and accomplishing the final step to process VNFs either on node selection or removal) of input VNF types of vectors.
(4): The NFV creates a new record of the chain identifier for operating with additional resource migration, update, and deletions; then, sends it to the SFC orchestrator (SFCO). The SFCO creates a service function path (SFP) flow rule for a new SDN controller routing table. Finally, the SDN adds flow entry, acknowledges updates into the flow table of layer 3, and makes the SFCO conscious of the successful operation.

4.2. Selecting SFC-Enabled GNNs in Control Entities

In our network scenario, we integrate the system architecture with core SDN/NFV controllers to handle the mission-critical services with lower time constraints.

GNN-based SFC aims to overcome obstacles in data usage (data-centric) and high availability (HA) to proactively predict resources and reward time. Therefore, the pillar of steering resource mapping is the primary concern in MANO. Sequentially, the approximate time series of the configuration of resource parameters is data-centric. Despite this, the MANO of NFVI is instantiated via the resource allocation of the forefront technique to support functionality in terms of diversity resource consumption and compromise resource adjustments. All components involved in deploying and managing SFC mapping are investigated to maximize resource utilization while meeting time constraints and ensuring flexibility. Figure 3 presents the interaction between end devices and the controller using GNN-DRL. This approach focuses on network resource optimization within an adaptive environment. It includes controlling routing policies in SDN/NFV entities and migrating the execution of resource-intensive tasks to edge paradigms. The GNN-based SDN/NFV controller is customized to simultaneously manage three key components: end devices, gNB, and the SDN/NFV controller.

(1): Paging initializes the IoT device’s status to establish the corresponding feedback on the state. The connection setup request of radio resource control (RRC) is connected and initialized to the network function for broadcasting and connecting in mobility.
(2): Access points gain the message from the initialized IoT device and utilize AMF in 5G. After that, AMF acknowledges and immediately establishes the configuration connection process.
(3): After negotiating, AMF and IoT devices proceed with the connection establishment process.
(4): With service-level information, negotiation is completed on which the access point carries the information of IoT devices. Then, SDN/NFV gathers and creates the flow table, and NFV determines the properties and capabilities of the resource.
(5): NFV initializes to set up the SFC for supplying differentiated requests. Meanwhile, VNF-FG pooling enables the consolidation and sharing of VNF resources across multiple tenants or services, promoting resource efficiency and flexibility. Each VNF-FG can be represented as a graph, where the network functions are represented as nodes, and the connections between them are represented as graph representations. The GNN extracts the features of the node as a graph structure and propagates them through the network edges aggregated. Otherwise, it allows for classification and link prediction tasks with VNF-FG pooling. The output and analysis are conducted to calculate the short path of traversing from one node to another node. It can define the optimal path of VNF and identify performance-initialized resources.
(6): Through efficiency classification, the SDN controller interacts with the GNN proposed for gathering the state observations and applies the action value for network policy orchestration.
(7): Notifications with user-level status are addressed to orchestrate resources on demanding IoT devices.
(8): After applying the rule of policy from the SDN/NFV controller, IoT devices update the user-level status of real-time information to the controller again.
(9): The controller gathers users on demand and checks with previous states’ observations to modify the real-time orchestration resource.

5. Proposed VNF Placement Solution

5.1. Proposed Methods-Based Solutions

In this section, we formulate the VNF placement as the problem that deals with interactions to be modeled as a Markov decision process (MDP) and use GNN-based DRL to enhance the optimal fault-tolerant SFC placement; we begin with the architecture. Then, we present an algorithm based on a graph convolutional network (GCN) and deep q-network (DQN) to show how this adaptive online DRL approach works while deploying. After that, we introduce GCN-based DRL to utilize the network systems by capturing the network node and link of the network adjacency metric prediction. After the GCN illustrates an output of node levels, feature matrices, and links, there are two steps to processing our proposal. We used a GCN to create node- and graph-level encoding using the attention layer. Figure 4 presents our proposed network architecture, which utilizes message passing with the network architecture that the performance node model predicted. A general overview of our proposed GNN-based DRL approach to the optimal sample collection process at the node level, link predictions, and training procedures for policy networks and critic networks is described in step as the following:

NFVO and SDNC manage flow tables and resource availability in the data-centric, which globalizes all the current states (

s_{t}

) of the MEC network, which are represented by graphs fed into the GCN. Subsequentially, a network matrix

N_{t}

represents the node-level encoding of the MEC network; it outputs after multiple convolution layers. The model employs an attention mechanism to weigh MEC node representation contributions. These weighted representations then dynamically aggregate to form a single, real-valued vector (

h^{t}

) encoding the state of the entire GCN performance. This vector (

h^{t}

) serves as the input to the policy network, effectively capturing the structured information fed into the current policy. After obtaining the output of the different node-level predictions, we defined setting the edge weight. After that, DQN is performed hierarchically, as in the previous procedure. In this way, the output of the GCN encodes the graph structure of the node level and the link aggregation metric. The GCN currently offers a state of real-time information as the real-valued vector is brought into the policy of the target network and q-network.

5.1.1. Graph Neural Network-Based Approach

A GCN is an optimal approach, which is semi-supervisor learning. Compared with ordinary GNNs, GCNs introduced the conventional learning approach. Besides that, GCNs can be automatically extracted from spatial features (i.e., node label, topology representation) and automatically arbitrary graph-structure data. We assume that among a graph

G

with

v ϵ V

, nodes have input vectors and

F

dimensional features, the graph structure can be represented by the adjacency matrix

A \in ℝ^{N \times N}

, and the node features can be expressed as

X \in ℝ^{N \times N}

. With a single GCN filter, the spatial features of the nodes are determined by their first-order neighbors, as expressed in the following equation.

Z^{(k)} = G C N (X, A) \leftarrow σ (\hat{A} {\hat{D}}^{- \frac{1}{2}} Z^{(k - 1)} Θ^{(k - 1)})

(9)

σ (.)

is a nonlinear activation function, where

\hat{A} \leftarrow A + I_{N}

is the adjacency matrix that adds the node of network propagation delay and

I_{N}

is identify matric of

A

.

\hat{D}

is the diagonal matrix with self-loops

\hat{A}

, and

Θ

is the layer-specific trainable weight matrix. The application of GNNs mainly includes node classification, edge classification, link prediction, and whole-graph classification.

Z^{(k - 1)}

is the node representation in the convolutional layers of the

k

and

k + 1

graph.

In addition, graph convolutional layers are mapped to each spatiotemporal layer to aggregate neighbor information. The application GCN mainly uses node and link prediction within the VNF classification and link connection instead of node and link prediction. This approach aims to devise the deployment strategy for VNF processing by leveraging the current SFC graph’s topological data, resource requirements, and deployment statuses of all VNF nodes. When employing GNNs, it is notable that graph classification is compulsory to extract the graph representation based on node features. This involves aggregating extensive graph information, a process referred to as readout. We adopt a straightforward technique to aggregate and read out node features.

5.1.2. DRL-Based Approaches

SFC resource mapping is used in the hierarchical scenario in terms of VNF placement and VNF instance to handle service needs. The SFC embedding problem is an MDP. The state, action, and reward of the MDP are described below:

The state is to gather a network state where there is a state space

S

, an action space

A

, a transition dynamic

P

, and a reward function

R

. At each time slot

t

, the agent is observed in a network state

S_{t}

, an action

a_{t}

, and a reward

r_{t}

. Then, the next step of the state environment is

s_{t + 1}

. In special cases, the agent obtains a special state called a terminal.

R_{t} = E [\sum_{t} γ^{t} r_{t}], γ \in [0, 1]

(10)

The discount factor γ plays a crucial role in guaranteeing the convergence of the cumulative return in setups with infinite horizons. The agent’s actions are governed by a policy π, which maps a given state

S

to a distribution of the available actions

A

. For each state s, there exists a corresponding value function π(s), which maps it to a scalar representation of the expected reward the agent anticipates receiving while in that state and following the policy π:

V^{π} (s) = E_{π} [R_{t} | s_{t} = s]

(11)

5.2. Designed GRL-SFT Framework

In the sub-section, we synthesize the physical and logical infrastructure to handle the integration of GCN and DRL components as GRL-SFT for operating the controller of SFCO to ensure approximate reliability and dynamic mapping for avoiding redundancy and fault tolerance. Algorithm 1 presents the GRL-SFT to ensure that the Q-value is approximated and the optimal action is selected as VNF nodes and link prediction for SFC placement.

In the processes initialized, the GCN model initializes its weight and architecture, which includes the number and type of layers, the activation function, and the parameters. There are various types of procedures of this algorithm that are utilized throughout the exploration and exploitation (using

ε

epsilon greedy); during the exploration, factor ε is less than 0.5 and the algorithm explores the environment by randomly selecting an action. The environment of GCNs relates to investigating graph topology and state information. Within that, GCNs construct the input graph representation, with current states

(s_{t})

used to capture the relevant information on MEC environments in forming graph structures. The GCN’s primary function then is to process this graph to generate node embeddings, which encode the information within the graph. For each possible action (which includes configurations for bandwidth allocation, resource allocation, resource instance, and VNF placement), the Q-value is calculated using DNN as a function approximator.

Finally, an exploration policy is employed to select the action. We calculate the Q-value (expected future reward) for the state

s_{t}

, resulting from taking each possible action at that VNF. It computes the Q-value by considering the next state

s_{t} + 1

and the Q-values obtained from previous iterations. The Q-values were obtained from previous steps as well. It selects the action

a_{t}

to be performed based on an epsilon-greedy policy. It chooses the action with the highest Q-value with a probability of

(1 - ε),

where ε is the exploration factor. Otherwise, it selects a random action with a probability of ε. The graph representation is updated to reflect the current changes and reconstruction of the input graph representation GCN with

s_{t + 1}

from constructing GCNs from the state

(t + 1

). Then, the process of passing the updated input graph through the main GCN is changed via node embedding. This encourages the algorithm to gradually exploit the learned knowledge more as the number of episodes increases.

If the exploration factor

(ε)

is greater than 0.5, the algorithm exploits the learned knowledge by selecting the action

a_{t + 1}

that has the maximum Q-value for the current state. Additionally, we update the state by transitioning from the current state

s_{t}

to the next state

s (t + 1)

based on the selected action

a_{t}

. It also receives a reward

r_{t}

from the environment, which is added to the final reward. The action is configured to the environment within the GRL-SFT framework. This algorithm utilizes GCNs to capture relationships between resources and VNFs and employs DRL to learn optimal placement strategies through exploration and exploitation.

Algorithm 1: GNN-DRL for optimal policy and action in SFC placement
Input: GCN representation of SFC request with the requirement of VNFs and VLs
Output: Optimal $a_{t}$ deployment and placement of locations for each VNF instance
1:	initial replay memory $D$
2:	for episode i = 0 to 1000 do
3:			Observe $G = (V, E),$ env. selection
4:			If exploration $(ε < 0.5$ ): do
5:				Construct GCN from state $(s_{t}$ ) of the network environment at time t, env.select_property.resource
6:				GCN passes the prediction value function
7:				Approximate $q_{value}$ for $\forall a_{t}$ using value function
8:				For each $a_{t}$ difference do
9:				$q_{value} \leftarrow compute$ . $q_{value}$ ( $s_{t}$ , $r_{t}$ )
10:				$q_{value} \leftarrow$ $epsilon_greedy (q_{value}, ε$ )
11:			End
12:					$Update GCN with s_{t + 1}$
13:					Construct GCN from the state $(t + 1$ )
14:					GCN passes $(t + 1)$ to value function $(t + 1)$
15:			$ε \leftarrow ε - ε_{decay}$
16:	Else if $exploitation (ε > 0.5)$ do
17:		$Select action a_{t} = {argmax}_{a} Q (s_{t}, a_{t})$
18:		$Update a_{t} \to s_{t}$
19:		$reward \leftarrow r + r_{t}$
20:	end

6. Performance Evaluation

6.1. Simulation Setup

In this section, we illustrate the performance evaluation of our proposed scheme. First, we describe the simulation used to evaluate the proposed algorithm. Then, we configure several (hyper)parameters to illustrate network performance using the proposed GRL-SFT method. Next, based on the simulation results, we compare our proposed scheme with existing algorithms and evaluate their performance in different scenarios.

Table 3 presents our network scheme. The simulation is implemented in Mininet and mini-NFV [43,44], a widely used tool in the SDN/NFV context for creating network hosts and controllers and embedding them with the Ryu (SDN) controller. The PyTorch 2.3 environment was chosen as the framework due to its flexibility in handling computational graphs. The network graph consists of three different topologies, N1, N2, and N3, with sizes (75V, 150E), (100V, 200E), and (125V, 250E), respectively.

In these network topologies, we randomly select VNFs that can be placed in MEC nodes. With GRL-SFT, we set the maximum tolerance delay between 50 ms and 100 ms to ensure the processing of specific VNFs. This guarantees a response within a reasonable timeframe, considering the CPU capacities of the nodes (2000 Mips). A higher CPU capacity allows for a single CPU to handle more VNF instructions on the MEC server nodes. The simulation traffic rate is 1000 packets/s to 5000 packet/s, and the delay on the link is a maximum of 2 ms. PyTorch is used for building the GNN model, and further hyperparameter values of GNNs and DRL—such as the number of episodes, exploration rate, discount factor, reply memory capacity, learning rate, batch size, and hidden layer dimensions—are set to 2000; 0.5; 0.95; 4000; 0.001; (32,128); and (32,64), (64,128), (128,256), respectively.

Our proposed scheme, the result, and a discussion of the evaluation metrics of the proposed and reference scheme are given, which are conducted for comparison with our scheme.

Resource greed for SFC (RGD-SFC) is a conventional approach to designing, managing, and optimizing SFC in a network that relies on making decisions based on locally optimal selection at each step.
Deep q-learning for SFC (DQL-SFC) is an approach that has only one estimator, which attempts to learn both the behavior and target policies to achieve an optimal solution.

6.2. Results and Discussion

We capture key performance metrics such as total reward, delay per SFC, SFC request acceptance ratio, SFC request failure ratio, and SFC restoration ratio. Figure 5 demonstrates the total rewards achieved using our proposed approach. The significant optimal superior learning efficiency and consistent improvement at each stage reflect highly efficient learning. In the initial stage, GRL-SFT being in a negative cumulative reward of −37.5 by 200 episodes prompts exploring the action space to realize the network environment. After that, GRL-SFT indicates that it allows for the quick adaptation and optimization of its strategies and achieves a final cumulative reward of approximately 37.5 by 10,000 episodes.

Moreover, our proposed GRL-SFT performance demonstrates unique results while RGD-SFC and DQL-SFC have similar cumulative rewards. RGD-SFC needs better initial performance and slower learning, while DQL-SFC, despite better initial performance, fails to optimize as effectively as GRL-SFT. Finally, this indicates that the deployer has gradually formed a long-term reward-optimal deployment scheme. Figure 5b shows GRL-SFT to determine the varying delay across the physical topology in managing SFC delays compared to the other two algorithms investigated. GRL-SFT consistently achieved the lowest average delays across all three network topologies (N1, N2, and N3), where GRL-SFT managed to keep average delays below 35 ms, while RGD-SFC and DQL-SFC indicate higher delays and feasible sensitivity to network complexity.

Figure 6a shows the evaluation of the SFCR acceptance ratios of the three algorithms. Our algorithm can handle 100 percent SFCR acceptance ratios, significantly outperforming the other algorithms. Due to GRL-SFT, the remaining resources are reserved for setting the cost on nodes, links, and VNF instances. This is used to achieve load balancing. Therefore, GRL-SFT can eliminate resource bottlenecks and enhance the SFCR acceptance ratio. Compared with our algorithm, RGD-SFC is less reliable in more complex or resource-constrained topologies (N2 and N3). It might be more suitable for simpler network configurations where resource constraints are less tight. DQL-SFC balances complexity and performance, offering high acceptance ratios across different topologies. However, GRL-SFT still outperforms other algorithms in terms of acceptance ratios.

On the other hand, Figure 7 shows that GRL-SFT is a reliable selection for handling SFCRs. It consistently shows the lowest failure ratios across all topologies. The low number of SFCR failures is achieved by employing an agent that analyzes network conditions and historical data to choose optimal VNF configurations and request strategies. RGD-SFC and DQL-SFC have unsatisfactory performance compared to our algorithms.

GRL-SFT has illustrated its effectiveness in latency and resource efficiency reliability by leveraging the main significant GNN and DQN components and demonstrating the proactive adjustment and management of resource orchestration. This approach simulation reduced the average total reward, delay, acceptance ratio, failure ratio, and restoration ratio.

7. Conclusions

In this article, we proposed that the GRL-SFT investigate the SFC placement problem in NFV-enabled MEC computing paradigms, which aim to enable fault tolerance while chaining services and VNF mapping. We presented the algorithm for implementing GNN-DRL-based SFC deployments and controlling resources to ensure VNF scaling. The utilization of GCN variance structures the input node and output node of the variance dimension of GNNs to provide a more optimized routing while the network topology changes. Our experiment shows that GRL-SFT performs better in terms of total reward, delay, acceptance ratio, failure ratio, and restoration ratio in difference-setting network topologies compared to reference algorithms, even though we set three different network topologies and resource variety values. Hence, our proposal can be adapted to convergence and flexibility for network scenarios with standard, medium, and heavy conditions. In future work, we will aim to enhance the efficiency of GRL-SFT algorithms further to simplify their implementation and consider online and offline resource orchestration with auto-scaling to improve SFC deployment flexibility.

Author Contributions

Conceptualization, S.R. and P.T.; methodology, S.R. and P.T.; software, P.T. and S.R.; validation, I.S. and S.K. (Seungwoo Kang); formal analysis, I.S. and S.R.; investigation, S.K. (Seokhoon Kim); resources, S.K. (Seokhoon Kim); data curation, P.T.; writing—original draft preparation, S.R.; writing—review and editing, S.R., I.S., S.K. (Seungwoo Kang) and P.T.; visualization, S.K. (Seungwoo Kang); supervision, S.K. (Seokhoon Kim); project administration, S.K. (Seokhoon Kim); funding acquisition, S.K. (Seokhoon Kim). All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korean government (MSIT) (No. RS-2022-00167197, Development of Intelligent 5G/6G Infrastructure Technology for The Smart City); in part by the National Research Foundation of Korea (NRF), Ministry of Education, through Basic Science Research Program under Grant NRF-2020R1I1A3066543; in part by BK21 FOUR (Fostering Outstanding Universities for Research) under Grant 5199990914048; and in part by the Soonchunhyang University Research Fund.

Data Availability Statement

Derived data supporting the findings of this study are available from the corresponding author on request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Au Gupta, A.; Jha, R.K. A Survey of 5G Network: Architecture and Emerging Technologies. IEEE Access 2015, 3, 1206–1232. [Google Scholar] [CrossRef]
Dogra, A.; Jha, R.K.; Jain, S. A Survey on beyond 5G Network with the Advent of 6G: Architecture and Emerging Technologies. IEEE Access 2020, 9, 67512–67547. [Google Scholar] [CrossRef]
Aboubakar, M.; Kellil, M.; Roux, P. A Review of IoT Network Management: Current Status and Perspectives. J. King Saud Univ. Comput. Inf. Sci. 2021, 34, 4163–4176. [Google Scholar] [CrossRef]
Mijumbi, R.; Serrat, J.; Gorricho, J.-L.; Bouten, N.; De Turck, F.; Boutaba, R. Network Function Virtualization: State-of-The-Art and Research Challenges. IEEE Commun. Surv. Tutor. 2016, 18, 236–262. [Google Scholar] [CrossRef]
Dung; Lien, Y.-H.; Liu, B.-H.; Chu, S.-I.; Nguyen, T.N. Virtual Network Function Placement for Serving Weighted Services in NFV-Enabled Networks. IEEE Syst. J. 2023, 17, 5648–5659. [Google Scholar] [CrossRef]
Umrao, B.K.; Yadav, D.K. Placement of Virtual Network Functions for Network Services. Int. J. Netw. Manag. 2023, 33, e2232. [Google Scholar] [CrossRef]
Kim, Y.; Kwak, J.; Lee, H.-W.; Chong, S. Dynamic Computation and Network Chaining in Integrated SDN/NFV Cloud Infrastructure. IEEE Trans. Cloud Comput. 2021, 11, 367–382. [Google Scholar] [CrossRef]
Troia, S.; Savi, M.; Nava, G.; Maria, L.; Schneider, T.; Maier, G. Performance Characterization and Profiling of Chained CPU-Bound Virtual Network Functions. Comput. Netw. 2023, 231, 109815. [Google Scholar] [CrossRef]
Zhu, M.; Oki, E. Robust Function Deployment against Uncertain Recovery Time in Different Protection Types with Workload-Dependent Failure Probability. Comput. Netw. 2023, 231, 109826. [Google Scholar] [CrossRef]
Nine, Z.; Kosar, T.; Bulut, M.F.; Hwang, J. GreenNFV: Energy-Efficient Network Function Virtualization with Service Level Agreement Constraints. In Proceedings of the SC ’23: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, New York, NY, USA, 12–17 November 2023. [Google Scholar] [CrossRef]
Bizanis, N.; Kuipers, F.A. SDN and Virtualization Solutions for the Internet of Things: A Survey. IEEE Access 2016, 4, 5591–5606. [Google Scholar] [CrossRef]
Gelberger, A.; Yemini, N.; Giladi, R. Performance Analysis of Software-Defined Networking (SDN). In Proceedings of the 2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems, San Francisco, CA, USA, 14–16 August 2013. [Google Scholar] [CrossRef]
Maleh, Y.; Qasmaoui, Y.; El Gholami, K.; Sadqi, Y.; Mounir, S. A Comprehensive Survey on SDN Security: Threats, Mitigations, and Future Directions. J. Reliab. Intell. Environ. 2022, 9, 201–239. [Google Scholar] [CrossRef]
Ali, J.; Jhaveri, R.H.; Alswailim, M.; Roh, B. ESCALB: An Effective Slave Controller Allocation-Based Load Balancing Scheme for Multi-Domain SDN-Enabled-IoT Networks. J. King Saud Univ. Comput. Inf. Sci. 2023, 35, 101566. [Google Scholar] [CrossRef]
Karakoç, E.; Çeken, C. Secure SLA Management Using Smart Contracts for SDN-Enabled WSN. KSII Trans. Internet Inf. Syst. 2023, 17, 3003–3029. [Google Scholar]
Abid, A.; Manzoor, M.F.; Farooq, M.S.; Farooq, U.; Hussain, M. Challenges and Issues of Resource Allocation Techniques in Cloud Computing. KSII Trans. Internet Inf. Syst. 2020, 14, 2815–2839. [Google Scholar]
Li, P.; Liu, G.; Guo, S.; Zeng, Y. Traffic-Aware Efficient Consistency Update in NFV-Enabled Software Defined Networking. Comput. Netw. 2023, 228, 109755. [Google Scholar] [CrossRef]
Chen, L.; Tang, H.; Zhao, Y.; You, W.; Wang, K. A Privacy-Preserving and Energy-Efficient Offloading Algorithm Based on Lyapunov Optimization. KSII Trans. Internet Inf. Syst. 2022, 16, 2490–2506. [Google Scholar]
Singh, R.; Sukapuram, R.; Chakraborty, S. A Survey of Mobility-Aware Multi-Access Edge Computing: Challenges, Use Cases and Future Directions. Ad Hoc Netw. 2023, 140, 103044. [Google Scholar] [CrossRef]
Ren, Y.; Guo, A.; Song, C. Multi-Slice Joint Task Offloading and Resource Allocation Scheme for Massive MIMO Enabled Network. KSII Trans. Internet Inf. Syst. 2023, 17, 794–815. [Google Scholar]
Kim, D.-Y.; Lee, S.; Kim, M.; Kim, S. Edge Cloud Selection in Mobile Edge Computing (MEC)-Aided Applications for Industrial Internet of Things (IIoT) Services. Comput. Syst. Sci. Eng. 2023, 47, 2049–2060. [Google Scholar] [CrossRef]
Eang, C.; Ros, S.; Kang, S.; Song, I.; Tam, P.; Math, S.; Kim, S. Offloading Decision and Resource Allocation in Mobile Edge Computing for Cost and Latency Efficiencies in Real-Time IoT. Electronics 2024, 13, 1218. [Google Scholar] [CrossRef]
Kang, S.; Ros, S.; Song, I.; Tam, P.; Math, S.; Kim, S. Real-Time Prediction Algorithm for Intelligent Edge Networks with Federated Learning-Based Modeling. Comput. Mater. Contin. 2023, 77, 1967–1983. [Google Scholar] [CrossRef]
Huang, H.; Tian, J.; Yin, H.; Min, G.; Wu, D.; Miao, W. RQAP: Resource and QoS Aware Placement of Service Function Chains in NFV-Enabled Networks. IEEE Trans. Serv. Comput. 2023, 16, 4526–4539. [Google Scholar] [CrossRef]
Ros, S.; Tam, P.; Kang, S.; Song, I.; Kim, S. A survey on state-of-the-art experimental simulations for privacy-preserving federated learning in intelligent networking. Electron. Res. Arch. 2024, 32, 1333–1364. [Google Scholar] [CrossRef]
Nie, M.; Chen, D.; Wang, D. Reinforcement Learning on Graphs: A Survey. IEEE Trans. Emerg. Top. Comput. Intell. 2023, 7, 1065–1082. [Google Scholar] [CrossRef]
Zhou, J.; Cui, G.; Hu, S.; Zhang, Z.; Yang, C.; Liu, Z.; Wang, L.; Li, C.; Sun, M. Graph Neural Networks: A Review of Methods and Applications. AI Open 2020, 1, 57–81. [Google Scholar] [CrossRef]
Hu, H.; Kang, Q.; Zhao, S.; Wang, J.; Fu, Y. Service Function Chain Deployment Method Based on Traffic Prediction and Adaptive Virtual Network Function Scaling. Electronics 2022, 11, 2625. [Google Scholar] [CrossRef]
Rankothge, W.; Le, F.; Russo, A.; Lobo, J. Optimizing Resource Allocation for Virtualized Network Functions in a Cloud Center Using Genetic Algorithms. IEEE Trans. Netw. Serv. Manag. 2017, 14, 343–356. [Google Scholar] [CrossRef]
Thiruvasagam, P.K.; Chakraborty, A.; Mathew, A.; Murthy, C.S.R. Reliable Placement of Service Function Chains and Virtual Monitoring Functions with Minimal Cost in Softwarized 5G Networks. IEEE Trans. Netw. Serv. Manag. 2021, 18, 1491–1507. [Google Scholar] [CrossRef]
Pei, J.; Hong, P.; Xue, K.; Li, D. Efficiently Embedding Service Function Chains with Dynamic Virtual Network Function Placement in Geo-Distributed Cloud System. IEEE Trans. Parallel Distrib. Syst. 2019, 30, 2179–2192. [Google Scholar] [CrossRef]
Chen, J.; Chen, J.; Zhang, H. DRLEC: Multi-Agent DRL Based Elasticity Control for VNF Migration in SDN/NFV Networks. In Proceedings of the 2021 26th IEEE Asia-Pacific Conference on Communications (APCC), Kuala Lumpur, Malaysia, 11–13 October 2021. [Google Scholar] [CrossRef]
Tam, P.; Ros, S.; Song, I.; Kang, S.; Kim, S. A Survey of Intelligent End-To-End Networking Solutions: Integrating Graph Neural Networks and Deep Reinforcement Learning Approaches. Electronics 2024, 13, 994. [Google Scholar] [CrossRef]
Habibi, F.; Mahdi, D.; Khonsari, A.; Ghaderi, M. Accelerating Virtual Network Embedding with Graph Neural Networks. In Proceedings of the 16th International Conference on Network and Service Management (CNSM), Izmir, Turkey, 2–6 November 2020. [Google Scholar] [CrossRef]
Sun, P.; Lan, J.; Li, J.; Guo, Z.; Hu, Y. Combining Deep Reinforcement Learning with Graph Neural Networks for Optimal VNF Placement. IEEE Commun. Lett. 2020, 25, 176–180. [Google Scholar] [CrossRef]
Rkhami, A.; Quang Pham, T.A.; Hadjadj-Aoul, Y.; Outtagarts, A.; Rubino, G. On the Use of Graph Neural Networks for Virtual Network Embedding. In Proceedings of the 2020 International Symposium on Networks, Computers and Communications (ISNCC), Montreal, QC, Canada, 20–22 October 2020. [Google Scholar] [CrossRef]
Chantre, H.; da Fonseca, N.L.S. Reliable Broadcasting in 5G NFV-Based Networks. IEEE Commun. Mag. 2018, 56, 218–224. [Google Scholar] [CrossRef]
Kothandaraman, B.; Du, M.; Sköldström, P. Centrally Controlled Distributed VNF State Management. In Proceedings of the 2015 ACM SIGCOMM Workshop on Hot Topics in Middleboxes and Network Function Virtualization, New York, NY, USA, 21 August 2015. [Google Scholar] [CrossRef]
Yang, B.; Xu, Z.; Chai, W.K.; Liang, W.; Tuncer, D.; Galis, A.; Pavlou, G. Algorithms for Fault-Tolerant Placement of Stateful Virtualized Network Functions. In Proceedings of the 2018 IEEE International Conference on Communications (ICC), Kansas City, MO, USA, 20–24 May 2018. [Google Scholar] [CrossRef]
Yuan, G.; Xu, Z.; Yang, B.; Liang, W.; Chai, W.K.; Tuncer, D.; Galis, A.; Pavlou, G.; Wu, G. Fault Tolerant Placement of Stateful VNFs and Dynamic Fault Recovery in Cloud Networks. Comput. Netw. 2020, 166, 106953. [Google Scholar] [CrossRef]
Pei, J.; Hong, P.; Pan, M.; Liu, J.; Zhou, J. Optimal VNF Placement via Deep Reinforcement Learning in SDN/NFV-Enabled Networks. IEEE J. Sel. Areas Commun. 2020, 38, 263–278. [Google Scholar] [CrossRef]
Roig, J.S.P.; Gutierrez-Estevez, D.M.; Gunduz, D. Management and Orchestration of Virtual Network Functions via Deep Reinforcement Learning. IEEE J. Sel. Areas Commun. 2019, 38, 304–317. [Google Scholar] [CrossRef]
Lantz, B.; O’Connor, B. A Mininet-Based Virtual Testbed for Distributed SDN Development. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication—SIGCOMM ’15, London, UK, 17–21 August 2015. [Google Scholar] [CrossRef]
Castillo-Lema, J.; Neto, A.V.; de Oliveira, F.; Kofuji, S.T. Mininet-NFV: Evolving Mininet with OASIS TOSCA NVF Profiles towards Reproducible NFV Prototyping. In Proceedings of the 2019 IEEE Conference on Network Softwarization (NetSoft), Paris, France, 24–28 June 2019. [Google Scholar] [CrossRef]

Figure 1. SFC placement and optimal network node selection based on ETSI reference scheme.

Figure 2. SFC instantiates resource utilization-based GRL-SFT optimally orchestrated controller.

Figure 3. Flowchart of SFC interaction with GNN/DRL to instantiate creating services orchestration.

Figure 4. Proposed GNN-based DRL approach to illustrate the network performance of node predictions and links in terms of the approximate training of current policy conditions.

Figure 5. Results on (a) cumulative reward evaluation and (b) delay per SFC between proposed and reference algorithms.

Figure 6. Results of (a) SFCR acceptance ratio and (b) SFCR failure ratio.

Figure 7. Results of SFCR restoration ratio.

Table 1. Primary acronyms and descriptions.

Acronym	Description	Acronym	Description
5G	Fifth generation	QoS	Quality of service
B5G	Beyond the fifth generation	SDN	Software-defined networking
CAPEX	Capital expense	SF	Service function
DRL	Deep reinforcement learning	SFC	Service function chaining
GCN	Graph convolutional network	SFCO	Service function chaining orchestration
GNN	Graph neural network	VIM	Virtual infrastructure management
MANO	Management and orchestration	VM	Virtual machine
MPNN	Message-passing neural network	VL	Virtual link
NFV	Network functions virtualization	VNF	Virtual network function
NFVI	Network functions virtualization infrastructure	VNFM	Virtual network function management
OPEX	Operating expense	VNFFG	Virtual network function forwarding graph

Table 2. Notation of system models within SFC processing and physical resources to minimize fault tolerance.

Notation	Description
$G$	The set of the physical network
$V$	The set of MEC nodes $\{1, 2, \dots, v,\} v \in V$
$E$	The set of the links $\{1, 2, \dots, e\} e \in E$
$T$	Number of timeslots $t \in T$
$C P_{v}$	Physical computing resources on node $v$
$C_{v}^{m a x}, m_{v}^{m a x}, d_{v}^{m a x}$	Upper-bound capacities of physical node-v (CPU, memory, disk)
$σ_{k}^{t}$	The resource capacity remaining at timeslot $t$ (CPU, memory, disk) of the node- $v$
$C_{\bar{v}}^{f, t}, m_{\bar{v}}^{f, t}, d_{\bar{v}}^{f, t}$	The request of the resource capacity at time-t (CPU, memory, disk) of node-v
$b w_{e}^{t}$	The set of the bandwidth resource at time-t of $E$ , with participants in $e \in E$
$\bar{V}$	The set of VNFs $\{1, 2, \dots, \bar{v}\} \bar{v} \in \bar{V}$
$\bar{E}$	The set of virtual links $\{1, 2, \dots, \bar{e}\} \bar{e} \in \bar{E}$
$R_{s}$	The number of VNFs in SFC requests for services
$D_{ρ, e}$	Propagation delay of the processing time of node

Table 3. Key simulation parameters.

Parameters	Specifications
Physical topology size (N1, N2, N3)	(75V, 150E), (100V, 200E), (125V, 250E)
Maximum tolerable delay	50 ms–100 ms
VNF instance randomly	(0,10)
CPU capacities of node	2000 Mips
Traffic rate	1000 packet/s–5000 packet/s
Delay on link	≤2 ms
Simulation time	1000 s
NFVO and SDNC	Ryu 4.34
Network topology platform	Mininet and Mini-NFV
Number of episodes	2000
Exploration rate	0.5
Discount factor	0.95
Reply memory capacity	4000
Learning rate	0.001
Batch size	(32,128)
Hidden layer dimensions	(32,64), (64,128), (128,256)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ros, S.; Tam, P.; Song, I.; Kang, S.; Kim, S. Handling Efficient VNF Placement with Graph-Based Reinforcement Learning for SFC Fault Tolerance. Electronics 2024, 13, 2552. https://doi.org/10.3390/electronics13132552

AMA Style

Ros S, Tam P, Song I, Kang S, Kim S. Handling Efficient VNF Placement with Graph-Based Reinforcement Learning for SFC Fault Tolerance. Electronics. 2024; 13(13):2552. https://doi.org/10.3390/electronics13132552

Chicago/Turabian Style

Ros, Seyha, Prohim Tam, Inseok Song, Seungwoo Kang, and Seokhoon Kim. 2024. "Handling Efficient VNF Placement with Graph-Based Reinforcement Learning for SFC Fault Tolerance" Electronics 13, no. 13: 2552. https://doi.org/10.3390/electronics13132552

APA Style

Ros, S., Tam, P., Song, I., Kang, S., & Kim, S. (2024). Handling Efficient VNF Placement with Graph-Based Reinforcement Learning for SFC Fault Tolerance. Electronics, 13(13), 2552. https://doi.org/10.3390/electronics13132552

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Handling Efficient VNF Placement with Graph-Based Reinforcement Learning for SFC Fault Tolerance

Abstract

1. Introduction

2. Background Studies

2.1. VNF Placement Solution

2.2. Graph-Based Reinforcement Learning for SFC

2.3. SFC Fault Tolerance Awareness

3. Problem and Formulation

3.1. Substrate Network Model

3.2. Service Function Chaining Request Models

3.3. Problem Formulation

3.4. Markov Decision Process Models

4. Proposed Network Scenario

4.1. Configuring SFC Execution Based on GNN-DRL Approach

4.2. Selecting SFC-Enabled GNNs in Control Entities

5. Proposed VNF Placement Solution

5.1. Proposed Methods-Based Solutions

5.1.1. Graph Neural Network-Based Approach

5.1.2. DRL-Based Approaches

5.2. Designed GRL-SFT Framework

6. Performance Evaluation

6.1. Simulation Setup

6.2. Results and Discussion

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI