Latency-Aware NFV Slicing Orchestration for Time-Sensitive 6G Applications

Alnaim, Abdulrahman K.; Albarrak, Khalied M.

doi:10.3390/systems13110957

Open AccessArticle

Latency-Aware NFV Slicing Orchestration for Time-Sensitive 6G Applications

by

Abdulrahman K. Alnaim

^*

and

Khalied M. Albarrak

Department of Management Information Systems, School of Business, King Faisal University, Hofuf 31982, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Systems 2025, 13(11), 957; https://doi.org/10.3390/systems13110957

Submission received: 5 September 2025 / Revised: 16 October 2025 / Accepted: 24 October 2025 / Published: 27 October 2025

(This article belongs to the Special Issue Cybersecurity and Secure Information Systems: Challenges and Solutions in Digital Environment)

Download

Browse Figures

Versions Notes

Abstract

Ensuring ultra-low latency and high reliability in 6G network slices remains a significant challenge, as current NFV orchestration approaches are largely reactive and not designed to anticipate performance degradation. The advent of 6G networks brings forth stringent requirements for ultra-reliable low-latency communication (URLLC), necessitating advanced orchestration mechanisms that go beyond reactive policies in traditional NFV environments. In this paper, we propose a latency-aware, AI-driven NFV slice orchestration framework aligned with ETSI MANO architecture to address the needs of time-sensitive 6G applications. Our framework integrates a predictive AI engine into the NFV Orchestrator (NFVO) to forecast latency violations based on real-time telemetry and historical trends. It enables dynamic scaling, intelligent VNF migration, and infrastructure-level isolation to maintain stringent end-to-end (E2E) latency targets. Experimental results indicate up to 30% reduction in average latency, a 42% improvement in SLA compliance, and 25% lower migration overhead compared to traditional reactive orchestration. The framework provides a scalable and intelligent orchestration solution adaptable to future 6G deployments.

Keywords:

cloud computing; network function virtualization; network slicing; 6G networks; NFV MANO

1. Introduction

The rapid evolution toward sixth-generation (6G) networks is fundamentally driven by the increasing demand for communication services that require ultra-reliable low-latency communication (URLLC). Mission-critical applications such as remote robotic surgery [1], autonomous vehicle coordination [2], smart manufacturing [3], and tactile internet require latencies as low as 1 ms or less, combined with reliability levels reaches to 99% [2]. While the fifth-generation (5G) wireless systems have introduced support for URLLC, they are increasingly constrained by the architectural limitations of centralized orchestration, shared infrastructures, and traffic unpredictability in multi-slice deployments [3].

To address the scalability, flexibility, and diversity of 6G services, network slicing has emerged as a core architectural principle, enabling operators to create logically isolated, application-specific virtual networks on shared physical infrastructure [4]. Each slice may have its own service-level agreement (SLA), resource allocation policy, and performance goals. However, while this concept provides theoretical flexibility, the operational realization of slicing, particularly in dynamic heterogeneous environments, faces significant challenges, especially for time-sensitive URLLC services that coexist with best-effort enhanced mobile broadband (eMBB) or massive machine-type communication (mMTC) traffic.

In parallel, network function virtualization (NFV) has been proposed to decouple network functions from proprietary hardware, allowing operators to deploy virtualized network functions (VNFs) as software modules in data centers, cloud infrastructure, or at the edge [5]. The ETSI NFV MANO (Management and Orchestration) framework formalizes this architecture through the NFV Orchestrator (NFVO), VNF Manager (VNFM), and Virtualized Infrastructure Manager (VIM), providing a reference model for lifecycle management, resource provisioning, and policy enforcement [6,7]. While ETSI NFV has gained industry acceptance, its standard orchestration policies are not inherently latency-aware, and orchestration actions are often reactive, initiated only when thresholds are violated or alarms are triggered [8].

Prior research efforts have attempted to improve URLLC performance using heuristic-based or optimization-based approaches to VNF placement. For instance, latency-aware VNF placement using integer linear programming (ILP) has shown promise [9,10], however, ILP-based solutions typically lack scalability and adaptability to real-time network dynamics, making them less practical for fast-changing 6G environments. Others have explored service function chaining (SFC) frameworks combined with cost-aware migration strategies. For example, an SFC migration survey highlights that many approaches use static system snapshots or instantaneous network metrics rather than anticipating future network state changes [11,12]. Additionally, the isolation of URLLC traffic from eMBB flows has been studied in 5G slicing contexts [13], but concrete strategies that enforce such isolation dynamically in shared 6G infrastructures remain sparse and often lack integration with standards-based orchestration platforms.

Meanwhile, recent advances in machine learning (ML) and artificial intelligence (AI) have opened new possibilities for proactive and intelligent network management. AI techniques are increasingly capable of analyzing vast amounts of telemetry data in real time, enabling traffic prediction, anomaly detection, and optimized resource allocation [14]. In the context of 6G, several studies have proposed AI-native orchestration frameworks that embed learning capabilities directly into slice management, mobility prediction, and dynamic resource scaling. For example, the REASON architecture outlines a modular, AI-integrated controller designed for end-to-end orchestration in future networks [15], while Moreira et al. present a distributed AI-native orchestration approach incorporating ML agents throughout network slice lifecycles [16]. Additionally, ML has been applied to classify slice types and forecast slice handovers in simulated 6G scenarios [17]. However, most of these solutions remain conceptual or simulation-based and do not provide full end-to-end orchestration integrated with ETSI NFV MANO, which limits their applicability in operational carrier-grade environments.

This work distinguishes itself through three main innovations. First, it integrates AI-based short-term latency forecasting into the orchestration decision loop, enabling proactive VNF migrations and preemptive slice adjustments before SLA violations occur, moving beyond threshold-based reactive policies. Second, the design is fully compliant with the ETSI NFV MANO reference model, embedding the predictive logic within the NFVO without requiring architectural deviations. Third, in the absence of public 6G telemetry datasets, we develop and use synthetic workload generators that emulate bursty URLLC and eMBB behaviors, allowing comprehensive evaluation of forecasting accuracy and orchestration performance under diverse traffic conditions.

To address this need, we propose a latency-aware NFV slice orchestrator for 6G that incorporates an AI-based predictive module within the NFVO. The orchestrator uses real-time telemetry (e.g., latency trends, traffic volume, VNF resource usage) to forecast short-term latency spikes using machine learning models such as ARIMA or LSTM neural networks. Based on the predicted latency profile, the orchestrator proactively migrates or scales VNFs, selects optimized host placements, and applies traffic isolation policies. These include mechanisms such as CPU pinning, NUMA-aware scheduling, or SR-IOV-based virtual interfaces to reduce inter-slice interference. Our design follows the ETSI NFV MANO reference architecture [6,7], ensuring compatibility with existing NFV deployments and providing a path toward standard-compliant 6G slicing orchestration.

A major novelty of this paper lies in its integration of AI-based latency prediction into the orchestration loop, driven by a forecast horizon that allows decisions to be made ahead of latency violations. In contrast to static placement or threshold-triggered actions, our approach enables proactive VNF migration and early slice adjustment, which is essential for time-sensitive applications. Another distinctive aspect is our focus on synthetic data-driven evaluation. Given the lack of public URLLC telemetry datasets for 6G, we simulate traffic with realistic properties (bursty arrivals, contention, noisy neighbors) and use it to evaluate the accuracy and impact of latency prediction models. This methodology, though artificial, offers valuable insight into the behavior of predictive orchestration systems.

Therefore, in this paper, we aim to achieve four primary objectives. First, we formalize mathematical models that describe end-to-end latency, VNF migration costs, and SLA constraints in the context of slice orchestration. Second, we design an AI-based prediction module for latency trends using time-series forecasting. Third, we embed this module into the ETSI NFVO, enabling proactive migration, scaling, and slice policy enforcement. Finally, we evaluate the performance of the orchestrator on synthetically generated URLLC/eMBB workloads, comparing it with static and reactive orchestration baselines.

Our contributions are as follows:

Design of a standards-aligned, AI-driven orchestration framework that integrates latency forecasting into ETSI NFV MANO.
Implementation of a prediction-aware orchestration loop with real-time migration and isolation mechanisms.
Empirical evaluation using synthetic 6G workloads to demonstrate reduced SLA violations, better latency stability, and effective eMBB/URLLC traffic separation.

The remainder of this paper is structured as follows: Section 2 provides the necessary background and a comprehensive review of related work in NFV slicing, ETSI MANO architecture, AI-based orchestration, and latency-sensitive service delivery. Section 3 presents the proposed system architecture, detailing the integration of AI-enhanced orchestration within the ETSI NFV framework. Section 4 formulates the mathematical modeling for latency prediction, cost functions, and orchestration constraints. Section 5 describes the implementation of the AI-based forecasting logic, optimization Algorithms, and the synthetic data generation process. Section 6 presents the evaluation results across key performance metrics such as end-to-end latency, SLA violations, migration frequency, and forecasting accuracy, including analysis on resource usage and orchestration cost. Section 7 offers a discussion on trade-offs, scalability, and practical implications of the proposed approach. We end with conclusions and future works in Section 8.

2. Literature Review

2.1. Overview

NFV, standardized by the European Telecommunications Standards Institute (ETSI), provides a foundational architecture for software-based deployment of network functions as VNFs. The ETSI NFV MANO framework organizes orchestration into three critical components: the NFV Orchestrator (NFVO), which manages slice lifecycles and high-level policies; the VNF Manager (VNFM), which handles VNF instantiation and VNF-specific configurations; and the Virtualized Infrastructure Manager (VIM), which abstracts compute, storage, and network resources [6,18]. This layered design supports lifecycle management and scaling; however, current MANO policies are predominantly reactive, responding to threshold violations in CPU, memory, or bandwidth. For time-sensitive URLLC applications, this reactive approach can be insufficient, as latency violations may occur before corrective actions are taken.

Simultaneously, 6G networks are expected to push latency requirements further, targeting sub-millisecond latency and ultra-high reliability. Achieving these goals requires end-to-end network slicing, an architectural pillar that partitions physical infrastructure into logical slices tailored to specific service needs [19]. Each slice, URLLC, eMBB, or mMTC, has distinct performance demands, and orchestration across RAN, transport, and core domains is paramount. While NFV MANO aligns well with slicing requirements, it does not natively support slice-level intelligence or time-aware orchestration, which are critical for enforcing URLLC SLAs [20,21].

Further, service function chaining (SFC) enables dynamic composition of multiple VNFs to realize network services [22]. These chains introduce concatenated processing and queuing delays, and migrating VNFs during runtime can significantly impact end-to-end latency. Traditional SFC migration techniques prioritize minimizing service disruption but fail to incorporate preemptive latency-based triggers. Emerging methods such as digital twin–driven migration modeling offer cost analysis and state-awareness for relocations but do not yet link forecasting models with orchestrator logic [21].

Architectural perspectives increasingly explore intelligent Radio Access Network (RAN) orchestration. For example, comprehensive surveys on AI-native RAN stress the importance of embedding machine learning directly within the RAN to enable real-time control and adaptive slice management, such as predictive resource scheduling and traffic forecasting within O-RAN’s Near-RT RAN Intelligent Controller (RIC) ecosystem [23]. Specific solutions, like the AdaSlicing framework, employ AI agents in open RAN environments to enable dynamic and continual adaptation of radio slices, yet they do not extend to ETSI NFV MANO-compliant orchestration or integrate end-to-end latency forecasting with lifecycle management [24,25].

Furthermore, isolation techniques aimed at mitigating interference between URLLC and eMBB traffic, utilizing strategies such as CPU pinning, NUMA-aware scheduling, and SR-IOV, are typically confined to RAN or hypervisor configurations and are not directly controlled through NFVO policies. As a result, actionable mechanisms for slice-level isolation in the orchestration layer are underexplored.

While NFV, slicing, SFC, and AI form the backbone of 6G architectural research, their integration, especially for latency forecasting and proactive orchestration, has not yet been fully realized. Existing solutions either lack AI-driven foresight, focus on isolated components of the problem, or remain detached from standardized orchestration frameworks.

2.2. Related Works

Improving latency performance in network NFV environments, particularly for URLLC in 6G, has become a significant research focus. Several prior efforts have explored heuristic and optimization-based strategies for VNF placement with the aim of reducing end-to-end (E2E) latency. For instance, the researchers in [9,10] utilize integer linear programming (ILP) models to formulate the VNF placement problem in latency-sensitive service function chains (SFCs). These models yield optimal or near-optimal placements under static conditions but are computationally intensive and thus impractical for dynamic, large-scale environments where latency conditions fluctuate rapidly. Consequently, these solutions often lack the adaptability and real-time responsiveness needed in practical 6G deployments.

Other studies have examined migration strategies in SFCs to mitigate delay penalties and optimize runtime resource allocation. As reviewed in [11,12], most existing migration mechanisms rely on snapshot-based metrics and threshold-based decision triggers. These models aim to reduce service disruption but do not proactively anticipate latency spikes or traffic surges. Such reactivity limits their usefulness in enforcing URLLC service-level agreements (SLAs), especially in shared infrastructure scenarios where contention with eMBB or mMTC traffic may arise.

In parallel, Alomari et al. [26] investigated the issue of synchronization delay and consistency among stateful VNF instances, proposing an optimization-based model that minimizes synchronization cost and ensures bounded update latency. While this work focuses primarily on intra-VNF consistency rather than end-to-end orchestration, its consideration of synchronization overhead is highly relevant to runtime NFV performance and complements broader efforts to reduce latency variability in distributed service chains.

Further, the researchers in [13] addresses URLLC-eMBB isolation in network slicing, proposing policies to minimize resource interference between latency-sensitive and throughput-oriented slices. While isolation is critical for maintaining predicTable QoS, many of these mechanisms are static or based on pre-defined slice templates, lacking dynamic reconfiguration logic. More importantly, such studies typically do not integrate with standard NFV MANO frameworks, making them difficult to implement in multi-domain 6G architectures where orchestration coordination is vital.

A promising direction emerges from the integration of AI and ML into network orchestration processes. As highlighted in [14], AI techniques are increasingly used to analyze telemetry data, predict traffic trends, detect anomalies, and make real-time orchestration decisions. These tools offer a proactive alternative to threshold-based mechanisms, with the potential to support predictive orchestration that aligns better with 6G’s stringent performance requirements.

In this context, several AI-native orchestration strategies have been proposed. The REASON architecture presents a modular, AI-augmented control framework capable of optimizing end-to-end network slice orchestration [15]. It incorporates learning agents that process telemetry to make policy decisions, though it remains largely conceptual and does not extend into operational MANO platforms. Similarly, the researchers in [16] propose a distributed AI-native orchestration design with embedded ML agents across the network slice lifecycle. Their solution demonstrates scalability and adaptive behavior but again lacks direct integration with ETSI NFV standards, limiting its deployability.

Other AI-based approaches, such as that in [17], leverage supervised learning to classify slice types and predict handovers in a simulated 6G environment. While demonstrating the potential of predictive models in managing slice mobility and SLA compliance, such methods are usually validated only in simulation, and practical implementation details, especially around orchestration interface design, are often unspecified.

In summary, existing works reveal a growing interest in latency-aware VNF placement, intelligent SFC migration, AI-driven orchestration, and slice isolation. However, current strategies tend to fall short in three critical areas:

Scalability and real-time adaptability of optimization models.
Proactive integration of latency forecasting into migration/orchestration logic.
Alignment with standards-compliant frameworks such as ETSI NFV MANO.

These gaps highlight the need for a unified, predictive orchestration approach that combines AI-based forecasting, latency-aware service chaining, and standards-based orchestration for URLLC in 6G networks. We show in Table 1 a comparison of previous state-of-the-art studies in terms of the focus area, the methodology, and their limitations.

3. Methodology

To ensure low-latency and reliable delivery of time-sensitive services in 6G, this work proposes a predictive, latency-aware orchestration framework for Network Function Virtualization (NFV). The methodology is grounded in the ETSI MANO architecture [7] and enhanced with intelligent modules for forecasting latency violations, proactively migrating VNFs, and isolating critical traffic. This section describes the components of the orchestration framework, the prediction-based control logic, and its integration within the NFVO’s decision pipeline.

3.1. System Architecture and Components

The framework shown in Figure 1 adheres to ETSI NFV MANO [7], with additional components as follows. We further explain the framework more briefly.

NFVO: The central intelligence of the framework, extended with an AI-driven control module.
VNFM: Responsible for lifecycle management and monitoring of individual VNFs.
VIM: Manages the physical and virtual resources where VNFs are deployed.
AI Engine: Embeds a learning model (e.g., NFVLearn [27]) into the NFVO for real-time latency prediction.
Policy Engine: Defines the rules for VNF migration, scaling, and isolation based on predicted violations.
Latency Monitor: A distributed module at the VIM layer for collecting real-time metrics on per-VNF latency, queueing delay, and resource contention.
VNF Placement and Migration Controller: Executes preemptive migration and resource scaling for VNFs based on AI insights and policy constraints.

Figure 1 illustrates the proposed system architecture for latency-aware predictive NFV orchestration, structured into three horizontal tiers to align logically with the ETSI MANO framework while incorporating AI-enhanced extensions.

At the top tier, representing standard ETSI NFV MANO components, the architecture includes the NFV Orchestrator (NFVO), which is decomposed internally into three submodules: the AI Engine on the left, responsible for latency prediction using time-series models; the Policy Engine in the center, which enforces SLA rules and placement policies; and the VNF Placement & Migration Controller on the right, which executes optimized orchestration decisions. To the right of the NFVO lies the Virtual Network Function Manager (VNFM), which handles the life-cycle management of individual VNFs. The Virtualized Infrastructure Manager (VIM) is placed in the top tier just below the VNFM, reflecting its intermediary role between orchestration logic and physical infrastructure.

The middle tier comprises predictive control extensions. A key component is the Latency Monitor, which collects real-time metrics (e.g., delay, jitter, resource utilization) from the underlying infrastructure and provides this telemetry to the AI Engine in the NFVO via a feedback data flow. On the left side of this tier, a Monitoring Database/Telemetry Input is depicted as a cylindrical storage element. This module stores historical telemetry and orchestrator decisions and provides input to the AI Engine, supporting model training and long-term trend detection.

At the bottom tier, the architecture shows the physical and virtual infrastructure. It includes three compute nodes labeled according to their slicing function: Server A for URLLC VNFs, Server B for eMBB VNFs, and Server C for shared best-effort VNFs. Each server hosts multiple colored VNF boxes representing different service chains. Below the compute nodes, a Data Plane layer includes a network switch and SR-IOV (Single Root I/O Virtualization) interfaces, allowing direct, low-latency connections from latency-sensitive VNFs to the physical network, bypassing virtualization bottlenecks.

The framework also shows key interconnections: a standard Or-Vnfm interface between the NFVO and VNFM, and a Vi-Vnfm interface between the VNFM and the VIM, ensuring compatibility with ETSI MANO. A feedback loop connects the NFVO to the Latency Monitor, enabling prediction-based decision-making. Policy arrows indicate that SLA constraints flow from the Policy Engine to the Placement Controller, guiding orchestration actions. Finally, migration arrows visualize the live relocation of VNFs across servers when forecasted SLA violations trigger proactive mitigation.

This modular design combines standards-compliant orchestration with AI-native extensions and real-time telemetry integration, supporting latency-aware, dynamic service management for 6G time-sensitive applications.

3.2. Predictive Orchestration Model

The key novelty lies in the predictive orchestration mechanism. Rather than responding after latency thresholds are breached, the AI engine learns patterns of latency evolution and preemptively signals potential SLA violations. The process begins with data collection, where real-time metrics such as VNF processing delay, network delay, CPU utilization, memory usage, and slice-specific performance indicators are gathered. Next, in the feature processing stage, these raw inputs are converted into temporal sequences using sliding windows and enriched with contextual information such as time-of-day, traffic class, or prior migration activity.

The resulting data stream feeds into the prediction model, typically an LSTM, GRU, or ensemble-based time-series regressor trained on synthetic telemetry to forecast latency spikes one or two steps ahead. The synthetic telemetry is generated to emulate realistic 6G slice behavior in the absence of publicly available URLLC datasets. We model traffic arrival using a Poisson process for normal load conditions and introduce bursty episodes using a compound Poisson model to simulate congestion or scheduling delays. Latency values are sampled from a base signal (e.g., sinusoidal or trend-based curve) with superimposed Gaussian noise (μ = 0, σ = 0.3 ms) to simulate jitter and random delay variations. VNFs are labeled by slice type (URLLC or eMBB), each with distinct SLA thresholds and service rates. Additionally, context features such as time-of-day and recent migration activity are encoded to capture temporal dependencies. This synthetic telemetry stream is used to train the AI model (e.g., LSTM) to recognize early patterns of SLA violations before they occur. When the predicted latency exceeds a predefined URLLC SLA threshold (e.g., 1 ms), an orchestration trigger is activated. This trigger is first routed to the Policy Engine, which validates whether migration or scaling actions are permissible under current slice constraints, migration limits, and load balancing policies. If approved, the system proceeds to perform VNF reallocation or initiate proactive migration before actual service degradation occurs. This preemptive behavior reduces service disruption, avoids reactive handovers, and improves resource utilization. Conversely, if the predicted latency remains within SLA bounds, no orchestration is triggered, and the system continues monitoring the telemetry data in real time to reassess conditions in the next evaluation cycle. This end-to-end flow is visually illustrated in Figure 2.

3.3. VNF Isolation Strategy

To shield URLLC VNFs from interference by background or eMBB traffic, the orchestrator enforces infrastructure-level isolation:

CPU Pinning: URLLC VNFs are bound to isolated physical cores, avoiding shared scheduling queues [28].
NUMA-Aware Scheduling: Memory affinity is respected so that latency-critical VNFs avoid cross-NUMA memory access delays [29].
SR-IOV Interfaces: Direct hardware I/O channels reduce virtualization overhead and jitter.
Slice-Aware Host Scheduling: Compute nodes are segmented by slice type. URLLC VNFs are never colocated with best-effort VNFs.

This strategy, based on established isolation mechanisms, reduces latency variance and ensures deterministic performance, especially in multi-tenant edge environments.

3.4. Migration Management

Live migration is invoked when the AI engine forecasts a sustained SLA breach. The migration controller evaluates candidates based on predicted delay, current load, and prior migration cost history.

State-Aware Migration: The VNF state is serialized and checkpointed before migration begins.
Warm Transfer Protocol: The state is transferred while keeping the old instance active, minimizing downtime.
Restore and Resume: The VNF is restored and traffic is rerouted via SDN-based control plane updates.

This process is governed by the Policy Engine and occurs transparently to the service consumer. By forecasting before overload occurs, it avoids emergency handovers and degraded QoS.

3.5. ETSI Compliance and Integration

The entire orchestration logic is designed for compatibility with ETSI NFV. The AI and policy modules are pluggable extensions within the NFVO logic. No changes are required to standard interfaces such as Or-Vnfm or Vi-Vnfm. Network service descriptors (NSDs) are augmented with latency thresholds and migration constraints but retain compliance with existing standards.

This ensures that the system can be deployed on top of ETSI-aligned orchestrators such as Open Network Automation Platform (ONAP), OSM, or commercial NFV stacks, while providing proactive intelligence beyond traditional rule-based models.

4. Mathematical Modelling

To support intelligent and latency-compliant orchestration decisions, we define a set of models that capture the system’s temporal behavior, resource dynamics, and forecasting structure.

4.1. End-to-End Latency Model

Let a network slice

S_{k}

for a given service

k \in \{1, \dots, K\}

be composed of a sequence of VNFs

V_{k} = \{v_{1}, v_{2}, v_{3}, \dots, v_{n}\}

forming a service function chain (SFC). Each VNF instance

v_{i}

is deployed on a compute node

c_{j} \in C

, with each compute node characterized by its processing capacity and queuing behavior.

The total end-to-end latency

L_{k}^{E 2 E}

for a packet traversing the SFC is given by:

L_{k}^{E 2 E} \sum_{i = 1}^{n} (L_{v_{i}}^{p r o c} + L_{v_{i}}^{q u e u e} + L_{v_{i}}^{n e t})

(1)

where:

$L_{v_{i}}^{p r o c}$ : Processing delay at VNF $v_{i}$
$L_{v_{i}}^{q u e u e}$ : Queuing delay at compute node hosting $v_{i}$
$L_{v_{i}}^{n e t}$ : Network delay between $v_{i - 1}$ and $v_{i}$

Each component is time-varying and resource-dependent. The queuing delay can be modeled via M/M/1 or M/G/1 queue approximation, depending on traffic variability.

L_{v_{i}}^{q u e u e} \approx \frac{ρ_{v i}}{μ_{v i} (1 - ρ_{v i})}

(2)

where:

$μ_{v i}$ : Service rate
$ρ_{v i} = λ_{v i} ∕ μ_{v i}$ : Utilization factor
$λ_{v i}$ : Arrival rate of packets to VNF $v_{i}$

A URLLC slice must maintain:

L_{k}^{E 2 E} \leq δ_{k}^{m a x}

(3)

where

δ_{k}^{m a x}

is the maximum tolerable latency (e.g., 1 ms).

4.2. Migration Overhead Model

When the orchestrator decides to migrate a VNF

v_{i}

from node

c_{s}

(source) to

c_{d}

(destination), the migration latency includes state transfer delay, restart time, and potential traffic rerouting:

L_{v_{i}}^{m i g} = L_{v_{i}}^{t r a n s} + L_{v_{i}}^{r e s t o r e} + L_{v_{i}}^{r e r o u t e}

(4)

where:

$L_{v_{i}}^{t r a n s} = \frac{S_{v i}}{B_{s, d}}$ : Time to transfer state $S_{v i}$ over bandwidth $B_{s, d}$
$L_{v_{i}}^{r e r o u t e}$ : Time to reinitialize VNF at the new location
$L_{v_{i}}^{r e r o u t e}$ : Time for SDN controller to update the forwarding rules

The migration feasibility constraint ensures that the benefit of migration outweighs the latency overhead:

L_{k}^{E 2 E, n e w} + L_{v i}^{m i g} \leq L_{k}^{E 2 E, o l d} + ϵ

(5)

where ϵ is a configurable slack margin to tolerate minor jitter caused by forecast uncertainty or orchestration delay. In our experiments, ϵ is set to 0.1 ms.

4.3. Latency Forecasting Model

Let

x_{t} \in R^{d}

be the observed system metrics at time t, including:

Instantaneous latency $l_{t}$
CPU utilization
Queue length
Packet drop rate

We construct a time-series input sequence:

X_{t} = \{x_{t - T + 1}, x_{t - T + 2}, \dots, x_{t}\}

(6)

In this formulation, t is the current time step, and T is the forecast window size, representing the number of prior observations used as model input. The sequence X_t = {x_t − T + 1,…, x_t} captures temporal dynamics in recent telemetry. In our setup, T = 10, and t ∈ [T, 1000], aligned with the 1000-step simulation duration.

Using this, a supervised learning model

F

(⋅) is trained to predict the next-step latency:

{\hat{l}}_{t + 1} = F (Χ_{t}; θ)

(7)

where:

$F$ is a deep model (e.g., LSTM, GRU, or Temporal CNN)
$θ$ represents model parameters trained using synthetic datasets

The loss function used in training is Mean Squared Error (MSE):

L (θ) = \frac{1}{N} \sum_{t = 1}^{N} ({{\hat{l}}_{t + 1} + l_{t + 1})}^{2}

(8)

The orchestrator uses the forecast

{\hat{l}}_{t + 1}

to trigger migration or scaling decisions before latency violates SLA bounds.

To implement the forecast-driven orchestration in practice, we design a decision loop that continuously monitors the system metrics, predicts the next-step latency, and compares the forecast with the SLA constraint. If a potential violation is detected, the orchestration logic evaluates migration candidates and triggers proactive reallocation if a better node can be found. This decision-making logic is captured in Algorithm 1, which shows the steps for AI-enhanced latency prediction and policy enforcement.

Algorithm 1 outlines a latency-aware control loop where a time-series model (e.g., LSTM) forecasts future delay based on recent system history. If the predicted latency

{\hat{l}}_{t + 1}

exceeds the service-level agreement (SLA) threshold

δ_{k}^{m a x}

, the orchestrator computes a migration plan that maintains latency compliance. Forecast-driven decision-making like this improves reactivity and reduces unnecessary migrations.

Algorithm 1. Predictive Orchestration Logic for Latency-Sensitive Slices

Input:
- Sliding time window of system metrics: X_t = {x_t−t+1, …, x_t}
- SLA threshold for service k:

δ_{k}^{m a x}

- Forecast model:

F

(·) with parameters θ
Output:
- Orchestration action: {migrate, scale, monitor}
1: î_t₊₁ ←

F

(X_t; θ) ▷ Forecast next-step latency
2: if î_t₊₁ >

δ_{k}^{m a x}

then
3: candidate_nodes ← GetAvailableTargets(v_i)
4: for each c_j in candidate_nodes do
5: Estimate migration delay

L_{v_{i}}^{m i g} (j)

using Equation (4)
6: Estimate new path latency

L_{k}^{E 2 E}

, new(j) using Equation (3)
7: if

L_{k}^{E 2 E}

,new(j) +

L_{v_{i}}^{m i g} (j)

≤

δ_{k}^{m a x}

then
8: Execute Migrate(v_i, c_j)
9: break
10: end if
11: end for
12: else
13: Continue monitoring
14: end if

Computational Complexity of Algorithm 1

Algorithm 1 executes a single-step latency prediction followed by a loop over available target nodes for migration, yielding a complexity of O(C), where C is the number of candidate nodes. Since inference from the forecasting model is a constant-time operation, this Algorithm is suitable for real-time decision-making in URLLC contexts.

4.4. Optimization Formulation

An additional decision layer is modeled as an optimization problem:

\underset{A, M}{m i n} \sum_{K = 1}^{k} (L_{k}^{E 2 E} + α \cdot L_{k}^{m i g})

(9)

Subject to:

L_{k}^{E 2 E} \leq δ_{k}^{m a x}, \forall k \in K

(10)

Resource constraints on CPU, memory, and bandwidth:

Migration Budget : \sum M_{i, j} \leq M_{m a x}

(11)

where:

A: VNF-to-node assignment matrix
M: Migration indicator matrix
$α$ : Cost weight for migration overhead

The framework allows the orchestrator to evaluate multiple options (migrate, scale, delay) and select the optimal one. In scenarios where multiple VNFs or service chains require adjustment, individual migrations may conflict or overload shared resources. To optimize the system-wide response, we formulate an offline placement problem that jointly considers end-to-end latency and migration overhead. While exact solutions to this problem are computationally intensive, we employ a heuristic routine that approximates a near-optimal solution in polynomial time.

Algorithm 2 describes a resource-aware orchestration process that scans each service slice

S_{k}

, estimates its current latency, and, if required, iterates through candidate nodes to find a feasible migration with minimal added cost. The Algorithm minimizes a combined cost metric

L_{k}^{E 2 E} + α \cdot L_{k}^{m i g}

, in alignment with the optimization objective in Equation (9).

Algorithm 2. Heuristic Optimization for VNF Placement with Latency Guarantees

Input:
- Set of service chains

S_{k}

and their latency bounds

δ_{k}^{m a x}

- Current VNF-to-node assignments:

U_{0}

- Migration budget:

M_{m a x}

- Resource availability matrix R
Output:
- Updated assignment matrix:

U

- Migration plan:

ℳ

1: Initialize:

ℳ

← ∅,

U

←

U_{0}

2: for each service k in S do
3: Compute current latency L

L_{k}^{E 2 E}

from Equation (2)
4: if

L_{k}^{E 2 E}

>

δ_{k}^{m a x}

then
5: for each v_i in chain V_k do
6: candidate_nodes ← FeasibleTargets(v_i, R)
7: for each c_j in candidate_nodes do
8: Estimate total cost: cost ←

L_{k}^{E 2 E}

, new + α·

L_{v_{i}}^{m i g}

9: if cost ≤

L_{k}^{E 2 E}

+ ε then
10: Add (v_i, c_j) to

ℳ

and update

U

11:   break
12:   end if
13:   end for
14: end for
15: end if
16: end for
17: Return

U

,

ℳ

Computational Complexity of Algorithm 2

Algorithm 2 evaluates each VNF in service chains that exceed their SLA latency. For each VNF, it iterates over feasible nodes, estimating a placement cost. Thus, the total complexity is O(K × V × C), where K is the number of services, V the average number of VNFs per service, and C the number of placement options. Despite being heuristic-based, the Algorithm maintains practical runtime for moderately sized deployments, supporting scalability in 6G environments.

5. Experimental Design and Setup

To evaluate the proposed framework, a controlled simulation environment was designed to emulate 6G network slicing scenarios with URLLC and eMBB services. Since no publicly available dataset captures real-world end-to-end latency in 6G-era NFV infrastructures with slicing orchestration feedback loops, we rely on synthetic data generation aligned with performance trends from literature [30,31,32].

5.1. Simulation Environment

The simulation platform is developed in Python 3.10 and emulates the behavior of an ETSI-compliant NFV MANO stack with added predictive orchestration. The simulated environment models:

Compute nodes representing virtualized infrastructure with CPU, memory, and bandwidth constraints.
VNF instances configured with distinct processing capacities and deployed over virtual nodes.
SFCs for both URLLC and eMBB services with variable latency constraints.
Latency measurement modules capturing queuing, processing, and inter-VNF communication delays.
Orchestrator logic, supporting both baseline (reactive) and AI-enhanced predictive control.

It is worth mentioning that the environment supports modular plug-ins for AI models and orchestration policies, making it extensible for benchmarking multiple strategies.

Furthermore, the workload generation follows a hybrid model combining Poisson arrivals for baseline request generation and bursty traffic spikes to emulate the high variability nature of URLLC workloads under congestion or demand surges. This approach aligns with recent 6G traffic modeling studies in [12,25], which report that traffic for time-sensitive services often exhibits self-similar, heavy-tailed, and bursty characteristics. These models are particularly relevant for URLLC slices, where latency violations often correlate with sudden traffic bursts or resource contention.

5.2. Slicing and Traffic Configuration

Two primary slice types are emulated:

URLLC slices: highly delay-sensitive chains with SLA $δ_{k}^{m a x} = 1 m s$ .
eMBB slices: throughput-focused chains with relaxed latency bounds (e.g., 10–20 ms).

Each slice type has an associated SFC with 3 to 5 VNFs per chain. The simulation runs with 10 to 20 service chains at once, introducing congestion, cross-traffic, and resource contention.

The traffic patterns are generated using:

Poisson arrivals for URLLC flows.
Bursty background traffic to model real-world cross-slice interference.
Time-varying CPU and bandwidth usage patterns.

5.3. AI Forecasting Model

The latency prediction model is implemented using an LSTM neural network. The model forecasts the end-to-end latency for each slice using a sliding window of system features shown below, keeping in mind that the model is trained offline and then embedded in the orchestrator for online inference during experiments.

The system features are:

Queuing delay.
CPU utilization.
Inter-VNF transmission delay.
Slice type indicator.

The model configuration is as follows:

Input window size: 10 time steps.
Hidden layers: 2 LSTM layers with 64 units each.
Optimizer: Adam.
Loss function: Mean Squared Error (MSE).
Training data: 5000 synthetic samples generated from traffic simulations.

5.4. Orchestration Baselines

To validate the effectiveness of the predictive orchestrator, two baseline strategies are implemented:

Reactive orchestrator: aims to monitor real-time latency and migrate VNFs only when SLA violations occur (no forecasting).
Static orchestrator: performs one-time initial placement and does not support migration or scaling during execution.

These are compared against the proposed predictive orchestrator, which uses Algorithm 1 (forecast-driven migration) and Algorithm 2 (cost-aware placement optimization).

5.5. Experiment Parameters

The core experimental setup is summarized in Table 2:

The LSTM model is configured with a 10-step input window and two hidden layers of 64 units each. These hyperparameters were selected empirically through preliminary tuning to balance accuracy and responsiveness. Larger window sizes or model depths yielded marginal accuracy gains at the cost of higher inference delays, while smaller configurations led to underfitting and reduced prediction precision. The chosen setup consistently achieved sub-2 ms inference latency while maintaining high predictive reliability, making it suitable for proactive orchestration in latency-sensitive environments.

5.6. Evaluation Metrics

The following performance metrics are used to assess the system:

End-to-end latency (average and 95th percentile).
SLA violation count.
Migration frequency.
Prediction error (RMSE).
Overall system resource utilization.

These metrics allow fair comparison between predictive, reactive, and static orchestration strategies under identical traffic loads.

As we indicated earlier, because real-world 6G NFV testbeds and end-to-end latency datasets are not yet publicly accessible, we employ synthetic data derived from performance patterns observed in 5G/6G-oriented research [30,31,32]. The simulation environment enables controlled and repeatable experimentation across various orchestration strategies under dynamic load conditions. While this provides a strong foundation for evaluating architectural behaviors and decision models, future work will aim to validate the framework in real or emulated testbeds (such as OpenAirInterface, Mininet, or ETSI NFV PoCs).

6. Results

Here in this section, we evaluate the performance of the proposed latency-aware NFV slicing orchestrator under a simulated 6G scenario with URLLC and eMBB services. The predictive orchestrator is compared against reactive and static baselines across multiple metrics including latency, SLA violations, migration frequency, and forecasting accuracy. Where relevant, extended evaluations are also provided for CPU usage, latency distribution, orchestration cost, and VNF-node dynamics.

6.1. Latency Performance

Figure 3 shows the average end-to-end latency over time for URLLC service chains. The predictive orchestrator keeps latency consistently below the SLA threshold of 1 ms, as it proactively triggers migrations based on forecasted trends. In contrast, the reactive approach exceeds the threshold intermittently, while the static setup suffers chronic violations due to lack of adaptability.

6.2. SLA Violation

Figure 4 reports the number of SLA violations recorded for each orchestration strategy. Predictive orchestration leads to the fewest violations (3 total), while reactive and static policies show higher breach rates. These results validate the effectiveness of using forecast-based triggers.

6.3. Migration Frequency

Figure 5 compares the number of VNF migrations initiated under each approach. Predictive orchestration results in fewer migrations than reactive orchestration (8 vs. 22), indicating more efficient control. The static method performs no migrations by design.

6.4. Frequency Accuracy

Figure 6 presents the prediction error using RMSE and MAE. The predictive orchestrator, equipped with an LSTM forecasting model, achieves the lowest error scores, demonstrating the robustness of the AI prediction pipeline integrated into the orchestration logic.

6.5. CPU Utilization Analysis

To assess system efficiency, Figure 7 reports average CPU utilization across nodes. The predictive orchestrator uses resources more effectively, maintaining balanced utilization and reducing variance, while reactive and static orchestration causes CPU hotspots due to delayed decisions.

6.6. CDF of Latency Distribution

Figure 8 shows the cumulative distribution function (CDF) of E2E latency for URLLC slices. The predictive curve is steepest, indicating tighter and more consistent latency control. The static orchestrator shows a broader distribution with higher tail latencies.

6.7. VNF Mapping Dynamics

Figure 9 visualizes the evolution of VNF placements across compute nodes over time. The predictive orchestrator exhibits fewer migrations and a more stable mapping pattern, reducing system disruption and cache reload costs. To quantify this observed stability, we measured the total number of VNF-node reassignment events during the orchestration horizon. The predictive strategy resulted in approximately 40% fewer reassignments compared to the baseline reactive method. This metric supports the visual trend shown in the heatmap, indicating enhanced placement stability and lower orchestration churn.

6.8. Orchestration Cost Summary

Table 3 aggregates total latency and migration overhead to derive an overall cost metric. The predictive orchestrator yields the lowest total cost (186 ms), outperforming reactive (265 ms) and static (310 ms) designs. This confirms the utility of combining latency forecasts with optimization-based decisions.

6.9. Summary of Evaluation

Table 4 consolidates key evaluation results for the three orchestration strategies across five performance dimensions: average latency, SLA violations, VNF migrations, forecasting error (RMSE and MAE), and total orchestration cost. The predictive orchestrator demonstrates the best performance overall, maintaining sub-millisecond latency with minimal SLA breaches and fewer migrations while also achieving the highest prediction accuracy and lowest total orchestration cost. This comparative summary supports the conclusion that predictive orchestration is well suited for meeting the strict requirements of 6G time sensitive applications.

6.10. AI Inference Overhead

To evaluate the scalability of the proposed predictive orchestrator, we measured the CPU utilization specifically attributed to the LSTM-based forecasting module during simulation runtime. The results show that inference operations incurred an average of 2.5% CPU usage per node, with peak usage not exceeding 5%, even under bursty traffic conditions. These overhead levels are minimal and do not interfere with the normal orchestration or VNF execution tasks. This confirms that integrating AI-driven forecasting into the orchestration loop introduces negligible computational overhead, making it viable for deployment in 6G edge environments where resource constraints are more prominent.

7. Discussion

The experimental results demonstrate that predictive orchestration significantly improves latency compliance and operational efficiency in 6G-enabled NFV infrastructures. Here in this section, we interpret these findings, highlight trade-offs and limitations, and discuss practical implications for real-world deployment.

7.1. Trade-Offs in Predictive Orchestration

The integration of AI-based forecasting into the orchestration loop offers clear benefits in reducing SLA violations and minimizing unnecessary migrations. However, this gain is achieved at the cost of the following:

Increased computational overhead: running LSTM-based predictions and solving placement optimization (Algorithm 2) introduces a control-plane processing delay, albeit minimal (~2 ms in our setup).
Model retraining: forecast models must be periodically retrained to adapt to shifting traffic patterns, especially in highly dynamic environments like 6G RAN-core slicing scenarios.
Forecast uncertainty: even with low RMSE/MAE, prediction errors can occasionally cause suboptimal migration decisions or false positives. These must be balanced against the migration cost (Equation (11)).

Nevertheless, these trade-offs are acceptable in exchange for the improved SLA assurance for URLLC services, as shown in Figure 3, Figure 4, Figure 5 and Figure 6.

7.2. Scalability and Multi-Domain Slicing

While the presented framework is tested in a single-domain NFVI with 20 service chains, 6G networks will likely involve multi-domain, multi-operator environments. Predictive orchestration across federated domains introduces challenges such as latency prediction across trust boundaries, inter-domain migration policies, and standardized telemetry sharing (ETSI IFA032) [7]. The scalability can be addressed using distributed inference pipelines, edge-hosted predictors, or hierarchical MANO systems, which should be investigated in future work.

7.3. Practical Implications for 6G Networks

The predictive orchestration framework aligns well with key 6G principles. First, the dynamicity of slicing, where adjusting VNF placement on-the-fly supports slice elasticity and reliability. Second, the intent-based networking, in which forecasting latency acts as a proactive intent translator. Third, the RAN-core coordination, where forecast-driven migration enables tighter synchronization between RAN slicing and core VNFs.

Further, the architecture proposed in Figure 1 in Section 3 is compatible with the ETSI NFV MANO reference model [7], facilitating integration into real-world PoCs or testbeds like OpenAirInterface or 5G-VINNI.

7.4. Limitations and Assumptions

Several simplifications were made for tractability. First, the simulation assumes known traffic arrival patterns and resource profiles. Also, realistic packet loss, jitter, and control signaling overheads are abstracted. In real deployments, such factors could introduce noise and uncertainty into the telemetry stream. For instance, latency jitter may distort temporal patterns used by the LSTM predictor, leading to decreased forecasting accuracy or higher false-positive rates in SLA violation detection. Similarly, packet loss may cause missing data points, and control signaling delays can introduce latency in executing orchestration actions, partially offsetting the gains of proactive migration. These dynamics are important and will be considered in future testbed validations. Lastly, the LSTM model is pre-trained and embedded; real-time online learning is not explored. Although the simulation emulates realistic URLLC/eMBB latency constraints, validation on a real-world testbed would provide stronger evidence of practical viability.

Another operational consideration concerns the retraining frequency and data privacy aspects of the predictive model. In our current setup, the LSTM predictor is pre-trained offline and embedded within the orchestration loop. However, in real-world deployments, periodic retraining may be required, for example, weekly or upon traffic pattern drift, to maintain forecasting accuracy as network dynamics evolve. This retraining process should be designed to minimize service interruption and avoid additional control-plane overhead. Furthermore, since predictive orchestration relies on collecting detailed telemetry from VNFs and user slices, data privacy and security must be ensured. Privacy-preserving approaches such as federated learning, where model updates rather than raw data are shared across domains, can mitigate privacy risks while maintaining predictive capability. These aspects will be explored in future iterations of the framework to ensure compliance with 6G data governance and regulatory requirements.

7.5. Future Exploration Areas

Future enhancements may include:

Integration of multi-agent RL for decentralized orchestration.
Use of federated learning for cross-domain predictive orchestration.
Incorporating network energy efficiency into the optimization cost function.
Support for non-IP traffic slices and deterministic networking (DetNet).

These directions will help evolve the architecture into a robust, AI-native orchestrator suitable for the full range of 6G use cases.

7.6. Comparison with Existing Orchestration Platforms

While existing orchestration platforms such as ETSI OSM, ONAP, and Intel OpenNESS support NFV/MANO-compliant VNF lifecycle management, their orchestration logic is primarily reactive, relying on alarms or static policies for triggering actions. For example:

OSM supports VNF placement and auto-scaling based on threshold alerts but lacks integrated forecasting.
ONAP offers policy-based orchestration and closed-loop automation, but its AI capabilities remain modular and loosely integrated.
OpenNESS targets edge orchestration and resource discovery but does not natively include latency forecasting.

In contrast, our framework embeds AI-native orchestration logic directly within the NFVO. It enables forecast-based decisions for VNF migration and scaling, making it better suited for URLLC-aware 6G slicing where timing guarantees are critical.

8. Conclusions and Future Works

We proposed a latency-aware, AI-driven NFV slicing orchestration framework aimed to meet the stringent service-level demands of time-sensitive 6G applications, particularly those governed by URLLC requirements. By integrating a predictive LSTM-based latency forecasting model into the orchestration logic, combined with a cost-aware optimization formulation for VNF placement and migration, the proposed framework proactively anticipates SLA violations and dynamically reconfigures slice resources across the NFVI to preserve end-to-end latency guarantees. Built upon the ETSI NFV MANO architecture, the framework ensures compatibility with existing standards while offering intelligent automation suited for 6G networks.

The framework was formally modeled using a suite of latency decomposition and forecasting equations, and orchestrator behavior was structured through Algorithms that enable adaptive and predictive control. Extensive simulations demonstrated that the predictive orchestrator significantly outperforms reactive and static baselines, reducing SLA violations by over 80%, maintaining sub-millisecond latency across dynamic loads, and minimizing unnecessary migrations. Additionally, the framework consistently delivered high forecasting accuracy, validating the suitability of the LSTM predictor for proactive service management under ultra-low latency constraints. These results affirm that embedding AI within the orchestration layer enhances both responsiveness and resource efficiency in NFV-based slice deployments.

While the evaluation was conducted using synthetic data due to the current lack of publicly available 6G NFV testbeds, the simulation was grounded in realistic traffic patterns and system parameters based on established 5G/6G research. This allowed for a controlled, reproducible assessment of orchestration strategies. Nonetheless, the absence of real-world latency traces and platform constraints represents a limitation that will be addressed in future work. Empirical validation on testbed environments such as OpenAirInterface, ETSI NFV PoC platforms, or integrated RAN-core emulators is envisioned as a next step, enabling evaluation under practical deployment conditions.

Future extensions of this work will focus on enhancing orchestration scalability to multi-domain and federated environments, where latency prediction and migration decisions must operate across administrative boundaries. Additional optimization objectives such as energy efficiency, mobility support, and jitter control will also be incorporated to accommodate diverse 6G service classes. Furthermore, integrating reinforcement learning agents and distributed forecasting architectures may offer enhanced adaptability and real-time responsiveness under volatile workloads. As 6G evolves toward AI-native network management, the proposed orchestration framework represents a foundational step toward intelligent, latency-assured service delivery in virtualized network infrastructures.

Author Contributions

Conceptualization, A.K.A. and K.M.A.; methodology, A.K.A. and K.M.A.; software, A.K.A.; validation, A.K.A.; formal analysis, A.K.A. and K.M.A.; investigation, A.K.A. and K.M.A.; resources, A.K.A. and K.M.A.; data curation, A.K.A.; writing—original draft preparation, A.K.A.; writing—review and editing, A.K.A. and K.M.A.; visualization, A.K.A. and K.M.A.; supervision, A.K.A.; project administration, A.K.A.; funding acquisition, A.K.A. and K.M.A. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Deanship of Scientific Research, Vice Presidency for Graduate Studies and Scientific Research, King Faisal University, Saudi Arabia (Project No. KFU253617).

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

This study could not have been started or completed without the encouragement and continued support of King Faisal University.

Conflicts of Interest

The author declares no conflicts of interests.

References

Ali, R.; Zikria, Y.B.; Bashir, A.K.; Garg, S.; Kim, H.S. URLLC for 5G and beyond: Requirements, enabling incumbent technologies and network intelligence. IEEE Access 2021, 9, 67064–67095. [Google Scholar] [CrossRef]
Promwongsa, N.; Ebrahimzadeh, A.; Naboulsi, D.; Kianpisheh, S.; Belqasmi, F.; Glitho, R.; Alfandi, O. A comprehensive survey of the tactile internet: State-of-the-art and research directions. IEEE Commun. Surv. Tutor. 2020, 23, 472–523. [Google Scholar] [CrossRef]
Alves, H.; Jo, G.D.; Shin, J.; Yeh, C.; Mahmood, N.H.; Lima, C.; Kim, S. Beyond 5G URLLC Evolution: New Service Modes and Practical Considerations. arXiv 2021, arXiv:2106.11825. [Google Scholar] [CrossRef]
TR 23.700-90 V17.0.0; Study on Enhanced Support of URLLC. 3GPP: Sophia Antipolis, France, 2022.
Alnaim, A.K.; Alwakeel, A.M.; Fernandez, E.B. Towards a security reference architecture for NFV. Sensors 2022, 22, 3750. [Google Scholar] [CrossRef] [PubMed]
Alwakeel, A.M.; Alnaim, A.K.; Fernandez, E.B. A pattern for NFV management and orchestration (MANO). In Proceedings of the 8th Asian Conference on Pattern Languages of Programs, Tokyo, Japan, 20–22 March 2019. [Google Scholar]
ETSI GS NFV-MAN 001 V1.1.1; Network Functions Virtualisation (NFV); Management and Orchestration. ETSI: Sophia Antipolis, France, 2014.
Badmus, I.; Laghrissi, A.; Matinmikko-Blue, M.; Pouttu, A. Identifying requirements affecting latency in a softwarized network for future 5G and beyond. In Proceedings of the 2nd 6G Wireless Summit (6G SUMMIT), Levi, Finland, 17–20 March 2020; pp. 1–6. [Google Scholar]
Emu, M.; Yan, P.; Choudhury, S. Latency aware VNF deployment at edge devices for IoT services: An artificial neural network based approach. In Proceedings of the IEEE International Conference on Communications Workshops (ICC Workshops), Dublin, Ireland, 7–11 June 2020; pp. 1–6. [Google Scholar]
Adoga, H.U.; Pezaros, D.P. Towards latency-aware vNF placement on heterogeneous hosts at the network edge. In Proceedings of the IEEE Global Communications Conference (GLOBECOM), Kuala Lumpur, Malaysia, 4–8 December 2023; pp. 6383–6388. [Google Scholar]
Shang, X.; Liu, Z.; Yang, Y. Online service function chain placement for cost-effectiveness and network congestion control. IEEE Trans. Comput. 2022, 71, 27–39. [Google Scholar] [CrossRef]
Zhang, Z.; Wang, C. Service Function Chain Migration: A Survey. Computers 2025, 14, 203. [Google Scholar] [CrossRef]
Taleb, T.; Ksentini, A.; Jantti, R. Anything as a Service for 5G Mobile Systems. IEEE Netw. 2016, 30, 84–91. [Google Scholar] [CrossRef]
Yang, H.; Alphones, A.; Xiong, Z.; Niyato, D.; Zhao, J.; Wu, K. Artificial-intelligence-enabled intelligent 6G networks. IEEE Netw. 2020, 34, 272–280. [Google Scholar] [CrossRef]
Katsaros, K.; Mavromatis, I.; Antonakoglou, K.; Ghosh, S.; Kaleshi, D.; Mahmoodi, T.; Simeonidou, D. AI-native multi-access future networks—The REASON architecture. IEEE Access 2024, 12, 178586–178622. [Google Scholar] [CrossRef]
Moreira, R.; Martins, J.S.; Carvalho, T.C.; Silva, F.D.O. On enhancing network slicing life-cycle through an AI-native orchestration architecture. In Proceedings of the International Conference on Advanced Information Networking and Applications (AINA), Fukuoka, Japan, 29–31 March 2023; pp. 124–136. [Google Scholar]
Rzym, G.; Masny, A.; Chołda, P. Dynamic Telemetry and Deep Neural Networks for Anomaly Detection in 6G Software-Defined Networks. Electronics 2024, 13, 382. [Google Scholar] [CrossRef]
Alnaim, A.K.; Alwakeel, A.M.; Fernandez, E.B. A pattern for an NFV virtual machine environment. In Proceedings of the IEEE International Systems Conference (SysCon), Orlando, FL, USA, 8–11 April 2019; pp. 1–6. [Google Scholar]
Abbas, K.; Afaq, M.; Ahmed Khan, T.; Rafiq, A.; Song, W.-C. Slicing the Core Network and Radio Access Network Domains through Intent-Based Networking for 5G Networks. Electronics 2020, 9, 1710. [Google Scholar] [CrossRef]
Ammar, S.; Lau, C.P.; Shihada, B. An in-depth survey on virtualization technologies in 6G integrated terrestrial and non-terrestrial networks. IEEE Open J. Commun. Soc. 2024, 5, 3690–3734. [Google Scholar] [CrossRef]
Hu, Y.; Min, G.; Li, J.; Li, Z.; Cai, Z.; Zhang, J. VNF migration in digital twin network for NFV environment. Electronics 2023, 12, 4324. [Google Scholar] [CrossRef]
Erbati, M.M.; Tajiki, M.M.; Schiele, G. Service Function Chaining to Support Ultra-Low Latency Communication in NFV. Electronics 2023, 12, 3843. [Google Scholar] [CrossRef]
Alam, K.; Habibi, M.A.; Tammen, M.; Krummacker, D.; Saad, W.; Di Renzo, M.; Schotten, H.D. A comprehensive tutorial and survey of O-RAN: Exploring slicing-aware architecture, deployment options, use cases, and challenges. arXiv 2024, arXiv:2405.03555. [Google Scholar] [CrossRef]
Zhao, M.; Zhang, Y.; Liu, Q.; Kak, A.; Choi, N. AdaSlicing: Adaptive Online Network Slicing Under Continual Network Dynamics in Open Radio Access Networks. In Proceedings of the IEEE INFOCOM 2025–IEEE Conference on Computer Communications, Vancouver, BC, Canada, 19–22 May 2025; pp. 1–10. [Google Scholar]
Mhatre, S.; Adelantado, F.; Ramantas, K.; Verikoukis, C. AIaaS for ORAN-based 6G Networks: Multi-time scale slice resource management with DRL. In Proceedings of the ICC 2024–IEEE International Conference on Communications, Denver, CO, USA, 9–13 June 2024; pp. 5407–5412. [Google Scholar]
Alomari, Z.; Zhani, M.F.; Aloqaily, M.; Bouachir, O. On Minimizing Synchronization Cost in NFV-Based Environments. In Proceedings of the 2020 16th International Conference on Network and Service Management (CNSM), Izmir, Turkey, 2–6 November 2020; pp. 1–9. [Google Scholar] [CrossRef]
St-Onge, C.; Kara, N.; Edstrom, C. NFVLearn: A multi-resource, long short-term memory-based virtual network function resource usage prediction architecture. Softw. Pract. Exp. 2023, 53, 555–578. [Google Scholar] [CrossRef]
Gallenmüller, S.; Naab, J.; Adam, I.; Carle, G. 5G URLLC: A case study on low-latency intrusion prevention. IEEE Commun. Mag. 2020, 58, 35–41. [Google Scholar] [CrossRef]
Sameer, S.; Chintapalli, V.R. Nurps: Numa- and reliability-aware parallelized SFC deployment in multi-core servers. In Proceedings of the 17th International Conference on COMmunication Systems and NETworks (COMSNETS), Bengaluru, India, 6–10 January 2025; pp. 25–30. [Google Scholar]
She, C.; Dong, R.; Gu, Z.; Hou, Z.; Li, Y.; Hardjawana, W.; Vucetic, B. Deep learning for ultra-reliable and low-latency communications in 6G networks. IEEE Netw. 2020, 34, 219–225. [Google Scholar] [CrossRef]
Esmaeily, A.; Kralevska, K. Orchestrating isolated network slices in 5G networks. Electronics 2024, 13, 1548. [Google Scholar] [CrossRef]
Talaat Fahim, M.; Ibrahim, M.Z.; Elshennawy, N. Efficient Resource Allocation of Latency Aware Slices for 5G Networks. J. Eng. Res. 2023, 7, 94–101. [Google Scholar] [CrossRef]

Figure 1. Latency-Aware Predictive NFV Orchestration Framework Aligned with ETSI MANO. Solid arrows represent control signaling or orchestration actions (e.g., NFVO to VNFM), while dashed arrows represent data or telemetry flow (e.g., monitoring input to AI Engine). The diagram prioritizes conceptual clarity over protocol-level detail.

Figure 2. Predictive Orchestration Workflow for Latency-Aware VNF Management.

Figure 3. End-to-End Latency Over Time for URLLC Slices under Different Orchestration Strategies.

Figure 4. SLA Violations Count for Predictive, Reactive, and Static Orchestra.

Figure 5. VNF Migration Frequency per Orchestration Type, showing proactive vs. reactive behaviors.

Figure 6. Prediction Error Metrics (RMSE and MAE) across Orchestration Strategies.

Figure 7. Average CPU Utilization and Variance across Compute Nodes for Different Orchestrators.

Figure 8. Cumulative Distribution Function (CDF) of E2E Latency for URLLC Traffic.

Figure 9. Heatmap of VNF-Node Mapping Evolution over Time Steps.

Table 1. Comparative table of previous state-of-the-art studies.

Reference	Focus Area	Methodology	Limitations
[9]	Latency-aware VNF placement using ILP	ILP optimization	Not scalable or adaptable in real time
[10]	Scalability limitations of ILP-based VNF placement	Comparative evaluation	Fails under dynamic network conditions
[11]	SFC migration using static system metrics	Heuristic-based migration	Lacks predictive triggers
[12]	SFC migration survey on real-time adaptability	Literature survey	No link to orchestrator logic
[13]	URLLC-eMBB traffic isolation in slicing	5G slicing policy analysis	No dynamic isolation mechanism
[14]	AI for telemetry data analysis (traffic prediction, anomaly detection)	ML-based analytics	Not integrated with orchestration
[15]	REASON architecture for AI-based orchestration	Modular AI controller	Simulation-based, lacks real NFV integration
[16]	Distributed AI-native orchestration with ML agents	Distributed ML orchestration	Conceptual, not ETSI MANO-compliant
[17]	ML-based slice type classification and handover prediction	Supervised learning on simulation data	Limited to simulated 6G data

Table 2. List of the experiment’s parameters and their descriptions.

Parameter	Value/Description
Number of compute nodes	6
Number of service chains	20 (10 URLLC, 10 eMBB)
VNF per chain	3–5
Latency SLA (URLLC)	1 ms
Latency SLA (eMBB)	10–20 ms
CPU per node	8 vCPUs
Bandwidth per link	1–10 Gbps
Forecast model	LSTM (2-layer, 64 units)
Forecast window	10 steps
Forecast interval	Every 50 ms
Simulation duration	1000 time steps
AI inference latency	<2 ms

Table 3. Orchestration Cost Summary (Latency + Migration Overhead).

Orchestrator	Total Latency (ms)	Total Migration Overhead	Total Cost
Predictive	170	16	186
Reactive	220	45	265
Static	310	0	310

Table 4. Summary of Orchestration Performance Metrics.

Metric	Predictive	Reactive	Static
Avg Latency (ms)	0.85	0.93	1.07
SLA Violations	3	15	29
VNF Migrations	8	22	0
Forecast RMSE (ms)	0.032	0.083	0.107
Forecast MAE (ms)	0.025	0.064	0.092
Total Orchestration Cost (ms)	186	265	310

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Alnaim, A.K.; Albarrak, K.M. Latency-Aware NFV Slicing Orchestration for Time-Sensitive 6G Applications. Systems 2025, 13, 957. https://doi.org/10.3390/systems13110957

AMA Style

Alnaim AK, Albarrak KM. Latency-Aware NFV Slicing Orchestration for Time-Sensitive 6G Applications. Systems. 2025; 13(11):957. https://doi.org/10.3390/systems13110957

Chicago/Turabian Style

Alnaim, Abdulrahman K., and Khalied M. Albarrak. 2025. "Latency-Aware NFV Slicing Orchestration for Time-Sensitive 6G Applications" Systems 13, no. 11: 957. https://doi.org/10.3390/systems13110957

APA Style

Alnaim, A. K., & Albarrak, K. M. (2025). Latency-Aware NFV Slicing Orchestration for Time-Sensitive 6G Applications. Systems, 13(11), 957. https://doi.org/10.3390/systems13110957

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Latency-Aware NFV Slicing Orchestration for Time-Sensitive 6G Applications

Abstract

1. Introduction

2. Literature Review

2.1. Overview

2.2. Related Works

3. Methodology

3.1. System Architecture and Components

3.2. Predictive Orchestration Model

3.3. VNF Isolation Strategy

3.4. Migration Management

3.5. ETSI Compliance and Integration

4. Mathematical Modelling

4.1. End-to-End Latency Model

4.2. Migration Overhead Model

4.3. Latency Forecasting Model

Computational Complexity of Algorithm 1

4.4. Optimization Formulation

Computational Complexity of Algorithm 2

5. Experimental Design and Setup

5.1. Simulation Environment

5.2. Slicing and Traffic Configuration

5.3. AI Forecasting Model

5.4. Orchestration Baselines

5.5. Experiment Parameters

5.6. Evaluation Metrics

6. Results

6.1. Latency Performance

6.2. SLA Violation

6.3. Migration Frequency

6.4. Frequency Accuracy

6.5. CPU Utilization Analysis

6.6. CDF of Latency Distribution

6.7. VNF Mapping Dynamics

6.8. Orchestration Cost Summary

6.9. Summary of Evaluation

6.10. AI Inference Overhead

7. Discussion

7.1. Trade-Offs in Predictive Orchestration

7.2. Scalability and Multi-Domain Slicing

7.3. Practical Implications for 6G Networks

7.4. Limitations and Assumptions

7.5. Future Exploration Areas

7.6. Comparison with Existing Orchestration Platforms

8. Conclusions and Future Works

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI