1. Introduction
Software-defined networking (SDN) has emerged as a transformative paradigm that decouples the control plane from the data plane, enabling centralized network management and programmability. While this architecture offers significant advantages in terms of flexibility and scalability, it introduces unique security challenges: the centralized controller becomes a single point of failure, and the traditional “perimeter defense” model proves inadequate against lateral movement attacks by malicious insiders [
1,
2,
3].
To address these limitations, the
zero-trust architecture (ZTA) has gained substantial traction as a security model predicated on the principle of “never trust, always verify” [
4,
5]. Unlike conventional network security that relies on coarse-grained perimeter boundaries, ZTA mandates continuous authentication and fine-grained access control for every resource and connection. When integrated with SDN’s programmable infrastructure, ZTA enables dynamic, software-defined security policies that can adapt to evolving threat landscapes in real time [
6,
7,
8].
Recent research has explored the convergence of these two paradigms. Guo et al. [
4] proposed IZTSDN, or the
Intelligent Zero-Trust Security Framework for SDN, that combines deep learning-based anomaly detection (CALSeq2Seq) with user behavior trust assessment (UBTA). Their framework demonstrated promising results: 99.56% detection accuracy on SDN-specific datasets and approximately 80.5% throughput maintenance under distributed denial-of-service (DDoS) attacks. However, while the study established the
functional viability of integrating deep learning with zero-trust principles, it left critical operational questions unanswered.
Specifically, the framework’s end-to-end performance remains uncharacterized. The original evaluation focused on component-level accuracy and aggregate throughput protection, while it did not quantify the end-to-end latency introduced by the multi-stage security pipeline from single packet authorization (SPA) requests through trust value calculation to access grant decisions; the real-time detection latency—the temporal window between malicious flow appearance and controller enforcement action; the control-plane resource overhead, CPU utilization, memory consumption, and inter-component communication bandwidth imposed by concurrent deep learning inference, trust computation, and standard OpenFlow operations [
9,
10,
11]; and the scalability degradation profile—how the latency and overhead evolve as the network size increases beyond small-scale simulation environments [
12,
13].
These performance dimensions are not merely implementation details; they determine whether an intelligent security framework can transition from proof-of-concept to operationally viable deployment at a moderate scale. Recent surveys on moving target defense and game-theoretic security strategies further highlight that, without rigorous performance evaluation, even theoretically sound defense mechanisms fail to achieve practical deployment [
14,
15]. Consequently, quantifying the operational cost of intelligent security integration is not merely an engineering formality but a prerequisite for production viability. In high-speed networks, excessive detection latency creates a “window of vulnerability” where attacks cause damage before mitigation [
16,
17,
18]. Similarly, uncontrolled control-plane overhead risks controller saturation, ironically creating the very denial-of-service conditions that the framework aims to prevent [
19,
20,
21].
The measurement of security–performance trade-offs is particularly critical for deep learning-enhanced SDN controllers. Unlike lightweight rule-based systems, deep neural networks impose substantial computational costs [
22,
23,
24,
25]. Recent work by Tang et al. [
26] demonstrated that even optimized deep learning IDS implementations in SDN controllers require careful latency management to achieve ∼1–2 ms per-flow processing. However, their evaluation did not integrate the trust-based authentication overhead, nor did they characterize resource consumption under sustained loads [
27,
28,
29].
This paper addresses the research gap by presenting the first comprehensive end-to-end performance analysis of an intelligent zero-trust SDN framework [
30]. We substantiate this claim of novelty by noting that, while prior works such as DeepIDS [
26] report system-level throughput and latency, and SDP implementations [
18,
31] quantify the authentication overhead, no existing study has simultaneously considered
(i) fine-grained end-to-end latency decomposition across authentication, detection, and mitigation stages;
(ii) the control-plane resource overhead (CPU, memory, bandwidth) under both normal and attack conditions;
(iii) scalability degradation profiles from 4 to 64 nodes; and
(iv) comparative benchmarking against the baseline SDN, SDP, and DeepIDS within a unified framework. Table 4 explicitly enumerates these unmeasured dimensions in IZTSDN and comparable frameworks, confirming that our evaluation is the first to address all four dimensions concurrently. We extend the IZTSDN architecture to quantify operational costs that were previously unmeasured, providing network operators with empirical data to assess deployment feasibility. Our specific contributions are as follows. First, we develop a fine-grained measurement methodology that isolates the authentication latency, detection latency, and policy enforcement latency within the integrated pipeline; this reveals the critical path bottlenecks that dominate end-to-end delay [
31,
32]. Second, we characterize the control-plane overhead by quantifying CPU utilization, memory consumption, and the controller-to-gateway communication overhead under varying load conditions from normal operation to high-volume attack scenarios [
33,
34,
35]. Third, we conduct a scalability analysis evaluating performance degradation as the network topology scales beyond the original four-node simulation to larger deployments with multiple switches, hosts, and concurrent flows, identifying the operational limits of the framework [
36]. Fourth, we provide comparative performance benchmarking that contextualizes IZTSDN’s overhead against that of the baseline SDN (without security), traditional software-defined perimeter (SDP) implementations, and the DeepIDS framework, demonstrating the operational cost of intelligent security integration [
26].
To clearly delineate the boundary between our work and the original IZTSDN framework [
4], we emphasize that the CALSeq2Seq and UBTA modules, as well as the baseline MiniIZTA testbed, are inherited from Guo et al. [
4]. Our incremental contributions lie in the extended testbed instrumentation, the fine-grained measurement methodology (latency decomposition, resource monitoring, and scalability profiling), and the quantitative performance characterization that were not provided in the original work.
The remainder of this paper is organized as follows.
Section 2 reviews related work on SDN security frameworks with an emphasis on performance evaluation methodologies.
Section 3 provides the technical background of the IZTSDN architecture and identifies the specific performance dimensions under investigation.
Section 4 details our experimental testbed, measurement instrumentation, and evaluation metrics.
Section 5 presents quantitative results for latency, overhead, and scalability. Finally,
Section 7 summarizes our contributions and identifies future research directions.
2. Related Work
The integration of security mechanisms into software-defined networking has been extensively studied, yet the systematic evaluation of their operational performance, particularly latency and control-plane overhead, remains underexplored. This section reviews three key research threads: (1) deep learning-based intrusion detection in SDN with performance considerations; (2) zero-trust architecture implementations and their computational costs; and (3) comprehensive SDN security frameworks that evaluate both effectiveness and efficiency. We conclude by identifying the specific gap addressed in this work.
2.1. Deep Learning-Based Security in SDN: Accuracy vs. Efficiency
Recent years have witnessed significant advancements in applying deep learning to SDN security, with researchers achieving high detection accuracy for various attack vectors. However, the computational cost of these sophisticated models presents a critical deployment consideration that is often underreported.
Tang et al. [
26] proposed DeepIDS, a deep learning approach for intrusion detection in SDN that explicitly addresses this balance. Their framework utilizes both fully connected deep neural networks (DNN) and gated recurrent neural networks (GRU-RNN), achieving 80.7% and 90% detection accuracy, respectively. Crucially, they conducted a system-level performance evaluation measuring throughput, latency, and resource utilization—a rarity in this domain. Their results demonstrate that GRU-RNN, while more accurate, introduces approximately 7% latency overhead and 4% throughput degradation in networks exceeding 64 switches, highlighting the performance–security trade-off inherent in complex neural architectures.
Similarly, Novaes et al. [
19] developed a GAN-based detection system for DDoS attacks in SDN, emphasizing real-time operation. Their adversarial deep learning approach achieves detection with minimal false positives, although the generator–discriminator architecture imposes substantial memory requirements on the controller. Khan and Akhunzada [
24] proposed a hybrid CNN-LSTM architecture for medical IoT environments, evaluating computational complexity alongside detection accuracy—an approach that we extend to general SDN deployments.
The CALSeq2Seq model proposed by Guo et al. [
4] represents the current state-of-the-art in detection accuracy (99.56%), combining a 1D-CNN for feature extraction with self-attention mechanisms and LSTM-Seq2Seq for temporal modeling. However, while the authors acknowledge that “hardware acceleration in training … can probably be controlled to less than 10 s” per step, they provide no empirical data on inference latency, controller CPU utilization, or memory consumption during real-time operation—precisely the operational metrics required for production deployment decisions.
Other notable approaches include that of Sanagavarapu and Sridhar [
34], who developed SDPredictNet—a topology-based SDN neural routing framework with traffic prediction analysis using ANN and LSTM architectures. Ahuja et al. [
16] proposed automated DDoS attack detection using ensemble machine learning methods in SDN, achieving 98.8% accuracy on SDN-specific datasets. Banitalebi Dehkordi et al. [
33] conducted comprehensive comparisons of machine learning and statistical methods for DDoS detection in SDN, providing valuable insights into the trade-offs between different algorithmic approaches.
Table 1 summarizes key deep learning approaches for SDN security, highlighting the scarcity of comprehensive performance evaluation.
The pattern is clear: while the detection accuracy has reached satisfactory levels (>98% in recent works), the operational cost of these intelligent mechanisms, particularly when integrated with control-plane functions, remains largely uncharacterized. This creates a critical knowledge gap for network operators who must assess whether security enhancements will saturate controller resources or introduce unacceptable latency in high-speed environments.
2.2. Zero-Trust Architecture: Security Benefits and Performance Costs
The zero trust model, formalized by NIST in SP 800-207, has been progressively adopted in SDN environments to address the limitations of perimeter-based security [
2]. However, the “never trust, always verify” principle inherently introduces computational overhead through continuous authentication and dynamic authorization.
Moubayed et al. proposed a software-defined perimeter (SDP) framework implementing zero-trust principles in SDN, achieving 75.5% throughput maintenance under attacks—notably lower than IZTSDN’s 80.5%. Their evaluation focused on the connectivity establishment time and attack resistance, while they did not quantify the latency of trust verification or the computational cost of cryptographic operations in the control plane.
Sallam et al. developed an integrated SDP-SDN architecture that maintains 75% throughput during port scanning and DDoS attacks. While they demonstrate the feasibility of combining these paradigms, their evaluation lacks granular latency decomposition or resource utilization metrics. Ramezanpour and Jagannath [
5] explored intelligent zero-trust architectures for 5G/6G networks, emphasizing machine learning integration, but their analysis remained conceptual, without empirical performance data.
Recent microservice-based implementations provide relevant insights. Research on the zero-trust security architecture (ZTSA) in Kubernetes environments reports authentication latency of 35 ms on average, with CPU utilization increasing by eight percentage points and throughput dropping by 5.5% due to cryptographic operations and token validation. While not SDN-specific, these findings suggest that zero-trust mechanisms impose measurable overhead that must be accounted for in performance-critical network infrastructure.
Fang et al. [
6] proposed THP, a novel authentication scheme to prevent multiple attacks in SDN-based IoT networks, demonstrating the importance of robust authentication mechanisms in constrained environments. Weng et al. [
8] developed BENBI, a scalable and dynamic access control scheme for the northbound interface of SDN-based VANET, addressing authentication latency in vehicular scenarios.
Table 2 compares zero-trust implementations in SDN and related environments.
The absence of comprehensive performance metrics across these studies, particularly when deep learning is integrated with zero-trust authentication, motivates our systematic evaluation.
2.3. Comprehensive SDN Security Frameworks: Effectiveness and Efficiency
Beyond individual security mechanisms, several researchers have proposed holistic frameworks combining multiple defense layers. These works are most directly comparable to our target of analysis, as they involve integrated pipelines where the cumulative overhead becomes a critical concern.
Han et al. [
7] proposed OverWatch, a cross-plane DDoS attack defense framework with collaborative intelligence in SDN. Their framework combines coarse-grained traffic monitoring on switches with fine-grained machine learning classification on the controller, achieving 96% detection accuracy. While they claim that it imposes a “small communication overhead”, no quantitative data are provided on controller resource consumption or detection latency.
Carvalho et al. [
20] developed an ecosystem for anomaly detection and mitigation in SDN, featuring traffic collection, detection, and reporting modules. However, they acknowledge that “algorithm complexity is large and easy to cause CPU overload”—a limitation that underscores the need for systematic overhead evaluation, which we address.
Recent comprehensive frameworks have begun addressing performance more rigorously. The blockchain-based security framework for heterogeneous SDN environments reports specific latency metrics: average authentication latency of 20–30 ms between homogeneous controllers (30 ms for heterogeneous), with registration times of approximately 0.10 s per controller. Their throughput evaluation shows 20 transactions per second with stable performance across Ryu, ONOS, and OpenDaylight controllers.
Similarly, SecuNet-4D provides a four-dimensional security framework (detection, defense, decision, dynamic adaptation) with explicit performance targets: threat detection accuracy ≥90%, policy enforcement latency <150 ms, and throughput up to 2000 Mbps. Their evaluation shows that dynamic policies achieve 4% packet loss versus 8% for static policies, with CPU usage at 45% (dynamic) versus 60% (static).
Other relevant frameworks include that of Javeed et al. [
23], who proposed an SDN-enabled hybrid deep learning-driven framework for detecting emerging cyber threats in IoT environments. Garg et al. [
22] developed a hybrid deep learning-based anomaly detection scheme for suspicious flow detection in SDN from a social multimedia perspective. Ye et al. [
32] presented a DDoS attack detection method based on SVM in software-defined networks, providing a comparative analysis with neural network approaches.
Yungaicela-Naula et al. [
21] proposed a flexible SDN-based framework for slow-rate DDoS attack mitigation using deep reinforcement learning, demonstrating the potential of adaptive security mechanisms. Santos et al. [
27] conducted comprehensive evaluations of machine learning algorithms for DDoS attack detection in SDN, providing practical implementation insights.
However, neither framework integrates deep learning-based anomaly detection with zero-trust authentication, leaving the performance characteristics of this specific combination as embodied in IZTSDN uncharacterized.
Table 3 presents a comprehensive comparison of integrated SDN security frameworks, highlighting the unique position of IZTSDN and the specific metrics that remain unmeasured.
We acknowledge that the studies aggregated in
Table 1,
Table 2 and
Table 3 are heterogeneous in terms of datasets (Bennett University SDN dataset, CICIDS2017, custom topologies), hardware (physical servers, VMs, Mininet emulation), evaluation metrics (accuracy-only vs. latency-only vs. multi-dimensional), and test conditions (attack type, intensity, duration). Consequently, direct numerical comparison across rows must be performed with caution. Our contribution is not to establish a ranked leaderboard but to identify the systematic absence of specific operational metrics, particularly end-to-end latency decomposition and control-plane resource profiling, that prevents informed deployment decisions. The methodological heterogeneity itself reinforces our motivation: the lack of standardized evaluation benchmarks for intelligent SDN security frameworks necessitates the comprehensive, unified measurement methodology that we present.
2.4. Research Gap and Contribution
The literature review reveals a consistent pattern: while recent SDN security frameworks achieve high detection accuracy and demonstrate functional viability, studies lack the systematic characterization of the operational costs incurred by intelligent, multi-layered security mechanisms. Specifically, end-to-end latency decomposition is absent from works combining deep learning with zero-trust authentication; the cumulative delay across SPA processing, trust calculation, deep learning inference, and policy enforcement remains unmeasured. Furthermore, the control-plane resource overhead for integrated security pipelines is rarely quantified; CPU utilization, memory consumption, and inter-component communication bandwidth are critical for capacity planning yet typically unreported. Additionally, scalability degradation profiles are underexplored; most evaluations use small-scale topologies (4–10 nodes), leaving the performance characteristics in networks beyond small-scale testbeds unknown. Finally, comparative operational benchmarking against baseline SDN and alternative security architectures is missing, preventing informed adoption decisions.
This paper addresses these gaps by presenting the first comprehensive end-to-end performance analysis of IZTSDN [
4], quantifying the latency and overhead costs of integrating deep learning-based anomaly detection with zero-trust authentication in software-defined networks.
3. Background and Problem Statement
While
Section 3 recapitulates the IZTSDN architecture and threat model from Guo et al. [
4] to establish the necessary context, the subsequent
Section 4 details our original experimental extensions: the performance instrumentation layer, the latency decomposition methodology, the resource monitoring infrastructure, and the scalability scenarios—none of which were present in the original MiniIZTA testbed.
3.1. The IZTSDN Architecture
The Intelligent Zero-Trust Security Framework for Software-Defined Networking (IZTSDN), proposed by Guo et al. [
4], integrates deep learning-based threat detection with dynamic zero-trust access control to address fundamental vulnerabilities in SDN environments. The framework specifically targets the centralized controller’s susceptibility to distributed denial-of-service (DDoS) attacks [
16,
18,
33] and the inadequacy of perimeter-based security models [
2] against lateral movement by compromised insiders [
1,
3]. As illustrated in
Figure 1, IZTSDN comprises five interconnected modules operating across three planes—the data plane, control plane, and intelligent engine plane—coordinated through a multi-stage security pipeline.
The CNN Self-Attention LSTM-Seq2Seq (CALSeq2Seq) model serves as the framework’s behavioral analysis engine, processing network flows through a four-stage pipeline. First, in the feature extraction stage, one-dimensional convolutional neural networks (1D-CNNs) with 32 filters of kernel size
extract spatial features from raw flow table data, followed by
max-pooling operations. Second, the feature fusion stage employs a scaled dot-product self-attention mechanism that computes
where
Q,
K, and
V represent query, key, and value transformation matrices, and
denotes the key dimensionality. Third, temporal modeling is performed via an LSTM-based Seq2Seq encoder–decoder architecture with 64 hidden units, capturing long-range temporal dependencies in flow sequences. Fourth, the classification stage uses a fully connected layer with softmax activation to output a binary classification (normal/abnormal), achieving 99.56% accuracy on the Bennett University SDN dataset. While the model demonstrates high detection accuracy, its computational complexity
for the self-attention mechanism and
for LSTM processing introduces significant inference overhead when deployed on the SDN controller. Guo et al. [
4] acknowledge that hardware acceleration can reduce the training time to “less than 10 s” per step, yet they provide no empirical measurement of the real-time inference latency or controller resource consumption during live traffic processing. Similar deep learning architectures have been explored in SDN security contexts: Garg et al. [
22] proposed a hybrid deep learning-based anomaly detection scheme combining a CNN and LSTM for suspicious flow detection; Cao et al. [
17] developed a spatial–temporal graph convolutional network for detecting and mitigating DDoS attacks in SDN; and Ravi et al. [
25] presented a deep learning feature fusion approach for intrusion detection in SDN-based IoT networks. These works collectively demonstrate the potential of deep learning for SDN security while highlighting the need for performance characterization.
The User Behavior Trust Authentication (UBTA) algorithm implements continuous, dynamic trust evaluation based on the “never trust, always verify” principle [
2]. The trust value
T is computed as a weighted sum of behavioral factors:
where
represents the trust evaluation factor for dimension
i (0: no trust, 0.5: temporary trust, 1: complete trust), and
denotes dynamically adjusted weights. For real-time prediction, UBTA incorporates temporal decay:
where
is the decay factor and
represents the predicted trust value at cycle
k. Access decisions follow threshold-based authorization: browsing (≥0.5), file download (≥0.7), upload (≥0.8), modification (≥0.9), and deletion (≥0.95). While this enables fine-grained access control, the computational cost of continuous trust recalculation, particularly when combined with deep learning inference, remains unquantified. Fang et al. [
6] proposed THP, a novel authentication scheme to prevent multiple attacks in SDN-based IoT networks, demonstrating the importance of robust authentication mechanisms in constrained environments. Weng et al. [
8] developed BENBI, a scalable and dynamic access control scheme for the northbound interface of SDN-based VANET, addressing authentication latency in vehicular scenarios.
The IZTSDN framework operates through two distinct phases. In Phase 1 (Pre-Connection Authentication), the user initiates a single packet authorization (SPA) request to the controller; the data collection module retrieves historical behavioral data; and the UBTA engine calculates the current trust value via Equations (
2) and (
3). If
, the controller installs flow rules via the policy enforcement component and notifies the gateway to establish a mutual TLS (mTLS) channel; otherwise, the request is silently dropped. In Phase 2 (Post-Connection Monitoring), CALSeq2Seq continuously monitors authorized traffic flows; upon anomaly detection, the controller immediately issues blocking flow rules to the gateway; and attacked ports are blocked for 120 s (with the duration increasing dynamically with the attack frequency). This multi-stage pipeline introduces multiple sequential processing steps that compound to create a window of vulnerability—the temporal gap between threat appearance and mitigation action.
3.2. Performance Dimensions and Research Gap
Despite IZTSDN’s demonstrated functional effectiveness (99.56% detection accuracy, 80.5% throughput maintenance under DDoS attacks), critical operational aspects remain uncharacterized. The original evaluation utilized a minimal testbed of four virtual machines and measured only the aggregate throughput and detection accuracy, leaving four fundamental performance dimensions unquantified.
The cumulative latency through the security pipeline (
) comprises three critical components:
The cumulative security latency
is defined as the worst-case sequential path through the pipeline:
We emphasize that Equation (
7) represents a compositional upper bound rather than an operational seriality constraint. In practice, authentication (Phase 1: pre-connection) and detection/mitigation (Phase 2: post-connection) operate in distinct temporal phases;
is employed for capacity planning and bottleneck identification, not for real-time scheduling.
The current SDN security literature provides concerning baseline figures: DeepIDS [
26] reports an approximately 7% latency increase with GRU-RNN processing, while controller benchmarking demonstrates a 40% latency rise under stress conditions. Industry guidance suggests that the zero-trust solution latency should be “measured in seconds or less; otherwise, analysis will be too late” to prevent exploit impacts. However, the specific breakdown of where latency accumulates in IZTSDN’s integrated pipeline, particularly the deep learning inference time and trust computation overhead, remains absent. Tang et al. [
26] explicitly measured system-level performance in their DeepIDS framework, including throughput, latency, and resource utilization—a rarity in this domain. Their work highlights the importance of quantifying the performance–security trade-off inherent in complex neural architectures.
The SDN controller’s centralized architecture creates a computational bottleneck when hosting security functions. Research demonstrates that controller performance degrades significantly under loads: RYU experiences a 38% packet loss increase during DDoS attacks [
16], while handling over 500 flow setups per second causes CPU utilization to exceed 90% and the latency to increase by up to 60% [
18]. IZTSDN compounds this challenge by requiring the controller to concurrently execute standard OpenFlow operations (flow table management, packet-in handling) [
1,
12], deep learning inference (CALSeq2Seq forward passes with attention computation), trust computation (UBTA algorithm execution with historical data queries), and cryptographic operations (mTLS establishment and key management) [
6]. The resulting resource consumption—CPU utilization, memory consumption, and controller-to-gateway communication bandwidth—has not been systematically measured. Keshari et al. [
9] conducted a systematic review of quality of service (QoS) in SDN, highlighting the critical importance of resource management in controller performance. Nisar et al. [
3] provided a comprehensive survey on SDN architectures, applications, and security, identifying the resource overhead as a key challenge for security mechanism deployment.
Performance evaluation in virtualized SDN environments reveals that scaling introduces significant overhead. Latency tests typically indicate considerable degradation due to additional software layers, with virtualization showing higher transmission delays and CPU consumption versus physical deployments [
11,
13]. Controller benchmarking demonstrates that the throughput degrades by 7–10% when scaling from 16 to 64 switches, with the memory overhead determining system stability [
26]. Guo et al.’s [
4] original evaluation utilized only four virtual machines in a minimal topology consisting of a controller, legal user, illegal user, gateway, and resources. The framework’s behavior at a production scale—hundreds of switches, thousands of hosts, and concurrent attack flows—remains entirely uncharacterized. This gap is critical because security mechanisms that function effectively in small testbeds may saturate controller resources in operational deployments. Selvi and Thamilselvan [
35] proposed an intelligent traffic prediction framework for 5G networks using SDN and fusion learning, demonstrating the challenges of scaling intelligent mechanisms in software-defined environments. Sanagavarapu and Sridhar [
34] developed SDPredictNet, a topology-based SDN neural routing framework that addresses scalability concerns through traffic prediction analysis.
While IZTSDN’s 80.5% throughput maintenance exceeds that of Moubayed et al.’s SDP implementation (75.5%), a meaningful comparison requires normalized operational metrics.
Table 4 summarizes the specific unmeasured dimensions.
Han et al. [
7] proposed OverWatch, a cross-plane DDoS attack defense framework with collaborative intelligence in SDN, achieving 96% detection accuracy but lacking quantitative overhead data. Carvalho et al. [
20] developed an ecosystem for anomaly detection and mitigation in SDN, acknowledging that the algorithm complexity is large and CPU overload can easily occur—a limitation that underscores the need for systematic overhead evaluation. Ramezanpour and Jagannath [
5] explored intelligent zero-trust architectures for 5G/6G networks, emphasizing machine learning integration but without empirical performance data. Santos et al. [
27] evaluated machine learning algorithms for DDoS attack detection in SDN, providing practical implementation insights but limited scalability analysis.
3.3. Problem Statement
The integration of deep learning-based anomaly detection with zero-trust authentication in SDN creates a complex, multi-stage security pipeline whose operational costs are critical for production deployment yet remain entirely unquantified.
Given the IZTSDN framework’s demonstrated functional effectiveness, what are the end-to-end latency (, , ), control-plane resource overhead (CPU, memory, bandwidth), and scalability limits when operating under realistic network loads? Furthermore, how do these operational costs compare against baseline SDN and alternative security architectures?
Addressing this problem requires the systematic consideration of (1) latency decomposition across pipeline stages; (2) resource consumption under varying load conditions; (3) degradation profiles as the topology scales; and (4) comparative benchmarking against frameworks with published performance data. The following sections present our extended MiniIZTA testbed, the measurement methodology, and quantitative results addressing these critical operational questions.
5. Experimental Results and Discussion
This section presents the quantitative findings from our end-to-end performance evaluation of IZTSDN. We analyze latency decomposition, the control-plane overhead, and the scalability limits and contextualize our results against the baseline SDN and alternative security frameworks. All experiments were conducted on the extended MiniIZTA testbed described in
Section 4, with measurements averaged over 10 runs (n = 1000 samples per metric).
5.1. End-to-End Latency Decomposition
Our first set of experiments isolates latency contributions across the IZTSDN security pipeline, revealing the critical path bottlenecks that dominate the end-to-end delay.
Figure 2 presents the latency breakdown across four topology scales, from the baseline four-node configuration (replicating Guo et al.’s original testbed [
4]) to our maximum 64-node deployment. The results reveal a pronounced super-linear scaling pattern: while the topology size increases by 16 times (4 to 64 nodes), the total latency increases by 3.9 times (60.9 ms to 235.6 ms).
The deep learning inference component () emerges as the dominant cost, consuming 73–85% of the total latency across all scales. At 64 nodes, CALSeq2Seq inference alone contributes 178.4 ms (mean) with a 95th percentile of 245 ms, indicating significant variability due to attention mechanism complexity and batch processing delays. This validates our hypothesis that the self-attention computation creates a scalability bottleneck. Notably, the authentication latency () exhibits moderate growth from 12.5 ms (4 nodes) to 45.1 ms (64 nodes), reflecting the UBTA algorithm’s database query overhead as the historical behavior dataset scales. The mitigation latency () remains relatively stable (3–15 ms), confirming that OpenFlow flow-mod operations are not the limiting factor. Critical Finding: At 32 nodes, the 95th percentile total latency (208.5 ms) exceeds the 100 ms threshold recommended for real-time security systems, suggesting that IZTSDN’s current implementation approaches operational limits at a moderate scale.
To characterize tail behavior—critical for understanding worst-case performance—
Figure 3 presents cumulative distribution functions for the end-to-end latency across scenarios.
Under normal traffic, IZTSDN achieves acceptable median latency (83 ms) with P95 at 127 ms. However, under DDoS attack conditions (10,000 packets/sec), the distribution shifts dramatically: the median latency increases to 142 ms, P95 reaches 285 ms, and P99 extends to 528 ms—a 6.4-times increase over the baseline SDN. The heavy tail under attack conditions reflects queue buildup at the controller’s packet-in handler and inference batching delays. When the CALSeq2Seq engine saturates, subsequent packets queue for processing, creating the observed long-tail distribution. This window of vulnerability where malicious traffic arrives but awaits classification represents a critical operational risk. Comparatively, DeepIDS [
26] demonstrates tighter latency bounds (P95: 89 ms) but at lower accuracy (88%), while the baseline SDN maintains sub-10 ms latency but provides zero security functionality. IZTSDN’s accuracy–precision trade-off is evident: 99.56% detection accuracy comes with significantly higher latency variance than competing approaches. The model ingests flow-table features of dimension
(packet count, byte count, duration, protocol flags, inter-arrival statistics). Inference is performed with batch size
(single flow per inference call) to minimize the detection latency, at the cost of reduced GPU/vector utilization. No hardware acceleration (GPU, TPU, FPGA) is employed in the reported experiments; all inference is executed on the CPU via TensorFlow 2.8.0 eager execution. A 100-flow warm-up sequence precedes each experiment to stabilize TensorFlow’s graph optimization and memory allocation caches. The standalone inference throughput, measured in isolation from OpenFlow processing, is 42 flows/second on the testbed CPU, establishing a theoretical lower bound for
.
5.2. Control-Plane Resource Overhead
Our resource monitoring experiments quantify the computational tax imposed by integrating deep learning and zero-trust authentication on the SDN controller.
Figure 4 illustrates the CPU and memory utilization over a 300 s experiment, with a DDoS attack initiated at
s. The temporal dynamics reveal distinct operational phases.
During the pre-attack phase (0–60 s), IZTSDN maintains a steady-state overhead: 25% baseline CPU for framework operations, 5% inference overhead (periodic flow analysis), and 8% trust calculation for ongoing sessions. Memory consumption stabilizes at 320 MB—2.1 times that of the baseline SDN controller (150 MB)—reflecting the loaded CALSeq2Seq model (180 MB) and working buffers. Upon attack initiation ( s), the resource consumption escalates rapidly: the inference CPU rises to 35–50% as CALSeq2Seq processes attack traffic at the maximum batch size; the trust CPU increases to 20% due to authentication requests from spoofed sources; and the control CPU rises to 30% due to handling flow table updates and blocking rules. Saturation Event: At s, the total CPU utilization exceeds 90%, marking the controller’s operational limit. Beyond this point, packet processing delays increase super-linearly, and flow table updates exhibit 200+ ms latency spikes. This saturation occurs despite the attack rate remaining constant, indicating resource exhaustion rather than the traffic volume as the limiting factor. The memory consumption grows gradually during the attack, reaching a peak of 440 MB (2.9 times that of the baseline) due to enlarged flow tables, connection state tracking, and inference batch buffers. No memory leaks were detected (tracemalloc snapshots stable at 60 s intervals), confirming the implementation’s stability under stress.
Table 5 details the resource attribution by IZTSDN component.
The CALSeq2Seq inference engine emerges as the dominant bottleneck, consuming 55% of the controller CPU during attacks. To disentangle the root cause of the CALSeq2Seq bottleneck, we emphasize that the observed inference latency is a product of both model complexity and the execution environment. The self-attention mechanism’s complexity and the LSTM-Seq2Seq sequential decoding impose fundamental arithmetic costs. Nevertheless, the monolithic integration of TensorFlow/Keras within Ryu’s single-threaded Python event loop introduces substantial runtime overhead: Python’s Global Interpreter Lock (GIL) prevents parallel packet processing, TensorFlow’s eager execution tensor allocation adds non-negligible per-inference latency, and the absence of cross-flow batching leaves CPU vector units underutilized. We therefore attribute the bottleneck to a combination of (i) the quadratic attention architecture, which dominates the FLOP count, and (ii) the Python/TensorFlow execution stack and Ryu’s event-driven architecture, which amplify the wall clock time beyond raw computational requirements. This distinction is critical because architectural optimizations (linear attention) and environmental optimizations (GPU offloading, gRPC microservices) yield orthogonal improvements. This concentration of the computational load in a single component executing Python-based TensorFlow operations within the Ryu controller’s event loop creates a serialization point that limits parallelism and exacerbates queuing delays.
5.3. Scalability Degradation Profile
To assess production viability, we evaluate the performance degradation as the network topology scales beyond the original four-node testbed.
Figure 5 presents the scalability degradation profile across six topology configurations, from 4 to 64 nodes, under a constant attack load (10,000 packets/s).
The results demonstrate super-linear degradation. Non-linear least-squares regression of the mean total latency
against the node count
n (data points:
) yields a power-law fit
with a coefficient of determination
(95% CI for exponent:
). This super-linear growth (
) is significantly worse than in ideal
scaling (
, F-test against linear model). Alternative models tested include logarithmic (
) and exponential (
); the power-law model is statistically preferred according to the Akaike information criterion (AIC). At 64 nodes, the mean latency reaches 235.6 ms (3.9× baseline), the P99 latency extends to 528.4 ms (5.5× baseline), throughput retention drops to 62% (vs. Guo et al.’s claimed 80.5% [
4]), and CPU utilization saturates at 94%. The throughput degradation follows a logarithmic pattern:
, dropping from 3.5 Gbps (baseline) to 2.17 Gbps (64 nodes). This degradation stems from controller saturation rather than data-plane limitations—packet captures confirm no switch-level drops, indicating that all losses occur at the control-plane processing stage. Saturation Point Identification: The knee in the scalability curve occurs between 32 and 48 nodes, where the latency transitions from moderate growth (95–142 ms) to rapid escalation (142–235 ms). This corresponds to CPU utilization crossing 80%, beyond which scheduler inefficiencies and the context switching overhead compound the base computational load.
Relative to competing frameworks, IZTSDN’s scalability profile is mixed. DeepIDS [
26] demonstrates superior scaling (tested to 64 switches with a 7% latency increase), attributed to its lighter GRU-RNN architecture versus CALSeq2Seq’s attention mechanism. However, DeepIDS achieves lower accuracy (88% vs. 99.56%), illustrating the fundamental accuracy–scalability trade-off in neural security architectures. Traditional SDP implementations [
18,
31] exhibit better throughput retention (75–76%) but provide no threat detection capability, rendering a direct comparison incomplete. IZTSDN’s integrated security functionality justifies the higher overhead, but the magnitude of degradation at scale (38% throughput loss at 64 nodes) limits its deployment viability in large data centers.
5.4. Security–Performance Trade-Off Analysis
To contextualize IZTSDN’s operational costs, we compare it against the baseline SDN and alternative security frameworks using multi-dimensional efficiency metrics.
Figure 6 presents a grouped comparison of the latency, accuracy, CPU overhead, and throughput retention across six configurations.
The comparison reveals distinct operational profiles. Baseline SDN exhibits minimal latency (2.5 ms) with zero security, serving as the cost reference point. SDP [
18] provides moderate latency (45 ms) for authentication-only protection with no detection capability (0% accuracy). OverWatch [
7] offers a balanced profile (65 ms, 96% accuracy), using traditional machine learning. DeepIDS [
26] delivers efficient deep learning (58 ms, 88% accuracy) and serves as a baseline for relative efficiency calculation. IZTSDN under normal conditions achieves high accuracy (99.56%) at a moderate cost (83 ms, 68% CPU). IZTSDN under attack conditions preserves accuracy but the latency triples (235 ms) and CPU saturates (94%). To contextualize the accuracy–latency trade-off, we report a
comparative efficiency score rather than a dimensionless ratio. Let
(accuracy points per ms latency). Using DeepIDS as the reference (
), we compute the relative efficiency: OverWatch
(0.97× reference), IZTSDN-normal
(0.79× reference), IZTSDN-attack
(0.28× reference). We caution that these values combine metrics from different evaluation contexts (DeepIDS tested on CICIDS2017, IZTSDN on Bennett University SDN dataset) and serve only as coarse indicators, not rigorous rankings. The key insight is qualitative: IZTSDN preserves high accuracy but at a substantially higher latency cost under attacks, whereas lighter frameworks (DeepIDS, OverWatch) offer more stable latency profiles.
For operational deployment in single-controller, virtualized SDN environments, we assess IZTSDN’s cost–benefit ratio across three operational scenarios. In a low-risk environment with normal traffic dominant, IZTSDN provides exceptional value: 99.56% accuracy with 83 ms latency (acceptable for most applications) and 80.5% throughput retention; the 2.1× memory overhead is manageable with modern server hardware (256+ GB RAM). In a high-threat environment with frequent attacks, the operational costs become prohibitive: the latency rises to 235 ms (P99: 528 ms), violating the real-time requirements for latency-sensitive applications (financial trading, industrial control), and CPU saturation at 94% creates a self-inflicted DoS risk where the security framework itself could cause controller failure. In a large-scale deployment exceeding 64 nodes, scalability limits are exceeded: the super-linear latency growth and 38% throughput degradation suggest that IZTSDN in its current form is unsuitable for production data centers beyond 32 nodes without architectural modification.
5.5. Bottleneck Analysis and Optimization Opportunities
Our measurements identify specific architectural constraints and allow the proposal of targeted optimizations. The latency breakdown reveals three primary bottlenecks. First, CALSeq2Seq inference serialization accounts for 55% of the delay; the deep learning engine executes as a blocking call within Ryu’s packet-in handler, preventing parallel flow processing. Optimization: Migrate inference to a dedicated GPU-accelerated service via gRPC/REST API, enabling asynchronous processing and batching efficiency, with expected gains of 60–70% latency reduction. Second, the self-attention quadratic complexity () causes the attention mechanism’s computational cost to grow quadratically with the sequence length. Optimization: Replace full attention with linear attention variants or switch to efficient Transformer alternatives (Performer, Linformer), with expected gains of 40–50% inference speedup and less than 1% accuracy loss. Third, controller CPU saturation (90%+ utilization) results from Python’s Global Interpreter Lock (GIL) and TensorFlow’s thread pool contention, limiting multi-core utilization. Optimization: Deploy IZTSDN as a distributed microservice architecture with inference service on GPU nodes, trust calculation on CPU nodes, and the controller focused solely on OpenFlow operations, with expected gains of 3–5× in throughput capacity.
Based on our overhead measurements, we hypothesize the following resource optimization strategies, with the expected gains derived from published benchmarks in related domains; empirical validation remains for future work. The first is model quantization: convert CALSeq2Seq from FP32 to INT8 precision, achieving a typical 2–4× speedup with minimal accuracy degradation (expected: 99.2% vs. 99.56%). The second is incremental trust evaluation: cache UBTA results for stable connections, reducing the trust calculation frequency from per-packet to per-flow, with an expected 50% reduction in . The third is an early exit mechanism: implement confidence thresholding in CALSeq2Seq so that high-confidence predictions exit early without full Seq2Seq decoding, yielding an expected 30% average latency reduction for clear attack/normal cases. The fourth is control-plane offloading: deploy P4-based smart NICs or programmable switches for initial flow classification, reducing the controller packet-in rate by 60–80%.
5.6. Discussion and Summary
Our comprehensive performance evaluation reveals that IZTSDN achieves exceptional detection accuracy (99.56%) but imposes significant operational costs: a 3.9× latency increase and a 4.2× CPU overhead at scale, with saturation occurring at 32–48 nodes under attack conditions. The deep learning inference component (CALSeq2Seq) emerges as the dominant bottleneck, consuming 55% of the controller resources and creating a serialization point that limits scalability. While IZTSDN demonstrates superior relative efficiency () compared to DeepIDS and OverWatch under normal operation, its performance degrades severely under attacks—precisely when security is most critical. This attack-induced degradation represents the primary barrier to production deployment. The super-linear scalability profile () suggests architectural constraints beyond the raw computational load, likely stemming from Python’s GIL and TensorFlow’s threading model. Our identified optimizations—GPU offloading, model quantization, distributed microservices—offer pathways to 3–5× performance improvements, potentially extending viable deployment to 128+ nodes. Recommendation: IZTSDN in its current form is suitable for small-to-medium deployments (4–16 nodes) with moderate security requirements. For large-scale or high-threat environments, we recommend implementing the identified optimizations (particularly GPU-accelerated inference offloading) or adopting a hybrid architecture combining IZTSDN’s accuracy with DeepIDS’s efficiency for a tiered threat response.