1. Introduction
Recent projections have indicated that the number of connected devices globally will reach several billions in the coming years, due to the proliferation of internet technologies, such as the Internet of Things (IoT) and its associated systems and devices [
1,
2]. A parallel trend is also noticeable in the growth of cloud technologies, including cloud-based software, surveillance solutions, real-time network automation, and distributed connections [
3]. In relation to this, software-defined networking (SDN), which introduces a decoupling of control and data planes in IoT networks, has also become a prevalent technology, highlighting the need for scalable and intelligent network management solutions [
4]. Both past and ongoing research point to SDN as a foundational component of IoT architectures [
5,
6,
7]. One early contribution in this area employed multilayer SDN controllers specifically designed for heterogeneous vehicular IoT traffic, demonstrating end-to-end quality of service (QoS) guarantees through network-calculus-based scheduling mechanisms [
8].
SDN, which leverages the OpenFlow protocol, is a rapidly evolving technology that significantly enhances network management, administration, and monitoring processes [
4]. By enabling programmable interaction with data plane elements and providing network administrators with a holistic view of the network through a controller, SDN positions the controller as the network’s intelligence, which is responsible for managing devices in the data plane via the application-programming interface (API) [
4]. This capability has led to the adoption of SDN-based frameworks in advanced IoT network topologies, including fifth-generation (5G) and next-generation network architectures, such as sixth-generation (6G) [
9,
10,
11].
While the programmability, flexibility, and decentralized control of SDN are critical to its effective implementation, these same features can introduce significant security challenges [
12]. IoT networks and other SDN-reliant systems are susceptible to a number of security risks, such as an increased likelihood of distributed-denial-of-service (DDoS) flooding attacks [
13,
14]. These inherent vulnerabilities associated with SDN highlight the pressing need for advanced and intelligent security frameworks that are capable of mitigating evolving threats, such as DDoS flooding attacks, while preserving the operational benefits of SDN-based architectures in network environments, such as IoT systems.
DDoS flooding attacks are carefully planned assaults that use a network of hacked devices, called botnets, to overwhelm a system. The goal is to clog up the network’s bandwidth or shut down specific servers and devices, making them unusable for regular users [
15]. These attacks constitute a significant portion of the most dangerous malicious traffic on the internet [
16]. Typically, attackers deploy these botnets to execute these operations covertly [
17]. As a result, end users of the targeted network’s device nodes often remain clueless that their devices and internet protocol (IP) addresses are being used to perpetrate DDoS flooding attacks. IoT devices are especially vulnerable to coordinated DDoS flooding attacks because of how they are built and the fact that the internet does not enforce strict traffic control on individual devices [
16]. This combination makes it easier for attackers to target and overwhelm IoT devices. This vulnerability is made even worse by the rapid growth in the number of IoT devices out there, even though their security is improving over time [
18]. In the face of increasingly complex DDoS flooding attacks, the critical need for innovative and proactive security techniques that can address the particular vulnerabilities of IoT networks becomes imperative.
As IoT networks that utilize SDN architectures expand in complexity and scale, optimizing their performance and security presents significant challenges that demand robust and well-optimized network configurations [
19]. Identifying critical network performance metrics and assessing their influences on the overall network performance through sensitivity analysis (SA) have consistently been put forward as pivotal tools for enhancing the reliability of network configurations [
20]. This is primarily because developing a reliable IoT network model requires a comprehensive understanding of how variations in model parameters influence the network’s behavior and the accuracy of its predictive outcomes. SA plays critical roles in this process by systematically analyzing the importance of input parameters, evaluating their contributions to the network model’s outputs, and quantifying the impacts of individual inputs on the overall system performance [
21].
SA techniques are commonly categorized based on their scope, as either local or global, and their framework, as either deterministic or statistical [
22,
23]. As discussed in [
21,
24,
25], statistical frameworks for SA are generally derived from principles associated with the design of experiments (DOE). These frameworks also classify SA methods into local and global approaches, depending on the parameter space under consideration and the specific objectives to be achieved [
26]. SA, both local (LSA) and global (GSA), is critical for understanding the influences of different parameters on the performance of IoT networks that are SDN reliant [
27]. Therefore, advancing the application of SA in SDN-reliant IoT networks is vital for developing adaptive and data-driven models that enhance predictive accuracy and system resilience.
The application of artificial intelligence (AI) techniques (machine-learning (ML) techniques in particular) is growing significantly in numerous fields and disciplines, including SA, within SDN [
10,
26]. In the context of the local SA (LSA), ML-based approaches often rely on gradient-based methods to evaluate how specific parameters affect performance metrics [
28]. Techniques such as gradient-boosting machines (GBMs) and artificial neural networks (ANNs) or neural networks are widely used to approximate gradients efficiently to support parameter tuning and optimization [
28,
29]. ANNs, in particular, offer enhanced capabilities to capture complex dependencies between parameters and performance for ML-assisted LSA in SDN environments [
30]. These capabilities not only facilitate the identification of critical parameters but also provide actionable insights for improving SDN configurations for IoT networks.
For global SA (GSA), ML techniques, such as the Gaussian process [
31], can play an important role in improving traditional methods, such as Monte Carlo simulations [
32]. By training ML models on subsets of data, predictions can be extended across a larger parameter space, enabling a more comprehensive analysis [
33]. Gaussian processes are also effective for modeling the distribution of network performance metrics under varying parametric conditions, providing a probabilistic framework that supports GSA [
34]. Furthermore, Monte Carlo simulations, when combined with these ML models, provide a detailed evaluation of how parameter variations affect the overall network performance [
35]. ML techniques, such as regression methods for surrogate modeling, further streamline GSA by approximating the behaviors of complex network systems, reducing computational costs, and improving scalability [
26,
35]. These ML-based approaches emphasize the potential of ML-assisted SA methodologies to optimize SDN configurations for IoT network applications and improve their performance and reliability in increasingly complex IoT network environments.
Practitioners often aim to enhance the effectiveness and efficiency of ML models by regulating and optimizing key processes in SDN environments [
36]. This practice inherently aligns with the principles of DOE, a data-driven approach that plays a crucial role in SA [
25,
37,
38]. DOE provides a structured experimental framework that is essential for addressing the complexities of both SA and ML challenges, particularly in the behavioral analysis of IoT networks [
39]. For instance, QoS is a vital metric in IoT networks [
40]. DOE can be used to systematically identify relationships between factors that influence QoS in IoT networks, thus improving network performance while minimizing the need for extensive experimental trials [
25,
41]. Furthermore, DOE can facilitate a targeted investigation of cause-and-effect relationships in scenarios involving the application of ANNs and evolutionary algorithms (EAs), both of which are commonly applied in IoT networks [
42,
43]. This structured approach reduces reliance on trial-and-error methods, saving time and computational resources. Hence, the integration of DOE with ML clearly offers great potential for enhancing model interpretability, reproducibility, and optimization efficiency within complex network environments, such as IoT networks.
A popular example of DOE in practice is Latin hypercube sampling (LHS) [
44,
45]. LHS is a widely used technique for addressing complex and high-dimensional problems, including ML-assisted global optimization of complex models [
46,
47]. LHS works by stratifying the sample space to ensure uniform sampling of each variable across its range, which is particularly advantageous for simulation-based optimization tasks in computational intelligence [
46]. This approach significantly improves the convergence rates of ML models, such as ANNs, making it a valuable tool in data-driven analytics [
47]. By integrating DOE methodologies, such as LHS, with ML for SA implementation, more robust insights can be obtained about model behavior, enabling the development of more reliable and resilient network systems. This is why extending the application of DOE within ML-assisted SA frameworks remains a potential direction for research efforts aimed at optimizing the performance and reliability of network systems, such as SDN-reliant IoT networks.
Generative adversarial networks (GANs), a widely used class of ML frameworks, have also become quite popular in recent years due to their ability to generate synthetic data by learning the underlying distribution of real-world datasets [
48]. Originally introduced in [
49], GANs typically comprise two neural networks: a generator, which creates synthetic data, and a discriminator, which aims to differentiate between generated and real data [
50]. These networks are trained simultaneously in a zero-sum game, where the generator continuously and iteratively refines its output to deceive the discriminator [
51]. Due to this continuous iterative process, the generator can create extremely realistic synthetic data, making GANs an effective tool for a variety of computing and networking applications [
52,
53]. The growing roles of GANs in optimizing the performance and addressing the complexities of modern network systems [
54] further emphasize their potential to support ML-driven SA in SDN-reliant IoT networks.
This work introduces a new data-driven framework that hybridizes LHS-based DOE, GAN-based synthetic data generation, and ANN-assisted SA to facilitate an enhanced behavioral study (including anomaly detection and classification (ADC)) in SDN-reliant IoT networks. Termed as DOE-GAN-SA, the proposed hybrid framework leverages simulated network scenarios, LHS-driven augmentation of network performance data, GAN-enabled synthetic data generation, and ANN-assisted SA to provide a comprehensive approach to behavioral analysis (including ADC) within SDN-reliant IoT networks. DOE-GAN-SA explores the harmonious co-working of LHS-based DOE, GAN-driven synthetic data generation, and ANN-based supervised learning by focusing on their complementary roles to support ADC and SA in SDN-reliant IoT network environments. The key contributions of this work include:
Using LHS and GAN for data augmentation in SDN-reliant IoT networks;
Showing the application of a newly augmented SDN-reliant IoT network dataset for detecting and classifying DDoS flooding attacks;
Improving the mitigation of DDoS flooding attacks through improved detection accuracy and ANN-assisted SA;
Demonstrating DOE-GAN-SA as a new hybrid ADC and SA framework for SDN-reliant IoT networks.
To the best of our knowledge, and based on the literature reviewed in
Section 2, this study is the first to introduce a hybrid approach combining DOE and ML techniques for the specific purpose of conducting SA on network performance metrics within SDN-dependent IoT networks. Although the individual methodologies employed are well-established within their respective domains, this work represents the first documented integration of these techniques for this targeted application. The structure of the remainder of this paper is as follows:
Section 2 reviews the recent literature relevant to the research conducted to provide a foundation for the study.
Section 3 outlines the SDN architecture and IoT network topology utilized to simulate various network scenarios to establish the technical context. The basic techniques guiding the formulation of the proposed DOE-GAN-SA framework are introduced in
Section 4, and the proposed DOE-GAN-SA framework is detailed in
Section 5, highlighting its components and functionalities.
Section 6 describes the experimental setup, presents the results, and discusses the findings, offering insights into DOE-GAN-SA’s performance, scope, and recommended approach for its practical adoption. Finally,
Section 7 provides the concluding remarks, summarizing the key contributions and potential directions for future work.
2. Related Work
SDN is a widely used approach for scaling heterogeneous IoT deployments, primarily due to its decoupling of control and data planes. However, much of the existing research in this area still relies heavily on heuristic-based parameter tuning, often overlooking the importance of feedback mechanisms that link anomaly detection with flow-rule adaptation [
55]. Early contributions, such as [
8], demonstrated the feasibility of SDN for vehicular IoT infrastructures via the multipurpose infrastructure for network applications (MINA)-SDN controller, yet their study did not address the critical constraints related to sensor energy consumption and embedded security mechanisms. Subsequent works, including the survey by [
56], emphasized the necessity of slice isolation and controller scalability in supporting large-scale IoT deployments. Similarly, Ref. [
5] introduced software-defined APIs designed for smart city infrastructures to enable the shared use of gateways and cloud services, accompanied by some quantitative analysis using a case study. More recently, Ref. [
6] showcased a multilevel architecture that effectively reduced packet loss in smart home networks; however, several challenges persist. Further highlighting these challenges, a comprehensive meta-analysis, conducted by [
7] and spanning over 160 studies, identified microservice-based controllers, flow table compression techniques, and energy-efficient routing as pivotal components in the secure deployment of SDN-reliant IoT systems for smart communities. Collectively, these studies highlight the importance of programmability and slice isolation, among other design strategies, in supporting the development of secure and scalable IoT infrastructures. Nonetheless, none of the reviewed approaches adequately addresses data-scarce SA or integrates ADC mechanisms directly within SDN architectures, an evident gap that the DOE-GAN-SA framework proposed in this work seeks to bridge.
While GANs have found several applications across multiple disciplines [
57], their adoption for ADC and SA, particularly in SDN-reliant IoT networks, can be said to still be in its early stages, arguably. Since the introduction of GANs in [
49], numerous studies have demonstrated their practicality in data augmentation. For instance, the work in [
58] investigated the use of GANs to generate realistic network traffic data, enabling the simulation of network behaviors under various conditions. However, this study, like other similar recent works [
59], did not specifically address ADC and SA, highlighting the untapped prospect of GAN-based synthetic data generation in supporting robust ADC and SA within SDN-reliant IoT network environments. Supporting this perspective is a recent study that demonstrated the effectiveness of GANs in augmenting datasets necessary for analyzing complex high-dimensional systems and improving classification performance [
60]. Despite these promising developments, the current literature indicates that the full potential of GANs in the context of ADC and SA within SDN-reliant IoT networks still remains largely unexplored, offering significant opportunities for further research. Hence, this current work explores the integration of GANs into ADC and SA workflows with the aim of exploiting their synthetic data generation capabilities for deeper insight into parameter-driven behaviors within complex network environments.
SA in SDN-reliant networks, such as IoT networks, has traditionally relied on standard techniques, like variance-based methods and Monte Carlo simulations [
24,
61]. These conventional approaches, while effective for several applications, often require large datasets and substantial computational resources [
62], making them less practical for low-latency, real-time, large-scale systems, such as IoT networks [
63]. Variance-based SA methods, for example, are very practical in measuring the contribution of each input factor to the overall variance of the output, which helps to identify key influential parameters [
30,
64]. However, in scenarios that involve a large number of parameters or complex interaction effects, they could become impractical due to increased computational demands and instability in estimations [
65]. To address these challenges, researchers have explored alternative techniques, such as sample-based estimations, that reduce computational overhead without sacrificing analytical accuracy [
66]. However, this is often at the cost of introducing new design parameters [
66].
Statistical sampling methods, such as LHS, are well-known for their efficiency in parameter selection and analysis [
67]. Unlike random sampling, LHS ensures a more uniform coverage of the input space to improve robustness and minimize computational costs [
47]. Despite its advantages, LHS is not without its own limitations. One significant drawback of LHS is its inability to thoroughly account for statistical relationships between input variables [
47,
67], which can compromise SA accuracy in systems such as SDN-reliant IoT networks, where inputs could be highly correlated [
39]. Combining LHS with GAN offers a promising solution to this limitation. GANs are capable of generating synthetic datasets that tend to preserve some statistical properties of real-world network data while also reasonably covering the input space of real-world network traffic comprehensively and robustly [
68]. By combining LHS-based data and GAN-generated data into a unified framework, harmonizing their advantages in data augmentation could be realized as carried out in this work. This hybrid approach has the potential to improve SA in SDN-reliant IoT networks, enabling more effective analyses of such complex and high-dimensional systems while meeting the demands of real-time applications.
Although GANs have demonstrated significant potential in synthetic data generation across various applications [
69], as established earlier, their application for SA in SDN-reliant networks, such as IoT networks, remains underexplored. Much of the existing research in this area has concentrated more on enhancing GAN architectures for data generation or refining synthetic data generation techniques using GANs [
69,
70]. Therefore, there is a gap in studies that harness the complementary strengths of GANs and LHS to perform SA in SDN-reliant IoT networks. Conventional SA frameworks, while effective in many scenarios, are often purpose built and may not necessarily offer additional insights into other operations, such as data augmentation and ADC, when implemented. These limitations also emphasize the need for innovative approaches that can efficiently address the complexities of modern network systems by providing more robust insights while maintaining analytical precision and complementing existing SA methods. Hence, exploring a hybrid SA framework that combines DOE and ML techniques, as typified in this work, can potentially redefine the methodological landscape for analyzing intelligent, large-scale network infrastructures in a resource-efficient and scalable manner.
Specifically, this work addresses the existing gaps discussed above by proposing the DOE-GAN-SA framework, which integrates LHS-based DOE with GAN-driven synthetic data generation to enable efficient and scalable ADC and ANN-assisted SA in SDN environments. GANs are leveraged to generate high-quality synthetic datasets that preserve some properties of real-world network data while enhancing data diversity. This capability is paired with the structured uniform sampling offered by LHS, which ensures comprehensive coverage of the input parameter space with reduced computational costs. By combining these methodologies, the DOE-GAN-SA framework facilitates a more robust and comprehensive approach to SA and ADC, enabling the precise identification of critical parameters that impact the SDN performance. As a result, the DOE-GAN-SA framework introduced in this study is expected to help create more efficient and reliable IoT network setups by enhancing ADC and improving SA for SDN-reliant IoT networks.