A Comprehensive Review: The Evolving Cat-and-Mouse Game in Network Intrusion Detection Systems Leveraging Machine Learning

Alasad, Qutaiba; Ahmed, Meaad; Alahmed, Shahad; Khattab, Omer T.; Abdulwahhab, Saba Alaa; Yuan, Jiann-Shuin

doi:10.3390/jcp6010013

Open AccessReview

A Comprehensive Review: The Evolving Cat-and-Mouse Game in Network Intrusion Detection Systems Leveraging Machine Learning

by

Qutaiba Alasad

^1,*

,

Meaad Ahmed

²,

Shahad Alahmed

²

,

Omer T. Khattab

³

,

Saba Alaa Abdulwahhab

²

and

Jiann-Shuin Yuan

⁴

¹

Departments of Cybersecurity and Petroleum Systems Control Engineering, Tikrit University, Tikrit 34001, Iraq

²

Department of Cybersecurity, Tikrit University, Tikrit 34001, Iraq

³

College of Health and Medical Technologies-Al-Dour, Northern Technical University, Mosul 41003, Iraq

⁴

Department of Electrical and Computer Engineering, University of Central Florida, Orlando, FL 32816, USA

^*

Author to whom correspondence should be addressed.

J. Cybersecur. Priv. 2026, 6(1), 13; https://doi.org/10.3390/jcp6010013

Submission received: 8 October 2025 / Revised: 5 December 2025 / Accepted: 23 December 2025 / Published: 4 January 2026

(This article belongs to the Special Issue Advanced Technologies for Detecting Cybersecurity Attacks in Internet of Things Systems)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Machine learning (ML) techniques have significantly enhanced decision support systems to render them more accurate, efficient, and faster. ML classifiers in securing networks, on the other hand, face a disproportionate risk from the sophisticated adversarial attacks compared to other areas, such as spam filtering, intrusion, and virus detection, and this introduces a continuous competition between malicious users and preventers. Attackers test ML models with inputs that have been specifically crafted to evade these models and obtain inaccurate forecasts. This paper presents a comprehensive review of attack and defensive techniques in ML-based NIDSs. It highlights the current serious challenges that the systems face in preserving robustness against adversarial attacks. Based on our analysis, with respect to their current superior performance and robustness, ML-based NIDS require urgent attention to develop more robust techniques to withstand such attacks. Finally, we discuss the current existing approaches in generating adversarial attacks and reveal the limitations of current defensive approaches. In this paper, the most recent advancements, such as hybrid defensive techniques that integrate multiple strategies to prevent adversarial attacks in NIDS, have highlighted the ongoing challenges.

Keywords:

network security; network-based intrusion detection system; cybersecurity; adversarial machine learning attacks and defenses

1. Introduction

The increasing quantity, scale, and sophistication of network assaults have had a broad range of consequences for many individuals and victimized enterprises, ranging from large monetary expenditures to widespread power dissipations. The goal of these attacks is to illegally access the cloud flow or disrupt the services offered to customers. Such attacks have significant impacts not only on economic and financial affairs, but also on national and cultural security [1,2,3]. Therefore, public and private sector institutions are anticipated to spend a very large amount of money to develop efficient techniques that can identify and eradicate breaches as a result of the significant negative impacts that such attacks generate [4]. Cyberattacks can cause serious damage to a country’s money systems, economic stability, national security, and cultural protection. Because of these high risks, it is essential to block such attacks, whether they come from inside or outside the country, from either the government or business organizations [5].

In computer servers and modern networks, firewalls and other rule-based security solutions have been commonly applied to prohibit active attacks. However, these solutions have not been able to completely detect malicious activities or contemporary threats [6]. As a result, intrusion detection systems (IDSs) have been proposed for repeatedly observing the traffic and creating alarm signals when any suspicious events take place. The IDS analyzes the network by collecting sufficient data and information in order to detect unusual sensor node behavior, illegal access, improper use, unauthorized users, hackers, cloaked malware, and many other risks [7].

IDS can broadly be categorized as network- and host-based intrusion detection systems (NIDS and HIDS). HIDS involves monitoring all the network ports and configuration parameters of the target device, which necessitates particular host-specific settings. Typically, HIDS, considered a passive technique [8], is implemented on single systems, and these single systems need a small program to monitor the operating system and create alarms. NIDS, in contrast to the HIDS, keeps tracking all incoming and outgoing packets in a computer network, while HIDS only monitors specific actions, such as which applications are allowed to be run and which files can be accessed [9]. This article reviews the current existing NIDS studies that focus on analyzing the flow of information between computers, such as network traffic. Accordingly, NIDS can effectively sniff the suspicious activity and protect the network hosts from sophisticated attacks. Note that both forms of IDS use either signature- or anomaly-based to evaluate the data. Figure 1 shows the IDS (NIDS/HIDS) integrated with the ML workflow, including the signature and anomaly detection. As of 2025, adversarial attacks have evolved, with new NIST taxonomies emphasizing network-specific vulnerabilities and mitigations [10].

The signature-based IDS employs pre-determined and pre-configured attack signatures to discover malicious tasks and activities. It is appropriate for detecting predefined types of attacks but cannot be used to detect unexpected ones. As a result, an attacker can build special code to fool the system once he or she obtains access to the IDS. In contrast, the anomaly-based IDS techniques have been designed to recognize anomalous data patterns and are commonly employed to discover frauds, flaws, and invasions. More specifically, anomaly-based IDS techniques analyze the user activities in the system logs to determine which are legal and which are not. This method can be leveraged to prevent new types of attacks without any prior knowledge [6].

The NIDS is considered one of the most promising active security solutions against adversary attacks [11,12]. It is designed especially for network framework directors to detect and prevent unusual attacks, in which important and sensitive data should be collected to protect the systems and prohibit all aberrant behaviors [13]. Since NIDS cannot prevent sophisticated attacks, especially in large networks, ML has been shown to effectively mitigate such attacks with low computational time.

1.1. Motivation

The increase in sophisticated adversarial attacks (AAs) on ML-based NIDS has motivated us to present this review paper. Since the ML models have become more deeply incorporated into the network security, they have also become more susceptible to adversarial manipulations. Given such sophisticated adversarial attacks, network security requires continuously an arms race between the assailants and defenders. Studies from 2025 show evasion rates up to 90% in IoT NIDS, as reviewed in a recent taxonomy of AAs on DL-based detection [14]. Adversaries are constantly developing specialized techniques to bypass the detection systems, and, therefore, defenders should always update and develop their approaches to prevent such attacks. More specifically, the main objective of this work is to offer an in-depth review of the current state of adversarial examples and defenses on ML-based NIDS. Unlike the previous reviews [15,16,17,18,19,20,21,22,23,24,25,26], which mainly focus on general surveys, this review advances by synthesizing the real-world feasibility, quantitative metrics, and unified defense gaps. We carefully review and analyze studies on attack and defense techniques in order to reveal real current challenges in the existing works and present possible future research. This paper also shows the necessity of leveraging real implemented attacks with modern datasets to successfully assess the ML-based NIDS. This article provides detailed direct future research in this domain in the discussion section.

1.2. Contributions of This Paper

The paper’s contributions are as follows:

We give an in-depth review of existing ML-based NIDS attack and defense techniques, including a detailed overview of NIDS, its types, ML techniques, commonly used datasets in NIDS, attacker models, and possible detection methods.
The types of powerful attack and defense techniques and the most realistic datasets in NIDS are carefully classified.
Existing attack and defensive approaches on adversarial learning in the NIDS are analyzed and evaluated.
We identify the real current challenges in detection and defensive techniques that need to be further investigated and remedied in order to produce more robust NIDS models and realistic datasets. Revealing these current challenges guides us to provide insights about future research directions in the area of detecting AAs in ML-based NIDS. This has been explored in detail in the discussion section, highlighting our findings and revealing the limitations of current existing approaches.
A comprehensive analysis has been conducted to show the impact of adversarial breaches on ML-based NIDS, including black-, gray-, and white-box attacks.

2. Background

2.1. ML Techniques in NIDS

ML approaches have been utilized to recognize the valuable features in network traffic and detect zero-day attacks, e.g., new obscure threats, that are difficult to detect leveraging traditional signature-based methods. ML-based NIDS improves the generalization of detection techniques to detect more sophisticated attacks [13]. Moreover, it has been shown that ML-based NIDS can significantly mitigate attacks targeted to bypass or fool the system [4]. The ML algorithm is mainly classified into three types: reinforcement learning, unsupervised learning, and supervised learning [27,28].

Given the wide use of Artificial Intelligence (AI) in the network security domain, revealing the main weaknesses of ML classifiers in adversarial methods is essential [29]. In cyber defense, adversarial machine learning (AML) attacks pose significant challenges to the ML security [13]. These assaults exploit the intrinsic sensitivity in ML models to their internal parameters in order to create strong infected examples that impact the model to achieve the attackers’ goal. Note that one of the strongest attacks that impacts the ML models is the adversarial perturbations [30]. In order to develop models and execute classification tasks, machine learning algorithms need sufficiently large datasets. These learned models can produce wrong decisions with high confidence due to skillfully created and managed perturbations introduced at the genuine inputs, called adversarial examples (AEs) [13]. In this review, we cover the ML-based NIDS research from 2020 to 2025. Out of 150 screened published papers, 50 papers have been selected because of their direct relevance to the adversarial attacks and defenses. The GAN- and gradient-based attacks, e.g., DeepFool, have been emphasized due to their prevalence. It is noteworthy to note that even though other emerging techniques, such as the AutoML [31], have been recently paid attention in the NIDS area, they have not been included in this study since this work mainly concentrates on the adversarial attack and defensive techniques, rather than the automated model techniques.

2.2. Adversarial Machine Learning (AML)

AML requires incorporating perturbations to the given data input to deceive the ML model, and this, in turn, leads to providing inaccurate detection outcomes that are desired by the assaulter. The incorporated malicious data must be undetected by observers. In contrast, revealing this in other domains is easier. For instance, in computer vision, modifying only a few pixels can result in generating unexpected outcomes. However, detecting such threats in network traffic is more challenging due to the unique features in this domain [32]. To be more specific, AML can sit at the crossroads of ML and computer security, and it is usually a battle between two agents [33]. The first agent is a malicious payload intruder whose goal is to infiltrate a specific network, while the second agent’s aim is to protect the network from risks caused by the payload [27]. Table 1 illustrates the factors that need to be considered in any attack, including descriptions, NIDS-specific examples, and relevant references. These factors primarily involve knowledge, timing, goals, and capability.

2.2.1. Knowledge

AML threats are broadly partitioned into black-, gray-, and white-box [13]:

Black-box attack: No information about the parameters or the classifier’s structure is required for the malicious agents. They should be able to access the input and the output of the models, and the rest is considered a black box.
Gray-box attack: Limited information or access to the system should be known to an attacker, such as accessing the preparation of the dataset and the predicted labels (training dataset), or applying a limited number of queries to the model.
White-box attack: In this type, it has been assumed that the attacker has complete knowledge and information about the classifier and the used hyperparameters.

2.2.2. Timing

Timing plays an excellent role in modeling AAs. A recent study showed that classification evasion is one of the most prevalent in NIDS, providing a taxonomy of DL-based adversarial attacks [14]. Generally, there are two types of attacks on NIDS-based ML: evasion and poisoning attacks. The details of these two attacks are demonstrated in Figure 2.

Evasion Attack: The attack can be employed during the testing phase, where an attacker attempts to force the ML model to classify the observations wrongly. In the context of network-based IDS, the attacker tries to prevent the detection system from detecting malicious or unusual events, and, therefore, the NIDS wrongly classifies the behavior as benign. To be more specific, four scenarios might occur, as follows: (1) Confidence reduction, which reduces the certainty score to cause wrong classification; (2) Misclassification, in which the attacker modifies the result to produce a class not similar to the first class; (3) Intended wrong prediction, in which the malicious agent generates an instance that fools the model into classifying the behaviour as an objective or incorrect class; and (4) Source or target misclassification, in which the adversary changes the result class of a certain attack example to a particular target class [34].
Poisoning attack: It occurs during the training phase, in which an assaulter tampers with the model or the training data to yield inaccurate predictions. Poisoning attacks mainly involve data insertion (injection), logic corruption, and data poisoning or manipulation. Data insertion occurs when an assailant inserts hostile or harmful data inputs into the original data without changing its characteristics or labels. The phrase “data manipulation” refers to an adversary adjusting the original training data to compromise the ML model. Logic corruption refers to an adversary’s attempt to modify the internal model structure, decision making logic, or hyperparameters to malfunction the ML [27].

2.2.3. Goals

Attackers can use a specific algorithm for different reasons. However, in most cases, they either have a specific purpose in mind and require the algorithm to produce a particular result (a targeted attack), or they want to lower the trustworthiness of the algorithm via inducing mistakes (an untargeted attack) [35].

2.2.4. Capability

This refers to the activities an assailant may perform on the target system, including the AI system. The level of access should be shown to the assailant with respect to the digital finder: limited access (can only access and read the output result), full access (can read and adjust the internal structure of the model and its outputs), or no access to any part [30].

With the increasing interest in leveraging ML techniques in the area of network security, many hostile or malicious attacks against these technologies have become prevalent. This highlights the need for extensive research solutions and mitigation techniques, as well as efforts to review papers in the field to address these assaults. AML has been presented in many publications in the image and text recognition fields. However, comprehensive studies are still needed to carefully address the issues of adversarial examples in the NIDS area. Even though there are many recent studies, surveys, and reviews in the area of NIDS, there are still many challenges that have not been explored yet. With respect to the efforts presented [15,16,17,19,20,21,22,23,25], this paper serves as an inspiration to provide a detailed review on antagonistic assaults and their countermeasures to effectively reveal current challenges in the literature studies and help set up future research directions. The main purpose of this study is to audit and assess advancements in adversarial learning applied to the NIDS space and highlight the key challenges and areas that require further examination in order to achieve worthy improvements in the area of ML-based NIDS. AML exploits ML’s sensitivities and vulnerabilities. According to the 2025 NIST taxonomy, such attacks are categorized based on the lifecycle stages and attacker goals [10].

2.3. Adversarial Attacks Based on Machine Learning

The commonly utilized and adopted techniques in cybersecurity are provided in this section. These techniques have played a critical role in addressing both defensive and offensive facets of managing systems and shielding sensitive data.

2.3.1. Generative Adversarial Networks (GANs)

They, presented by Goodfellow et al. in 2014, are a type of neural network model widely employed for creative tasks. They are among the most promising tools in many applications, such as unsupervised learning, image creation, earthquake engineering, and data augmentation [36,37]. A discriminator (D) and a generator (G) are two components in any GAN model. G creates examples that accurately imitate the original ones, which exist in the main traffic [24,38]. When the adversarial learning process is utilized to train the network, the G imitates the data obtained from noisy random inputs. Then, the outcomes are passed to the D model in order to assess the generated samples and output the probability to find if the data were produced via the original primary dataset or G. To maximize the probability value, G is retrained using the results generated by D. The G and D models function in a two-player, zero-sum, mini-max game, in which the G model maximizes the probability value and the discriminator reduces the value [39]. Compared to ZOO, GANs are more effective in black-box NIDS, with 2025 reviews highlighting their impact on evasion success in network traffic [14].

The loss of the G is computed via the following Equation (1):

L_{G} = E_{M \in S_{a t t a c k} N} D (G (M, N))

(1)

where the n-dimension and m-dimension are represented by N (noise vector) and M (attack sample vector). D refers to the discriminator, and G refers to the generator. LG is the creator loss, and S attack is the attack instance. E refers to the value used in all of the G’s random inputs. Note that it is very essential to decrease the value of LG to elevate the accuracy of the G. Two processes are required to train the D model: (i) adversarial data created via G, and (ii) labels estimated by IDS. To calculate the loss function in the D, the following equation can be used:

L_{D} = E_{s \in B_{b e n i g n}} D (s) - E_{s \in B_{a t t a c k}} D (s)

(2)

where E is the approximated size of the created data, classified as malicious or benign. s is the data collection generated from the G model and utilized to train the D model. The benign sample is represented by the B benign, while the attack sample is represented by the B attack [40].

2.3.2. Zero-Order Optimization (ZOO)

ZOO, also known as bandit or gradient-free optimization, is used to estimate the gradients of functions that are difficult to calculate with traditional methods. Instead of relying on direct gradient information, ZOO approximates gradients by evaluating the function itself, making it useful for optimizing complex problems [1]. Interestingly, ZOO can handle many challenging tasks where gradients are hard to calculate, e.g., attacking large neural networks, network control, reinforcement learning, and tuning deep neural network (DNN) hyperparameters [41].

This technique also approximates the function gradient through checking the function at some points and seeing how the output modifies. Additionally, it works well with black-box optimization methods, where the model’s hyperparameters cannot be accessed [42]. ZOO can create AEs to attack the NNs in black-box settings and make the NN models produce incorrect predictions [43], even when the classifier structure is inaccessible to an attacker [41,42]. The optimization problem of ZOO can be calculated via the following Equations (3) and (4):

\begin{matrix} m i n \\ x^{'} \end{matrix} {| | x^{'} - x | |}_{2} + c . f (x^{'}, t) s . t . x^{'} \in [0,1]^{p}

(3)

in which c refers to a regularization parameter when it is larger than 0. p refers to a dimensional column vector. X represents the original data input associated with the label

l

, and

X^{'}

is the malicious example connected to the label t, e.g.,

f (X^{'}) = t \neq f (x

)

l, where the loss function is given by f {(X}^{'} + t)

and can be computed using Equation (4):

f (X^{'}, t) = m a x \{\max_{l \neq t} \log {[F (X^{'})]}_{l} - l o g {[F (X^{'})]}_{t^{'}} - K\}

(4)

where the class’ number is K, and when k represents larger than or equal to zero, it represents a tuning parameter to improve the transferability of the attack. ĝ_i represents the estimated gradients and can be calculated using the finite differences approaches as follows (5):

ĝ_{i} = \frac{\partial f (x)}{\partial x_{i}} \approx \frac{f (x + {h e}_{i}) - f (x - {h e}_{i})}{2 h}

(5)

where the i-th part of the standard vector is referred to as

e_{i}

, and h represents a small value. ZOO employs the Newton’s technique with the Hessian approximation, ĥi, as given in the following equation.

ĥ_{i} = \frac{\partial^{2} f (x)}{\partial x_{i i}^{2}} \approx \frac{f (x + {h e}_{i}) - 2 f (x - {h e}_{i})}{h^{2}}

(6)

This approach is effective in approximating the gradient and Hessian while maintaining approximately the same performance in the C&W breach, and no training for the models or further knowledge about the target model is required. However, while the ZOO method often requires significant computational time and costs due to the query intensity, e.g., 1000 s of queries per sample [41,42], recent optimizations achieved by using the Hessian approximation have been proven to decrease the number of the required queries by about 20–50% in the black-box settings. In fact, these improve the efficiency of the black-box NIDS attacks, though ZOO remains slower than the gradient-based approaches, such as FGSM [44].

2.3.3. Kernel Density Estimation (KDE)

KDE has been considered as a powerful statistical approach used to predict directly from observed data the probability density function (PDF) of a random variable. In contrast to parametric methods that require specific distributional assumptions, KDE does not make any assumptions about the data’s distribution. This makes KDE especially useful for analyzing complex or unknown data patterns, helping to accurately capture the behavior of real-world datasets [45].

KDE smooths data points by placing a kernel function (like a symmetric, non-negative Gaussian) over each point, spreading its influence over a range. This creates a smooth density estimate that reveals the data’s true distribution [46]. KDE uses this kernel to find the distribution of a variable without requiring prior assumptions, making it a non-parametric method. The density is estimated purely from the observed data using the following equation:

f_{h} (x) = \frac{1}{n} \sum_{i = 1}^{n} k_{h} (x - x_{i}) = \frac{1}{n h} \sum_{i = 1}^{n} K (\frac{x - x_{i}}{h})

(7)

In the equation, h represents the bandwidth (controlling smoothness), x is the input variable, and the kernel function is given by K, which is bounded, non-negative, and integrates to unity (normalized) [47]. Due to the KDE’s flexibility, it is used in many areas, like seismology for analyzing image processing, the hybrid Kalman filter (EnKF), and microearthquake (MEQ) data [45]. In cybersecurity, this technique works with Support Vector Data Description (SVDD) based on the distance to spot traffic adversaries, boosting performance, detection accuracy, and outlier identification—especially on the NSL-KDD dataset [48].

KDE is powerful because it can reveal data distributions, especially when they have complex shapes, such as multiple peaks that parametric methods struggle with. For instance, when applied to the National Alzheimer’s Coordinating Center (NACC) dataset, KDE successfully uncovered hidden patterns, showing multiple peaks in cognitive test score distributions [49]. However, KDE’s performance depends heavily on the bandwidth, a smoothing parameter that controls how much detail is preserved. A large bandwidth oversmooths the data, hiding important features like multiple peaks, whereas a small bandwidth creates a noisy estimate that does not reflect the true distribution [50]. That is why choosing the right bandwidth is crucial for obtaining an accurate density estimate.

Recently, KDE has been combined with advanced techniques to assess a changing dynamic geothermal system. It is used to approximate the probability value of such a system and compute its probability density function (PDF) under different situations, e.g., thermal breakthrough limits and how long reservoirs might last. This approach gives a fuller picture of geothermal resources than traditional static methods [45]. Because of this, KDE is a powerful non-parametric tool for finding PDFs in datasets, making it useful for many applications. These include analyzing risks in renewable energy systems, spotting patterns in medical data, and evaluating geothermal resources. However, to obtain precise density estimates, it is important to carefully select both the kernel type and bandwidth [51].

2.3.4. DeepFool

DeepFool is an effective way to create adversarial examples that fool deep neural networks (DNNs). It works in white-box assaults; assaulters can completely bypass the classifier’s design. Unlike targeted attacks, it simply seeks the smallest possible change to make the model misclassify the input data by studying where the model switches between different classes [52]. While DNNs work very well on tasks like language processing, speech recognition, and image analysis [53], they can be tricked by tiny (hard-to-notice) changes in input data to produce wrong outputs. This weakness is especially dangerous in important areas like self-driving cars and security systems [54]. In cybersecurity, DeepFool can create “poisoned” data that fool the model but do not break it, using techniques like Pearson’s correlation to filter out useless features—known as a poisoning attack [55]. Via studying how the model works and slowly changing the primary input, DeepFool iteratively applies minimal perturbations until the decision boundary is crossed to efficiently generate small but highly impactful adversarial changes [56]. Recent 2025 attacks adapt DeepFool for DL-NIDS, as categorized in a taxonomy of deep learning adversarial methods [14].

DeepFool begins with a zero perturbation vector. Each step calculates how close the input is to the classification boundary to force misclassification. It finds the smallest required modification to cross the decision threshold by approximating the model near the given input. DeepFool is used to keep finding tiny adjustments that change the model’s prediction. The changes depend on the gradient differences and the perturbation direction. Basically, DeepFool measures the input’s distance to the decision threshold and adds the minimum noise needed to fool the model [23]. It searches the input space for the nearest boundary and the shortest crossing path. A small overshoot is added to ensure the input fully crosses the boundary. The process repeats—adding tiny perturbations and checking results—until the model makes a mistake [23]. The DeepFool model first finds the minimum change required to reach the classification boundary. Afterwards, starting from the last obtained position, it repeats the process over and over until it successfully creates adversarial examples. Mathematically, the smallest required change to make an adversarial sample is shown in Equation (8).

δ (X| f) = \min_{r} {| |r| |}_{2} s . t . f (X + r) \neq f (X)

(8)

where the smallest needed change is given by r, while δ measures how resistant the linear modeler f is to the input X. The model works as f(x) = Wᵀ·x + b, where the bias and the weight are represented by b and W, respectively. The DFA technique provides an accurate and efficient way to test how strong ML models are against attacks. It creates adversarial examples with smaller changes than the FGSM and JSMA methods but tricks models more often. However, it requires more computing power than both [52].

2.3.5. Fast Gradient Sign Method (FGSM)

FGSM is an effective technique to produce infected instances [57]. These malicious samples are specially modified inputs, in which each pixel is changed enough to make the ML classifiers produce incorrect outputs. The method works by calculating how modifications on the input impact the model’s error (the gradient), then adjusting each pixel in the direction that maximizes this error. The amount of change is controlled by a setting called epsilon (ε), which limits the change on the given input. The final changed image (the AE) can be calculated using Equation (9).

X_{a d v e r s a r i a l} = X + ε . S i g n (▽ x J (θ, X, Y))

(9)

where ε is a minimum fixed value, and ∇ indicates the loss function gradient J (how steep the slope is); the model’s parameters, original input, and true label are θ, X, and Y, respectively. This method does its work in three stages: (1) first, it finds the loss’s gradient, (2) scales it to a maximum size of ε, and (3) adds it to X to create the adversarial example Xₐᵈᵥ. While FGSM is fast at generating AEs, it is less effective than advanced methods because it produces only one AE per input and does not explore all possible attacks. Another limitation is that it is a white-box assault, restricting its use when attackers have limited access. However, it is still helpful for manufacturers to test their models’ robustness from insider attacks [23].

2.3.6. The Carlini and Wagner (C&W) Attack

In this method, the optimization techniques are employed to generate better assaults than the older L-BFGS approach. It works by [58]. Carlini et al. modified the objective function and removed L-BFGS’s box constraints. They tested three attack types using different distance measures (L₀, L₂, and L∞) and replaced cross-entropy loss with hinge loss. A key innovation was introducing a variable k instead of directly optimizing the perturbation δ, avoiding box constraints. The approach is mathematically defined in Equations (10) and (11):

\begin{matrix} minD \\ δ \end{matrix} (X, X + δ) + c . f (X + δ) s . t . X + δ \in [0, 1]

(10)

X + δ = \frac{1}{2} [\tanh (k) + 1]

(11)

The C&W attack uses a carefully chosen constant c > 0, where δ represents the malicious input, D(·,·) measures distances (L₀, L₂, or L∞), and the loss function is given by f(X + δ) that succeeds when the model predicts the attack’s target. As shown in Equation (8), k replaces δ as the optimized variable. Through a white-box attack, C&W can transfer between networks, enabling black-box attacks against ML security systems with partial knowledge. While more effective than L-BFGS at bypassing advanced defenses, such as distillation-based defense and adversarial training, it requires more computing power than FGSM or JSMA [22].

2.3.7. The Jacobian-Based Saliency Map Attack (JSMA)

The JSMA was introduced by Papernot et al. [59] to fool DNNs. It works by utilizing the Jacobian matrix to identify which input features most affect the model’s predictions, then modifies those features to fool the model. Unlike FGSM, which modifies all pixels slightly, JSMA changes only a few key pixels to create adversarial examples. It builds a saliency map from network gradients to identify which pixels will most effectively trick the model, then modifies them one by one. The process repeats until either the attack succeeds or reaches the maximum allowed pixel changes. For an input image X with label l (f(X) = l), the attack adds a small change δ to create X’ that becomes misclassified (f(X’) = l *), as follows:

A r g {m i n}_{δ_{x}} ||δ_{x}|| s . t . f (X^{'}) = f (X + δ_{x}) = l^{'}

(12)

For the input X, the Jacobian matrix and its positive derivatives are given in Formula (13):

J_{f} (X) = \frac{ə f (X)}{ə X} = {[\frac{δ f_{j} (X)}{δ X_{i}}]}_{i ϵ 1 \dots M; j ϵ 1 \dots N}

(13)

Compared to FGSM, JSMA requires more computing time because it calculates saliency values. However, it changes fewer features in the input, creating adversarial examples that look much more like the original samples [22].

2.3.8. Projected Gradient Descent (PGD)

PGD builds on BIM by adding a projection step to stay within allowed limits. Developed by Madry et al. (2018), this optimization method finds the minimum of a function while satisfying the constraints, such as the maximum perturbation size. It works by repeatedly moving in the direction that reduces the function most (following the negative gradient), then adjusting to stay within the permitted area. This projection step guarantees that the solution always obeys the constraints. The method is summarized in Expression (14).

\prod C_{ε} (X^{'}) = {A r g m i n}_{z \in C_{ε}} ||Z - X^{'}||, X_{N + 1}^{a d v} = \prod_{C_{ε}} \{X_{N}^{a d v} + ⍺ . S i g n (▽ x J (X_{N}^{a d v}, Y))\}

(14)

Here, Cε refers to the allowed set of changes, and C_ε is equal to

{z : d (x, z) < ε}

; ∏_C_ε represents projecting back into C_ε, and α controls the size of the step. For instance, when using the L∞

d (x, z) = | | x - z | | \infty)

, the projection

\prod_{C_{ε}} (z)

simply clips

z

to

[x - ε, x + ε]

. Y and J are the correct labels and the loss function, respectively. N counts iterations, X is a given image, and α controls the change in the size. PGD guarantees that solutions stay within the allowed limits. This is very good for applications with limited constraints, but the step of the projection is slow, especially under complicated constraints [22,60].

2.3.9. Basic Iteration Method (BIM)

It, introduced in 2017, improves on FGSM by using multiple small steps instead of one big change [61]. It repeatedly applies FGSM to an input, making tiny adjustments in the direction that most increases the classifier’s error. The created adversarial example looks almost like the original but fools the model. BIM starts with an initial solution and gradually improves it using gradient descent. After each step, the modified image is adjusted to keep pixel changes within limits. The method is summarized in Expression (15):

X_{0}^{a d v} = X, X_{N + 1}^{a d v} = {C l i p}_{X + ε} \{X_{N}^{a d v} + ⍺ . S i g n (▽ x J (X_{N}^{a d v}, Y))\}

(15)

where the loss function is given by J, Y represents its true label, X refers to the given image, N counts iterations, and α controls how much we change the input. The Clip function keeps the adversarial example within two limits: the ε range (x ± ε) and valid input values. BIM starts with the original image, calculates how changes affect the loss (the gradient), and then makes small controlled adjustments in that direction. After each change, it clips the values to stay within the bounds. This process repeats either until successful or for N iterations [22]. Table 2 shows a comparison of adversarial attack techniques in NIDS, including threat type, strengths, weaknesses, tested datasets, references, and example success rates.

Given the comparative analysis of the adversarial defensive and attack strategies, the evaluation between the GAN and DeepFool can be classified, as follows:

In terms of attacks, GANs generate more realistic infected samples, which can effectively bypass NIDSs but require hours to days for training purposes, and this, in turn, makes them impractical and infeasible in many applications. DeepFool carefully calculates minimal perturbations using mathematical operations, and, based on our recent study analysis supported by experimental results [62], it has been shown that precise training of the DeepFool technique produces stronger samples than those generated by the GAN attacks. Note that generating such samples can be performed in seconds with high stealth advantages.

In terms of defenses, while GANs can be utilized to create unpredictable attack vectors, their randomness prevents systematic defense development by making the comprehensive countermeasures hard to be generated. DeepFool’s deterministic approach allows defenders to understand the attack techniques accurately. In fact, this enables the targeted countermeasure and rapid security testing cycles. Further, the consistency in the DeepFool ensures that defensive approaches remain reliable across different samples and instances, unlike GANs, where different patterns can deceive the selective defenses. Therefore, this makes DeepFool outperform GAN.

Note that the most common attack techniques are the FGSM, which is highly efficient in white-box attacks [57], and the GAN-based attacks, which can create real adversarial samples [36]. While the most common defenses are the adversarial training, which can be used to improve the robustness of the model by 20–40% [63], the feature squeezing techniques have been shown to mitigate evasion attacks effectively [64]. We focused on comparing GANs and DeepFool since they are currently the two most effective adversarial techniques. GANs are known for generating highly evasive samples, while DeepFool excels in efficiency and precision. Our recent findings demonstrate that, with careful and proper training, DeepFool can outperform the GANs while offering significant advantages in terms of speed and practice.

3. Datasets Used in NIDS

3.1. Types of Datasets Used

This subsection provides in detail the most commonly employed datasets in the NIDS-based cybersecurity. Table 3 describes these frequently utilized datasets.

NSL-KDD dataset [65]: It represents the updated KDDCup99 dataset. It has 41 features: 13 for content connection features, 9 for temporal features (found during the two-second time window), 9 for individual TCP connection features, and 10 for other general features.
UNSW-NB15 dataset [66]: This dataset includes data with sizes of 100 GB, including both malicious and normal traffic records, where each sample has 49 features created by using many feature extraction techniques. It is mainly partitioned into test and training sets, 82,332 and 175,341 instances, respectively.
BoT-IoT dataset [67]: It has been generated from a physical network incorporating botnet and normal samples from traffic. It consists of the serious attacks, including DoS, DDoS, information theft, fingerprinting, and service scanning.
Kyoto 2006+ dataset [68]: It was established and revealed by Song et al. It is a collection of real traffic coming from 32 honeypots with various features from almost three years (November 2006 to August 2009), including a total of over 93 million samples [69]. Each of these records has 24 features captured from network traffic flows; 14 of them are represented as KDD or DARPA, while the rest are added as new features.
CIC dataset [18]: It has been generated to assess the NIDS performance and train the models further. The first generation of these datasets was CICIDS2017. It was collected from a synthetic network traffic for 5 days. The collected data are in flow and packet formats with 80 different features [4].
CSE-CIC-IDS2018 dataset [70]: The Canadian Institute for Cybersecurity (CIC) and Communications Security Establishment (CSE) have implemented this on AWS (Amazon Web Services), based in Fredericton. It represents the extended CIC-IDS2017 and was collected to reflect real cyber threats, including eighty features, network traffic, and system logs [71,72]. The network’s infrastructure contains 50 attacking computers, 420 compromised machines, and 30 servers.
CIC-IDS2019 dataset [73]: This one includes new attacks and addresses some flaws discovered in the previous ones: CIC-IDS2018 and CIC-IDS2017 [74]. It contains 87 features classified into normal and abnormal ones, with numerous samples of the DDoS-based attacks.
The ADFA-LD dataset [75]: It was derived from the host-based IDSs and includes samples recorded from two operating systems. Many important attacks in this dataset have been collected from zero-day and malware-based attacks.
KDD CUP 99 dataset [76]: This dataset contains a total of 23 attacks with 41 features and an additional one, the 42nd feature, employed to divide the connection as either normal or attack.
CTU-13 dataset [77]: It is a synthetic dataset that concentrates mainly on botnet network traffic and contains a wide range of dataset analyses for anomaly detection on critical infrastructures, covering 153 attack types, including port scanning and DDoS. The dataset has 30 samples. This dataset was originally presented in two formats: a full Pcap of normal and malicious packets and a labeled bidirectional Netflow. Table 3 illustrates the advantages and disadvantages of commonly used datasets in NIDS, highlighting their strengths, limitations, and supporting references to evaluate the adversarial contexts. It is worth mentioning that older datasets like KDDCup99 are justified and used for the baseline comparisons. In fact, they skew the results and do not reflect the modern network behaviors. In order to ensure the experimental validation, recent taxonomies on adversarial attacks against the NIDS [14] have been considered since they can better capture the traffic patterns and threat actions.

Given the current datasets used in NIDS, the effectiveness of the adversarial defenses and attacks mainly depends on the training data volume and diversity. For example, the CICIDS2019 represents the most comprehensive dataset compared to the other listed ones, since it incorporates extensive contemporary attack types, e.g., DDoS variants, web-based threats, and advanced persistent threats, which reflect modern cybersecurity landscapes. This can help to make the model optimal in DeepFool’s boundary approach through obtaining valuable feature spaces and providing precise decision boundary identification. Therefore, it is excellent for training the classifier.

3.2. Data Preparation

Data Preprocessing: Handle missing values, such as removing missing values, assigning with median/mean/mode, and removing irrelevant duplicated data.
Data Transformation: Apply log to skewed numerical features and convert to binary, e.g., flags, attack types.
Handling Anomalies: Include noise filtering and outliers, such as using z-scores, IQR, or isolation forest methods.
Aggregation: Create summary statistics for network flows, e.g., total packets and average duration, as well as group traffic data by sessions or flows.
Standardize Data Format: Ensure all datasets share a consistent naming, labels, and formats.
Encoding: Include binary and multi-class classification.
Splitting and Balancing: Split into train/test/validation set and use SMOTE or undersampling to address class imbalance.
Data Types: Convert columns to appropriate data types, e.g., float, int, or category.
Dimensionality Reduction: Incorporate t-SNE or PCA to decrease dimensionality.
Feature Engineering: Encode categorical variables (one hot or label), normalize numeric features (min–max/standardization), extract temporal features, and select correlation analysis or feature importance metrics, e.g., mutual information, tree-based models, to retain only the most relevant features.

3.3. Feature Reduction

In order to effectively elevate the resiliency of the IDS, the valuable and important features must be taken from the dataset to decrease the complexity of the model and the data dimensions. Generally, there are four main reduction approaches, known as “PCA”, “AE”, “SNDAE”, “RFE”, and “SSAE”, as follows:

Principal component analysis (PCA): It is leveraged to obtain valuable features in which the feature’s dimensions are decreased. This will reduce both the model’s computational time and complexity. In cybersecurity, PCA is used to obtain valuable features from the network traffic [38].
Recursive feature elimination (Rfe): This one can be employed to choose preferable features from the entire set of features in the given input data, where high-ranked features have only been chosen, while the ones with low ranks are deleted. This approach is used to eliminate redundant features and retrieve preferable features [11].
The stacked sparse autoencoder (SSAE): It is one of the deep learning models that is used to select features with high rank behavior and activity data. The classification features have been presented in SSAE for the first time to automatically extract the deep sparse features. The sparse features with low dimensions have been employed to implement various fundamental classifiers [78].
Autoencoder (AE): AE is an unsupervised feature learning method used to classify both anomaly detection in the networks and malware. In fact, using a deep AE, the original input is transformed into an improved representation through hierarchical feature learning, in which each corresponding level captures a different degree of complexity. A linear activation function in all units of an AE captures a subspace of the basic parts of the input. Moreover, it is anticipated that incorporating non-linear activation functions with AE can help to detect more valuable features [54,78,79].
Stacked Non-Symmetrical Deep Autoencoder (SNDAE): It uses the novel NDAE (non-linear deep autoencoder) approach in unsupervised learning for obtaining valuable features. A classification model built from stacked NDAEs has been shown to achieve excellent feature reduction when the RF algorithm was incorporated with two datasets: NSL-KDD and KDDCup99 [54].

It is worth mentioning that both the PCA and Rfe approaches are among the most widely used and promising scientific techniques. The PCA can be employed to decrease the dimensions of data with linear correlation, and its extent is relatively massive; the Rfe is adopted when there is an essential matter that depends on choosing the most relevant features. The autoencoder can be employed with complex and non-linear data, such as pictures, sound, and text. The SSAE technique is better than the AE approach in terms of dealing with complex data. The SNDAE is more effective for very large, high-dimensional, and noisy datasets since it can be leveraged to produce deep and non-linear representations that are resistant to noise. This representation highly refines the classifier’s performance.

3.4. Feature Extraction

It is not easy to directly expose intrusions in raw network packets due to the large volume of data traffic in the network. Therefore, to spot these intrusions faster, the valuable features in the raw packet headers should be carefully extracted, and the NIDS model should be trained on these extracted features [80]. Given the fundamental unit of each feature, the feature extraction process can be divided into flow- or packet-based. A summary of each feature extraction type, with the disadvantages and advantages of feature extraction, is given below:

1.

Feature Extraction based on Flow: This technique is the most commonly used one, where details from network packets in the same connection and flow are extracted. Every extracted feature refers to the main connection between two given devices. The extracted features in this method are mainly obtained from the header of the network packet and can be classified into three steps, as follows:

Aggregate information can be used to compute the number of specific packet features in the connection, like how many bytes have been sent, the number of flags used, and packets transferred.
Summary statistics give the entire information in detail regarding the whole connection, including the used protocol/service and the duration of the connection.
Statistical information is used to evaluate statistics of the packet features in the network connection, e.g., the mean and the standard deviation of interarrival time and packet size of the network [81].

2.

Feature Extraction based on Packet: This is another feature extraction technique that can be used to obtain a specific feature at the packet level, rather than at the connection or flow. Usually, features based on packets are statistically aggregated data similar to the feature flows, but with extra correlation analysis between the inter-packet time and the size of the packet across two devices. The features can be extracted using different time windows to minimize the traffic variations and capture the sequential nature of the data packets. The packet-based feature extraction techniques can be employed in different detection systems [82,83].

4. Adversarial Attack Against ML-Based NIDS Models

4.1. Black-Box Attack

4.1.1. Poisoning Attack

Multiple gradient approximation approaches have been presented to accurately estimate the gradient for the attack’s performance. In [84], the local search technique has been introduced for evaluating the impact of some pixels at the output to approximately calculate the gradient. The ZOO attack can be used to approximate the gradient by checking how output scores change when making small adjustments to the input near x [85]. In a similar way, Ilyas et al. [86] employed a natural evolution approach to obtain the predicted attack value by utilizing a search function [87].

In [88], the authors presented a technique operated in a black-box attack, where two attack strategies have been used—data poisoning and model stealing—to target ML-based NIDS. First, they used an improved version of SMOTE, called A-SMOTE, to create synthetic training data from a small labeled dataset. The created data have been employed for training a substitute DNN. Next, they applied a technique called Center Drifting Boundary Pattern (CBP) to generate adversarial examples (AEs) against the substitute model. These AEs were then used to poison the ML-based NIDS. Various datasets, Kyoto 2006+N, SL-KDD, and WSN-DS, with different classifiers (SVM, NB, and LR with linear, sigmoid, and RBF kernels), have been employed to assess the proposal. Experimental findings showed that, for the WSN-DS dataset, their method reduced the average model accuracy from 97.71% to 89.61%, demonstrating a significant degradation in detection performance.

In [89], a new AA approach in black-box attack, namely the Hierarchical Adversarial Attack (HAA), has been introduced. It creates strong attacks to deceive the GNN-based NIDSs in IoT networks with a low cost. First, it employs a saliency map to recognize and modify key features with small changes. Then, it selects the most vulnerable nodes for attack using a hierarchical method derived from the Random Walk with Restart (RWR) technique. The authors tested HAA on two GNN models using the UNSW-SOSR2019 dataset [90], Jumping Knowledge Network (JK-Net) [91], and Graph Convolutional Network (GCN) [92]. They compared HAA with three other methods: Improved Random Walk with Restart (iRWR) [93], Greedily Corrected Random Walk (GCRW) [94], and Resistive Switching Memory (RSM) [95]. The findings illustrated that HAA reduced the accuracy of the two GNN models by over 30%. However, HAA has not been checked well to verify whether it works against adversarial defense techniques. Another technique to create adversarial attacks in real time, utilizing heatmaps from Class Activation Mapping (CAM) and Grad-CAM++, was proposed by Kotak et al. The authors revealed the weaknesses in IoT-NIDS based on ML via leveraging IoT identification models based on the payload, such as CNNs, Global Average Pooling (GAP), and Fully Connected Neural Network (FCN) [96]. These models were used to process the first seven hundred and eighty-four bytes in the TCP payload and change all to a Monochrome image of size 28 × 28 in the IoT Trace dataset [97]. The findings demonstrated that the presented defensive model offered the best performance compared to others.

4.1.2. Evasion Attack

In [98], a practical black-box attack was proposed to attack different NIDSs in limited queries and decision-based settings. This attack can be used to recognize the benign and anomaly samples to easily bypass the strong NIDS. Query-efficient spherical local subspaces with a manifold approach have been employed to create AEs. An algorithm has been implemented on the attacked classifier to reduce the number of queries and reveal the anomaly thresholds. Moreover, the anomaly embedding space has been created using the spherical local subspaces. This technique is effective in exposing anomalies based on threshold decisions, especially once the thresholds between anomalous and benign samples are ambiguous [99]. The proposal has been evaluated employing the Isolation Forests with the CICIDS2018 dataset [100], Deep Support Vector Data Description (DSVDD), Adversarially Learned Anomaly Detection (ALAD) [101], Deep Autoencoding Gaussian Mixture Model (DAGMM) [102], and AnoGAN [103]. The findings showed that this attack accomplished a more than 70% success rate in all classifiers used.

Aiken and Scott-Hayward [104] studied whether adversarial attacks could fool an anomaly-based NIDS in an SDN. They developed a tool called Hydra to generate adversarial evasion threats via altering three main features (size of payloads, two-way traffic flow, and rate of packets) to avoid the detection of TCP-SYN DDoS attacks. These adversarial examples were tested on Neptune, an ML-based SDN detection system that uses traffic features and multiple models. The dataset included infected traffic from the DARPA SYN flood dataset and normal traffic from the CICIDS2017 dataset. Different models—FR, SV, LR, and KNN—were evaluated. Results showed that small changes in traffic features could drop Neptune’s detection accuracy from 100% to 0% for TCP-SYN DDoS attacks. The outcomes illustrated that LR, SVM, and RF were the most vulnerable classifiers, while the most robust one was the KNN. A limitation of this work is that Hydra only works for TCP-SYN saturation attacks.

Yan et al. [105] developed DOS-WGAN, a method using Wasserstein GANs with a gradient penalty to create DoS traffic that mimics the normal and evades NIDS detection. They tested it using the KDDCup99 dataset on the CNN classifier, and the outcomes illustrated that the detection rate had been reduced by 47.6%. Separately, Shu et al. [106] proposed Gen-AAL, a GAN-based active-learning attack to fool black-box ML-based NIDS with minimal queries. Their method required only 25 labeled samples to train the GAN and achieved a 98.86% evasion rate when tested on a GBDT-powered NIDS trained/validated on the CICIDS2017 dataset. Guo et al. proposed another black-box attack to generate malicious traffic. First, they trained the model to estimate the threshold in the model that they want to attack. Then, in order to generate hostile network traffic with the structure and tune the model’s hyperparameters, the BIM was leveraged. Different models, Residual Network (ResNet), MLP, SVM, CNN, and KNN, with two different datasets, CSE-CIC-IDS2018 and KDD99, have been utilized to evaluate the introduced attack, and it has been pointed out that the attack effectively affected the model performance [107].

TANTRA, which modified the traffic network via changing a given network assault’s arrival time, has been suggested by Sharon et al. The adversaries employed an LSTM to learn and accurately approximate normal patterns in the timing between packets. It then modified the attack traffic using the LSTM to match these normal timing patterns. While TANTRA showed high success in evading detection, adjusting time delays could weaken the attack’s effectiveness. However, the authors did not report whether the manipulated traffic still kept its harmful nature [108].

Zolbayar et al. [109] developed NIDSGAN, which uses predefined constraints and modifies the loss function to create harder-to-detect attacks. Besides fooling the discriminator into mistaking adversarial traffic for benign, the loss function also reduces the differences between adversarial and original features. Separately, the authors in [110] suggested IoTGAN to change the IoT device traffic to evade the models. The IoTGAN trains a substitute model in black-box scenarios as the discriminator, while the generator learns to add deceptive perturbations. The attack was tested on five ML classifiers (KNN, RF, SVM, NNs, and DT [111]) using the real-world UNSW IoT Trace dataset [97] from 28 IoT devices. Results showed over 90% evasion success against all target models.

In [80], an approach has been presented to automatically modify network data in gray- and black-box scenarios without breaking its functionality. Their approach uses GANs to create adversarial examples (AEs) and Particle Swarm Optimization to optimize them for evading detection. The attack was tested on multiple ML-/DL-based NIDSs using the Kitsune and CICIDS2017 datasets, achieving over 97% evasion success in half the cases. However, the GAN-based generation process is slow and resource-heavy, making it impractical for real-time attacks, especially in IoT networks. A ZOO attack in black-box settings was developed to test how well an RF-based NIDS could resist attacks. Their ZOO attack produces powerful adversarial examples (AEs) that are completely new and undetectable by the system. Tests showed that this attack dropped the model’s accuracy down to 48%. To defend against this, they incorporated the GAN model to generate AEs, augmented with the training data, to boost the model’s accuracy back up to 76% [5].

Researchers in 2023 developed a method named SGAN-IDS that makes adversarial examples (AEs) able to avoid the NIDS models. They tested it using the CICIDS2017 dataset and five different machine learning-based IDSs. The system combines GAN technology with self-attention mechanisms to produce attacks that slip past detection systems. Tests showed that the SGAN-IDS successfully lowered the model’s detection rate by 15.93%, ensuring the effectiveness of the presented assault [112]. Researchers tested poisoning attacks on deep learning-based NIDS by injecting malicious data into training sets at different rates (1% to 50%). Results showed that while accuracy only dropped slightly to 0.93, other metrics revealed bigger problems: PPV was 0.082, FPR reached 0.29, and MSE hit 67%. These numbers prove that poisoning attacks can seriously harm DL models. The study [55] exposed how vulnerable DL-based NIDSs are to these attacks, indicating the real need for strong defenses. The experiments proved that poisoned data can significantly damage performance and are hard to detect.

The study in [62] tested four attack methods (ZOO, GAN, DeepFool, and KDE) with three major datasets: ADFA-LD, CICIDS2018, and CICIDS2019. Researchers evaluated these attacks on a trained ML-based NIDS classifier to determine which attack was the strongest. The results clearly showed that DeepFool performed best among all the tested attacks.

4.2. White-Box Attack

4.2.1. Poisoning Attack

Fan et al. [113] pointed out problems with current approaches that utilized gradient-based assaults in order to test Adversarial Training (AdvTrain) defenses [114]. They created an unseen assault technique named Non-Gradient Attack (NGA) and a better testing standard called Composite Criterion (CC) that looks at assault success rate and accuracy. Their NGA technique works by searching for adversarial examples outside the decision boundary, then slowly moving them closer to real data while keeping them misclassified. Tests on CIFAR-10 and CIFAR-100 datasets [115] compared AdvTrain’s performance against four common gradient attacks (C&W, FGSM, PGD, and BIM). Results showed that previous estimates of DNN-based IoT NIDS security might have been too optimistic. The new NGA + CC method gives a more accurate way to test these systems, both with and without AdvTrain defenses. The authors noted that their NGA method currently has slow convergence but they plan to improve this in possible future research.

4.2.2. Evasion Attack

Researchers in [116] used Mutual Information (MI) to identify the valuable features in DoS and benign traffic. They crafted attacks by reducing the differences between these key DoS features using l1 norm minimization. The attack has been assessed utilizing DNN and SVM models with the NSL-KDD dataset. The outcomes showed the DNN’s DoS detection accuracy fell by 70.7%. Abusnaina et al. [117] investigated whether standard adversarial attack methods are efficient against flow-based IDS in SDN networks. They found that these methods are not suitable because flow features are interconnected (unlike image pixels) and attacks must create realistic network flows. Their solution, FlowMerge, creates believable attack flows by blending original flows with representative “mask” flows from a target class using either averaging or accumulation. When tested against a CNN classifier, standard methods (C&W, ElasticNet, DeepFool, MIM, PGD) achieved 99.84% untargeted attack success, while FlowMerge achieved even higher success for targeted attacks while maintaining realistic flows. The study also showed that adversarial training could defend against standard attacks but not against FlowMerge. This method is limited since it depends on certain assumptions about traffic features.

Hashemi et al. [118] tested a white-box attack using the CIC-IDS2018 dataset on different NIDSs: BiGAN [119], DAGMM, and Kitsune [83]. This method, similar to the work in [120], used just three simple packet changes: delaying packets, splitting packets, or adding new packets. The attacks were created by repeatedly trying these changes and checking if they successfully lowered the NIDS’s anomaly score below its detection threshold. In [12], AIDAE (Anti-Intrusion Detection AutoEncoder), a classifier that creates fake network features that seem to be normal to IDSs, has been proposed. It learns patterns from real network data and generates convincing fake features without needing access to the target IDS during training. AIDAE uses both an autoencoder and a GAN and smartly handles both numerical and categorical features. The proposal has been assessed utilizing NSL-KDD, CIC-IDS-2017, and UNSW-NB15 datasets with seven detection methods (Adaboost, CNN, LSTM, KNN, DT, RF, and LR), and the results showed that the proposal significantly decreased the accuracy to about 7.11%.

In [33], the genetic algorithm (GA), GAN, and particle swarm optimization (PSO) have been used to produce strong adversarial examples. They tested the proposal on several classifiers, including BAG, DT, QDA, LDA, SVM, MLP, GB, NB, RF, KNN, and LR with NSL-KDDU and NSW-NB15 datasets. The authors compared these techniques to the Monte Carlo simulation that creates random infected data. The average evasion rates have been reported as 96.19% (MC), 99.99% (PSO), 99.95% (GA), and 92.60% (GAN) using the NSL-KDD dataset. For UNSW-NB15, the rates have been reported as 98.06% (PSO), 99.53% (MC), 99.61% (GAN), and 100% (GA). Teuffenbach et al. [121] introduced a method to create adversarial examples (AEs) for NIDS while respecting constraints on the network traffic features. Features are grouped, and each group receives a weight based on how easy it is to modify them. This technique uses the C&W threat to optimize perturbations while considering these constraints and weights. Attack difficulty depends on two limits: how many features can be changed (feature budget) and how much they can be altered (perturbation budget). This method ensures realistic adversarial flows by only modifying the accessible and independent features (with an average change of less than 0.2). The technique was tested against FGSM and C&W attacks using DNN, DBN, and AE models with two datasets: NSL-KDD and CICIDS2017. The results illustrated that the AE classifier was the most robust against the created adversarial samples but had the minimum accuracy in detecting anomalies compared to the DBN and DNN models. Meanwhile, DNN and DBN handled adversarial examples better for DoS attacks than for PortScan or Probe attacks.

Wang et al. [122] pointed out that many studies incorrectly use image-based adversarial example (AE) generation methods for NIDS, which do not fit the network traffic’s unique needs. They developed a Constraint-Iteration Fast Gradient Sign Method (CIFGSM) that accounts for network traffic complexity, feature types, and their relationships. Testing CIFGSM against IFGSM leveraging MLP, DT, and CNN models with the NSL-KDD dataset ensured that CIFGSM performed better in accuracy, feature matching, Euclidean distance, and matrix rankings. Under CIFGSM attacks, classifier accuracy dropped to 0.25 (DT), 0.68 (CNN), and 0.73 (MLP). InfoGain Ratio, a valuable feature-ranking technique for identifying malicious from benign ones, was proposed by Anthi et al. The perturbation has been produced using the top-ranked features. The AAs have been assessed by using SVM, Bayesian Network, DT, and RF, where these AAs have been produced by changing the affected features individually or simultaneously. The experimental findings indicated that the presented technique can create strong AAs for DoS attacks and is effective against systems utilizing supervised ML-based NIDS [123].

In [124], a DL model has been used to evaluate the IoT-NIDS based on ML. To do so, an integrated CNN and LSTM (LSTM–CNN) classifier for IoT-NIDS has been achieved to assess the classifier’s robustness against AEs by modifying the real device conditions, including the CPU load to prevent the attacks, temperature adjusts, and device restarts. The proposal was evaluated leveraging the LwHBench dataset to expose other AEs, such as PGD, Boundary Attack, JSMA, BIM, MIM, FGSM, and C&W [125]. The integrated classifier accomplished an average F1 score of 96% with an 80% TPR in system detection. Even though the classifier remained robust to temperature-based assaults once exposed to various evasion AEs, a few attacks, e.g., BIM, MIM, and FGSM, have bypassed the detection system. Furthermore, both the adversarial training and the model distillation have been incorporated to refine the classifier performance. In fact, combining the adversarial training and the model distillation generated a robust defensive technique.

This study [126] examined how well deep learning (DL)-based NIDS can withstand popular adversarial evasion attacks like C&W, JSMA [44], PGD [45], and FGSM [42]. The goal was to trick the NIDS into labeling malicious traffic as normal by feeding it carefully crafted adversarial examples. The attacks noticeably hurt the system’s performance, lowering its AUC, f-score, recall, precision, and accuracy. Based on classification reports and confusion matrices, the C&W is generally shown to be a strong attack since its impact on the IDS was similar to FGSM, JSMA, and PGD, but under the C&W attack, the score of the AUC hit about 64, higher than FGSM (59.232) and PGD (58.485) but lower than JSMA (68.037), where the experiments used the CICIDS2017 dataset.

4.3. Gray-Box Attack

4.3.1. Poisoning Attack

In [127], the authors introduce a framework to classify and explain various attacks that could target feature selection algorithms. This builds on existing attack models employed to assess the security of unsupervised and supervised learning systems. Using this framework, they then define poisoning attack strategies against common embedded feature selection methods, such as ridge regression, elastic net, and LASSO (least absolute shrinkage and selection operator). The study in [128] created poisoning attacks to attack SVMs. These created threats work by adding carefully designed bad data to the training set, which make the SVM perform worse. Most machine learning methods assume that training data come from normal, clean sources—but this is not always true for security applications. The attack method uses gradient ascent (a gradient-based optimization technique) to mimic how the SVM finds its best solution. It works with different types of SVMs (even complex non-linear ones) by calculating the attack in the original input space. Tests show that this approach consistently finds ways to make the classifier perform much worse by exploiting weak spots in the learning process.

4.3.2. Evasion Attack

In [129], an investigation on how AEs affected the DNN classifier was presented. It has been pointed out that the attacker is able to generate strong AEs to deceive the model. The experimental results were conducted employing WGAN, C&W, and ZOO attacks. Under the three attacks, the effectiveness of the DNN model on the NSL-KDD dataset is drastically reduced, implying that attacks with high detrimental implications might be launched without even knowing the detection model’s underlying workings. For ZOO, WGAN, and C&W, the attacks result in 70 percent, 62 percent, and 24 percent reductions in F1 score, respectively. The replacement attack model is determined to be inefficient, while the ZOO assault is proven to be the most successful and robust. Note that this approach is computationally costly since it must contact the detection system to upgrade the gradients. This could limit its utility in real-world circumstances. This happens because the NIDS limits the queries to a certain number that an attacker can generate.

Packet-Level Attacks (PLAs) work directly on network packets to create practical adversarial examples. Homoliak et al. [120] tested this approach using a gradient boosting (GB) threat model against five basic ML classifiers (Naive Bayes, Decision Trees, SVM, Logistic Regression, and Kernel Density Naive Bayes) on their custom ASNM-NPBO dataset. The attacks randomly applied six types of packet changes to trick the classifiers: delaying packets, losing packets, damaging packets, duplicating packets, rearranging packet order, and breaking up payloads. Each modification was tested to see if it could bypass detection.

As mentioned before, all attack techniques in the gray-box attack setting require only partial knowledge, e.g., access to the training dataset.

4.4. Combination of Poisoning and Evasion Attacks

Another scenario of attack explored a certain type of DoS-based AEs to evade DoS IDS based on ANN [130]. The researchers proposed an enhanced boundary approach to relieve malicious DoS attacks by analyzing the features within the DoS traffic. The purpose of this work was to improve the Mahalanobis distance by changing the discrete and continuous features of DoS examples. Two datasets, CICIDS2017 and KDDcup99, with two trained ANN models, have been used, and the experimental results revealed that, by applying the proposed technique, adversarial DoS samples can be generated with a few query numbers, and this further decreased the predicted output of the true class to about 49 percent.

Apruzzese et al. [30] tested attacks using the CTU-13 dataset on three ML models, RF, MLP, and K-NN, based on NIDS. Normally, these classifiers performed well with recall scores from 0.93 (K-NN) to 0.97 (RF). For the attack, they modified three features: bytes exchanged, connection duration, and total packets by adding random values within their normal ranges. This caused performance to drop sharply, with recall falling to 0.31 (MLP) and 0.34 (RF). The study then tested two defenses: adversarial retraining improved recall to 0.49–0.60, while removing attacked features before training worked even better, boosting recall to 0.76–0.89. In [40], polymorphic DDoS attacks have been created by incorporating GANs, where polymorphic attacks constantly change their attack patterns (by modifying feature counts and swapping features) to evade DS detection systems. These evolving attacks successfully bypassed detection systems while keeping false alarms rare. The study, conducted utilizing RF, NB, LR, and DT models with the CICIDS2017 dataset, showed that detection rates plummeted to 5.23% (modified feature counts) and 3.89% (swapped features). While traditional incremental training defenses failed against these attacks, the method proves valuable for generating synthetic attack data to strengthen NIDS training and improve the detection of emerging threats. Table 4 presents a summary of the studies in brief on the adversarial attack generation using ML-based NIDS approaches. Note that the feasibility can be assessed based on the computational time and the detectability, such as whether the attack exceeds the anomaly detection thresholds [99]. In general, the poisoning attacks are less feasible than the evasion attacks since they require access to the training pipeline, which is often more restricted.

In summary, evasion attacks, e.g., FGSM and PGD, are the most common and dominant due to their efficiency in different real-world scenarios. Such attacks usually can achieve 70–90% evasion rates on different datasets [14,126] and incur low computational cost, e.g., seconds per sample. In contrast, poisoning attacks, e.g., data injection, can take hours and require direct access to the training data. In fact, this makes them very difficult to be implemented in real and dynamic networks. It is worth noting that evasion attacks are poorly transferable and cannot be transferred well in the black-box settings since they can be detected due to the query limits and are vulnerable to robust defenses, such as adversarial training, that can decrease the performance by 20–50% [131]. Moreover, recent works ensured that the hybrid evasion–poisoning attacks can elevate the success rates up to 85%; however, they can be easily detected since they produce unusual and anomalous traffic patterns [10].

5. Defending ML-Based NIDS Models

5.1. Mitigation of Black-Box Attack

5.1.1. Mitigating Poisoning Attack

Many scholars have introduced various defensive techniques to prevent AEs. For example, the authors in [132] introduced a safety-net system that combines an attack detector with the original classifier, which tests the deeper layers of the classifier to spot AEs. Metzen [133] explored a similar approach. Another defense called Reject on Negative Impact (RONI) [134] analyzed each training sample’s effect on accuracy and removed those that significantly hurt performance.

The paper in [135] evaluated an outlier detector as a defense against malicious data injection attacks. The idea was to find and eliminate suspicious training data using statistical checks. This method has been checked to determine whether it can spot poisoned data made by two strong attack methods: the input instance-key strategy and the Blended Accessory Injection strategy. The results showed that the outlier detector failed to catch any poisoned samples from either attack. This means that these attack methods can create poisoned data that blend in and avoid detection.

5.1.2. Mitigating Evasion Attack

In [11], GANs have been incorporated to create AEs that change only non-essential network traffic features without affecting functionality. They also proposed using GAN-generated AEs during training to strengthen defenses. Testing on the KDDCup99 dataset with multiple classifiers (DNN, LE, SVM, KNN, NB, RF, DT, GB) showed that initial accuracy ranged from 43.44% (SVM) to 65.38% (GB). After GAN-based training, performance improved significantly, with accuracy reaching 86.64% (LR) and 79.31% (KNN). Wang et al. [136] adapted mutation testing from software engineering to detect adversarial examples in DNNs. Their method works by randomly changing the neural network’s structure during operation and checking how often these changes affect the output labels. This “label change rate” helps to identify suspicious inputs. The approach successfully detected C&W attacks [58] even without previous knowledge about the attack method.

In (2022), the authors first prepared and cleaned the NSL-KDD dataset by converting all symbols into numerical values. Then, the PCA has been incorporated to extract features and classify the data utilizing SVM, LR, and RF models. Their findings showed that the SVM achieved the best accuracy (98%), followed by RF (85%) and LR (78%), measured using accuracy, precision, and recall metrics [32]. Separately, Faker and Dogdou combined big data analytics with deep learning to improve intrusion detection systems. They tested three classifiers—Random Forest, Gradient Boosting Tree (as an ensemble), and Deep Feed Forward Neural Network (DNN). The DNN performed best, by obtaining the highest accuracy, 99.16%, in binary classification, and in the multiclass model, it was 97.01% [137]. In 2022, researchers developed a defense system combining an FR model with a GAN-powered deep learning approach. They trained this model on a blended dataset containing both real network data and artificial samples produced by ZOO and GAN attack methods. The team also used PCA (principal component analysis) to choose the most valuable features from the generated adversarial examples (AEs), which helped to improve the model during the training. Test results showed that the system’s accuracy improved by 27% compared to previous methods [5].

The researchers developed a DeepFool-based defense method to protect the system against four different threat models: DeepFool, KDE, ZOO, and GAN. They tested this approach utilizing CICIDS2019, CICIDS2018, and ADFA-LD. The experimental findings demonstrated that their defensive solution outperformed other models by detecting more attacks successfully [62]. More specifically, the detection rates have been improved by about 15–25% on average against the DeepFool and ZOO, and 10–20% on average against the GANs due to the precise selection of the boundary perturbations.

5.2. Mitigation of White-Box Attack

5.2.1. Mitigating Poisoning Attack

Benzaıd et al. [63] proposed a DDoS self-protection technique, which resists malicious attacks. The proposal employed software-defined networking and DL to automatically spot and prevent DDoS threats. It showed good performance in improving server response time and reducing system load. The detection model was implemented leveraging a Multilayer Perceptron (MLP) trained on both normal and DDoS traffic from the CICIDS2017 dataset. To defend against attacks, adversarial training was applied using adversarial examples (AEs) created with the FGSM technique.

Benaddi et al. [138] employed Generative Adversarial Networks (GANs) to train NIDS based on Distributional Reinforcement Learning (DRL). This approach helped to detect rare network threats and refined the robustness and the performance of NIDS in Industrial Internet of Things (IIoT) environments. Similarly, in [139], the authors introduced the Decentralized Swift Vigilance (Desvig) approach, which incorporates Conditional GANs (C-GANs) [140] to enable fast-response, high-efficiency security solutions for industrial settings. Benaddi et al. [141] introduced C-GAN (Conditional GAN) to strengthen a hybrid LSTM–CNN-based NIDS for IoT networks. They used this as an external training network to refine the classifier’s robustness. The approach was actually adapted from an auxiliary classifier GAN (AC-GAN) framework [142].

5.2.2. Mitigating Evasion Attack

Stochastic Activation Pruning (SAP) [143] is a post hoc defense method against adversarial attacks. During the neural network’s forward pass, it randomly drops the activations at each layer, with smaller activations being more likely to be dropped. To balance this, the remaining activations in later layers are scaled up to maintain input strength. By pruning weaker activations, SAP reduces the chance that adversarial changes will build up in deeper layers, improving the model’s resistance to adversarial examples.

In [144], a defensive technique was presented to prevent AEs. The DNN classifier has been trained against AEs utilizing the min–max method with the UNSW-NB15 dataset. The max technique has been used to construct AEs, which increase the loss, while the min technique has been used as a protection method to decrease the loss of the inserted infected samples at the training phase. Four different models have been leveraged to create AEs. To refine the resilience of the classifier, the model was trained leveraging both the generated AEs and the normal instances for each technique. The used classifiers have been attacked via four groups of AEs during the testing for every strategy. Among all adversarial attack strategies, the “dFGSMS” trained with AEs produced via dFGSMS had the least evasion results. Meanwhile, among all other models, the “BGAS” beats all attack strategies. PCA was also used for data dimension reduction purposes and mostly improved the strength of trained models against the AEs. Compared to the initial trial, the evasion rates have been lowered by about three times, and the trained models were equally resilient to all forms of AEs.

Researchers in [4] introduced Reconstruction from Partial Observation (RePO) to make an unsupervised denoising autoencoder better at detecting AEs in NIDS for both packet and flow-based data. They compared RePO with baseline methods like Kitsune [83], BiGAN [119], and DAGMM, and used adversarial attacks from [118]. Testing on the CICIDS2017 dataset, RePO enhanced the adversarial detection system by 29% and 45% compared to the baselines. The authors in [145] introduced a method to reveal AAs during inference by analyzing neural network activations. They trained an artificial neural network (ANN) on part of the CICIDS2017 dataset and recorded its activations during testing. These have then been leveraged to train and evaluate different models—Adaboost, ANN, Random Forest (RF), and SVM—to identify adversarial examples (AEs). Their results showed that RF and KNN achieved a high recall of 0.99 against evasion attacks generated utilizing PGD, C&W, BIM, and FGSMs.

Peng et al. [29] developed a way to spot harmful fake data (ASD) using a two-way GAN (BiGAN) to shield NIDS from attacks. Their system has three pieces: (1) an encoder, (2) a discriminator, and (3) a generator. The generator studies normal traffic patterns during training. The ASD then looks for oddities in the data by checking reconstruction mistakes, while the discriminator examines these errors. This lets the ASD block suspicious data before it hits the NIDS. Tests revealed that without ASD, the DNN’s performance dropped sharply: 60.46% (FGSM attack), 28.23% (PGD), and 46.5% (MI-FGSM). With ASD, performance bounced back—rising 26.46% against PGD and 11.85% against FGSM—though its impact on IM-FGSM remained unclear. Ganesan and Sarac [146] used a feature selection method (RFR) to eliminate the features that attackers exploit to fool detection systems. They experimented with smaller sets of features to create a group of ML classifiers: LR, RF, SVM, and DNN, which are harder to trick. The tests used three datasets, KDDCup99, CICIDS, and DARPA, with Hydra creating the attack data. The results proved that this group of models working together could catch attacks that a single model using all features would miss.

The research examined whether attack methods that work on one ML-based NIDS could fool other differently designed NIDS models, when the attacker has no internal knowledge (black-box conditions) [147]. To circumvent a DNN model, both FGSM and PGD models were utilized to create AEs. The transferability of these cases was then investigated across prominent ML techniques, LDA, DT, FR, LR, and SVM. The trials emphasized that the DNN classifier was the highest degraded, while the remaining classifiers were less degraded, albeit to varying levels, due to their differentiable-unit design. Furthermore, the study found that utilizing a combination of detection systems was resilient to transferable threats compared to an individual classifier. The work ensured that the used Detect-and-Reject strategies helped to minimize the effects of adversarial attack transferability over ML-based detection classifiers.

Novaes et al. [148] used a GAN to detect DDoS attacks in SDN networks. Since GANs can generate adversarial traffic, they have been used for defense through adversarial training. The GAN performed better than MLP, CNN, and LSTM models when tested on the CICDDoS 2019 dataset and simulated SDN traffic. Yumlembam et al. [149] introduced a GAN-based system to strengthen Android malicious detection utilizing Graph Neural Networks (GNNs). Their method uses GNN to create graph embeddings from API call graphs, combining ‘Permission’ and ‘Intent’ features to boost malware classification. They developed VGAE-MalGAN, which adds fake nodes and edges to API graphs, tricking GNN-based malware detectors while keeping the original malware behavior intact. VGAE-MalGAN has two parts: a generator (a modified variational graph autoencoder) and a substitute detector (a GraphSAGE model). Experiments showed that retraining the model with VGAE-MalGAN makes it more resistant to attacks while maintaining high detection accuracy. The study used the CMaldroid and Drebin datasets for evaluation.

To make the model stronger against attacks, three defense methods have been checked: adversarial training, high-confidence, and Gaussian data augmentation (GDA). After applying these defenses, the model’s confidence score improved significantly in all four attack scenarios. This study examines different advanced AAs and their defensive technique in NIDS, supported by detailed experiments. The results suggest that DL-based NIDS should not be used for critical, real-time applications unless strong defenses are in place to block adversarial attacks [126]. The study in [64] introduced a defensive approach to mitigate the strong C&W attacks. During the proposal training, the GDA with the adjusted adversarial training has been employed, while during testing, the Feature Squeezing (FS) technique was executed on adversarial samples before sending them to the NIDS to be classified. The method is tested using the newer dataset, CIC-DDoS-2019, and the results are measured using classification reports and confusion matrices [64]. The 2025 hybrid framework boosts robustness by 20–30% against evasion attacks, enhancing defenses through integrated adversarial training [131].

5.3. Mitigation of Gray-Box Attack

5.3.1. Mitigating Poisoning Attack

In [150], a defense method is proposed against specific attacks. Causative attacks work by tricking the email filter into learning incorrectly when trained on malicious emails, leading to misclassifications. Each attack email worsens the performance of the filter, so when the harmful impact of each email was measured, damaging ones, from the training set, could be eliminated. The detection technique measures an email’s impact by comparing filter performance with and without it, where this technique was named as (RONI) Reject On Negative Impact. To test this, a small 50-email validation set (V) and a 20-email training set (T) have been repeatedly assessed five times to obtain high accuracy. For each test email Q, two models, one leveraging T alone and another employing T∪Q, have been trained, and the Q’s impact has been measured by obtaining the average change in misclassifications on V. If Q’s effect is harmful, it is removed. RONI was tested with 120 normal spam emails and dictionary attacks (using Aspell and Usenet dictionaries). The results showed that RONI can effectively prevent dictionary threats, in which all threat emails can be detected without wrongly flagging harmless ones. Note that threat emails dropped the accuracy to about 6.8 in true negatives (correctly classified safe emails), while the normal spam decreased it to at most 4.4, rendering the threshold more efficient. However, RONI fails against focused attacks because they target future emails, leaving almost no trace in the training data.

5.3.2. Mitigating Evasion Attack

Apruzzese et al. in [151] introduced AppCon, which is a combinational technique that integrates strong defenses to protect NIDS against adversarial evasion threats with high performance. The AppCon reduced the number of malicious instances that are needed by an opponent to bypass the detection system. The combinations of the infected feature flow have been slightly modified to create robust AEs [152]. The AppCon has been assessed leveraging finely built NIDS classifiers, involving the CTU-13 dataset with different models. The results demonstrated that the proposal decreased the effectiveness of the evasion threats by more than 50% without reducing the classifier performance.

Siganos et al. (2023) created an AI-based IDS for IoT systems, making it easier to understand using SHapley Additive exPlanations (SHAP) with both ML and DL models. This helped to explain how the IDS works, removing the “black-box” problem. The system performed well in detection and met the growing need for Explainable AI in complex systems. For testing, they used two balanced datasets, including IEC 60870-5-104, and compared ten different ML models. Among these models, RF was the most accurate, achieving an F1-score of 66% [153]. In 2023, researchers developed an AI-powered NIDS to fix data imbalance problems. The system generates realistic artificial data and monitors training balance to address this issue. When tested with the autoencoder, DNN, and CNN detection models, it performed better than existing methods. Experiments used both real-world and IoT datasets, showing major improvements—hitting 93.2% and 87% accuracies on NSL-KDD and UNSW-NB15 datasets, respectively. The system also effectively detected network attacks in IoT data [154].

In 2024, researchers created an attention–GAN method to better detect cyber threats. This approach generates powerful attack samples that help train detection models to catch more threats. By combining GAN with attention mechanisms, the system improves the detection of sophisticated attacks. Tests using both CICIDS2017 and KDD Cup datasets showed impressive results—reaching 97% for recall, precision, and F1-scores, and 99.69% accuracy. The method also includes a new defense strategy against future threats [155]. The unified defenses have been underexplored due to the challenges in combining the evasion (real-time) and the poisoning (training-phase) mitigations, such as conflicting optimization goals and increased complexity [131]. The hybrid ensembles boost the model’s robustness by about 20–30%, but they require the standardized benchmarks for evaluation purposes.

5.4. Mitigating the Combination of Poisoning and Evasion Attacks

Qureshi et al. [156] proposed a threat detection technique named RNN-ADV, which employed Artificial Bee Colony (ABC) with the Random Neural Network method to spot attack attempts. The authors checked how well it works to prohibit JSMA threats and compared it with the DNNs trained on the NSL-KDD dataset. When under attack, RNN-ADV performed better, scoring 44.65% for DoS attacks and 52.60% F1 for benign traffic, while the DNN only managed 35.69% and 25.89% for these same cases. The results showed that RNN-ADV functioned excellently in classifying adversarial examples with higher precision and accuracy than DNN. Table 5 illustrates the research studies on defensive ML-based NIDS, where the adversarial training can improve the detection performance by approximately 15–40% against the evasion attacks. However, such a technique struggles with the poisoning attacks, typically providing less than 50%. The hybrid techniques have been shown to be one of the promising solutions, but they are still underexplored due to the significant overhead required [10,131].

Given the current defensive techniques, recent studies indicate that the adversarial training and GAN-based defenses are the most effective and robust techniques against the evasion attacks since they can improve the robustness of the model by 15–40% [63,147]. However, the anomaly detection approaches perform poorly against the poisoning attacks, in which their effectiveness is usually less than 50% [135]. Unfortunately, most of the existing defensive approaches are attack-specific and barely address the hybrid attacks due to the required high computational overhead and the lack of unified evaluation benchmarks [131], where their real-world performance significantly drops in high-speed networks as the latency constraints limit the timely detection. Note that the hybrid frameworks, such as the ensemble-based [131], can mitigate both the poisoning and evasion attacks by combining the adversarial training with the detection approaches, and this improves the robustness of the model by about 20–30% against the combined attacks.

6. Discussion

In this section, we suggest and give insight into some prospective concepts for further future research studies. NIDS operates in hostile environments, and, therefore, it is subject to many adversarial attacks. To mitigate this dilemma, designing solid systems that are resistant to these attacks is necessary. The results of the experimental studies indicated a high rate of smuggling attacks against gold-free systems. We analyzed and summarized the ML-based NIDS research for the past five years, from 2020 to 2025, as shown in Figure 3.

Adversarial attacks can largely compromise and deteriorate the classifiers of NIDS. This belongs to the transferability of AEs over several architectures, where an opponent can create a substitution model and later use these hostile examples to attack other targeted systems.

In general, many survey studies have been conducted, and most of the proposed mitigation techniques aim to improve the overall performance of the network systems in order to strengthen the defense mechanism. However, it is interesting that a few of these studies provided defensive solutions and evaluated them through several factors, such as time, memory, complexity, cost, power, and suitability for the target application. Yet, they failed to incorporate essential methods such as robust optimization, data sanitization, and game theory.

From the previous tables given in this work, several considerations can be expressed, the first of which is that many studies used the widely deprecated dataset KDD99, as in [11,163,164], which has a lot of flaws and does not represent the current attacks on real-world applications. Consequently, emerging and realistic large-scale datasets, such as CSECICIDS2019 and LITENT2020, are requested to mitigate sophisticated attacks and to moderate the network configurations as the user bases grow significantly. Using these recent datasets enables scholars to obtain results that are accurate and consistent with modern technology networks to prevent advanced threats. This issue has been shown in the experimental validation. Many studies that depend on the obsolete KDDCup99 dataset, e.g., [11,163,164], often report that the evasion attack rates exceed 90%. However, these results have been recently shown to be invalid and unreliable. As proven by NIST [10], such datasets fail to capture and reflect the modern network conditions, including specific vulnerabilities and traffic characteristics. Therefore, this makes the researchers dramatically overestimate the effectiveness of the attacks and underestimate the robustness of the modern intrusion detection defenses [14]. Other researchers leverage more than one dataset in their experiments, as in [12,40,165], and this is an important point for the development of sophisticated systems. Therefore, the evaluation of these experiments using large datasets against adversarial attacks should show that the proposed technique is robust and resilient. Furthermore, these findings permit researchers to better understand the impact of sophisticated attacks on the current detection approaches by taking into consideration up-to-date datasets.

We also realize that only a few proposed techniques involve poisoning attacks, as in [40,130,166], whereas the majority of other techniques concentrate on the evasion assaults, as in [159,160,161], during the inference. These are important findings since poisoning assaults are hard to carry out in real conditions and circumstances, since the attackers are not able to control the training data. In addition, many research works have concentrated on specific attack techniques, including FGSM [160], which obtained the highest percentage in their use, because of its ease, efficiency, and spread, while other methods are barely applied, such as the ZOO method (black-box attack), especially in the field of NIDS. It has been noted that the FGSM (white-box attack) might not be suitable for the network field since it needs to change every feature and provide hostile examples. This means exaggerating the knowledge of the enemy and assuming that the attackers have complete control to alter all the system features in a fine-grained manner. Also, the focus of most studies was on the adversary knowledge factor (white-box attack), in which the opponent is strong in attacking the target system. On the contrary, a few studies have focused on the weak adversary side within the black-box control, such as in [129,159,162], in which adversarial threats have no knowledge about the attacked system, and an attacker can enter the hostile data and monitor the outputs, aiming to understand the behavior of the system. However, it is essential to take into account the number of attempts that a malicious agent needs in order to reach the stage of triggering the detection mechanism. Therefore, to adequately evaluate the systems, there must be diversity in the factors of the attacker model to understand its mission, acceptability, and suitability in a wide variety of scenarios in the NIDS domain. This conclusion demonstrates the potential of new research avenues for evaluating more complicated yet realistic antagonistic samples.

A noteworthy problem is that many researchers have recommended solutions, such as ensemble learning [163,164] and feature reduction [157], that have significant flaws, rendering them inefficient for preventing NIDSs from adversarial assaults. In the non-adversarial context, feature removal affects the detection of the model’s performance, while ensemble learning lacks interoperability and needs extensive calculations. Such a method could also not be resistant to adversarial samples in terms of the transferability. As a result, implementing a highly robust architecture for ML systems that can withstand adversarial attacks remains a research challenge. Although most of the research concentrates on conventional networks, there are fewer studies that investigate adversarial assaults in IoT and wireless networks.

Many studies have focused on the aspect of images, and a few of them are interested in the networks, and this may be because the features in the field of images are not restricted, unlike the features of networks that contain different types of data, large sizes, errors, and many forms, such as digital, continuous and binary, and correlated and interdependent, where these restrictions are not fully addressed, leading to incorrect or mismatched results for hostile traffic.

Furthermore, not all studies have considered the possibility of initiating these assaults in real-world circumstances. A typical research project can start with any threat model and then examine the attack’s impacts, with no or inadequate consideration of the scenario’s plausibility [27]. Others believe that an adversary can conduct an infinite number of tries against the NIDSs without being detected [130]. Many criteria, such as required processing power, the number of changeable features, and the magnitude of injected instances, limit the choice of a suitable adversarial generation approach [13]. Determining the impact of AAs on ML models is essential for building a more robust and secure system since the cybersecurity systems should always include real threats and adversaries. As a result, there is a loophole between academic and actual-world contexts that must be bridged in the interests of mutual benefit. We would like to highlight the critical keys that can help open the door for future research, as follows:

There is no defensive technique, including NIDS or hybrid ML with NIDS, that has been presented yet that can provide a very high attack detection rate against strong attacks, such as GAN, DeepFool, and KDE. In [44], extensive experiments have been conducted, and it has been pointed out that no defensive approaches were able to prevent sophisticated attacks generated via incorporating different datasets at different percentage rates of attacks.
Although GANs have been shown to be the most promising and powerful attacks, the recent study in [44] confirmed that other attack models, e.g., DeepFools, can be stronger than GANs if they have been trained carefully.
The implementation of adversarial attacks has been extensively leveraged and explored in deep detail in both image and speech processing areas. However, their impacts on NIDS remain to be explored. Creating adversarial examples in the real-time domain from network traffic at the physical layer has not been proposed or presented yet. This one is especially important to open the door for future interesting research areas.
Many ML-based NIDS can detect attacks very accurately, but this usually makes them slower and more resource-intensive to run. On the other hand, simpler detection systems can work faster and cheaper by removing fewer critical features. While this approach makes the system more efficient, it does cause a drop in accuracy [158].
Current defensive techniques typically address evasion or poisoning attacks separately, leaving models subject to combined threats. Our analysis shows that very few techniques can effectively mitigate both attacks at once. To close this gap, researchers should develop strong unified defenses to prevent both attack types, and this, in turn, makes the detection systems safer against real-world adversarial approaches.
Unified defenses against both the evasion and poisoning attacks, consistent with the 2025 NIST taxonomy [10], can be implemented leveraging the emerging ensemble-based frameworks [131]. For instance, the hybrid adversarial–training ensembles have been shown to achieve 20–30% improvements in the robustness of the model. These defensive approaches can be further strengthened through incorporating the technical pathways, including the hardware acceleration, e.g., FPGA and GPU pipelines for the real-time adversarial example generation [60]. For the real-time generation, the pathways should include GPU or FPGA acceleration since they can be used to reduce the latency by approximately 50% in the SDN testbeds [60,104].
Bridge academic–real-world gaps with physical-layer attacks, informed by 2025 reviews on NIDS-specific adversarial impacts [14]. For example, the physical-layer simulations, including the packet-level perturbation models that are deployed in the SDN testbeds [103], enable the real evaluation of the adversarial traffic behavior in the operational network conditions.

7. Conclusions

In this paper, we have investigated the ML techniques and their importance in NIDS and provided a brief explanation of NIDSs and their correlation with ML, because NIDSs are among the most crucial cybersecurity problems in the world’s advanced technologies. Supervised ML performs well on the labeled attacks, unsupervised techniques are efficient at detecting anomalies, and the DL models are excellent at handling complex attack patterns; however, they are still not robust to perturbations. The most important point is how attackers exploit the ML technique’s weakness to launch attacks. Despite their superior efficiency and performance, ML-based NIDSs are subjected to small perturbations that are injected into legal traffic to trick the detection systems, resulting in disastrous repercussions for network security. AML remains a threat; 2025 advancements, like enhanced ensemble frameworks for IDS robustness, offer a promising solution. Because this aspect of ML-NIDS has received increasing but insufficient attention, it needs to be addressed promptly, as NIDS needs more robust defense techniques against black-box adversarial attacks. This study provides an in-depth review of the adversarial examples of NIDS based on ML. In addition, it lays out possible interesting future work. Our main findings indicate that the evasion attacks are more serious and continue to dominate the adversarial attacks. Therefore, it is important to develop efficient and robust NIDS techniques that can withstand such strong attacks. Moreover, there is a need for real research on the physical-layer attacks in order to bridge the gap between controlled laboratory experiments and real-world network conditions. The future of IDS is moving towards the AI-combined techniques and the real-time physical-layer defensive approaches that can operate at the physical layer. The adversarial cat-and-mouse game will persist and can be motivated with unified hybrid techniques. Further future directions may include the hardware-accelerated robustness through incorporating the GPU- and FPGA-based defensive techniques.

Author Contributions

Conceptualization, Q.A. and J.-S.Y.; methodology, Q.A.; software, Q.A. and M.A.; validation, Q.A., M.A. and S.A.; formal analysis, Q.A.; investigation, Q.A. and M.A.; resources, J.-S.Y. and M.A.; data curation, Q.A., O.T.K. and S.A.A.; writing—original draft preparation, Q.A.; writing—review and editing, Q.A., M.A., S.A., O.T.K., S.A.A. and J.-S.Y.; visualization, Q.A. and M.A.; supervision, J.-S.Y.; project administration, J.-S.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data have been generated or deeply analyzed in this study.

Conflicts of Interest

No interest conflicts are defined by the authors.

References

Ahmad, S.; Arif, F.; Zabeehullah, Z.; Iltaf, N. Novel approach using deep learning for intrusion detection and classification of the network traffic. In Proceedings of the 2020 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA), Tunis, Tunisia, 22–24 June 2020. [Google Scholar]
Alasad, Q.; Lin, J.; Yuan, J.-S.; Fan, D.; Awad, A. Resilient and secure hardware devices using ASL. ACM J. Emerg. Technol. Comput. Syst. 2021, 17, 1–26. [Google Scholar] [CrossRef]
Alasad, Q.; Yuan, J.-S.; Subramanyan, P. Strong logic obfuscation with low overhead against IC reverse engineering attacks. ACM Trans. Des. Autom. Electron. Syst. 2020, 25, 1–34. [Google Scholar] [CrossRef]
Hashemi, M.J.; Keller, E. Enhancing robustness against adversarial examples in network intrusion detection systems. In Proceedings of the 2020 IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN), Leganes, Spain, 10–12 November 2020. [Google Scholar]
Alahmed, S.; Alasad, Q.; Hammood, M.M.; Yuan, J.-S.; Alawad, M. Mitigation of black-box attacks on intrusion detection systems-based ML. Computers 2022, 11, 115. [Google Scholar] [CrossRef]
Aboueata, N.; Alrasbi, S.; Erbad, A.; Kassler, A.; Bhamare, D. Supervised machine learning techniques for efficient network intrusion detection. In Proceedings of the 2019 28th International Conference on Computer Communication and Networks (ICCCN), Valencia, Spain, 29 July–1 August 2019. [Google Scholar]
Kumari, A.; Mehta, A.K. A hybrid intrusion detection system based on decision tree and support vector machine. In Proceedings of the 2020 IEEE 5th International Conference on Computing Communication and Automation (ICCCA), Greater Noida, India, 30–31 October 2020. [Google Scholar]
Sah, G.; Banerjee, S. Feature reduction and classification techniques for intrusion detection system. In Proceedings of the 2020 International Conference on Communication and Signal Processing (ICCSP), Chennai, India, 28–30 July 2020. [Google Scholar]
Serinelli, B.M.; Collen, A.; Nijdam, N.A. Training guidance with kdd cup 1999 and nsl-kdd data sets of anidinr: Anomaly-based network intrusion detection system. Procedia Comput. Sci. 2020, 175, 560–565. [Google Scholar] [CrossRef]
Vassilev, A.; Oprea, A.; Fordyce, A.; Anderson, H. Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations; NIST Trustworthy and Responsible AI Report; NIST AI 100-2e2025; National Institute of Standards and Technology: Gaithersburg, MD, USA, 2023.
Usama, M.; Asim, M.; Latif, S.; Qadir, J.; Ala-Al-Fuqaha. Generative adversarial networks for launching and thwarting adversarial attacks on network intrusion detection systems. In Proceedings of the 2019 15th International Wireless Communications & Mobile Computing Conference (IWCMC), Tangier, Morocco, 24–28 June 2019. [Google Scholar]
Chen, J.; Wu, D.; Zhao, Y.; Sharma, N.; Blumenstein, M.; Yu, S. Fooling intrusion detection systems using adversarially autoencoder. Digit. Commun. Netw. 2021, 7, 453–460. [Google Scholar] [CrossRef]
Alatwi, H.A.; Aldweesh, A. Adversarial black-box attacks against network intrusion detection systems: A survey. In Proceedings of the 2021 IEEE World AI IoT Congress (AIIoT), Seattle, WA, USA, 10–13 May 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 34–40. [Google Scholar]
Hasan, M.M.; Islam, R.; Mamun, Q.; Islam, M.Z.; Gao, J. Adversarial attacks on deep learning-based network intrusion detection systems: A taxonomy and review. SSRN 5096420. 2025. Available online: https://ssrn.com/abstract=5096420 (accessed on 7 October 2025).
Alqahtani, A.; AlShaher, H. Anomaly-Based Intrusion Detection Systems Using Machine Learning. J. Cybersecur. Inf. Manag. 2024, 14, 20. [Google Scholar]
Alatwi, H.A.; Morisset, C. Adversarial machine learning in network intrusion detection domain: A systematic review. arXiv 2021, arXiv:2112.03315. [Google Scholar] [CrossRef]
Rawal, A.; Rawat, D.; Sadler, B.M. Recent advances in adversarial machine learning: Status, challenges and perspectives. In Proceedings of the Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications III, Online, 12–16 April 2021; Volume 11746. [Google Scholar]
Ahmad, Z.; Khan, A.S.; Shiang, C.W.; Abdullah, J.; Ahmad, F. Network intrusion detection system: A systematic study of machine learning and deep learning approaches. Trans. Emerg. Telecommun. Techn. 2021, 32, e4150. [Google Scholar] [CrossRef]
MOzkan-Okay; Samet, R.; Aslan, Ö.; Gupta, D. A comprehensive systematic literature review on intrusion detection systems. IEEE Access 2021, 9, 57727–157760. [Google Scholar] [CrossRef]
Rosenberg, I.; Shabtai, A.; Elovici, Y.; Rokach, L. Adversarial machine learning attacks and defense methods in the cyber security domain. ACM Comput. Surv. 2021, 54, 1–36. [Google Scholar] [CrossRef]
Jmila, H.; Khedher, M.I. Adversarial machine learning for network intrusion detection: A comparative study. Comput. Netw. 2022, 214, 109073. [Google Scholar] [CrossRef]
Khazane, H.; Ridouani, M.; Salahdine, F.; Kaabouch, N. A holistic review of machine learning adversarial attacks in IoT networks. Future Internet 2024, 16, 32. [Google Scholar] [CrossRef]
Alotaibi, A.; Rassam, M.A. Adversarial machine learning attacks against intrusion detection systems: A survey on strategies and defense. Future Internet 2023, 15, 62. [Google Scholar] [CrossRef]
Lim, W.; Yong, K.S.C.; Lau, B.T.; Tan, C.C.L. Future of generative adversarial networks (GAN) for anomaly detection in network security: A review. Comput. Secur. 2024, 139, 103733. [Google Scholar] [CrossRef]
Pacheco, Y.; Sun, W. Adversarial Machine Learning: A Comparative Study on Contemporary Intrusion Detection Datasets. In Proceedings of the International Conference on Information Systems Security and Privacy, Virtual, 11–13 February 2021; pp. 160–171. [Google Scholar]
He, K.; Kim, D.D.; Asghar, M.R. Adversarial machine learning for network intrusion detection systems: A comprehensive survey. IEEE Commun. Surv. Tutor. 2023, 25, 538–566. [Google Scholar] [CrossRef]
Ibitoye, O.; Abou-Khamis, R.; Shehaby, M.E.; Matrawy, A.; Shafiq, M.O. The Threat of Adversarial Attacks on Machine Learning in Network Security—A Survey. arXiv 2019, arXiv:1911.02621. [Google Scholar] [CrossRef]
Liu, Q.; Li, P.; Zhao, W.; Cai, W.; Yu, S.; Leung, V.C.M. A survey on security threats and defensive techniques of machine learning: A data driven view. IEEE Access 2018, 6, 12103–12117. [Google Scholar] [CrossRef]
Peng, Y.; Fu, G.; Luo, Y.; Hu, J.; Li, B.; Yan, Q. Detecting adversarial examples for network intrusion detection system with GAN. In Proceedings of the 2020 IEEE 11th International Conference on Software Engineering and Service Science (ICSESS), Beijing, China, 16–18 October 2020. [Google Scholar]
Apruzzese, G.; Colajanni, M.; Ferretti, L.; Marchetti, M. Addressing adversarial attacks against security systems based on machine learning. In Proceedings of the 2019 11th International Conference On Cyber Conflict (CyCon), Tallinn, Estonia, 28–31 May 2019. [Google Scholar]
Liu, H. Automated Network Defense: A Systematic Survey and Analysis of AutoML Paradigms for Network Intrusion Detection. Appl. Sci. 2025, 15, 10389. [Google Scholar] [CrossRef]
Apruzzese, G.; Andreolini, M.; Ferretti, L.; Marchetti, M.; Colajanni, M. Modeling realistic adversarial attacks against network intrusion detection systems. Digit. Threat. Res. Pract. 2022, 3, 1–19. [Google Scholar] [CrossRef]
Alhajjar, E.; Maxwell, P.; Bastian, N. Adversarial machine learning in network intrusion detection systems. Expert Syst. Appl. 2021, 186, 115782. [Google Scholar] [CrossRef]
Ayub, M.A.; Johnson, W.A.; Talbert, D.A.; Siraj, A. Model evasion attack on intrusion detection systems using adversarial machine learning. In Proceedings of the 2020 54th Annual Conference on Information Sciences and Systems (CISS), Princeton, NJ, USA, 18–20 March 2020. [Google Scholar]
Silva, S.H.; Najafirad, P. Opportunities and challenges in deep learning adversarial robustness: A survey. arXiv 2020, arXiv:2007.00753. [Google Scholar] [CrossRef]
Marano, G.C.; Rosso, M.M.; Aloisio, A.; Cirrincione, G. Generative adversarial networks review in earthquake-related engineering fields. Bull. Earthq. Eng. 2024, 22, 3511–3562. [Google Scholar] [CrossRef]
Bourou, S.; El Saer, A.; Velivassaki, T.-H.; Voulkidis, A.; Zahariadis, T. A review of tabular data synthesis using GANs on an IDS dataset. Information 2021, 12, 375. [Google Scholar] [CrossRef]
Soleymanzadeh, R.; Kashef, R. Efficient intrusion detection using multi-player generative adversarial networks (GANs): An ensemble-based deep learning architecture. Neural Comput. Appl. 2023, 35, 12545–12563. [Google Scholar] [CrossRef]
Dutta, I.K.; Ghosh, B.; Carlson, A.; Totaro, M.; Bayoumi, M. Generative adversarial networks in security: A survey. In Proceedings of the 2020 11th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), New York, NY, USA, 28–31 October 2020. [Google Scholar]
Chauhan, R.; Heydari, S.S. Polymorphic adversarial DDoS attack on IDS using GAN. In Proceedings of the 2020 International Symposium on Networks, Computers and Communications (ISNCC), Montreal, QC, Canada, 20–22 October 2020. [Google Scholar]
Chen, P.-Y.; Zhang, H.; Sharma, Y.; Yi, J.; Hsieh, C.-J. Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, Salt Lake City, UT, USA, 14–18 October 2017. [Google Scholar]
Golovin, D.; Karro, J.; Kochanski, G.; Lee, C.; Song, X.; Zhang, Q. Gradientless descent: High-dimensional zeroth-order optimization. arXiv 2019, arXiv:1911.06317. [Google Scholar]
Kumar, S.; Gupta, S.; Buduru, A.B. BB-Patch: BlackBox Adversarial Patch-Attack using Zeroth-Order Optimization. arXiv 2024, arXiv:2405.06049. [Google Scholar]
Ye, H.; Huang, Z.; Fang, C.; Li, C.J.; Zhang, T. Hessian-Aware Zeroth-Order Optimization. IEEE Trans. Pattern Anal. Mach. Intell. 2025, 47, 4869–4877. [Google Scholar] [CrossRef] [PubMed]
Tian, X.; Kong, Y.; Gong, Y.; Huang, Y.; Wang, S.; Du, G. Dynamic geothermal resource assessment: Integrating reservoir simulation and Gaussian Kernel Density Estimation under geological uncertainties. Geothermics 2024, 120, 103017. [Google Scholar] [CrossRef]
Aghaei, E.; Serpen, G. Host-based anomaly detection using Eigentraces feature extraction and one-class classification on system call trace data. arXiv 2019, arXiv:1911.11284. [Google Scholar]
Pillonetto, G.; Aravkin, A.; Gedon, D.; Ljung, L.; Ribeiro, A.H.; Schön, T.B. Deep networks for system identification: A survey. Automatica 2025, 171, 111907. [Google Scholar] [CrossRef]
Ahsan, M.; Khusna, H.; Wibawati; Lee, M.H. Support vector data description with kernel density estimation (SVDD-KDE) control chart for network intrusion monitoring. Sci. Rep. 2023, 13, 19149. [Google Scholar] [CrossRef]
Chen, Y.-C. A tutorial on kernel density estimation and recent advances. Biostat. Epidemiol. 2017, 1, 161–187. [Google Scholar] [CrossRef]
Węglarczyk, S. Kernel density estimation and its application. In ITM Web of Conferences; EDP Sciences: Les Ulis, France, 2018; Volume 23, p. 37. [Google Scholar]
Petrovsky, D.V.; Rudnev, V.R.; Nikolsky, K.S.; Kulikova, L.I.; Malsagova, K.M.; Kopylov, A.T.; Kaysheva, A.L. PSSNet—An accurate super-secondary structure for protein segmentation. Int. J. Mol. Sci. 2022, 23, 14813. [Google Scholar] [CrossRef]
Moosavi-Dezfooli, S.-M.; Fawzi, A.; Frossard, P. Deepfool: A simple and accurate method to fool deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 2574–2582. [Google Scholar]
Fatehi, N.; Alasad, Q.; Alawad, M. Towards adversarial attacks for clinical document classification. Electronics 2022, 12, 129. [Google Scholar] [CrossRef]
Shone, N.; Ngoc, T.N.; Phai, V.D.; Shi, Q. A deep learning approach to network intrusion detection. IEEE Trans. Emerg. Top. Comput. Intell. 2018, 2, 41–50. [Google Scholar] [CrossRef]
Alahmed, S.; Alasad, Q.; Yuan, J.-S.; Alawad, M. Impacting robustness in deep learning-based NIDS through poisoning attacks. Algorithms 2024, 17, 155. [Google Scholar] [CrossRef]
Jakubovitz, D.; Giryes, R. Improving DNN robustness to adversarial attacks using Jacobian regularization. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
Goodfellow, I.J.; Shlens, J.; Szegedy, C. Explaining and harnessing adversarial examples. arXiv 2014, arXiv:1412.6572. [Google Scholar]
Carlini, N.; Wagner, D. Towards evaluating the robustness of neural networks. In Proceedings of the 2017 IEEE Symposium on security and privacy (sp), San Jose, CA, USA, 22–26 May 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 39–57. [Google Scholar]
Papernot, N.; McDaniel, P.; Jha, S.; Fredrikson, M.; Celik, Z.B.; Swami, A. The limitations of deep learning in adversarial settings. In Proceedings of the 2016 IEEE European symposium on security and privacy (EuroS&P), Saarbruecken, Germany, 21–24 March 2016; IEEE: Piscataway, NJ, USA, 2016. [Google Scholar]
Madry, A.; Makelov, A.; Schmidt, L.; Tsipras, D.; Vladu, A. Towards deep learning models resistant to adversarial attacks. arXiv 2017, arXiv:1706.06083. [Google Scholar]
Kurakin, A.; Goodfellow, I.J.; Bengio, S. Adversarial examples in the physical world. In Artificial Intelligence Safety and Security; Chapman and Hall/CRC: Boca Raton, FL, USA, 2018; pp. 99–112. [Google Scholar]
Ahmed, M.; Alasad, Q.; Yuan, J.-S.; Alawad, M. Re-Evaluating Deep Learning Attacks and Defenses in Cybersecurity Systems. Big Data Cogn. Comput. 2024, 8, 191. [Google Scholar] [CrossRef]
Benzaïd, C.; Boukhalfa, M.; Taleb, T. Robust self-protection against application-layer (D) DoS attacks in SDN environment. In Proceedings of the 2020 IEEE Wireless Communications and Networking Conference (WCNC), Seoul, Republic of Korea, 25–28 May 2020. [Google Scholar]
Roshan, M.K.; Zafar, A. Boosting robustness of network intrusion detection systems: A novel two phase defense strategy against untargeted white-box optimization adversarial attack. Expert Syst. Appl. 2024, 249, 123567. [Google Scholar] [CrossRef]
Aljawarneh, S.; Aldwairi, M.; Yassein, M.B. Anomaly-based intrusion detection system through feature selection analysis and building hybrid efficient model. J. Comput. Sci. 2018, 25, 152–160. [Google Scholar] [CrossRef]
Moustafa, N.; Slay, J. UNSW-NB15: A comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In Proceedings of the 2015 Military Communications and Information Systems Conference (MilCIS), Canberra, ACT, Australia, 10–12 November 2015. [Google Scholar]
Alosaimi, S.; Almutairi, S.M. An intrusion detection system using BoT-IoT. Appl. Sci. 2023, 13, 5427. [Google Scholar] [CrossRef]
Song, J.; Takakura, H.; Okabe, Y.; Eto, M.; Inoue, D.; Nakao, K. Statistical analysis of honeypot data and building of Kyoto 2006+ dataset for NIDS evaluation. In Proceedings of the First Workshop on Building Analysis Datasets and Gathering Experience Returns for Security, Salzburg, Austria, 10 April 2011. [Google Scholar]
Manisha, P.; Gujar, S. Generative Adversarial Networks (GANs): What it can generate and What it cannot? arXiv 2018, arXiv:1804.00140. [Google Scholar]
Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp 2018, 1, 108–116. [Google Scholar]
Liu, L.; Wang, P.; Lin, J.; Liu, L. Intrusion detection of imbalanced network traffic based on machine learning and deep learning. IEEE Access 2020, 9, 7550–7563. [Google Scholar] [CrossRef]
Kilincer, I.F.; Ertam, F.; Sengur, A. Machine learning methods for cyber security intrusion detection: Datasets and comparative study. Comput. Netw. 2021, 188, 107840. [Google Scholar] [CrossRef]
Sharafaldin, I.; Lashkari, A.H.; Hakak, S.; Ghorbani, A.A. Developing realistic distributed denial of service (DDoS) attack dataset and taxonomy. In Proceedings of the 2019 International Carnahan Conference on Security Technology, Chennai, India, 1–3 October 2019. [Google Scholar]
Rizvi, S.; Scanlon, M.; Mcgibney, J.; Sheppard, J. Application of artificial intelligence to network forensics: Survey, challenges and future directions. IEEE Access 2022, 10, 110362–110384. [Google Scholar] [CrossRef]
Singh, G.; Khare, N. A survey of intrusion detection from the perspective of intrusion datasets and machine learning techniques. Int. J. Comput. Appl. 2022, 44, 659–669. [Google Scholar] [CrossRef]
Rampure, V.; Tiwari, A. A rough set based feature selection on KDD CUP 99 data set. Int. J. Database Theory Appl. 2015, 8, 149–156. [Google Scholar] [CrossRef]
Sharma, A.; Babbar, H. Detecting cyber threats in real-time: A supervised learning perspective on the CTU-13 dataset. In Proceedings of the 2024 5th International Conference for Emerging Technology (INCET), Belgaum, India, 24–26 May 2024. [Google Scholar]
Yan, B.; Han, G. Effective feature extraction via stacked sparse autoencoder to improve intrusion detection system. IEEE Access 2018, 6, 41238–41248. [Google Scholar] [CrossRef]
Yousefi-Azar, M.; Varadharajan, V.; Hamey, L.; Tupakula, U. Autoencoder-based feature learning for cyber security applications. In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA, 14–19 May 2017. [Google Scholar]
Han, D.; Wang, Z.; Zhong, Y.; Chen, W.; Yang, J.; Lu, S.; Shi, X.; Yin, X. Evaluating and improving adversarial robustness of machine learning-based network intrusion detectors. IEEE J. Sel. Areas Commun. 2021, 39, 2632–2647. [Google Scholar] [CrossRef]
Tavallaee, M.; Bagheri, E.; Lu, W.; Ghorbani, A.A. A detailed analysis of the KDD CUP 99 data set. In Proceedings of the 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, Ottawa, ON, Canada, 8–10 July 2009. [Google Scholar]
Thing, V.L. IEEE 802.11 network anomaly detection and attack classification: A deep learning approach. In Proceedings of the 2017 IEEE Wireless Communications and Networking Conference (WCNC), San Francisco, CA, USA, 19–22 March 2017. [Google Scholar]
Mirsky, Y.; Doitshman, T.; Elovici, Y.; Shabtai, A. Kitsune: An ensemble of autoencoders for online network intrusion detection. arXiv 2018, arXiv:1802.09089. [Google Scholar] [CrossRef]
Narodytska, N.; Kasiviswanathan, S.P. Simple Black-Box Adversarial Attacks on Deep Neural Networks. In CVPR Workshops; Elsevier: Amsterdam, The Netherlands, 2017; Volume 2. [Google Scholar]
Ghadimi, S.; Lan, G. Stochastic first-and zeroth-order methods for nonconvex stochastic programming. SIAM J. Optim. 2013, 23, 2341–2368. [Google Scholar] [CrossRef]
Ilyas, A.; Engstrom, L.; Athalye, A.; Lin, J. Black-box adversarial attacks with limited queries and information. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 261–273. [Google Scholar]
Wierstra, D.; Schaul, T.; Glasmachers, T.; Sun, Y.; Schmidhuber, J. Natural evolution strategies. arXiv 2011, arXiv:1106.4487. [Google Scholar] [PubMed]
Li, P.; Zhao, W.; Liu, Q.; Liu, X.; Yu, L. Poisoning machine learning based wireless IDSs via stealing learning model. In Proceedings of the International Conference on Wireless Algorithms, Systems, and Applications, Tianjin, China, 20–22 June 2018. [Google Scholar]
Zhou, X.; Liang, W.; Li, W.; Yan, K.; Shimizu, S.; Wang, K.I.-K. Hierarchical adversarial attacks against graph-neural-network-based IoT network intrusion detection system. IEEE Internet Things J. 2021, 9, 9310–9319. [Google Scholar] [CrossRef]
Hamza, A.; Gharakheili, H.H.; Benson, T.A.; Sivaraman, V. Detecting volumetric attacks on lot devices via sdn-based monitoring of mud activity. In Proceedings of the 2019 ACM Symposium on SDN Research, San Jose, CA, USA, 3–4 April 2019; pp. 36–48. [Google Scholar]
Xu, K.; Li, C.; Tian, Y.; Sonobe, T.; Kawarabayashi, K.-I.; Jegelka, S. Representation learning on graphs with jumping knowledge networks. In International Conference on Machine Learning; PMLR: New York, NY, USA, 2018; pp. 5453–5462. [Google Scholar]
Kipf, T. Semi-Supervised Classification with Graph Convolutional Networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
Zhou, X.; Liang, W.; Wang, K.I.-K.; Huang, R.; Jin, Q. Academic influence aware and multidimensional network analysis for research collaboration navigation based on scholarly big data. IEEE Trans. Emerg. Top. Comput. 2018, 9, 246–257. [Google Scholar] [CrossRef]
Ma, J.; Ding, S.; Mei, Q. Towards more practical adversarial attacks on graph neural networks. Adv. Neural Inf. Process. Syst. 2020, 33, 4756–4766. [Google Scholar]
Sun, Z.; Ambrosi, E.; Bricalli, A.; Ielmini, D. In-memory PageRank accelerator with a cross-point array of resistive memories. IEEE Trans. Electron Devices 2020, 67, 1466–1470. [Google Scholar] [CrossRef]
Kotak, J.; Elovici, Y. Adversarial attacks against IoT identification systems. IEEE Internet Things J. 2022, 10, 7868–7883. [Google Scholar] [CrossRef]
Sivanatha, A.; Gharakheili, H.H.; Loi, F.; Radford, A.; Wijenayake, C.; Vishwanath, A. Classifying IoT devices in smart environments using network traffic characteristics. IEEE Trans. Mob. Comput. 2018, 18, 1745–1759. [Google Scholar] [CrossRef]
Tian, J. Adversarial vulnerability of deep neural network-based gait event detection: A comparative study using accelerometer-based data. Biomed. Signal Process. Control. 2022, 73, 103429. [Google Scholar] [CrossRef]
Kuppa, A.; Grzonkowski, S.; Asghar, M.R.; Le-Khac, N.-A. Black box attacks on deep anomaly detectors. In Proceedings of the 14th International Conference on Availability, Reliability and Security, Canterbury, UK, 26-29 August 2019; pp. 1–10. [Google Scholar]
Liu, F.T.; Ting, K.M.; Zhou, Z.-H. Isolation forest. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, Pisa, Italy, 15–19 December 2008. [Google Scholar]
Zenati, H.; Romain, M.; Foo, C.-S.; Lecouat, B.; Chandrasekhar, V. Adversarially learned anomaly detection. In Proceedings of the 2018 IEEE International Conference on Data Mining (ICDM), Singapore, 17–20 November 2018. [Google Scholar]
Zong, B.; Song, Q.; Min, M.R.; Cheng, W.; Lumezanu, C.; Cho, D.; Chen, H. Deep autoencoding gaussian mixture model for unsupervised anomaly detection. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
Schlegl, T.; Seeböck, P.; Waldstein, S.M.; Langs, G.; Schmidt-Erfurth, U. f-AnoGAN: Fast unsupervised anomaly detection with generative adversarial networks. Med. Image Anal. 2019, 54, 30–44. [Google Scholar] [CrossRef]
Aiken, J.; Scott-Hayward, S. Investigating adversarial attacks against network intrusion detection systems in sdns. In Proceedings of the 2019 IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN), Dallas, TX, USA, 12–14 November 2019. [Google Scholar]
Yan, Q.; Wang, M.; Huang, W.; Luo, X.; Yu, F.R. Automatically synthesizing DoS attack traces using generative adversarial networks. Int. J. Mach. Learn. Cybern. 2019, 10, 3387–3396. [Google Scholar] [CrossRef]
Shu, D.; Leslie, N.O.; Kamhoua, C.A.; Tucker, C.S. Generative adversarial attacks against intrusion detection systems using active learning. In Proceedings of the 2nd ACM Workshop on wireless Security and Machine Learning, Linz, Austria, 13 July 2020. [Google Scholar]
Guo, S.; Zhao, J.; Li, X.; Duan, J.; Mu, D.; Jing, X. A Black-Box Attack Method against Machine-Learning-Based Anomaly Network Flow Detection Models. Secur. Commun. Netw. 2021, 2021, 5578335. [Google Scholar] [CrossRef]
Sharon, Y.; Berend, D.; Liu, Y.; Shabtai, A.; Elovici, Y. Tantra: Timing-based adversarial network traffic reshaping attack. IEEE Trans. Inf. Forensics Secur. 2022, 17, 3225–3237. [Google Scholar] [CrossRef]
Zolbayar, B.-E.; Sheatsley, R.; McDaniel, P.; Weisman, M.J.; Zhu, S.; Zhu, S.; Krishnamurthy, S. Generating practical adversarial network traffic flows using NIDSGAN. arXiv 2022, arXiv:2203.06694. [Google Scholar] [CrossRef]
Hou, T. IoTGAN: GAN powered camouflage against machine learning based IoT device identification. In Proceedings of the 2021 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN), Los Angeles, CA, USA, 13–15 December 2021. [Google Scholar]
Bao, J.; Hamdaoui, B.; Wong, W.-K. Iot device type identification using hybrid deep learning approach for increased iot security. In Proceedings of the 2020 International Wireless Communications and Mobile Computing (IWCMC), Limassol, Cyprus, 15–19 June 2020. [Google Scholar]
Aldhaheri, S.; Alhuzali, A. SGAN-IDS: Self-attention-based generative adversarial network against intrusion detection systems. Sensors 2023, 23, 7796. [Google Scholar] [CrossRef]
Fan, M.; Liu, Y.; Chen, C.; Yu, S.; Guo, W.; Wang, L. Toward Evaluating the Reliability of Deep-Neural-Network-Based IoT Devices. IEEE Internet Things J. 2021, 9, 17002–17013. [Google Scholar] [CrossRef]
Wong, E.; Rice, L.; Kolter, J.Z. Fast is better than free: Revisiting adversarial training. arXiv 2020, arXiv:2001.03994. [Google Scholar] [CrossRef]
Krizhevsky, A.; Nair, V.; Hinton, G. The CIFAR-10 and CIFAR-100 Datasets. 2014. Available online: https://www.cs.toronto.edu/~kriz/cifar.html (accessed on 3 February 2025).
Usama, M.; Qadir, J.; Al-Fuqaha, A.; Hamdi, M. The adversarial machine learning conundrum: Can the insecurity of ML become the achilles’ heel of cognitive networks? IEEE Netw. 2019, 34, 196–203. [Google Scholar] [CrossRef]
Abusnaina, A.; Khormali, A.; Nyang, D.; Yuksel, M.; Mohaisen, A. Examining the robustness of learning-based ddos detection in software defined networks. In Proceedings of the 2019 IEEE Conference on Dependable and Secure Computing (DSC), Hangzhou, China, 23–25 June 2019. [Google Scholar]
Hashemi, M.J.; Cusack, G.; Keller, E. Towards evaluation of nidss in adversarial setting. In Proceedings of the 3rd ACM CoNEXT Workshop on Big DATA, Machine Learning and Artificial Intelligence for Data Communication Networks, Orlando, FL, USA, 9 December 2019; pp. 14–21. [Google Scholar]
Zenati, H.; Foo, C.S.; Lecouat, B.; Manek, G.; Chandrasekhar, V.R. Efficient GAN-based anomaly detection. arXiv 2018, arXiv:1802.06222. [Google Scholar]
Homoliak, I.; Teknos, M.; Ochoa, M.; Breitenbacher, D.; Hosseini, S.; Hanacek, P. Improving network intrusion detection classifiers by non-payload-based exploit-independent obfuscations: An adversarial approach. arXiv 2018, arXiv:1805.02684. [Google Scholar] [CrossRef]
Teuffenbach, M.; Piatkowska, E.; Smith, P. Subverting network intrusion detection: Crafting adversarial examples accounting for domain-specific constraints. In Proceedings of the International Cross-Domain Conference for Machine Learning and Knowledge Extraction, Dublin, Ireland, 25–28 August 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 301–320. [Google Scholar]
Wang, Y.; Wang, Y.; Tong, E.; Niu, W.; Liu, J. A c-ifgsm based adversarial approach for deep learning based intrusion detection. In Proceedings of the International Conference on Verification and Evaluation of Computer and Communication Systems, Xi’an, China, 26–27 October 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 207–221. [Google Scholar]
Anthi, E. Hardening machine learning denial of service (DoS) defences against adversarial attacks in IoT smart home networks. Comput. Secur. 2021, 108, 102352. [Google Scholar] [CrossRef]
Sánchez, P.M.S.; Celdrán, A.H.; Bovet, G.; Pérez, G.M. Adversarial attacks and defenses on ML-and hardware-based IoT device fingerprinting and identification. Future Gener. Comput. Syst. 2024, 152, 30–42. [Google Scholar] [CrossRef]
Sánchez, P.M.S.; Valero, J.M.J.; Celdrán, A.H.; Bovet, G.; Pérez, M.G.; Pérez, G.M. LwHBench: A low-level hardware component benchmark and dataset for Single Board Computers. arXiv 2022, arXiv:2204.08516. [Google Scholar] [CrossRef]
Roshan, K.; Zafar, A.; Haque, S.B.U. Untargeted white-box adversarial attack with heuristic defence methods in real-time deep learning based network intrusion detection system. Comput. Commun. 2024, 218, 97–113. [Google Scholar] [CrossRef]
Xiao, H.; Biggio, B.; Brown, G.; Fumera, G.; Eckert, C.; Roli, F. Is feature selection secure against training data poisoning? In International Conference on Machine Learning; PMLR: New York, NY, USA, 2015; pp. 1689–1698. [Google Scholar]
Biggio, B.; Nelson, B.; Laskov, P. Poisoning attacks against support vector machines. arXiv 2012, arXiv:1206.6389. [Google Scholar]
Yang, K.; Liu, J.; Zhang, C.; Fang, Y. Adversarial examples against the deep learning based network intrusion detection systems. In Proceedings of the MILCOM 2018-2018 IEEE Military Communications Conference (MILCOM), Los Angeles, CA, USA, 29–31 October 2018. [Google Scholar]
Peng, X.; Huang, W.; Shi, Z. Adversarial attack against dos intrusion detection: An improved boundary-based method. In Proceedings of the 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), Portland, OR, USA, 4–6 November 2019. [Google Scholar]
Awad, Z.; Zakaria, M.; Hassan, R. An enhanced ensemble defense framework for boosting adversarial robustness of intrusion detection systems. Sci. Rep. 2025, 15, 14177. [Google Scholar] [CrossRef] [PubMed]
Lu, J.; Issaranon, T.; Forsyth, D. Safetynet: Detecting and rejecting adversarial examples robustly. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017. [Google Scholar]
Metzen, J.H.; Genewein, T.; Fischer, V.; Bischoff, B. On detecting adversarial perturbations. arXiv 2017, arXiv:1702.04267. [Google Scholar] [CrossRef]
Barreno, M.; Nelson, B.; Joseph, A.D.; Tygar, J.D. The security of machine learning. Mach. Learn. 2010, 81, 121–148. [Google Scholar] [CrossRef]
Chen, X.; Liu, C.; Li, B.; Lu, K.; Song, D. Targeted backdoor attacks on deep learning systems using data poisoning. arXiv 2017, arXiv:1712.05526. [Google Scholar] [CrossRef]
Wang, J.; Dong, G.; Sun, J.; Wang, X.; Zhang, P. Adversarial sample detection for deep neural network through model mutation testing. In Proceedings of the 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), Montreal, QC, Canada, 25–31 May 2019. [Google Scholar]
Raghuvanshi, A.; Singh, U.K.; Sajja, G.S.; Pallathadka, H.; Asenso, E.; Kamal, M.; Singh, A.; Phasinam, K. Intrusion detection using machine learning for risk mitigation in IoT-enabled smart irrigation in smart farming. J. Food Qual. 2022, 2022, 3955514. [Google Scholar] [CrossRef]
Benaddi, H.; Jouhari, M.; Ibrahimi, K.; Ben Othman, J.; Amhoud, E.M. Anomaly detection in industrial IoT using distributional reinforcement learning and generative adversarial networks. Sensors 2022, 22, 8085. [Google Scholar] [CrossRef]
Li, G.; Ota, K.; Dong, M.; Wu, J.; Li, J. DeSVig: Decentralized swift vigilance against adversarial attacks in industrial artificial intelligence systems. IEEE Trans. Ind. Inform. 2019, 16, 3267–3277. [Google Scholar] [CrossRef]
Mirza, M.; Osindero, S. Conditional generative adversarial nets. arXiv 2014, arXiv:1411.1784. [Google Scholar] [CrossRef]
Benaddi, H.; Jouhari, M.; Ibrahimi, K.; Benslimane, A.; Amhoud, E.M. Adversarial attacks against iot networks using conditional gan based learning. In Proceedings of the GLOBECOM 2022—2022 IEEE Global Communications Conference, Rio de Janeiro, Brazil, 4–8 December 2022; pp. 2788–2793. [Google Scholar]
Odena, A.; Olah, C.; Shlens, J. Conditional image synthesis with auxiliary classifier gans. In International Conference on Machine Learning; PMLR: New York, NY, USA, 2017; pp. 2642–2651. [Google Scholar]
Dhillon, G.S.; Azizzadenesheli, K.; Lipton, Z.C.; Bernstein, J.; Kossaifi, J.; Khanna, A.; Anandkumar, A. Stochastic activation pruning for robust adversarial defense. arXiv 2018, arXiv:1803.01442. [Google Scholar] [CrossRef]
Khamis, R.A.; Shafiq, M.O.; Matrawy, A. Investigating resistance of deep learning-based ids against adversaries using min-max optimization. In Proceedings of the ICC 2020—2020 IEEE International Conference on Communications (ICC), Dublin, Ireland, 7–11 June 2020. [Google Scholar]
Pawlicki, M.; Choraś, M.; Kozik, R. Defending network intrusion detection systems against adversarial evasion attacks. Future Gener. Comput. Syst. 2020, 110, 148–154. [Google Scholar] [CrossRef]
Ganesan, A.; Sarac, K. Mitigating evasion attacks on machine learning based nids systems in sdn. In Proceedings of the 2021 IEEE 7th International Conference on Network Softwarization (NetSoft), Tokyo, Japan, 28 June–2 July 2021. [Google Scholar]
Debicha, I.; Debatty, T.; Dricot, J.-M.; Mees, W.; Kenaza, T. Detect & reject for transferability of black-box adversarial attacks against network intrusion detection systems. In Proceedings of the International Conference on Advances in Cyber Security, Penang, Malaysia, 24–25 August 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 329–339. [Google Scholar]
Novaes, M.P.; Carvalho, L.F.; Lloret, J.; Proença, M.L. Adversarial Deep Learning approach detection and defense against DDoS attacks in SDN environments. Future Gener. Comput. Syst. 2021, 125, 156–167. [Google Scholar] [CrossRef]
Yumlembam, R.; Issac, B.; Jacob, S.M.; Yang, L. Iot-based android malware detection using graph neural network with adversarial defense. IEEE Internet Things J. 2022, 10, 8432–8444. [Google Scholar] [CrossRef]
Nelson, B.; Barreno, M.; Chi, F.J.; Joseph, A.D.; Rubinstein, B.I.P.; Saini, U.; Sutton, C.; Tygar, J.D.; Xia, K. Misleading learners: Co-opting your spam filter. In Machine Learning in Cyber Trust: Security, Privacy, and Reliability; Springer: Berlin/Heidelberg, Germany, 2009; pp. 17–51. [Google Scholar]
Apruzzese, G.; Andreolini, M.; Marchetti, M.; Colacino, V.G.; Russo, G. AppCon: Mitigating evasion attacks to ML cyber detectors. Symmetry 2020, 12, 653. [Google Scholar] [CrossRef]
Apruzzese, G.; Colajanni, M. Evading botnet detectors based on flows and random forest with adversarial samples. In Proceedings of the 2018 IEEE 17th International Symposium on Network Computing and Applications (NCA), Cambridge, MA, USA, 1–3 November 2018. [Google Scholar]
Siganos, M.; Radoglou-Grammatikis, P.; Kotsiuba, I.; Markakis, E.; Moscholios, I.; Goudos, S.; Sarigiannidis, P. Explainable ai-based intrusion detection in the internet of things. In Proceedings of the 18th International Conference on Availability, Reliability and Security, Benevento, Italy, 29 August–1 September 2023; pp. 1–10. [Google Scholar]
Park, C.; Lee, J.; Kim, Y.; Park, J.-G.; Kim, H.; Hong, D. An enhanced AI-based network intrusion detection system using generative adversarial networks. IEEE Internet Things J. 2022, 10, 2330–2345. [Google Scholar] [CrossRef]
Sen, M.A. Attention-GAN for anomaly detection: A cutting-edge approach to cybersecurity threat management. arXiv 2024, arXiv:2402.15945. [Google Scholar]
Qureshi, A.-U.-H.; Larijani, H.; Mtetwa, N.; Yousefi, M.; Javed, A. An adversarial attack detection paradigm with swarm optimization. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020. [Google Scholar]
Punitha, A.; Vinodha, S.; Karthika, R.; Deepika, R. A feature reduction intrusion detection system using genetic algorithm. In Proceedings of the 2019 IEEE International Conference on System, Computation, Automation and Networking (ICSCAN), Pondicherry, India, 29–30 March 2019. [Google Scholar]
Alasad, Q.; Hammood, M.M.; Alahmed, S. Performance and Complexity Tradeoffs of Feature Selection on Intrusion Detection System-Based Neural Network Classification with High-Dimensional Dataset. In Proceedings of the International Conference on Emerging Technologies and Intelligent Systems, Riyadh, Saudi Arabia, 9–11 May 2022; Springer: Berlin/Heidelberg, Germany; pp. 533–542. [Google Scholar]
Usama, M.; Qayyum, A.; Qadir, J.; Al-Fuqaha, A. Black-box adversarial machine learning attack on network traffic classification. In Proceedings of the 2019 15th International Wireless Communications & Mobile Computing Conference (IWCMC), Tangier, Morocco, 24–28 June.
Warzyński, A.; Kołaczek, G. Intrusion detection systems vulnerability on adversarial examples. In Proceedings of the 2018 Innovations in Intelligent Systems and Applications (INISTA), Thessaloniki, Greece, 3–5 July 2018. [Google Scholar]
Zhao, S.; Li, J.; Wang, J.; Zhang, Z.; Zhu, L.; Zhang, Y. attackgan: Adversarial attack against black-box ids using generative adversarial networks. Procedia Comput. Sci. 2021, 187, 128–133. [Google Scholar] [CrossRef]
Lin, Z.; Shi, Y.; Xue, Z. Idsgan: Generative adversarial networks for attack generation against intrusion detection. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Chengdu, China, 16–19 May 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 79–91. [Google Scholar]
Waskle, S.; Parashar, L.; Singh, U. Intrusion detection system using PCA with random forest approach. In Proceedings of the 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC), Coimbatore, India, 2–4 July 2020. [Google Scholar]
Mirza, A.H. Computer network intrusion detection using various classifiers and ensemble learning. In Proceedings of the 2018 26th Signal Processing and Communications Applications Conference (SIU), Izmir, Turkey, 2–5 May 2018. [Google Scholar]
Fitni, Q.R.S.; Ramli, K. Implementation of ensemble learning and feature selection for performance improvements in anomaly-based intrusion detection systems. In Proceedings of the 2020 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT), Bali, Indonesia, 7–8 July 2020. [Google Scholar]
Li, P.; Liu, Q.; Zhao, W.; Wang, D.; Wang, S. Chronic poisoning against machine learning based IDSs using edge pattern detection. In Proceedings of the 2018 IEEE International Conference on Communications (ICC), Kansas City, MO, USA, 20–24 May 2018. [Google Scholar]

Figure 1. Combination of IDSs with the workflow of ML.

Figure 2. The flow of the evasion and poisoning attacks.

Figure 3. Analyzing the NIDS based on ML research for the past five years, from 2020 to 2025.

Table 1. Summary of attacker model factors in adversarial machine learning.

Factor	Description	Examples in NIDS Context	Refs.
Knowledge	Level of knowledge on the attacked model that the assaulter knows.	Black-box: Only input–output access. Gray-box: Limited dataset access. White-box: Full access to model parameters.	[13]
Timing	When the attack occurs in the ML lifecycle.	Evasion: During the testing, to misclassify (real time traffic). Poisoning: During the training, to corrupt data (data tampering).	[14,27,34]
Goals	Attacker’s objective.	Targeted: Specific misclassification. Non-targeted: General error induction.	[35]
Capabilities	Attacker’s access and actions on the system.	Full access: Modify model internals. Partial: Limited access to the model. Limited: Query outputs only.	[10,15,16,17,18,19,20,21,22,23,24,30]

Table 2. Comparison of adversarial attack techniques.

Attack Tech.	Type (White/Black/Gray)	Strengths	Weaknesses	Success Rate in NIDS (Examples)	Datasets Tested	Refs.
GANs	Black/White /Gray	High evasion; realistic samples	Computationally intensive	98% evasion [40]	CICIDS2017	[14,24,36,37,38,39,40]
ZOO	Black	Gradient-free; black-box effective	High query count; slow	97% on DNNs [41]	NSL-KDD	[1,41,42,44]
KDE	White/Black	Non-parametric density estimation	Bandwidth sensitivity	95% outlier detection [48]	NSL-KDD	[45,46,47,48,49,50,51]
DeepFool	Black/White	Minimal perturbations	Compute-heavy; white-box only	90% misclassification [52]	KDDCup99	[14,24,52,56]
FGSM	White	Fast generation	Less transferable	97% on CNN [57]	KDDCup99	[23,57]
C&W	White	Optimized for distances (L0/L2/L∞)	Resource-intensive	95% bypass [58]	Various	[22,58]
JSMA	White	Targets key features	Slow; feature-specific	92% targeted [59]	NSL-KDD	[22,59]
PGD	White	Constrained optimization	Iterative; compute-costly	96% robust test [60]	CICIDS2017	[22,60]
BIM	White	Multi-step improvements	Similarly to PGD but basic	94% evasion [61]	CICIDS2017	[22,61]

Table 3. Pros and cons of commonly used datasets in NIDS.

Dataset	Pros.	Cons.	Impact on Adversarial Robustness
KDDCup99	Widely used benchmark; large scale	Outdated (1999); redundant records; lacks modern attacks	Skews results due to obsolete patterns; underestimates modern evasion rates by 10–20% in 2025 studies [14]; overoptimistic validity in adversarial experiments [10].
NSL-KDD	Reduced redundancy from KDD; balanced classes	Still based on 1999 data; limited realism	Inflates robustness claims by ignoring contemporary traffic; evasion success underestimated by 15% [14].
UNSW-NB15	Realistic modern traffic; includes 9 attack families	Imbalanced; some synthetic elements	Better for robustness testing but skews if not balanced; underestimates poisoning by 5–10% in hybrid attacks [10].
CIC-IDS2017-2019	Comprehensive real traffic; multi-class attacks	High dimensionality; processing-intensive	Minimizes skew in robustness claims; accurate for 2025 evasion rates (up to 90% in IoT) [14]; recommended for valid experiments.
BoT-IoT	IoT-specific; recent botnet simulations	Focused on IoT; limited generalizability	Reduces skew for IoT NIDS; but overestimates general robustness; evasion rates accurate at 80–90% [14].
Kyoto 2006+	Honeypot-based; long-term data	Older (2006+); lacks latest zero-day attacks	Skews claims in long-term studies; underestimates current adversarial impacts by 20% [10].
CTU-13	Real botnet captures; 13 scenarios	Malware-focused; dated (2011)	Moderate skew; useful for poisoning tests but underestimates evasion in modern networks [14].

Table 4. Summarized studies on adversarial attacks (poisoning and evasion).

Attack Tech.	Description	NIDS-Specific Examples	Refs.	Computational Cost	Detectability
GANs	Generator–discriminator model for crafting evasive samples; maximizes deception.	Bypassing NIDS with synthetic traffic [11,38].	[14,36,37,38,39,40]	Medium (training-intensive but efficient in black-box).	Low (evasion rates up to 90% in network traffic; hard to detect [14]).
ZOO	Gradient-free optimization for black-box attacks; approximates gradients via queries.	Attacking DNN-based NIDS without model access [41,42,43,44].	[41,42,43,44]	High (1000 s queries/sample; reduced 20–50% with Hessian [41,42]).	Medium (query intensity may expose attack; slower than FGSM).
KDE	Non-parametric density estimation for anomaly crafting; smooths perturbations.	Spotting adversaries in NSL-KDD traffic [45,46,47,48,49,50,51].	[45,46,47,48,49,50,51]	Low (statistical, no heavy training).	High (reveals multi-peak patterns; easier to detect in complex data [50]).
DeepFool	Iterative minimal perturbations to cross decision boundaries; white-box focus.	Poisoning DL-NIDS with small changes [14,52,56].	[14,52,56]	Medium (iterative but efficient for DNNs).	Low (small perturbations; adapted for 2025 DL-NIDS evasion success [14]).
FGSM	Single-step gradient-based; adds noise via sign of gradients.	Fast evasion in testing phase [57].	[57]	Low (one-step computation).	Medium (larger perturbations; more detectable than DeepFool).

Table 5. Summarized the defensive studies against adversarial attacks.

Defense Tech.	Description	NIDS-Specific Examples	Refs.	Success Rate vs. Combined Attacks
Adversarial Training	Train models on adversarial examples to build robustness.	Enhancing NIDS against evasion [4,63].	[63,131,137,138,139,140]	15–25% improvement; vulnerable to strong combined attacks [63,131].
Feature Selection	Reduce dimensions to eliminate vulnerable features.	Genetic algorithm for IDS efficiency [157,158].	[157,158]	10–20% robustness gain; limited against poisoning + evasion (drops to 5% in hybrids [14]).
Ensemble Methods	Combine multiple models for detection, e.g., random forests.	Boosting against white-box attacks [159,160,161,162].	[131,159,160,161,162]	20–30% improvement in ensembles; effective for combined (evasion + poisoning) [31,131].
Hybrid Defenses	Integrate training + detection (e.g., GAN-based anomaly).	Unified against both attack types [10,131].	[10,131]	25–35% in 2025 frameworks; closes the gap for combined threats [14,131].
Detection-Based	Detect perturbations via isolation forests or autoencoders.	Kitsune for online anomalies [100,101,102,103].	[100,101,102,103]	10–15% for evasion; low (5–10%) for combined without unification [10].

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Alasad, Q.; Ahmed, M.; Alahmed, S.; Khattab, O.T.; Abdulwahhab, S.A.; Yuan, J.-S. A Comprehensive Review: The Evolving Cat-and-Mouse Game in Network Intrusion Detection Systems Leveraging Machine Learning. J. Cybersecur. Priv. 2026, 6, 13. https://doi.org/10.3390/jcp6010013

AMA Style

Alasad Q, Ahmed M, Alahmed S, Khattab OT, Abdulwahhab SA, Yuan J-S. A Comprehensive Review: The Evolving Cat-and-Mouse Game in Network Intrusion Detection Systems Leveraging Machine Learning. Journal of Cybersecurity and Privacy. 2026; 6(1):13. https://doi.org/10.3390/jcp6010013

Chicago/Turabian Style

Alasad, Qutaiba, Meaad Ahmed, Shahad Alahmed, Omer T. Khattab, Saba Alaa Abdulwahhab, and Jiann-Shuin Yuan. 2026. "A Comprehensive Review: The Evolving Cat-and-Mouse Game in Network Intrusion Detection Systems Leveraging Machine Learning" Journal of Cybersecurity and Privacy 6, no. 1: 13. https://doi.org/10.3390/jcp6010013

APA Style

Alasad, Q., Ahmed, M., Alahmed, S., Khattab, O. T., Abdulwahhab, S. A., & Yuan, J.-S. (2026). A Comprehensive Review: The Evolving Cat-and-Mouse Game in Network Intrusion Detection Systems Leveraging Machine Learning. Journal of Cybersecurity and Privacy, 6(1), 13. https://doi.org/10.3390/jcp6010013

Article Menu

A Comprehensive Review: The Evolving Cat-and-Mouse Game in Network Intrusion Detection Systems Leveraging Machine Learning

Abstract

1. Introduction

1.1. Motivation

1.2. Contributions of This Paper

2. Background

2.1. ML Techniques in NIDS

2.2. Adversarial Machine Learning (AML)

2.2.1. Knowledge

2.2.2. Timing

2.2.3. Goals

2.2.4. Capability

2.3. Adversarial Attacks Based on Machine Learning

2.3.1. Generative Adversarial Networks (GANs)

2.3.2. Zero-Order Optimization (ZOO)

2.3.3. Kernel Density Estimation (KDE)

2.3.4. DeepFool

2.3.5. Fast Gradient Sign Method (FGSM)

2.3.6. The Carlini and Wagner (C&W) Attack

2.3.7. The Jacobian-Based Saliency Map Attack (JSMA)

2.3.8. Projected Gradient Descent (PGD)

2.3.9. Basic Iteration Method (BIM)

3. Datasets Used in NIDS

3.1. Types of Datasets Used

3.2. Data Preparation

3.3. Feature Reduction

3.4. Feature Extraction

4. Adversarial Attack Against ML-Based NIDS Models

4.1. Black-Box Attack

4.1.1. Poisoning Attack

4.1.2. Evasion Attack

4.2. White-Box Attack

4.2.1. Poisoning Attack

4.2.2. Evasion Attack

4.3. Gray-Box Attack

4.3.1. Poisoning Attack

4.3.2. Evasion Attack

4.4. Combination of Poisoning and Evasion Attacks

5. Defending ML-Based NIDS Models

5.1. Mitigation of Black-Box Attack

5.1.1. Mitigating Poisoning Attack

5.1.2. Mitigating Evasion Attack

5.2. Mitigation of White-Box Attack

5.2.1. Mitigating Poisoning Attack

5.2.2. Mitigating Evasion Attack

5.3. Mitigation of Gray-Box Attack

5.3.1. Mitigating Poisoning Attack

5.3.2. Mitigating Evasion Attack

5.4. Mitigating the Combination of Poisoning and Evasion Attacks

6. Discussion

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI