EGGA: An Error-Guided Generative Augmentation and Optimized ML-Based IDS for EV Charging Network Security

Yang, Li; Kirubavathi, G.

doi:10.3390/fi18040202

Open AccessArticle

EGGA: An Error-Guided Generative Augmentation and Optimized ML-Based IDS for EV Charging Network Security

by

Li Yang

^1,2,*

and

G. Kirubavathi

³

¹

Faculty of Business and Information Technology, Ontario Tech University, Oshawa, ON L1G 0C5, Canada

²

Department of Electrical and Computer Engineering, Western University, London, ON N6A 3K7, Canada

³

Department of Mathematics, Amrita School of Physical Sciences, Amrita Vishwa Vidyapeetham, Coimbatore 641112, India

^*

Author to whom correspondence should be addressed.

Future Internet 2026, 18(4), 202; https://doi.org/10.3390/fi18040202

Submission received: 1 March 2026 / Revised: 4 April 2026 / Accepted: 10 April 2026 / Published: 13 April 2026

(This article belongs to the Special Issue Advanced Artificial Intelligence and Machine Learning for Cybersecurity)

Download

Browse Figures

Versions Notes

Abstract

Electric Vehicle Charging Systems (EVCSs) are increasingly connected with the Internet of Things (IoT) and smart grid infrastructure, yet they face growing cyber risks due to expanded attack interfaces. These systems are vulnerable to various attacks that potentially impact both charging operations and user privacy. Intrusion Detection Systems (IDSs) are essential for identifying suspicious activities and mitigating risks to protect EVCS networks, but conventional ML-based IDSs are often unable to achieve optimal performance due to imbalanced datasets, complex traffic distributions, and human design limitations. In practice, EVCS traffic is typically multi-class, imbalanced, and safety-critical, where both missed attacks and false alarms can lead to denial of charging, service interruption, unnecessary incident escalation, financial loss, and reduced user trust. Automated ML (AutoML) and Generative Artificial Intelligence (GAI) have emerged as promising solutions in cybersecurity. Existing GAI and augmentation methods are mostly class-frequency-driven, but this does not necessarily improve the error-prone regions where IDSs actually fail. In this paper, we propose a GAI and an AutoML-based IDS that incorporates a Conditional Generative Adversarial Network (cGAN) with the optimized XGBoost model to improve the effectiveness of intrusion detection in EVCS networks and IoT systems. The proposed framework involves two techniques: (1) a novel cGAN-based error-guided generative augmentation (EGGA) method that extracts misclassified samples and generates a more robust training set for IDS development, and (2) an optimized IDS model that automatically constructs an optimized XGBoost model based on Bayesian Optimization with Tree-structured Parzen Estimator (BO-TPE). The main algorithmic novelty lies in EGGA, which uses model errors to guide generative augmentation toward difficult decision regions, while the overall pipeline represents a practical system-level integration of EGGA, XGBoost, and BO-TPE. To the best of our knowledge, this is the first work that combines GAI and AutoML to specifically improve detection on hard samples, enabling more autonomous and reliable identification of diverse cyber attacks in EV charging networks and IoT systems. Experiments are conducted on two benchmark EVCS and cybersecurity datasets, CICEVSE2024 and CICIDS2017, demonstrating consistent and statistically meaningful improvements over state-of-the-art IDS models. This research highlights the importance of combining automation, generative balancing, and optimized learning to strengthen cybersecurity solutions for EV charging networks and IoT systems.

Keywords:

cybersecurity; Intrusion Detection System; generative AI; conditional GAN; AutoML; XGBoost; Bayesian Optimization; data augmentation; EV Charging Systems; CICEVSE2024; CICIDS2017

1. Introduction

The global rise in Electric Vehicles (EVs) and the development of extensive charging infrastructure have transformed EV Charging Systems (EVCSs) into complex cyber-physical networks [1]. Instead of isolated devices, modern EVCSs now integrate many Internet-of-Things (IoT) components, such as Electric Vehicle Supply Equipment (EVSE), cloud servers, user apps, and smart grid interfaces [2]. This increased connectivity makes EVCS operations more convenient and intelligent, but the corresponding increased attack interface also exposes them to a wide range of cyber threats. Attackers might target the EV charging station’s firmware or backend communications (e.g., the Open Charge Point Protocol (OCPP)), or even misuse user authentication mechanisms to steal services [3]. For example, researchers have demonstrated a denial of charging attack that remotely halts charging services, causing service disruption and potential grid imbalances [4]. As EVCSs function as increasingly connected and service-dependent cyber-physical infrastructures, successful attacks can disrupt charging availability, compromise backend communications, misuse charging services, and undermine operational reliability and user trust [5,6]. Therefore, it is crucial to ensure the cybersecurity of charging networks for safety, reliability, and user trust.

Intrusion Detection Systems (IDSs) play a pivotal role in safeguarding IoT systems like EVCSs, because they can continuously monitor network traffic and device behaviors for signs of cyber attacks [3]. In EVCS deployment, an IDS can serve as a second layer of defense behind preventive controls as the first layer of defense (e.g., firewalls, encryption, and access control). IDSs can detect anomalous patterns indicative of severe attacks that bypass the preventive controls, and then corresponding countermeasures can be implemented to halt current attacks or prevent future attacks [7]. In EVCSs, an IDS can inspect network traffic between chargers and the central management system, or monitor internal EVSE logs, to flag anomalies and attacks. More broadly, in EVCS and smart grid environments, such IDSs are typically deployed as a critical component of a layered cybersecurity architecture, where detection complements preventive and response security mechanisms to improve overall cyber resilience [8,9].

Machine Learning (ML) models have been widely used to develop IDSs for IoT systems due to their strong capability to analyze large volumes of network traffic data and learn complex data distributions [10]. However, ML-based IDS effectiveness can be constrained by imbalanced datasets and complex traffic distributions, which may bias ML toward dominant classes and reduce its sensitivity to rare but high-impact events [11]. Generative Artificial Intelligence (AI) models, such as Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), and transformers, offer promising capabilities for enhancing IDSs by generating realistic synthetic data that improves training data quality, mitigates class imbalance, and increases detection robustness against rare and evolving attacks [12]. However, existing augmentation methods are mainly class-frequency-driven. They typically expand minority classes according to class counts, but this does not guarantee improvement in the error-prone regions where IDSs actually produce false positives and false negatives. On the other hand, conventional ML-based IDSs often rely on manually designed ML models, which require extensive human efforts and expertise. Automated ML (AutoML) is an advanced ML technique that aims to automatically optimize ML model performance, and Hyperparameter Optimization (HPO) is an essential procedure of AutoML [7,11]. In EV charging security, even small detection gains can be practically meaningful because they may reduce service interruption, denial of charging events, and unnecessary operational response to suspicious activities [5,6].

Therefore, this paper proposes a generative AI and AutoML-based optimized intrusion detection framework for EV charging network security. The proposed IDS is a multi-stage IDS pipeline that combines a high-performance base ML model with an error-guided generative balancing loop and an automated tuning stage. First, an XGBoost-based multi-class classifier is developed as a base model for initial intrusion detection [11]. Second, a conditional GAN (cGAN)-based error-guided generative augmentation (EGGA) method generates synthetic samples to mitigate imbalance and amplify misclassified or difficult samples within cross-validation, thereby concentrating augmentation on error-prone decision regions. Third, a Bayesian Optimization with Tree-structured Parzen Estimator (BO-TPE) optimization method is used to optimize the XGBoost hyperparameters to further improve detection performance [7]. This integration unifies generative AI with automated hyperparameter optimization in a single closed-loop IDS pipeline, improving both class balance and decision boundary quality under imbalanced, multi-class attack distributions.

The main contributions of this paper are summarized as follows:

It proposes EGGA, a novel error-guided generative augmentation strategy that identifies misclassified samples during cross-validation and uses a cGAN model to generate hard case-focused synthetic data and strengthen error-prone decision regions.
It proposes an AutoML-based optimized XGBoost model using BO-TPE for maximizing intrusion detection performance.
It designs a multi-stage closed-loop IDS pipeline for EV charging networks that tightly integrates a strong base learner, error-guided generative augmentation, and automated model tuning, enabling more autonomous and robust detection under imbalanced, multi-class, and nonstationary EVCS network traffic.
It evaluates the performance of the proposed IDS on two benchmark public cybersecurity datasets, CICEVSE2024 [13] and CICIDS2017 [14], and compares the performance with state-of-the-art GAI and optimized ML models.
The code for the proposed methods will be released and made publicly available on GitHub (https://github.com/LiYangHart/EGGA-Error-Guided-Generative-AI-and-Optimized-Machine-Learning-based-Intrusion-Detection-System (accessed on 9 April 2026)).

The main algorithmic novelty of this work lies in the proposed EGGA strategy. Specifically, EGGA extracts out-of-fold mistakes under stratified cross-validation, uses a cGAN to model the class conditional distribution of these difficult samples, and allocates synthetic samples proportionally to the observed error frequency of each class. The full framework then serves as an effective system-level integration by applying BO-TPE to optimize the final XGBoost detector trained on the EGGA augmented data. To the best of our knowledge, this is the first work that links error-focused generative AI models with AutoML-based model optimization, so the IDS learns difficult attack patterns more effectively and protects EV charging networks and IoT systems more reliably.

The remainder of this paper is organized as follows: Section 2 reviews related work on optimized ML and GAI-based IDS techniques for EVCS and IoT systems. Section 3 describes the proposed optimized IDS framework, focusing on the proposed cGAN-based EGGA and BO-optimized XGBoost methods. Section 4 presents the performance evaluation, including experimental setup, metrics, and discussion of results. Section 5 concludes the paper.

2. Related Work

The rapid advancement in optimized ML and generative AI techniques has reshaped how modern IDSs are designed for complex IoT environments such as EV charging networks. In this section, a comprehensive literature review is provided to introduce existing optimized ML and generative AI-based IDSs for EVCS networks, IoT systems, and modern networks.

2.1. Optimized ML-Based IDSs

Recent IDS studies increasingly emphasize optimized ML pipelines that reduce reliance on manual trial and error by systematically improving model design and hyperparameter configurations.

Bakro et al. [15] proposed a cloud-focused IDS that optimizes a Random Forest (RF) pipeline through hybrid bio-inspired feature selection. Their method combines Grasshopper optimization and a Genetic Algorithm (GA) to search for compact, high-value feature subsets, aiming to reduce high-dimensional cost while improving multi-class detection quality in cloud traffic. Elmasry et al. [16] proposed a double Particle Swarm Optimization (PSO)-based automated search strategy to optimize deep learning-based intrusion detection, including Deep Neural Networks (DNNs), Long Short-Term Memory (LSTM), and Deep Belief Networks (DBNs). This method uses swarm optimization to tune the feature set and learning hyperparameters during pretraining, enabling the IDS to adapt its architecture and achieve stronger detection results with less manual design effort.

As a popular deep learning (DL) method, Convolutional Neural Networks (CNNs) are also used for intrusion detection. Yang et al. [17] presented an optimized CNN-based IDS for vehicular networks, where transfer learning reduces training burden and hyperparameter optimization strengthens the final detector, targeting both intra-vehicle and external network attacks. Validation on Car-Hacking and CICIDS2017 datasets suggests that carefully optimized CNN transfer pipelines can generalize well across distinct vehicular security domains. Naeem et al. [18] proposed a multi-class vehicle security approach that uses deep transfer learning and CNNs as the backbone and applies GA-based optimization to tune the learning setup. This optimization-centered design aims to systematically improve performance for multi-category attack recognition in vehicle security pipelines.

Khan et al. [19] proposed Optimized Ensemble IDS (OE-IDS), an AutoML-guided ensemble IDS framework for modern networks, where AutoML is used to rank candidate learners and then integrates the selected models using soft voting to enhance intrusion detection performance on network traffic. Singh et al. [20] developed AutoML-Intrusion Detection (AutoML-ID) for wireless sensor networks, in which BO automatically selects and tunes the best-performing model from a predefined set of machine learning candidates to reduce manual design effort. The AutoML-ID outperforms many other DL and AutoML methods for intrusion detection.

While these optimization-driven approaches can substantially boost the effectiveness of ML-based IDSs, their improvements are still bounded by the quality and coverage of the available training data, especially under severe imbalance and hard-to-learn attack regions. These observations motivate complementary strategies that optimize not only the learner, but also the training data distribution, which naturally leads to generative augmentation-based IDS models.

2.2. Generative AI-Based IDSs

Generative AI modeling has increasingly been used in IDS research to address the data bottleneck by synthesizing realistic traffic samples to mitigate class imbalance and to better represent rare attack behaviors. In EVCS networks, Asim et al. [21] proposed VAE-XGBoost, a hybrid pipeline where a VAE model learns compact latent representations and supports more robust learning under imbalanced conditions, followed by XGBoost for the final attack classification in next-generation EV charging networks. This design reflects a broader trend of combining representation learning from generative models with strong ML models to obtain strong accuracy while keeping inference efficient.

Lee and Park [22] introduced a GAN-based imbalanced data IDS that generates minority class samples to mitigate class imbalance issues, then evaluates detection performance using a downstream classifier such as Random Forest. Extending this idea to tabular IoT data, Habibi et al. [23] employed Conditional Tabular (CTGAN) to model and synthesize realistic tabular records for rare botnet behaviors, then trained ML detectors on the augmented data to improve botnet detection. These studies collectively suggest that generative AI-learned synthetic data can be more faithful than naive oversampling when features are highly coupled and non-linear, which is typical in network flow and IoT traffic.

Additionally, Bouzeraib et al. [24] combined horizontal federated learning with an optimized Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP) to enhance IoT intrusion detection while maintaining data locality across clients. This work indicates that generative augmentation can be paired with privacy-aware training to improve robustness in distributed IoT deployments, though it also motivates careful validation of synthetic sample quality and its impact on false positives in safety-critical settings.

However, most existing generative augmentation strategies are class-frequency-driven, meaning they increase minority classes broadly rather than concentrating generation on the decision regions where the detector actually fails.

2.3. Literature Comparison and Our Contributions

To summarize, existing optimized ML-based IDSs have improved detection by enhancing ML model selection and hyperparameter tuning, while generative AI-based IDSs have improved learning under imbalance by adding synthetic samples. However, existing optimized ML pipelines [15,16,17,18,19,20] assume the training distribution is fixed, and existing generative augmentation pipelines [21,22,23,24] are class-level rather than error-level, so they do not directly reinforce the specific decision regions that cause false positives and false negatives in practice.

In EVCS and IoT settings, this limitation becomes more important because a small number of confusing samples can dominate operational risk, and simply adding more minority class points does not guarantee improved robustness on the hardest cases. Moreover, prior work typically applies generative augmentation and model optimization as separate steps, rather than as a closed loop that uses the model’s own errors to guide what data should be generated next. This gap motivates an error-guided generative augmentation loop that explicitly targets misclassified or difficult samples, combined with automated model tuning to further strengthen the final IDS.

Therefore, this work contributes a unified framework that combines optimized ML and generative AI in a closed-loop manner for EV charging network security. The main contributions of our proposed framework are as follows: First, EGGA is an error-guided cGAN-based augmentation strategy that extracts misclassified samples during cross-validation and generates targeted synthetic samples to strengthen error-prone decision regions, rather than performing only frequency-based balancing. Second, an AutoML-driven optimization stage that applies BO-TPE to automatically tune XGBoost hyperparameters on the augmented training set, improving the final classifier beyond default settings or manually designed models. Third, a multi-stage IDS pipeline tightly integrates strong tree-based ML models, error-guided generative augmentation, and automated tuning to improve robustness under imbalanced, multi-class EVCS and general network traffic. The main algorithmic novelty of the proposed framework is EGGA, together with an effective system-level integration of EGGA, XGBoost, and BO-TPE for EV charging network security. The proposed IDS framework is evaluated on two public benchmark datasets, CICEVSE2024 and CICIDS2017, and is compared against state-of-the-art optimized ML and generative AI-based IDSs reviewed in this section to demonstrate its performance improvements.

3. Proposed cGAN-Based IDS Framework

3.1. System Overview

The overall architecture of the proposed EGGA-based optimized IDS is illustrated in Figure 1. The framework is designed as a multi-stage pipeline that integrates data quality improvement, error-guided generative augmentation, and automated model optimization into a single end-to-end intrusion detection workflow for EV charging networks and general network environments.

The framework takes network traffic records from CICEVSE2024 and CICIDS2017 as input and applies a consistent pre-processing block before any learning component. In the experimental workflow, each representative subset is first divided into stratified training and test splits using an 80/20 ratio, and the hold-out test set is reserved only for final evaluation. This block handles missing values via mean/mode imputation, converts categorical fields to numeric values via label encoding, and performs min-max normalization so that features share a comparable scale and the downstream generative model remains stable during training.

Next, the framework enters the proposed EGGA stage. Specifically, a base XGBoost classifier is trained under stratified cross-validation only on the training set to identify misclassified samples across folds, which are treated as error samples that represent difficult and ambiguity-prone regions in the decision space. A cGAN model is then trained on these error samples and their class labels from the training set to generate targeted synthetic samples, producing an augmented training set only for the training set that strengthens the model on hard cases rather than relying solely on frequency-based class balancing.

Finally, the augmented dataset is used to train an optimized intrusion detector. In this stage, BO-TPE is employed to automatically tune the XGBoost hyperparameters using only the EGGA-augmented training set, yielding an optimized XGBoost model selected as the final IDS for deployment. The output of the framework is a multi-class decision that separates normal traffic from specific attack categories, enabling practical and accurate intrusion detection for EV charging systems and related IoT environments.

3.2. Data Pre-Processing

Before feeding data into the cGAN model and the ML classifier, several data pre-processing steps that are common in IDS development but tailored for EVCS data are performed, including data cleaning, encoding, and normalization [7].

In EVCS and IoT systems, certain features in some data samples might be missing. For example, a packet capture misses some fields, or an EV does not report a value due to unavailability. Many ML-based IDSs cannot directly address missing values or are negatively affected by them during training [7].

Therefore, it is crucial to address missing values. For numeric continuous feature values, mean values are used to impute them; for categorical feature values, the mode imputation is used, which imputes missing values with the most frequent category (mode imputation) [7]. Additionally, duplicated samples are dropped to avoid bias.

The EVCS data usually contains certain non-numerical features, such as IP addresses, protocol names, and charger IDs [13]. As many ML models are unable to process string or categorical data directly, encoding is required to convert categorical features into numerical features [11]. In the proposed framework, label encoding is used to transform these categorical features into integers based on the number of unique values for each feature [7]. It is used instead of other encoding methods, such as one-hot encoding, because it avoids unnecessary dimensionality expansion and reduces computational and memory overhead.

Additionally, normalization is applied to the dataset to avoid training biased ML models. As many ML models treat features with larger values as more important, data normalization is used to scale feature values into a comparable range [11]. In min-max normalization, features are scaled into the range of [0, 1]. This [0, 1] range also enables the proposed GAN models to use the sigmoid activation function for the generator output, since inconsistent feature scales could impede the learning process of both the generator and discriminator. The feature value after min-max normalization,

x_{n}

, is represented by [7],

x_{n} = \frac{x - m i n}{m a x - m i n},

(1)

where x is the original value and

m i n

and

m a x

are the minimum and maximum values of the original feature.

After the pre-processing steps, the updated dataset serves as input to the cGAN-based EGGA method for generating a better dataset.

3.3. Proposed cGAN-Based Error-Guided Generative Augmentation (EGGA) Method

Designing an IDS framework for EV charging networks is challenging, because the data distribution is typically multi-class and strongly imbalanced, with rare attacks appearing sparsely and often overlapping with benign events in feature space [25]. In this case, even high-capacity classifiers can achieve strong average metrics while still failing on a small subset of ambiguous samples that dominate false alarms and missed detections during deployment. Moreover, common balancing strategies allocate augmentation budget based on class frequency rather than on the classifier’s empirical error patterns, which can leave hard decision regions underrepresented. Therefore, to enable error-focused learning principles, the proposed EGGA method introduces a closed-loop augmentation mechanism that uses misclassified samples as a data-driven proxy for difficult regions and then synthesizes additional training samples conditioned on their labels [26].

Accordingly, the proposed EGGA method integrates (i) cross-validated mistake extraction, (ii) conditional adversarial learning of the mistake distribution, and (iii) a mistake frequency allocation rule that determines how many synthetic samples to generate per class. The method treats generative augmentation as an optimization operator that iteratively reshapes the training distribution to reduce recurring decision errors.

Generative AI models are attractive for IDS development because they optimize the data distribution without discarding the majority class information that is essential for modeling normal behavior. In tabular security data, naive oversampling can distort joint feature dependencies, whereas neural generative models can learn complex multivariate structures and provide higher utility synthetic data for downstream classifiers [27]. Among generative models, GANs learn by adversarial training between a generator and a discriminator [22]. However, unconditional GANs can struggle in multi-class settings because the generator may collapse toward dominant modes, producing samples that do not reflect class-specific semantics. Conditional GANs address this limitation by conditioning both networks on the label, explicitly modeling the class-conditional distribution

p (x ∣ y)

[28].

Let

z

∼

p (z)

be a latent noise vector and

y \in {1, \dots, C}

be a class label represented by a one-hot vector

y \in {0, 1}^{C}

. A cGAN defines a generator

G (z, y)

that outputs a synthetic feature vector

\tilde{x}

and a discriminator

D (x, y)

that estimates the probability that

x

is real given the condition

y

. The standard minimax objective is [28]:

\begin{matrix} min_{G} max_{D} L_{cGAN} (G, D) & = E_{data} [log D (x, y)] \\ + E_{z, y} [log (1 - D (G (z, y), y))], \end{matrix}

(2)

where

E_{data}

denotes expectation over

(x, y)

∼

p_{data}

and

E_{z, y}

denotes expectation over

z

∼

p (z)

and

y

∼

p (y)

.

In EGGA, both G and D are implemented as multilayer perceptrons operating on concatenated inputs. Specifically, the generator takes

[z; y]

and applies two hidden blocks with LeakyReLU activations and batch normalization, followed by an output layer that produces a synthetic vector in the feature dimension. The discriminator takes

[x; y]

and applies two hidden layers with LeakyReLU, followed by a sigmoid output that represents

D (x, y) \in (0, 1)

.

Compared with a vanilla GAN, conditioning provides controllable sampling and reduces cross-class mixing. Additionally, compared with VAE-style models [21], cGANs directly optimize sample realism under an adversarial criterion rather than maximizing an evidence lower bound, which can produce overly smooth samples when the primary objective is decision boundary strengthening. In EGGA, the cGAN choice is driven by its label controllability and lightweight sampling cost, which are desirable for iterative augmentation in IDS pipelines.

The proposed EGGA process can be summarized as an error-conditioned augmentation loop, as shown in Algorithm 1.

Step 1: Error mining via stratified cross-validation: Given the training set $D_{t r} = {X_{t r}, y_{t r}}$ , EGGA runs stratified K fold cross-validation with a base learner (default XGBoost configured for multi-class probabilities). In each fold, the learner is trained on $K - 1$ folds and evaluated on the held-out fold. For each validation sample i, the predicted label is ${\hat{y}}_{i} = arg {max}_{c} p_{θ} (y = c ∣ x_{i})$ and misclassified samples are collected when ${\hat{y}}_{i} \neq y_{i}$ . The union across folds forms the mistake set:

$M = ⋃_{k = 1}^{K} \{(x_{i}, y_{i}) : {\hat{y}}_{i}^{(k)} \neq y_{i}\} .$

(3)

This design ensures that $M$ contains samples that are difficult under out-of-fold evaluation, rather than merely capturing training noise. In addition, the code stores metadata such as fold index and predicted probabilities, but EGGA uses only the original feature structure and the ground truth labels for generator training.
Step 2: Train a single cGAN on the aggregated mistake set: Let $X_{m}$ and $y_{m}$ denote the features and labels extracted from $M$ . EGGA trains one cGAN on $(X_{m}, y_{m})$ , where labels are transformed to one-hot vectors using a deterministic mapping class_to_index. During each epoch, EGGA samples a minibatch of real samples $(x, y)$ from $(X_{m}, y_{m})$ and samples $z$ ∼ $N (0, I)$ . It also samples a minibatch of labels for synthetic generation by drawing from the empirical label distribution in $y_{m}$ , producing $\tilde{y}$ . The generator produces $\tilde{x} = G (z, \tilde{y})$ , and the discriminator is updated by minimizing:

$L_{D} = - E [log D (x, y)] - E [log (1 - D (\tilde{x}, \tilde{y}))] .$

(4)

Then the generator is updated through the combined model by minimizing:

$L_{G} = - E [log D (G (z, \tilde{y}), \tilde{y})],$

(5)

which encourages synthetic samples to be classified as real under the same label condition. The implementation further applies a simple stabilization heuristic by scaling the generator target label with a schedule factor that increases during training, which reduces overly aggressive generator updates in early epochs.
Step 3: Error-guided synthesis with mistake-proportional allocation: After training, EGGA generates synthetic samples per class using the learned condition. Let $n_{c} = |{(x, y) \in$ $M : y = c}|$ be the mistake count for class c. EGGA allocates a synthetic budget:

$s_{c} = m \cdot n_{c},$

(6)

where m is a user-controlled multiplier. For each class c with $s_{c} > 0$ , EGGA samples $z_{1 : s_{c}}$ ∼ $N (0, I)$ and sets the condition to $y = c$ , then generates $X_{s y n}^{(c)} = G (z_{1 : s_{c}}, c)$ . Finally, the augmented training set is constructed as:

$X_{t r}^{e g g a} = [X_{t r}; X_{s y n}], y_{t r}^{e g g a} = [y_{t r}; y_{s y n}],$

(7)

where $[\cdot; \cdot]$ denotes concatenation.

The proposed EGGA method provides three practical advantages. First, it is error-selective. The generator models the empirical distribution of misclassified samples, which concentrates synthesis on uncertain regions rather than globally oversampling minority classes. Second, it is class-controllable. Conditioning enables attack-specific synthesis and reduces the risk of generating ambiguous cross-class samples. Third, it is lightweight for iterative training. Sampling from a trained cGAN is computationally inexpensive compared with more complex likelihood-based or diffusion-based tabular generators, while still providing strong utility for downstream classifiers in many settings [26].

Algorithm 1: EGGA: Error-guided generative augmentation.

With the error-strengthened training set produced by EGGA, the next step is to optimize ML-based intrusion detection so that its capacity and regularization match the distribution of the enriched difficult samples.

3.4. Optimized XGBoost Model Using Bayesian Optimization with Tree-Structured Parzen Estimator (BO-TPE)

After generating an updated dataset with augmented difficult samples using the proposed cGAN-based EGGA method, the dataset is used to train an optimized XGBoost model using the BO-TPE method.

In this work, we adopt XGBoost, a scalable gradient-boosted decision tree (GBDT) algorithm that builds an additive ensemble of K regression trees to model complex non-linear decision functions efficiently [11]. Given a training set

D = {(x_{i}, y_{i})}_{i = 1}^{n}

, XGBoost predicts [29]:

{\hat{y}}_{i} = \sum_{k = 1}^{K} f_{k} (x_{i}), f_{k} \in F,

(8)

where

F

denotes the space of Classification and Regression Trees (CARTs). The learning objective combines empirical loss and a structural regularizer:

J = \sum_{i = 1}^{n} ℓ (y_{i}, {\hat{y}}_{i}) + \sum_{k = 1}^{K} Ω (f_{k}),

(9)

with the common tree regularization:

Ω (f) = γ T + \frac{λ}{2} \sum_{j = 1}^{T} w_{j}^{2},

(10)

where T is the number of leaves and

w_{j}

is the score on leaf j. XGBoost optimizes (9) in a stage-wise manner using second-order approximations of the loss, which yields fast convergence and stable learning for tabular features when compared with many end-to-end deep models [29].

XGBoost is selected as the base intrusion detector due to the following reasons:

Boosted trees and XGBoost remain highly competitive or superior to deep neural networks on medium-scale tabular problems, often with lower tuning burden and stronger out-of-the-box performance [7].
The tree ensemble structure captures non-linear feature interactions that occur naturally in network behavior, while maintaining efficient inference that is suitable for real-time or near-real-time detection [30].
XGBoost includes explicit regularization and shrinkage mechanisms that are effective against overfitting on imbalanced security datasets, especially after the training distribution is reshaped by targeted augmentation [31].

These characteristics make XGBoost a strong choice as the final detector in a multi-stage IDS pipeline, where data improvement and model optimization are performed sequentially. XGBoost includes a rich set of hyperparameters that jointly control ensemble size, tree complexity, sampling, and regularization. The most influential ones in practice include the number of boosting rounds K (often denoted n_estimators), the maximum tree depth max_depth, and the shrinkage factor learning_rate, which together dominate the bias variance trade-off and convergence behavior [7].

In this work, BO is instantiated using the TPE, a nonparametric density-based surrogate that models

p (θ ∣ y)

rather than

p (y ∣ θ)

and supports conditional, tree-structured search spaces [7]. Given an observation history

H = {(θ_{t}, y_{t})}_{t = 1}^{T}

, TPE splits configurations into a better set

H^{(l)}

and a worse set

H^{(g)}

using a quantile threshold

y^{⋆}

[32]:

p (θ ∣ y, H) = \{\begin{matrix} l (θ), & y \leq y^{⋆}, \\ g (θ), & y > y^{⋆}, \end{matrix}

(11)

where

l (θ)

and

g (θ)

are Parzen window density estimates. The threshold

y^{⋆}

is a data-dependent model parameter of BO-TPE, and it is automatically initialized and updated based on the current set of observed losses during optimization.

Candidate configurations are then proposed by maximizing the density ratio:

θ_{next} = arg max_{θ \in Ω} \frac{l (θ)}{g (θ)},

(12)

which is equivalent to selecting points with high probability under good configurations and low probability under poor configurations [32]. Compared with BO using Gaussian processes, TPE is often more convenient for mixed discrete and continuous hyperparameters and high-dimensional spaces [32]. Compared with metaheuristic optimizers such as PSO and GA, BO-TPE typically uses fewer evaluations to reach strong configurations, which is important when each evaluation requires training and validating an IDS model [7].

Algorithm 2 describes the optimization process of the XGBoost model for optimizing the intrusion detection performance.

Algorithm 2: BO-TPE hyperparameter optimization for XGBoost on EGGA-augmented data.

Overall, the proposed IDS is a multi-stage pipeline that first improves the training distribution through EGGA and then maximizes detector utility through BO-TPE-optimized XGBoost. The key advantage of this design is that data optimization and model optimization are aligned: EGGA concentrates training density on empirically difficult regions, and BO-TPE then adjusts the ensemble capacity and regularization so that the final XGBoost model can exploit the enriched hard sample distribution without overfitting. This closed-loop combination strengthens detection performance on ambiguous cases while preserving efficiency, offering a practical path toward more autonomous and reliable IDS deployment in EV charging networks and related IoT systems.

4. Performance Evaluation

4.1. Experimental Setup

To develop and evaluate the proposed IDS, the models were implemented by extending the Scikit-learn [33], Xgboost [29], and Keras [34] libraries within the Python 3.7 environment. In the experiments, the models were trained on an Alienware Aurora R9 machine equipped with an Intel Core i9 9900K processor, 64 GB of RAM, and an NVIDIA GeForce RTX 2080 Ti GPU, representing a central server-class platform in EVCSs or IoT systems.

The proposed IDS framework is evaluated on two public benchmark datasets, namely CICEVSE2024 [13] and CICIDS2017 [14], to validate both EV charging-specific threat coverage and general network intrusion detection capability.

CICEVSE2024 [13] is a state-of-the-art cybersecurity dataset specifically designed for EVCS security research. It is released by the Canadian Institute for Cybersecurity (CIC) and is collected from an operational Level-2 EV charging station testbed that contains an EVSE unit, Raspberry Pi-based components, and communication equipment that enables real protocol interactions. Its testbed enables realistic data generation under both idle and charging states. CICEVSE2024 involves many network attacks, including various reconnaissance and denial of service scenarios targeting EV charging communication surfaces, which are valuable for controlled evaluation of multi-class IDS models. In this work, the IDS evaluation focuses on the network traffic datasets in CICEVSE2024.

CICIDS2017 [14] is a widely used intrusion detection benchmark dataset also released by the CIC. It contains labeled network flows with more than 80 features and corresponding traffic captures generated in various attack scenarios. The CICIDS2017 dataset includes benign traffic and multiple attack families executed during the capture period, including brute force attacks, denial of service variants, web-attacks, infiltration, botnet activity, and distributed denial of service traffic. This diversity makes CICIDS2017 useful for evaluating IDS models under heterogeneous traffic patterns and severe class imbalance, which remains a practical challenge in operational deployments.

Both datasets exhibit substantial scale and class imbalance in their original releases, which can significantly increase computational cost during model training, oversampling, and hyperparameter optimization. Therefore, for the purpose of this work, representative subsets are constructed at the class level to preserve attack diversity while ensuring reliable experimentation. In particular, the sampling strategy aims to retain sufficient samples per attack category to maintain statistically meaningful evaluation while reflecting the storage and processing constraints commonly encountered in edge and embedded deployments within EV charging ecosystems [3].

To evaluate the proposed model, both cross-validation and hold-out validation methods are used in the experiments. At the first stage of the proposed framework, five-fold cross-validation is implemented on the base ML model to identify all the difficult samples or misclassified samples across the entire training set. After the optimized ML model is trained on the new training set generated by the proposed cGAN-based EGGA method, its performance is evaluated on the test set. An 80/20% train–test split is used for hold-out validation, as this is a standard split and there are a sufficient number of samples to construct training and test sets to evaluate and avoid overfitting. The hold-out 20% test set remains untouched throughout model development, and only the training set is used for the model development procedures, including the error identification, cGAN training, synthetic sample allocation, and the HPO process using BO-TPE.

Table 1 and Table 2 present the class composition of the CICEVSE2024 and CICIDS2017 datasets used in this study. For each class, the tables report the class label, total sample count in the final subset, class distribution percentage, original training sample count after the 80/20 split, the number of misclassified samples identified by the base XGBoost model during the five-fold cross-validation on the training split, the training sample count after EGGA, and the test sample count. As shown in these tables, both datasets are clearly imbalanced, with several minority attack classes containing far fewer samples than the dominant classes. This characteristic motivates the proposed EGGA method, which aims to strengthen learning in error-prone and underrepresented regions of the data distribution.

Representative subsets were constructed at the class level. For minority classes with limited samples (i.e., ICMP flood or fragmentation, normal class in CICEVSE2024, and infiltration), all original samples were retained. For the majority classes, representative subsets were randomly sampled from the original class distributions using a fixed random seed of 42 to ensure exact reproducibility. Using a fixed random seed is standard practice for improving reproducibility in ML experiments, and 42 is a commonly used implementation choice [35]. After subset construction, each dataset was partitioned into training and test sets using the same fixed seed of 42 and an 80/20 split. The exact same subset files and split partitions were used across all reproduced baselines to ensure a fair and consistent comparison.

As network traffic data is usually imbalanced data, four classification metrics, including accuracy, precision, recall, and F1-scores, are used to assess the model performance comprehensively. Based on the number of true positives (TPs), true negatives (TNs), false positives (FPs), and false negatives (FNs), these four metrics can be computed by the following formulas [36]:

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(13)

P r e c i s i o n = \frac{T P}{T P + F P}

(14)

R e c a l l = \frac{T P}{T P + F N}

(15)

F 1 = \frac{2 \times P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l} = \frac{2 \times T P}{2 \times T P + F P + F N}

(16)

Additionally, to evaluate the model efficiency, the model training time on the training set and the average model inference time per sample are monitored. Here, the reported training time includes the full offline model development process of each method. Therefore, for methods that use data re-balancing or synthetic augmentation, the reported training time includes both the re-balancing or augmentation time and the subsequent classifier training time. For the proposed full framework, the training time also includes the EGGA-related data generation stage and the BO-TPE-based hyperparameter optimization stage. We also reproduce and evaluate state-of-the-art methods in the literature [15,16,17,18,19,20,21,22,23,24,37] using the same metrics, and compare their performance with the proposed model’s performance.

4.2. Experimental Results and Discussion

Table 3 and Table 4 report the performance and efficiency comparisons between the proposed EGGA-based optimized IDS and representative state-of-the-art optimized ML and generative AI-based IDSs reproduced from the literature [15,16,17,18,19,20,21,22,23,24,37]. For fair comparison, the reproduced VAE, GAN, CTGAN, and WGAN-GP methods are evaluated within the same XGBoost-based IDS setting, while XGBoost + SMOTE is added as a simpler non-deep learning oversampling baseline. This additional control helps isolate whether the observed gain comes only from adding synthetic samples or from the proposed error-guided augmentation mechanism itself.

Table 5 further reports the optimized XGBoost hyperparameters selected by BO-TPE on the two datasets. The most influential parameters include the number of boosting rounds, tree depth, and learning rate. On both CICEVSE2024 and CICIDS2017, BO-TPE selects

n_e s t i m a t o r s = 80

, indicating that a similar ensemble size is suitable for both datasets after EGGA. However, the optimal tree depth and learning rate differ. On CICEVSE2024, the optimal configuration is

m a x_d e p t h = 30

and

l e a r n i n g_r a t e = 0.733

, while on CICIDS2017, the optimal configuration is

m a x_d e p t h = 20

and

l e a r n i n g_r a t e = 0.789

. This result suggests that, although the two datasets benefit from a similar number of boosting rounds, their best model complexity and update strength are still dataset-dependent. Therefore, these results further justify the use of BO-TPE instead of relying on manually fixed or default hyperparameter settings.

Table 3 summarizes the results on CICEVSE2024, which represents EV charging network traffic. The proposed full pipeline (XGBoost + EGGA + BO-TPE) achieves the best overall performance, reaching 99.958% for accuracy and 99.957% for F1-score. This performance is higher than all compared optimized ML baselines in the literature, including GA-RF [15], PSO-LSTM [16], PSO-CNN [17], GA-CNN [18], OE-IDS [19], and AutoML-ID [20], and also higher than the reproduced generative augmentation variants that combine XGBoost with VAE [21], GAN [22], CTGAN [23], WGAN-GP [24], and Synthetic Minority Over-sampling Technique (SMOTE) [37]. SMOTE is another common data synthesis method that uses nearest neighbor strategies for data sample generation, and it is used as a non-deep learning method for comparison [37,38]. Compared with this simpler XGBoost + SMOTE baseline, the proposed EGGA-cGAN framework still achieves higher detection performance, indicating that the gain is not merely due to generic oversampling, but is more closely related to the proposed error-guided augmentation strategy.

The ablation study provides direct evidence of how the proposed stages contribute to the final gain. The base XGBoost model yields an F1-score of 99.916%, establishing a strong starting point for the pipeline. After applying EGGA, the performance increases to 99.944% F1, showing that error-guided conditional generation improves the classifier by strengthening the training distribution in the regions where the base learner makes mistakes. Importantly, this improvement is achieved without changing the classifier family, which supports the central hypothesis that a targeted augmentation policy can be an effective data-level optimization operator. Finally, BO-TPE further improves the F1-score to 99.957%, suggesting that after the training set is enhanced by EGGA, automated hyperparameter optimization can better align model capacity and regularization with the enriched hard-sample distribution.

Although the absolute improvement from 99.916% to 99.958% appears modest, it corresponds to a substantial relative error reduction. Using accuracy as an example, the error rate decreases from

(100 - 99.917) % = 0.083 %

to

(100 - 99.958) % = 0.042 %

, which is approximately a

49 %

reduction. In EV charging security, where even a small number of missed attacks or false alarms can translate into operational risk and high response cost, this reduction is practically meaningful.

Table 6 provides a more detailed class and error analysis on the CICEVSE2024 dataset. The total number of test errors is reduced from six for the original XGBoost model to three for the full proposed framework. The clearest gains appear on service detection, where the test error count decreases from two to zero, and the per-class F1-score improves from 99.661% to 99.797%, and on vulnerability scan, where the error count decreases from one to zero, and the per-class F1-score increases from 99.920% to 100.00%. For the rare ICMP flood or fragmentation class, the error count remains two because the test set contains only 18 samples, but the per-class F1-score still improves from 91.429% to 94.117%, indicating better handling of this hard minority class. Meanwhile, classes that are already near saturation remain stable. These results show that the proposed framework improves the most difficult EVCS-related classes without degrading the already well-classified classes. The confusion matrices of the original XGBoost and the full proposed framework on the CICEVSE2024 dataset are shown in Figure 2 and Figure 3.

From an efficiency perspective, as shown in Table 3, the full pipeline requires a longer training time (56.32 s) than the base XGBoost (6.86 s), mainly due to the additional stages of cross-validation-driven error mining, cGAN training, and BO-TPE search. However, the inference time remains extremely low and is the fastest among the compared methods (0.0018 ms per sample). This indicates that the proposed framework concentrates the additional computation in the offline training phase, while preserving near-real-time detection during deployment. Compared with optimization-heavy deep baselines, such as PSO-LSTM [16] and PSO-CNN [17], the proposed framework achieves both substantially higher detection performance and far lower inference latency, which is desirable for continuous monitoring in EVCS environments.

Table 4 reports the evaluation on CICIDS2017, a widely used benchmark with diverse enterprise network behaviors. The proposed full pipeline again obtains the best overall performance, achieving 99.832% accuracy and 99.829% F1-score. It outperforms the state-of-the-art optimized ML models [15,16,17,18,19,20], and also exceeds all reproduced generative augmentation combinations (XGBoost + VAE, GAN, CTGAN, and WGAN-GP) [21,22,23,24]. This consistency across datasets indicates that the proposed strategy generalizes beyond the EV-specific dataset and remains effective on a different traffic distribution. The same observation also holds on CICIDS2017, where the proposed framework remains stronger than the added SMOTE baseline. This further supports that the improvement is not obtained solely by adding extra synthetic samples, but by using error-guided augmentation to strengthen difficult decision regions.

The ablation trend on CICIDS2017 matches the observation on CICEVSE2024. The base XGBoost achieves 99.776% F1. With EGGA, the F1-score increases to 99.811%, and BO-TPE further improves it to 99.829%. In terms of relative error reduction, the accuracy error rate decreases from

(100 - 99.779) % = 0.221 %

for the base model to

(100 - 99.832) % = 0.168 %

for the full pipeline, corresponding to an approximate

24 %

reduction. This reduction supports that EGGA primarily targets boundary errors and class-overlap confusions, while BO-TPE refines the model configuration to best exploit the improved training set.

Table 7 provides a similar class-level view on the CICIDS2017 dataset. The total number of test errors decreases from 25 to 19. The most visible improvements are observed for DoS, where the error count decreases from seven to five and the per-class F1-score increases from 99.908% to 99.934%; for Web-attack, where the error count decreases from two to one and the per-class F1-score improves from 99.774% to 99.887%; and for Bot, where the error count decreases from two to one and the per-class F1-score improves from 98.958% to 99.348%. Brute force also improves slightly from two to one test errors. In contrast, infiltration remains the most challenging class, with three test errors and a 72.727% per-class F1-score in both cases, which is consistent with its extremely small test set size of only seven samples. Overall, these class-level results support that the proposed method mainly improves difficult and frequently confused attack classes, rather than merely inflating already saturated aggregate metrics. The confusion matrices of the original XGBoost and the full proposed framework on the CICIDS2017 dataset are shown in Figure 4 and Figure 5.

In addition to aggregate metrics, the added class and error analysis in Table 6 and Table 7, together with the confusion matrices in Figure 2, Figure 3, Figure 4 and Figure 5, provide task-oriented evidence of the usefulness of the generated samples. The improvements are mainly concentrated on difficult and frequently confused classes, which is consistent with the design objective of EGGA.

Regarding efficiency, as shown in Table 3 and Table 4, the inference time of the final detector remains at 0.0018 ms per sample, matching the base XGBoost and remaining substantially lower than the deep baselines. The training time increases to 78.39 s for the full pipeline, which reflects the additional offline optimization stages. This trade-off is reasonable in IDS practice, where models are typically trained periodically on a central server, while inference latency is the dominant constraint for online detection.

Across both datasets, the results support three main conclusions. First, EGGA provides consistent improvements over the base learner, indicating that error-guided conditional synthesis is an effective mechanism for improving robustness in imbalanced and multi-class intrusion detection. Second, BO-TPE yields additional gains beyond EGGA, demonstrating that data-level optimization and model-level optimization are complementary in the proposed design. Third, the final IDS maintains extremely low inference latency while achieving the best detection performance, which is critical for practical deployment in EV charging networks and other IoT-enabled security monitoring scenarios.

Lastly, although this paper focuses on IDS development and performance improvement, practical deployment in EVCS and smart grid settings should also consider broader defense layering and the potential vulnerability of ML-based IDS pipelines to adversarial manipulation or poisoning [39,40]. These important issues are beyond the main scope of the present work and will be investigated in future research.

5. Conclusions

EV charging infrastructures are evolving into large-scale cyber-physical and IoT-enabled systems, where security monitoring must detect diverse attacks with minimal operational delay. This paper presents an EGGA-based optimized IDS, an error-guided generative augmentation, and an AutoML-based optimized IDS framework that integrates a cGAN-driven hard-sample augmentation loop with a BO-TPE optimized XGBoost detector in a unified multi-stage pipeline. The key technical contribution is to treat model errors as a training signal for data optimization. EGGA identifies misclassified samples under stratified cross-validation and uses a conditional generator to synthesize label-consistent samples that strengthen error-prone decision regions, rather than performing only class-frequency-driven balancing. This error-guided data improvement is then coupled with BO-TPE hyperparameter optimization, enabling the final XGBoost model to better match its capacity and regularization to the enriched hard-sample distribution. Experiments on two public benchmark datasets, CICEVSE2024 and CICIDS2017, demonstrate that the proposed full pipeline achieves the best overall performance among the compared optimized ML and generative AI-based IDSs. On CICEVSE2024, it attains 99.958% accuracy and 99.957% F1-score, and on CICIDS2017, it achieves 99.832% accuracy and 99.829% F1-score. Moreover, the final detector preserves extremely low inference latency (0.0018 ms per sample on both datasets), which supports practical real-time or near-real-time deployment, while the additional computation is primarily shifted to the offline training stage. For future work, we will extend EGGA and the optimized IDSs to streaming and evolving EVCS environments, including continual learning and drift-aware updating to maintain detection reliability under nonstationary traffic. We will also investigate broader deployment-oriented issues, including defense-in-depth integration and robustness against adversarial manipulation, poisoning, and related model vulnerability concerns.

Author Contributions

Conceptualization, L.Y. and G.K.; methodology, L.Y.; validation, L.Y. and G.K.; formal analysis, L.Y.; resources, L.Y.; data curation, L.Y.; writing—original draft preparation, L.Y.; writing—review and editing, L.Y. and G.K.; project administration, L.Y.; funding acquisition, L.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This project was made possible in part through the support of the National Cybersecurity Consortium and the Government of Canada (CSIN).

Data Availability Statement

Data available in a publicly accessible repository: https://github.com/LiYangHart/EGGA-Error-Guided-Generative-AI-and-Optimized-Machine-Learning-based-Intrusion-Detection-System (accessed on 9 April 2026).

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Tanyıldız, H.; Şahin, C.B.; Dinler, Ö.B.; Migdady, H.; Saleem, K.; Smerat, A.; Gandomi, A.H.; Abualigah, L. Detection of Cyber Attacks in Electric Vehicle Charging Systems Using a Remaining Useful Life Generative Adversarial Network. Sci. Rep. 2025, 15, 10092. [Google Scholar] [CrossRef] [PubMed]
Morgan, E.F.; Ali, M.H. Digital Twin-Driven Cybersecurity for 5G/6G-Enabled Electric Vehicle Charging Infrastructure: A Review. Energies 2025, 18, 6048. [Google Scholar] [CrossRef]
Fatemeh, D.; Li, Y.; Firouz, B.A.; Abdallah, S. On TinyML and Cybersecurity: Electric Vehicle Charging Infrastructure Use Case. IEEE Access 2024, 12, 108703–108730. [Google Scholar] [CrossRef]
Gupta, K.; Panigrahi, B.K.; Joshi, A.; Paul, K. Demonstration of Denial of Charging Attack on Electric Vehicle Charging Infrastructure and Its Consequences. Int. J. Crit. Infrastruct. Prot. 2024, 46, 100693. [Google Scholar] [CrossRef]
Ronanki, D.; Karneddi, H. Electric Vehicle Charging Infrastructure: Review, Cyber Security Considerations, Potential Impacts, Countermeasures, and Future Trends. IEEE J. Emerg. Sel. Top. Power Electron. 2024, 12, 242–256. [Google Scholar] [CrossRef]
Johnson, J.; Berg, T.; Anderson, B.; Wright, B. Review of Electric Vehicle Charger Cybersecurity Vulnerabilities, Potential Impacts, and Defenses. Energies 2022, 15, 3931. [Google Scholar] [CrossRef]
Yang, L. Optimized and Automated Machine Learning Techniques Towards IoT Data Analytics and Cybersecurity. Ph.D. Thesis, The University of Western Ontario (Canada), London, ON, Canada, 2022. [Google Scholar]
Liu, M.; Teng, F.; Zhang, Z.; Ge, P.; Sun, M.; Deng, R.; Cheng, P.; Chen, J. Enhancing Cyber-Resiliency of DER-Based Smart Grid: A Survey. IEEE Trans. Smart Grid 2024, 15, 4998–5030. [Google Scholar] [CrossRef]
Zhang, Z.; Liu, M.; Sun, M.; Deng, R.; Cheng, P.; Niyato, D.; Chow, M.-Y.; Chen, J. Vulnerability of Machine Learning Approaches Applied in IoT-Based Smart Grid: A Review. IEEE Internet Things J. 2024, 11, 18951–18975. [Google Scholar] [CrossRef]
Gui, G.; Xue, Z.; Zhao, R.; Deng, X.; Zhan, M. Lightweight Intrusion Detection Methods Based on Artificial Intelligence for IoT Networks. Adv. Mach. Learn. Cyber-Attack Detect. IoT Netw. 2025, 193–225. [Google Scholar] [CrossRef]
Yang, L.; Shami, A. Toward Autonomous and Efficient Cybersecurity: A Multi-Objective AutoML-Based Intrusion Detection System. IEEE Trans. Mach. Learn. Commun. Netw. 2025, 3, 1244–1264. [Google Scholar] [CrossRef]
Radanliev, P.; Santos, O.; Ani, U.D. Generative AI Cybersecurity and Resilience. Front. Artif. Intell. 2025, 8, 1568360. [Google Scholar] [CrossRef]
Buedi, E.D.; Ghorbani, A.A.; Dadkhah, S.; Ferreira, R.L. Enhancing EV Charging Station Security Using a Multi-Dimensional Dataset: CICEVSE2024. In Data and Applications Security and Privacy XXXVIII, Proceedings of the 38th Annual IFIP WG 11.3 Conference, DBSec 2024, San Jose, CA, USA, 15–17 July 2024; Springer: Cham, Switzerland, 2024; Volume 14901, pp. 171–190. [Google Scholar] [CrossRef]
Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization. Proceedings of the 4th International Conference on Information Systems Security and Privacy ICISSP, Funchal, Portugal, 22–24 January 2018, Volume 1, pp. 108–116. [CrossRef]
Bakro, M.; Kumar, R.R.; Husain, M.; Ashraf, Z.; Ali, A.; Yaqoob, S.I.; Ahmed, M.N.; Parveen, N. Building a Cloud-IDS by Hybrid Bio-Inspired Feature Selection Algorithms Along With Random Forest Model. IEEE Access 2024, 12, 8846–8874. [Google Scholar] [CrossRef]
Elmasry, W.; Akbulut, A.; Zaim, A.H. Evolving Deep Learning Architectures for Network Intrusion Detection Using a Double PSO Metaheuristic. Comput. Netw. 2020, 168, 107042. [Google Scholar] [CrossRef]
Yang, L.; Shami, A. A Transfer Learning and Optimized CNN Based Intrusion Detection System for Internet of Vehicles. Proceedings of the 2022 IEEE International Conference on Communications (ICC), Seoul, Republic of Korea, 16–20 May 2022, IEEE: Piscataway, NJ, USA; pp. 1–6.
Naeem, H.; Ullah, F.; Krejcar, O.; Li, D.; Vasan, D. Optimizing Vehicle Security: A Multiclassification Framework Using Deep Transfer Learning and Metaheuristic-Based Genetic Algorithm Optimization. Int. J. Crit. Infrastruct. Prot. 2025, 49, 100745. [Google Scholar] [CrossRef]
Khan, M.A.; Iqbal, N.; Imran; Jamil, H.; Kim, D.H. An Optimized Ensemble Prediction Model Using AutoML Based on Soft Voting Classifier for Network Intrusion Detection. J. Netw. Comput. Appl. 2023, 212, 103560. [Google Scholar] [CrossRef]
Singh, A.; Amutha, J.; Nagar, J.; Sharma, S.; Lee, C.C. AutoML-ID: Automated Machine Learning Model for Intrusion Detection Using Wireless Sensor Network. Sci. Rep. 2022, 12, 9074. [Google Scholar] [CrossRef] [PubMed]
Asim, M.; Sair, F.; Ishaq, M.; Cengiz, K.; Akleylek, S.; Ivković, N. VAE-XGBoost: A Hybrid Intrusion Detection System for next Generation EV Charging Networks. PeerJ Comput. Sci. 2026, 12, e3506. [Google Scholar] [CrossRef]
Lee, J.H.; Park, K.H. GAN-Based Imbalanced Data Intrusion Detection System. Pers. Ubiquitous Comput. 2021, 25, 121–128. [Google Scholar] [CrossRef]
Habibi, O.; Chemmakha, M.; Lazaar, M. Imbalanced Tabular Data Modelization Using CTGAN and Machine Learning to Improve IoT Botnet Attacks Detection. Eng. Appl. Artif. Intell. 2023, 118, 105669. [Google Scholar] [CrossRef]
Bouzeraib, W.; Ghenai, A.; Zeghib, N. Enhancing IoT Intrusion Detection Systems Through Horizontal Federated Learning and Optimized WGAN-GP. IEEE Access 2025, 13, 45059–45076. [Google Scholar] [CrossRef]
Jiang, J.; Tang, Q.; Wang, B.; Tang, X.; Zhu, F.; Yadav, K.; Filali, A.; Vasilakos, A.V. DEAL: Dynamic Ensemble Algorithm for Imbalanced Streaming Data in UAV-Assisted Consumer IoT. IEEE Trans. Consum. Electron. 2026, Early Access. [Google Scholar] [CrossRef]
Wang, Z.; Wang, P.; Liu, K.; Wang, P.; Fu, Y.; Lu, C.T.; Aggarwal, C.C.; Pei, J.; Zhou, Y. A Comprehensive Survey on Data Augmentation. IEEE Trans. Knowl. Data Eng. 2026, 38, 47–66. [Google Scholar] [CrossRef]
Shi, R.; Wang, Y.; Du, M.; Shen, X.; Chang, Y.; Wang, X. A Comprehensive Survey of Synthetic Tabular Data Generation. arXiv 2025, arXiv:2504.16506. [Google Scholar] [CrossRef]
Mirza, M.; Osindero, S. Conditional Generative Adversarial Nets. arXiv 2014, arXiv:1411.1784. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; Association for Computing Machinery: New York, NY, USA, 2016; pp. 785–794. [Google Scholar]
Le, T.-T.-H.; Oktian, Y.E.; Kim, H. XGBoost for Imbalanced Multiclass Classification-Based Industrial Internet of Things Intrusion Detection Systems. Sustainability 2022, 14, 8707. [Google Scholar] [CrossRef]
Grinsztajn, L.; Oyallon, E.; Varoquaux, G. Why do tree-based models still outperform deep learning on tabular data? arXiv 2022, arXiv:2207.08815. [Google Scholar] [CrossRef]
Bergstra, J.; Bardenet, R.; Bengio, Y.; Kégl, B. Algorithms for Hyper-Parameter Optimization. In Proceedings of the 25th International Conference on Neural Information Processing Systems; Curran Associates Inc.: Red Hook, NY, USA, 2011; pp. 2546–2554. [Google Scholar]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-Learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
Chollet, F. Keras. GitHub Repository. 2015. Available online: https://github.com/keras-team/keras (accessed on 9 April 2026).
Pineau, J.; Vincent Larivière, P.; Sinha, K.; Larivière, V.; Beygelzimer, A.; d’Alché Buc, F.; Fox, E.; Larochelle, H. Improving Reproducibility in Machine Learning Research: A Report from the NeurIPS 2019 Reproducibility Program. J. Mach. Learn. Res. 2021, 22, 1–20. [Google Scholar]
Salo, F.; Injadat, M.; Nassif, A.B.; Shami, A.; Essex, A. Data Mining Techniques in Intrusion Detection Systems: A Systematic Literature Review. IEEE Access 2018, 6, 56046–56058. [Google Scholar] [CrossRef]
Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority over-Sampling Technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
Jiang, J.; Liu, F.; Liu, Y.; Tang, Q.; Wang, B.; Zhong, G.; Wang, W. A Dynamic Ensemble Algorithm for Anomaly Detection in IoT Imbalanced Data Streams. Comput. Commun. 2022, 194, 250–257. [Google Scholar] [CrossRef]
Zhang, Z.; Wang, B.; Liu, M.; Qin, Y.; Wang, J.; Tian, Y.; Ma, J. Limitation of Reactance Perturbation Strategy Against False Data Injection Attacks on IoT-Based Smart Grid. IEEE Internet Things J. 2024, 11, 11619–11631. [Google Scholar] [CrossRef]
Liu, M.; Zhang, X.; Zhang, R.; Zhou, Z.; Zhang, Z.; Deng, R. Detection-Triggered Recursive Impact Mitigation Against Secondary False Data Injection Attacks in Cyber-Physical Microgrid. IEEE Trans. Smart Grid 2025, 16, 1744–1761. [Google Scholar] [CrossRef]

Figure 1. The proposed IDS overview.

Figure 2. The confusion matrix of the original XGBoost model on CICEVSE2024.

Figure 3. The confusion matrix of the full proposed framework on CICEVSE2024.

Figure 4. The confusion matrix of the original XGBoost model on CICIDS2017.

Figure 5. The confusion matrix of the full proposed framework on CICIDS2017.

Table 1. Composition of the CICEVSE2024 dataset used.

Class Label (Attack Type or Normal)	Sample Counts	Class Distribution (%)	Original Training Sample Counts	XGBoost Error Counts in Five-Fold CV	Training Sample Counts After EGGA	Test Sample Counts
TCP floods	7696	21.413	6157	1	6177	1539
Stealth SYN scanning	5807	16.157	4645	1	4665	1162
Port scanning	4739	13.185	3791	1	3811	948
Service detection	3690	10.267	2952	6	3072	738
Identity rotation and rotation flood	3421	9.518	2737	0	2737	684
Vulnerability scan	3110	8.653	2488	1	2508	622
OS fingerprinting	2787	7.754	2229	0	2229	558
Aggressive scan	1898	5.281	1518	1	1538	380
UDP flood	1577	4.388	1262	0	1262	315
Slow request starvation	1044	2.905	835	0	835	209
ICMP flood or fragmentation	90	0.250	72	5	172	18
Normal	82	0.228	66	0	66	16

Table 2. Composition of the CICIDS2017 dataset used.

Class Label (Attack Type or Normal)	Sample Counts	Class Distribution (%)	Original Training Sample Counts	XGBoost Error Counts in Five-Fold CV	Training Sample Counts After EGGA	Test Sample Counts
Normal	22,731	40.118	18,129	45	19,029	4602
DoS	19,035	33.595	15,225	15	15,525	3810
Port scan	7946	14.024	6382	9	6562	1564
Brute force	2767	4.883	2243	9	2423	524
Web-attack	2180	3.847	1736	14	2016	444
Bot	1966	3.470	1584	15	1884	382
Infiltration	36	0.064	29	6	149	7

Table 3. Performance evaluation of the proposed models (including ablation studies) and state-of-the-art methods on the CICEVSE2024 dataset.

Category	Method	Accuracy (%)	Precision (%)	Recall (%)	F1 (%)	Training Time (s)	Avg Test Time per Sample (ms)
Optimized ML Methods in the Literature	GA-RF [15]	99.917	99.916	99.917	99.916	36.86	0.0276
	PSO-LSTM [16]	83.948	81.537	83.948	82.174	219.09	0.1461
	PSO-CNN [17]	95.966	96.027	95.966	95.895	245.33	0.0723
	GA-CNN [18]	96.036	96.174	96.036	95.978	242.15	0.1130
	OE-IDS [19]	99.930	99.930	99.930	99.930	165.24	0.0386
	AutoML-ID [20]	99.930	99.930	99.930	99.930	116.23	0.0044
Generative AI Methods in the Literature	XGBoost + EGGA-VAE [21]	99.930	99.930	99.930	99.930	32.62	0.0024
	XGBoost + EGGA-GAN [22]	99.930	99.930	99.930	99.930	68.47	0.0033
	XGBoost + EGGA-CTGAN [23]	99.930	99.930	99.930	99.930	74.61	0.0020
	XGBoost + EGGA-WGAN-GP [24]	99.917	99.916	99.917	99.916	38.54	0.0030
	XGBoost + SMOTE [37]	99.930	99.930	99.930	99.930	7.39	0.0034
Proposed Framework (with Ablation Studies)	XGBoost	99.917	99.916	99.917	99.916	6.86	0.0022
	XGBoost + EGGA-cGAN	99.944	99.945	99.944	99.944	23.18	0.0023
	Full Proposed Framework (XGBoost + EGGA-cGAN + BO-TPE)	99.958	99.958	99.958	99.957	56.32	0.0018

Note: Bold values indicate the best results in each metric column.

Table 4. Performance evaluation of the proposed models (including ablation studies) and state-of-the-art methods on the CICIDS2017 dataset.

Category	Method	Accuracy (%)	Precision (%)	Recall (%)	F1 (%)	Training Time (s)	Avg Test Time per Sample (ms)
Optimized ML Methods in the Literature	GA-RF [15]	99.532	99.534	99.532	99.529	58.28	0.0294
	PSO-LSTM [16]	93.109	94.679	93.109	93.354	169.31	0.0289
	PSO-CNN [17]	94.635	95.700	94.635	94.844	218.19	0.0507
	GA-CNN [18]	94.617	95.614	94.617	94.783	241.14	0.0528
	OE-IDS [19]	99.541	99.543	99.541	99.583	243.12	0.0318
	AutoML-ID [20]	99.665	99.665	99.665	99.661	173.60	0.0045
Generative AI Methods in the Literature	XGBoost + EGGA-VAE [21]	99.806	99.806	99.806	99.802	58.71	0.0024
	XGBoost + EGGA-GAN [22]	99.797	99.797	99.797	99.794	80.82	0.0033
	XGBoost + EGGA-CTGAN [23]	99.779	99.780	99.779	99.776	86.38	0.0020
	XGBoost + EGGA-WGAN-GP [24]	99.797	99.797	99.797	99.794	47.65	0.0030
	XGBoost + SMOTE [37]	99.797	99.797	99.797	99.794	47.65	0.0030
Proposed Framework (with Ablation Studies)	XGBoost	99.779	99.780	99.779	99.776	10.32	0.0018
	XGBoost + EGGA-cGAN	99.815	99.815	99.815	99.811	42.02	0.0019
	Full Proposed Framework (XGBoost + EGGA-cGAN + BO-TPE)	99.832	99.833	99.832	99.829	78.39	0.0018

Note: Bold values indicate the best results in each metric column.

Table 5. XGBoost hyperparameter optimization results.

Hyperparameter of XGBoost	Search Space	Optimal Value on the CICEVSE2024 Dataset	Optimal Value on the CICIDS2017 Dataset
n_estimators	[20, 200]	80	80
max_depth	[5, 100]	30	20
learning_rate	[0.001, 1]	0.733	0.789

Table 6. Class and eror analysis on the CICEVSE2024 dataset.

Class Label (Attack Type or Normal)	Test Set Sample Counts	Original XGBoost Error Counts on the Test Set	Original XGBoost Per-Class F1 (%)	Proposed Method Error Counts on the Test Set	Proposed Method Per-Class F1 (%)
TCP floods	1539	0	99.976	0	100.00
Stealth SYN scanning	1162	0	100.00	0	100.00
Port scanning	948	0	100.00	0	100.00
Service detection	738	2	99.661	0	99.797
Identity rotation and rotation flood	684	0	100.00	0	100.00
Vulnerability scan	622	1	99.920	0	100.00
OS fingerprinting	558	0	100.00	0	100.00
Aggressive scan	380	0	100.00	0	100.00
UDP flood	315	1	99.841	1	99.841
Slow request starvation	209	0	99.761	0	100.00
ICMP flood or fragmentation	18	2	91.429	2	94.117
Normal	16	0	100.00	0	100.00
Overall	7189	6	99.916	3	99.957

Note: Bold values indicate the results that changed after applying the proposed EGGA method.

Table 7. Class and error analysis on the CICIDS2017 dataset.

Class Label (Attack Type or Normal)	Test Set Sample Counts	Original XGBoost Error Counts on the Test Set	Original XGBoost Per-Class F1 (%)	Proposed Method Error Counts on the Test Set	Proposed Method Per-Class F1 (%)
Normal	4602	9	99.729	8	99.794
DoS	3810	7	99.908	5	99.934
Port scan	1564	0	99.936	0	99.936
Brute force	524	2	99.713	1	99.714
Web-attack	444	2	99.774	1	99.887
Bot	382	2	98.958	1	99.348
Infiltration	7	3	72.727	3	72.727
Overall	11,333	25	99.776	19	99.829

Note: Bold values indicate the results that changed after applying the proposed EGGA method.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yang, L.; Kirubavathi, G. EGGA: An Error-Guided Generative Augmentation and Optimized ML-Based IDS for EV Charging Network Security. Future Internet 2026, 18, 202. https://doi.org/10.3390/fi18040202

AMA Style

Yang L, Kirubavathi G. EGGA: An Error-Guided Generative Augmentation and Optimized ML-Based IDS for EV Charging Network Security. Future Internet. 2026; 18(4):202. https://doi.org/10.3390/fi18040202

Chicago/Turabian Style

Yang, Li, and G. Kirubavathi. 2026. "EGGA: An Error-Guided Generative Augmentation and Optimized ML-Based IDS for EV Charging Network Security" Future Internet 18, no. 4: 202. https://doi.org/10.3390/fi18040202

APA Style

Yang, L., & Kirubavathi, G. (2026). EGGA: An Error-Guided Generative Augmentation and Optimized ML-Based IDS for EV Charging Network Security. Future Internet, 18(4), 202. https://doi.org/10.3390/fi18040202

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

EGGA: An Error-Guided Generative Augmentation and Optimized ML-Based IDS for EV Charging Network Security

Abstract

1. Introduction

2. Related Work

2.1. Optimized ML-Based IDSs

2.2. Generative AI-Based IDSs

2.3. Literature Comparison and Our Contributions

3. Proposed cGAN-Based IDS Framework

3.1. System Overview

3.2. Data Pre-Processing

3.3. Proposed cGAN-Based Error-Guided Generative Augmentation (EGGA) Method

3.4. Optimized XGBoost Model Using Bayesian Optimization with Tree-Structured Parzen Estimator (BO-TPE)

4. Performance Evaluation

4.1. Experimental Setup

4.2. Experimental Results and Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI