MDPI - Publisher of Open Access Journals

22 pages, 580 KiB

Open AccessArticle

The Choice of Training Data and the Generalizability of Machine Learning Models for Network Intrusion Detection Systems

by Marcin Iwanowski, Dominik Olszewski, Waldemar Graniszewski, Jacek Krupski and Franciszek Pelc

Appl. Sci. 2025, 15(15), 8466; https://doi.org/10.3390/app15158466 - 30 Jul 2025

Viewed by 249

Network Intrusion Detection Systems (NIDS) driven by Machine Learning (ML) algorithms are usually trained using publicly available datasets consisting of labeled traffic samples, where labels refer to traffic classes, usually one benign and multiple harmful. This paper studies the generalizability of models trained [...] Read more.

Network Intrusion Detection Systems (NIDS) driven by Machine Learning (ML) algorithms are usually trained using publicly available datasets consisting of labeled traffic samples, where labels refer to traffic classes, usually one benign and multiple harmful. This paper studies the generalizability of models trained on such datasets. This issue is crucial given the application of such a model to actual internet traffic because high-performance measures obtained on datasets do not necessarily imply similar efficiency on the real traffic. We propose a procedure consisting of cross-validation using various sets sharing some standard traffic classes combined with the t-SNE visualization. We apply it to investigate four well-known and widely used datasets: UNSW-NB15, CIC-CSE-IDS2018, BoT-IoT, and ToN-IoT. Our investigation reveals that the high accuracy of a model obtained on one set used for training is reproducible on others only to a limited extent. Moreover, benign traffic classes’ generalizability differs from harmful traffic. Given its application in the actual network environment, it implies that one needs to select the data used to train the ML model carefully to determine to what extent the classes present in the dataset used for training are similar to those in the real target traffic environment. On the other hand, merging datasets may result in more exhaustive data collection, consisting of a more diverse spectrum of training samples. Full article

(This article belongs to the Special Issue Artificial Intelligence and Cybersecurity: Challenges and Opportunities)

► Show Figures

Figure 1

17 pages, 3650 KiB

Open AccessArticle

Towards Intelligent Threat Detection in 6G Networks Using Deep Autoencoder

by Doaa N. Mhawi, Haider W. Oleiwi and Hamed Al-Raweshidy

Electronics 2025, 14(15), 2983; https://doi.org/10.3390/electronics14152983 - 26 Jul 2025

Viewed by 154

Abstract

The evolution of sixth-generation (6G) wireless networks introduces a complex landscape of cybersecurity challenges due to advanced infrastructure, massive device connectivity, and the integration of emerging technologies. Traditional intrusion detection systems (IDSs) struggle to keep pace with such dynamic environments, often yielding high [...] Read more.

The evolution of sixth-generation (6G) wireless networks introduces a complex landscape of cybersecurity challenges due to advanced infrastructure, massive device connectivity, and the integration of emerging technologies. Traditional intrusion detection systems (IDSs) struggle to keep pace with such dynamic environments, often yielding high false alarm rates and poor generalization. This study proposes a novel and adaptive IDS that integrates statistical feature engineering with a deep autoencoder (DAE) to effectively detect a wide range of modern threats in 6G environments. Unlike prior approaches, the proposed system leverages the DAE’s unsupervised capability to extract meaningful latent representations from high-dimensional traffic data, followed by supervised classification for precise threat detection. Evaluated using the CSE-CIC-IDS2018 dataset, the system achieved an accuracy of 86%, surpassing conventional ML and DL baselines. The results demonstrate the model’s potential as a scalable and upgradable solution for securing next-generation wireless networks. Full article

(This article belongs to the Special Issue Emerging Technologies for Network Security and Anomaly Detection)

► Show Figures

Figure 1

42 pages, 2224 KiB

Open AccessArticle

Combined Dataset System Based on a Hybrid PCA–Transformer Model for Effective Intrusion Detection Systems

by Hesham Kamal and Maggie Mashaly

AI 2025, 6(8), 168; https://doi.org/10.3390/ai6080168 - 24 Jul 2025

Viewed by 504

Abstract

With the growing number and diversity of network attacks, traditional security measures such as firewalls and data encryption are no longer sufficient to ensure robust network protection. As a result, intrusion detection systems (IDSs) have become a vital component in defending against evolving [...] Read more.

With the growing number and diversity of network attacks, traditional security measures such as firewalls and data encryption are no longer sufficient to ensure robust network protection. As a result, intrusion detection systems (IDSs) have become a vital component in defending against evolving cyber threats. Although many modern IDS solutions employ machine learning techniques, they often suffer from low detection rates and depend heavily on manual feature engineering. Furthermore, most IDS models are designed to identify only a limited set of attack types, which restricts their effectiveness in practical scenarios where a network may be exposed to a wide array of threats. To overcome these limitations, we propose a novel approach to IDSs by implementing a combined dataset framework based on an enhanced hybrid principal component analysis–Transformer (PCA–Transformer) model, capable of detecting 21 unique classes, comprising 1 benign class and 20 distinct attack types across multiple datasets. The proposed architecture incorporates enhanced preprocessing and feature engineering, followed by the vertical concatenation of the CSE-CIC-IDS2018 and CICIDS2017 datasets. In this design, the PCA component is responsible for feature extraction and dimensionality reduction, while the Transformer component handles the classification task. Class imbalance was addressed using class weights, adaptive synthetic sampling (ADASYN), and edited nearest neighbors (ENN). Experimental results show that the model achieves 99.80% accuracy for binary classification and 99.28% for multi-class classification on the combined dataset (CSE-CIC-IDS2018 and CICIDS2017), 99.66% accuracy for binary classification and 99.59% for multi-class classification on the CSE-CIC-IDS2018 dataset, 99.75% accuracy for binary classification and 99.51% for multi-class classification on the CICIDS2017 dataset, and 99.98% accuracy for binary classification and 98.01% for multi-class classification on the NF-BoT-IoT-v2 dataset, significantly outperforming existing approaches by distinguishing a wide range of classes, including benign and various attack types, within a unified detection framework. Full article

► Show Figures

Figure 1

34 pages, 2669 KiB

Open AccessArticle

A Novel Quantum Epigenetic Algorithm for Adaptive Cybersecurity Threat Detection

by Salam Al-E’mari, Yousef Sanjalawe and Salam Fraihat

AI 2025, 6(8), 165; https://doi.org/10.3390/ai6080165 - 22 Jul 2025

Viewed by 346

Abstract

The escalating sophistication of cyber threats underscores the critical need for intelligent and adaptive intrusion detection systems (IDSs) to identify known and novel attack vectors in real time. Feature selection is a key enabler of performance in machine learning-based IDSs, as it reduces [...] Read more.

The escalating sophistication of cyber threats underscores the critical need for intelligent and adaptive intrusion detection systems (IDSs) to identify known and novel attack vectors in real time. Feature selection is a key enabler of performance in machine learning-based IDSs, as it reduces the input dimensionality, enhances the detection accuracy, and lowers the computational latency. This paper introduces a novel optimization framework called Quantum Epigenetic Algorithm (QEA), which synergistically combines quantum-inspired probabilistic representation with biologically motivated epigenetic gene regulation to perform efficient and adaptive feature selection. The algorithm balances global exploration and local exploitation by leveraging quantum superposition for diverse candidate generation while dynamically adjusting gene expression through an epigenetic activation mechanism. A multi-objective fitness function guides the search process by optimizing the detection accuracy, false positive rate, inference latency, and model compactness. The QEA was evaluated across four benchmark datasets—UNSW-NB15, CIC-IDS2017, CSE-CIC-IDS2018, and TON_IoT—and consistently outperformed baseline methods, including Genetic Algorithm (GA), Particle Swarm Optimization (PSO), and Quantum Genetic Algorithm (QGA). Notably, QEA achieved the highest classification accuracy (up to 97.12%), the lowest false positive rates (as low as 1.68%), and selected significantly fewer features (e.g., 18 on TON_IoT) while maintaining near real-time latency. These results demonstrate the robustness, efficiency, and scalability of QEA for real-time intrusion detection in dynamic and resource-constrained cybersecurity environments. Full article

► Show Figures

Figure 1

29 pages, 669 KiB

Open AccessArticle

LLM-Based Cyberattack Detection Using Network Flow Statistics

by Leopoldo Gutiérrez-Galeano, Juan-José Domínguez-Jiménez, Jörg Schäfer and Inmaculada Medina-Bulo

Appl. Sci. 2025, 15(12), 6529; https://doi.org/10.3390/app15126529 - 10 Jun 2025

Viewed by 895

Abstract

Cybersecurity is a growing area of research due to the constantly emerging new types of cyberthreats. Tools and techniques exist to keep systems secure against certain known types of cyberattacks, but are insufficient for others that have recently appeared. Therefore, research is needed [...] Read more.

Cybersecurity is a growing area of research due to the constantly emerging new types of cyberthreats. Tools and techniques exist to keep systems secure against certain known types of cyberattacks, but are insufficient for others that have recently appeared. Therefore, research is needed to design new strategies to deal with new types of cyberattacks as they arise. Existing tools that harness artificial intelligence techniques mainly use artificial neural networks designed from scratch. In this paper, we present a novel approach for cyberattack detection using an encoder–decoder pre-trained Large Language Model (T5), fine-tuned to adapt its classification scheme for the detection of cyberattacks. Our system is anomaly-based and takes statistics of already finished network flows as input. This work makes significant contributions by introducing a novel methodology for adapting its original task from natural language processing to cybersecurity, achieved by transforming numerical network flow features into a unique abstract artificial language for the model input. We validated the robustness of our detection system across three datasets using undersampling. Our model achieved consistently high performance across all evaluated datasets. Specifically, for the CIC-IDS-2017 dataset, we obtained an accuracy, precision, recall, and F-score of more than 99.94%. For CSE-CIC-IDS-2018, these metrics exceeded 99.84%, and for BCCC-CIC-IDS-2017, they were all above 99.90%. These results collectively demonstrate superior performance for cyberattack detection, while maintaining highly competitive false-positive rates and false-negative rates. This efficacy is achieved by relying exclusively on real-world network flow statistics, without the need for synthetic data generation. Full article

(This article belongs to the Special Issue Advances in Cyber Security)

► Show Figures

Figure 1

23 pages, 4049 KiB

Open AccessArticle

ROSE-BOX: A Lightweight and Efficient Intrusion Detection Framework for Resource-Constrained IIoT Environments

by Silin Peng, Yu Han, Ruonan Li, Lichen Liu, Jie Liu and Zhaoquan Gu

Appl. Sci. 2025, 15(12), 6448; https://doi.org/10.3390/app15126448 - 8 Jun 2025

Viewed by 498

Abstract

The rapid advancement of the Industrial Internet of Things (IIoT) has transformed industrial automation, enabling real-time monitoring and intelligent decision making. However, increased connectivity exposes IIoT systems to sophisticated cyber threats, which may pose significant security risks, especially in resource-constrained IIoT environments where [...] Read more.

The rapid advancement of the Industrial Internet of Things (IIoT) has transformed industrial automation, enabling real-time monitoring and intelligent decision making. However, increased connectivity exposes IIoT systems to sophisticated cyber threats, which may pose significant security risks, especially in resource-constrained IIoT environments where computational efficiency is critical. Existing intrusion detection solutions often suffer from high computational overhead and inadequate adaptability, rendering them impractical for real-time deployment in IIoT environments. To address these challenges, this study introduces a lightweight and efficient intrusion detection framework tailored for resource-constrained IIoT environments. Firstly, an XGBoost-assisted Random Forest (XGB-RF) method is proposed to select the most important features to obtain an optimal feature subset. Moreover, SMOTE (Synthetic Minority Oversampling Technique) is utilized to balance the optimal feature subset to improve detection precision. Furthermore, to reduce computing resource requirements and latency while improving detection performance, Bayesian optimization is applied to fine-tune the parameters of XGBoost (BO-XGBoost) to obtain the best detection results. Finally, extensive experiments on benchmark datasets, including CIC-IDS2017, CSE-CIC-IDS2018, and CIC-DDoS2019, demonstrate that the proposed method, which we call ROSE-BOX (Random Forest, Synthetic Minority Oversampling Technique, and BO-Xgboost), achieves a detection accuracy exceeding 99.85% while maintaining low latency and CPU occupancy rates. Our findings highlight the robustness, lightweight nature, and efficiency of ROSE-BOX, making it well-suited for real-time intrusion detection in resource-constrained IIoT environments. Full article

(This article belongs to the Special Issue Advances in the Internet of Things (IoT): Attacks Detection and Privacy Protection)

► Show Figures

Figure 1

18 pages, 4079 KiB

Open AccessArticle

A Scalable Hybrid Autoencoder–Extreme Learning Machine Framework for Adaptive Intrusion Detection in High-Dimensional Networks

by Anubhav Kumar, Rajamani Radhakrishnan, Mani Sumithra, Prabu Kaliyaperumal, Balamurugan Balusamy and Francesco Benedetto

Future Internet 2025, 17(5), 221; https://doi.org/10.3390/fi17050221 - 15 May 2025

Viewed by 709

Abstract

The rapid expansion of network environments has introduced significant cybersecurity challenges, particularly in handling high-dimensional traffic and detecting sophisticated threats. This study presents a novel, scalable Hybrid Autoencoder–Extreme Learning Machine (AE–ELM) framework for Intrusion Detection Systems (IDS), specifically designed to operate effectively in [...] Read more.

The rapid expansion of network environments has introduced significant cybersecurity challenges, particularly in handling high-dimensional traffic and detecting sophisticated threats. This study presents a novel, scalable Hybrid Autoencoder–Extreme Learning Machine (AE–ELM) framework for Intrusion Detection Systems (IDS), specifically designed to operate effectively in dynamic, cloud-supported IoT environments. The scientific novelty lies in the integration of an Autoencoder for deep feature compression with an Extreme Learning Machine for rapid and accurate classification, enhanced through adaptive thresholding techniques. Evaluated on the CSE-CIC-IDS2018 dataset, the proposed method demonstrates a high detection accuracy of 98.52%, outperforming conventional models in terms of precision, recall, and scalability. Additionally, the framework exhibits strong adaptability to emerging threats and reduced computational overhead, making it a practical solution for real-time, scalable IDS in next-generation network infrastructures. Full article

(This article belongs to the Special Issue Scalable and Distributed Cloud Continuum Orchestration for Next-Generation IoT Applications: Latest Advances and Prospects—2nd Edition)

► Show Figures

Figure 1

22 pages, 3438 KiB

Open AccessArticle

A High-Accuracy Advanced Persistent Threat Detection Model: Integrating Convolutional Neural Networks with Kepler-Optimized Bidirectional Gated Recurrent Units

by Guangwu Hu, Maoqi Sun and Chaoqin Zhang

Electronics 2025, 14(9), 1772; https://doi.org/10.3390/electronics14091772 - 27 Apr 2025

Viewed by 890

Abstract

Advanced Persistent Threat (APT) refers to a highly targeted, sophisticated, and prolonged form of cyberattack, typically directed at specific organizations or individuals. The primary objective of such attacks is the theft of sensitive information or the disruption of critical operations. APT attacks are [...] Read more.

Advanced Persistent Threat (APT) refers to a highly targeted, sophisticated, and prolonged form of cyberattack, typically directed at specific organizations or individuals. The primary objective of such attacks is the theft of sensitive information or the disruption of critical operations. APT attacks are characterized by their stealth and complexity, often resulting in significant economic losses. Furthermore, these attacks may lead to intelligence breaches, operational interruptions, and even jeopardize national security and political stability. Given the covert nature and extended durations of APT attacks, current detection solutions encounter challenges such as high detection difficulty and insufficient accuracy. To address these limitations, this paper proposes an innovative high-accuracy APT attack detection model, CNN-KOA-BiGRU, which integrates Convolutional Neural Networks (CNN), Bidirectional Gated Recurrent Units (BiGRU), and the Kepler optimization algorithm (KOA). The model first utilizes CNN to extract spatial features from network traffic data, followed by the application of BiGRU to capture temporal dependencies and long-term memory, thereby forming comprehensive temporal features. Simultaneously, the Kepler optimization algorithm is employed to optimize the BiGRU network structure, achieving globally optimal feature weights and enhancing detection accuracy. Additionally, this study employs a combination of sampling techniques, including Synthetic Minority Over-sampling Technique (SMOTE) and Tomek links, to mitigate classification bias caused by dataset imbalance. Evaluation results on the CSE-CIC-IDS2018 experimental dataset demonstrate that the CNN-KOA-BiGRU model achieves superior performance in detecting APT attacks, with an average accuracy of 98.68%. This surpasses existing methods, including CNN (93.01%), CNN-BiGRU (97.77%), and Graph Convolutional Network (GCN) (95.96%) on the same dataset. Specifically, the proposed model demonstrates an accuracy improvement of 5.67% over CNN, 0.91% over CNN-BiGRU, and 2.72% over GCN. Overall, the proposed model achieves an average improvement of 3.1% compared to existing methods. Full article

(This article belongs to the Special Issue Advanced Technologies in Edge Computing and Applications)

► Show Figures

Figure 1

19 pages, 1222 KiB

Open AccessArticle

A Comparative Study of Two-Stage Intrusion Detection Using Modern Machine Learning Approaches on the CSE-CIC-IDS2018 Dataset

by Isuru Udayangani Hewapathirana

Knowledge 2025, 5(1), 6; https://doi.org/10.3390/knowledge5010006 - 12 Mar 2025

Viewed by 1819

Abstract

Intrusion detection is a critical component of cybersecurity, enabling timely identification and mitigation of network threats. This study proposes a novel two-stage intrusion detection framework using the CSE-CIC-IDS2018 dataset, a comprehensive and realistic benchmark for network traffic analysis. The research explores two distinct [...] Read more.

Intrusion detection is a critical component of cybersecurity, enabling timely identification and mitigation of network threats. This study proposes a novel two-stage intrusion detection framework using the CSE-CIC-IDS2018 dataset, a comprehensive and realistic benchmark for network traffic analysis. The research explores two distinct approaches: the stacked autoencoder (SAE) approach and the Apache Spark-based (ASpark) approach. Each of these approaches employs a unique feature representation technique. The SAE approach leverages an autoencoder to learn non-linear, data-driven feature representations. In contrast, the ASpark approach uses principal component analysis (PCA) to reduce dimensionality and retain 95% of the data variance. In both approaches, a binary classifier first identifies benign and attack traffic, generating probability scores that are subsequently used as features alongside the reduced feature set to train a multi-class classifier for predicting specific attack types. The results demonstrate that the SAE approach achieves superior accuracy and robustness, particularly for complex attack types such as DoS attacks, including SlowHTTPTest, FTP-BruteForce, and Infilteration. The SAE approach consistently outperforms ASpark in terms of precision, recall, and F1-scores, highlighting its ability to handle overlapping feature spaces effectively. However, the ASpark approach excels in computational efficiency, completing classification tasks significantly faster than SAE, making it suitable for real-time or large-scale applications. Both methods show strong performance for distinct and well-separated attack types, such as DDOS attack-HOIC and SSH-Bruteforce. This research contributes to the field by introducing a balanced and effective two-stage framework, leveraging modern machine learning models and addressing class imbalance through a hybrid resampling strategy. The findings emphasize the complementary nature of the two approaches, suggesting that a combined model could achieve a balance between accuracy and computational efficiency. This work provides valuable insights for designing scalable, high-performance intrusion detection systems in modern network environments. Full article

► Show Figures

Figure 1

24 pages, 1605 KiB

Open AccessArticle

CGFL: A Robust Federated Learning Approach for Intrusion Detection Systems Based on Data Generation

by Shu Feng, Luhan Gao and Leyi Shi

Appl. Sci. 2025, 15(5), 2416; https://doi.org/10.3390/app15052416 - 24 Feb 2025

Cited by 1 | Viewed by 927

Abstract

The implementation of comprehensive security measures is a critical factor in the rapid growth of industrial control networks. Federated Learning has emerged as a viable solution for safeguarding privacy in machine learning. The effectiveness of pattern detection in models is diminished as a [...] Read more.

The implementation of comprehensive security measures is a critical factor in the rapid growth of industrial control networks. Federated Learning has emerged as a viable solution for safeguarding privacy in machine learning. The effectiveness of pattern detection in models is diminished as a result of the difficulty in extracting attack information from extremely large datasets and obtaining an adequate number of examples for specific types of attacks. A robust Federated Learning method, CGFL, is introduced in this study to resolve the challenges presented by data distribution discrepancies and client class imbalance. By employing a data generation strategy to generate balanced datasets for each client, CGFL enhances the global model. It employs a data generator that integrates artificially generated data with the existing data from local clients by employing label correction and data generation techniques. The geometric median aggregation technique was implemented to enhance the security of the aggregation process. The model was simulated and evaluated using the CIC-IDS2017 dataset, NSL-KDD dataset, and CSE-CIC-IDS2018 dataset. The experimental results indicate that CGFL does an effective job of enhancing the accuracy of ICS attack detection in Federated Learning under imbalanced sample conditions. Full article

(This article belongs to the Special Issue Advanced Computer Security and Applied Cybersecurity)

► Show Figures

Figure 1

25 pages, 6393 KiB

Open AccessArticle

Re-Evaluating Deep Learning Attacks and Defenses in Cybersecurity Systems

by Meaad Ahmed, Qutaiba Alasad, Jiann-Shiun Yuan and Mohammed Alawad

Big Data Cogn. Comput. 2024, 8(12), 191; https://doi.org/10.3390/bdcc8120191 - 16 Dec 2024

Cited by 2 | Viewed by 1949

Abstract

Cybersecurity attacks pose a significant threat to the security of network systems through intrusions and illegal communications. Measuring the vulnerability of cybersecurity is crucial for refining the overall system security to further mitigate potential security risks. Machine learning (ML)-based intrusion detection systems (IDSs) [...] Read more.

Cybersecurity attacks pose a significant threat to the security of network systems through intrusions and illegal communications. Measuring the vulnerability of cybersecurity is crucial for refining the overall system security to further mitigate potential security risks. Machine learning (ML)-based intrusion detection systems (IDSs) are mainly designed to detect malicious network traffic. Unfortunately, ML models have recently been demonstrated to be vulnerable to adversarial perturbation, and therefore enable potential attackers to crash the system during normal operation. Among different attacks, generative adversarial networks (GANs) have been known as one of the most powerful threats to cybersecurity systems. To address these concerns, it is important to explore new defense methods and understand the nature of different types of attacks. In this paper, we investigate four serious attacks, GAN, Zeroth-Order Optimization (ZOO), kernel density estimation (KDE), and DeepFool attacks, on cybersecurity. Deep analysis was conducted on these attacks using three different cybersecurity datasets, ADFA-LD, CSE-CICIDS2018, and CSE-CICIDS2019. Our results have shown that KDE and DeepFool attacks are stronger than GANs in terms of attack success rate and impact on system performance. To demonstrate the effectiveness of our approach, we develop a defensive model using adversarial training where the DeepFool method is used to generate adversarial examples. The model is evaluated against GAN, ZOO, KDE, and DeepFool attacks to assess the level of system protection against adversarial perturbations. The experiment was conducted by leveraging a deep learning model as a classifier with the three aforementioned datasets. The results indicate that the proposed defensive model refines the resilience of the system and mitigates the presented serious attacks. Full article

► Show Figures

Figure 1

45 pages, 3370 KiB

Open AccessArticle

Adaptive Cybersecurity Neural Networks: An Evolutionary Approach for Enhanced Attack Detection and Classification

by Ahmad K. Al Hwaitat and Hussam N. Fakhouri

Appl. Sci. 2024, 14(19), 9142; https://doi.org/10.3390/app14199142 - 9 Oct 2024

Cited by 6 | Viewed by 3467

Abstract

The increasing sophistication and frequency of cyber threats necessitate the development of advanced techniques for detecting and mitigating attacks. This paper introduces a novel cybersecurity-focused Multi-Layer Perceptron (MLP) trainer that utilizes evolutionary computation methods, specifically tailored to improve the training process of neural [...] Read more.

The increasing sophistication and frequency of cyber threats necessitate the development of advanced techniques for detecting and mitigating attacks. This paper introduces a novel cybersecurity-focused Multi-Layer Perceptron (MLP) trainer that utilizes evolutionary computation methods, specifically tailored to improve the training process of neural networks in the cybersecurity domain. The proposed trainer dynamically optimizes the MLP’s weights and biases, enhancing its accuracy and robustness in defending against various attack vectors. To evaluate its effectiveness, the trainer was tested on five widely recognized security-related datasets: NSL-KDD, CICIDS2017, UNSW-NB15, Bot-IoT, and CSE-CIC-IDS2018. Its performance was compared with several state-of-the-art optimization algorithms, including Cybersecurity Chimp, CPO, ROA, WOA, MFO, WSO, SHIO, ZOA, DOA, and HHO. The results demonstrated that the proposed trainer consistently outperformed the other algorithms, achieving the lowest Mean Square Error (MSE) and highest classification accuracy across all datasets. Notably, the trainer reached a classification rate of 99.5% on the Bot-IoT dataset and 98.8% on the CSE-CIC-IDS2018 dataset, underscoring its effectiveness in detecting and classifying diverse cyber threats. Full article

► Show Figures

Figure 1

17 pages, 733 KiB

Open AccessArticle

A Comparative Analysis of the TDCGAN Model for Data Balancing and Intrusion Detection

by Mohammad Jamoos, Antonio M. Mora, Mohammad AlKhanafseh and Ola Surakhi

Signals 2024, 5(3), 580-596; https://doi.org/10.3390/signals5030032 - 12 Sep 2024

Cited by 1 | Viewed by 1477

Abstract

Due to the escalating network throughput and security risks, the exploration of intrusion detection systems (IDSs) has garnered significant attention within the computer science field. The majority of modern IDSs are constructed using deep learning techniques. Nevertheless, these IDSs still have shortcomings where [...] Read more.

Due to the escalating network throughput and security risks, the exploration of intrusion detection systems (IDSs) has garnered significant attention within the computer science field. The majority of modern IDSs are constructed using deep learning techniques. Nevertheless, these IDSs still have shortcomings where most datasets used for IDS lies in their high imbalance, where the volume of samples representing normal traffic significantly outweighs those representing attack traffic. This imbalance issue restricts the performance of deep learning classifiers for minority classes, as it can bias the classifier in favor of the majority class. To address this challenge, many solutions are proposed in the literature. TDCGAN is an innovative Generative Adversarial Network (GAN) based on a model-driven approach used to address imbalanced data in the IDS dataset. This paper investigates the performance of TDCGAN by employing it to balance data across four benchmark IDS datasets which are CIC-IDS2017, CSE-CIC-IDS2018, KDD-cup 99, and BOT-IOT. Next, four machine learning methods are employed to classify the data, both on the imbalanced dataset and on the balanced dataset. A comparison is then conducted between the results obtained from each to identify the impact of having an imbalanced dataset on classification accuracy. The results demonstrated a notable enhancement in the classification accuracy for each classifier after the implementation of the TDCGAN model for data balancing. Full article

► Show Figures

Figure 1

29 pages, 8035 KiB

Open AccessArticle

A Novel Hybrid Unsupervised Learning Approach for Enhanced Cybersecurity in the IoT

by Prabu Kaliyaperumal, Sudhakar Periyasamy, Manikandan Thirumalaisamy, Balamurugan Balusamy and Francesco Benedetto

Future Internet 2024, 16(7), 253; https://doi.org/10.3390/fi16070253 - 18 Jul 2024

Cited by 11 | Viewed by 6388

Abstract

The proliferation of IoT services has spurred a surge in network attacks, heightening cybersecurity concerns. Essential to network defense, intrusion detection and prevention systems (IDPSs) identify malicious activities, including denial of service (DoS), distributed denial of service (DDoS), botnet, brute force, infiltration, and [...] Read more.

The proliferation of IoT services has spurred a surge in network attacks, heightening cybersecurity concerns. Essential to network defense, intrusion detection and prevention systems (IDPSs) identify malicious activities, including denial of service (DoS), distributed denial of service (DDoS), botnet, brute force, infiltration, and Heartbleed. This study focuses on leveraging unsupervised learning for training detection models to counter these threats effectively. The proposed method utilizes basic autoencoders (bAEs) for dimensionality reduction and encompasses a three-stage detection model: one-class support vector machine (OCSVM) and deep autoencoder (dAE) attack detection, complemented by density-based spatial clustering of applications with noise (DBSCAN) for attack clustering. Accurately delineated clusters aid in mapping attack tactics. The MITRE ATT&CK framework establishes a “Cyber Threat Repository”, cataloging attacks and tactics, enabling immediate response based on priority. Leveraging preprocessed and unlabeled normal network traffic data, this approach enables the identification of novel attacks while mitigating the impact of imbalanced training data on model performance. The autoencoder method utilizes reconstruction error, OCSVM employs a kernel function to establish a hyperplane for anomaly detection, while DBSCAN employs a density-based approach to identify clusters, manage noise, accommodate diverse shapes, automatically determining cluster count, ensuring scalability, and minimizing false positives and false negatives. Evaluated on standard datasets such as CIC-IDS2017 and CSECIC-IDS2018, the proposed model outperforms existing state of art methods. Our approach achieves accuracies exceeding 98% for the two datasets, thus confirming its efficacy and effectiveness for application in efficient intrusion detection systems. Full article

(This article belongs to the Special Issue Cybersecurity in the IoT)

► Show Figures

Figure 1

21 pages, 3903 KiB

Open AccessArticle

End-to-End Network Intrusion Detection Based on Contrastive Learning

by Longlong Li, Yuliang Lu, Guozheng Yang and Xuehu Yan

Sensors 2024, 24(7), 2122; https://doi.org/10.3390/s24072122 - 26 Mar 2024

Cited by 6 | Viewed by 2388

Abstract

The network intrusion detection system (NIDS) plays a crucial role as a security measure in addressing the increasing number of network threats. The majority of current research relies on feature-ready datasets that heavily depend on feature engineering. Conversely, the increasing complexity of network [...] Read more.

The network intrusion detection system (NIDS) plays a crucial role as a security measure in addressing the increasing number of network threats. The majority of current research relies on feature-ready datasets that heavily depend on feature engineering. Conversely, the increasing complexity of network traffic and the ongoing evolution of attack techniques lead to a diminishing distinction between benign and malicious network behaviors. In this paper, we propose a novel end-to-end intrusion detection framework based on a contrastive learning approach. We design a hierarchical Convolutional Neural Network (CNN) and Gated Recurrent Unit (GRU) model to facilitate the automated extraction of spatiotemporal features from raw traffic data. The integration of contrastive learning amplifies the distinction between benign and malicious network traffic in the representation space. The proposed method exhibits enhanced detection capabilities for unknown attacks in comparison to the approaches trained using the cross-entropy loss function. Experiments are carried out on the public datasets CIC-IDS2017 and CSE-CIC-IDS2018, demonstrating that our method can attain a detection accuracy of 99.9% for known attacks, thus achieving state-of-the-art performance. For unknown attacks, a weighted recall rate of 95% can be achieved. Full article

(This article belongs to the Special Issue Intrusion Detection Systems for IoT)

► Show Figures

Figure 1

Search Results (52)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (52)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI