One-Class Anomaly Detection for Industrial Applications: A Comparative Survey and Experimental Study

Paolini, Davide; Dini, Pierpaolo; Soldaini, Ettore; Saponara, Sergio

doi:10.3390/computers14070281

Open AccessArticle

One-Class Anomaly Detection for Industrial Applications: A Comparative Survey and Experimental Study

¹

Department of Information Engineering, University of Pisa, Via G. Caruso n.16, 56122 Pisa, Italy

²

MOBI-EPOWERS Research Group, ETEC Department, Vrije Universiteit Brussel (VUB), 1050 Brussel, Belgium

^*

Author to whom correspondence should be addressed.

Computers 2025, 14(7), 281; https://doi.org/10.3390/computers14070281

Submission received: 21 May 2025 / Revised: 10 July 2025 / Accepted: 11 July 2025 / Published: 16 July 2025

(This article belongs to the Special Issue Intrusion Detection and Trust Provisioning in Edge-of-Things Environment)

Download

Browse Figures

Versions Notes

Abstract

This article aims to evaluate the runtime effectiveness of various one-class classification (OCC) techniques for anomaly detection in an industrial scenario reproduced in a laboratory setting. To address the limitations posed by restricted access to proprietary data, the study explores OCC methods that learn solely from legitimate network traffic, without requiring labeled malicious samples. After analyzing major publicly available datasets, such as KDD Cup 1999 and TON-IoT, as well as the most widely used OCC techniques, a lightweight and modular intrusion detection system (IDS) was developed in Python. The system was tested in real time on an experimental platform based on Raspberry Pi, within a simulated client–server environment using the NFSv4 protocol over TCP/UDP. Several OCC models were compared, including One-Class SVM, Autoencoder, VAE, and Isolation Forest. The results showed strong performance in terms of detection accuracy and low latency, with the best outcomes achieved using the UNSW-NB15 dataset. The article concludes with a discussion of additional strategies to enhance the runtime analysis of these algorithms, offering insights into potential future applications and improvement directions.

Keywords:

one-class classification; machine learning; cybersecurity; IDS RUNTIME; TCP/UDP protocol; features reduction; PCA

1. Introduction

The rapid advancement of digital technologies and industrial paradigms has given rise to a highly interconnected, cloud-based ecosystem, where smart devices and distributed infrastructures generate vast amounts of data [1,2]. Applications like online banking, IoT, real-time communications, and cloud services are now deeply embedded in both private and industrial domains [3,4,5,6]. However, this evolution has also amplified cyber threats, such as identity theft, ransomware, data breaches, and DoS attacks that affect individuals, businesses, and institutions [7,8,9]. In response, defense companies have increasingly embraced cybersecurity [10], investing in advanced detection and response systems [11], with applications extending beyond military to sectors like energy, transportation, and industrial control systems [12,13]. Intrusion detection systems (IDSs), in particular, have benefited from the integration of machine learning techniques [14,15,16]. Among these, one-class classification (OCC) models have proven effective in scenarios with scarce or absent malicious data [17,18], detecting anomalies by learning patterns of normal behavior [19]. This article combines a comparative survey of public cybersecurity datasets and OCC techniques with a runtime evaluation conducted on a custom-built network testbed that simulates real-world conditions. The analysis highlights the strengths and weaknesses of each method and outlines potential directions for future research and practical deployment. Designed as a hybrid contribution, this work integrates a broad methodological review with a focused experimental evaluation of OCC models best suited for latency-sensitive and integrable IDS scenarios.

2. Related Works

The implementation of AI into cybersecurity has become increasingly critical, particularly in military contexts, where infrastructures face advanced and persistent threats [20,21]. Techniques such as ML, DL, statistical learning, and NLP have demonstrated strong capabilities in malware detection, phishing analysis, and automated threat classification [22,23,24]. However, their complexity often limits the use of real-time applications [25,26]. AI-based IDSs are essential for ensuring confidentiality, integrity, and availability in sensitive networks [27,28]. Leveraging clustering, entropy analysis, and edge cloud simulation, these systems effectively detect threats such as DDoS and code injection [29]. However, public datasets (e.g., KDD99, UNSW-NB15, TON-IoT) often lack realism, limiting generalization to complex enterprise or defense settings [30,31,32,33]. OCC models offer a viable alternative, as they require only normal traffic for training and adapt well to evolving threats without prior attack knowledge [34,35]. However, privacy restrictions in sectors such as defense hinder data sharing, requiring anonymization and encryption, often at the cost of data richness [36,37,38]. Consequently, robust preprocessing is essential to preserve model performance [39,40]. This work evaluates the suitability of widely used public datasets for training ML-based IDSs under realistic conditions, proposing a privacy-preserving framework for custom deployment without external data exposure. While existing surveys focus on OCC taxonomies and benchmarks [41,42], they often overlook runtime behavior. This study bridges that gap through a dual-layered approach, combining a critical survey with empirical evaluation on a modular, edge-based testbed, providing actionable insights for IDS deployment in constrained environments.

3. Methodology

This study addresses the gap in the literature regarding the runtime evaluation of ML-based intrusion detection techniques. The methodology is structured as follows:

Dataset Analysis, Selection Justification, and Bias Assessment:
A comparative analysis of widely adopted public datasets was conducted to identify those most suitable for intrusion detection research. The selection focused on well-established and frequently cited datasets to ensure relevance and reproducibility, while newer or poorly validated datasets were excluded. In addition, a dataset selection justification was performed to align each dataset’s characteristics with the objectives of the study. A dedicated bias analysis was also carried out to identify structural imbalances, outdated attack representations, and distributional anomalies, which could influence model performance and generalizability.
One-Class Anomaly Detection Techniques - Taxonomy and Selection:
An initial taxonomy of one-class classification (OCC) techniques was developed to categorize the most widely used anomaly detection models based on their methodological foundations. This analysis enabled a structured comparison of OCC approaches and informed the subsequent selection of models for evaluation. Only algorithms that have been extensively validated in offline settings using the previously selected benchmark datasets were considered for real-time experimentation.
Data Preprocessing:
Data cleaning involved removing missing, redundant, or non-informative features. Principal component analysis (PCA) was applied to reduce dimensionality while preserving at least 95% of the total variance. Correlation matrices and variance plots supported feature selection.
Experimental Setup:
A custom testbed simulating an NFSv4-based industrial network was developed. NFSv4 was selected due to its relevance in SCADA, PLC, and IoT-based systems, allowing centralized, non-redundant data access for monitoring and analysis.
Runtime Evaluation:
Models were tested in real time using key metrics such as accuracy, detection latency, precision, recall, F1-score, and runtime stability. Multiple dataset–algorithm combinations were benchmarked to identify the most robust and efficient configurations.
Conclusions and Future Work:
The study highlights the feasibility of deploying OCC-based IDS models in real-world conditions, going beyond traditional offline assessments. Proposed improvements include expanding the testbed complexity, integrating more recent datasets, and exploring next-generation ML approaches for enhanced evaluation under dynamic and constrained environments.

4. Dataset

Designing an effective dataset for training IDS models is a fundamental step in developing modern security systems [43]. The dataset must not only be representative but also flexible enough to adapt to various network environments [44]. A key challenge in training OCC algorithms is ensuring that the dataset reflects traffic dynamics consistent with real-world enterprise networks [45]. Without this alignment, the model’s effectiveness may be compromised. Over time, several increasingly realistic datasets have been introduced, some becoming standard benchmarks in cybersecurity research [46]. The following section provides an overview of the most widely adopted public IDS datasets (as shown in Table 1), highlighting their structure and primary use cases.

4.1. KDD Cup 1999

The KDD Cup 1999 dataset [47] is one of the most widely adopted benchmarks for evaluating IDS and remains a historical reference in the field [48]. Derived from the DARPA 1998 Intrusion Detection Evaluation Program, it was built from raw TCP packet traces collected over nine weeks from a simulated military LAN environment. These traces were processed to generate over 4.8 million connection records, structured into 42 features describing traffic behavior such as duration, protocol type, TCP flags, and byte counts. The dataset classifies network activity into five major categories: Normal, DoS, R2L, U2R, and Probe, as illustrated in Figure 1 and Table 2.

Each category includes multiple attack types (e.g., SYN flood, buffer overflow, port scanning), representing various strategies and severities. However, the dataset has well-documented limitations: it contains a high percentage of duplicate records (78% in training, 89.5% in test) and exhibits significant class imbalance, as normal traffic dominates the training set while DoS attacks are overrepresented in testing. Furthermore, due to its age and synthetic origin, many of its attack patterns are outdated and lack key modern features such as IP addresses and temporal attributes [49]. Despite these issues, KDD Cup 1999 is still extensively used as a baseline benchmark for evaluating new IDS models, especially in early-stage research, although its results should be interpreted with caution when generalizing to real-world scenarios [50].

4.2. NSL-KDD

To address the well-known limitations of the KDD Cup 1999 dataset, particularly those related to data redundancy and class imbalance, the NSL-KDD dataset was developed [48]. This new dataset was created by carefully selecting a subset of records from the original KDD dataset, with the aim of preserving its usefulness while correcting its major shortcomings. NSL-KDD is the result of a refinement process that involved removing duplicate instances and overly frequent records, thereby improving the overall quality of the data available for training machine learning models [51,52]. Moreover, the reduced yet representative number of instances makes it less computationally demanding and more suitable for a realistic evaluation of algorithm performance. Despite these improvements, NSL-KDD still inherits some structural limitations [53]. The data it relies on still originates from the simulated traffic of the DARPA 1998 network, which introduces the risk of biases due to the lack of representation of dynamics typical of modern networks. One notable consequence of this limitation is the poor representation of low-footprint attacks, which are increasingly common and relevant in real-world scenarios today [54]. From a statistical perspective, NSL-KDD shows significant progress in class distribution, and this distribution in training and test data is shown in Figure 2. See Table 3 for a description of the main characteristics of the dataset itself.

Normal traffic accounts for approximately 51.88% of the data, while anomalous traffic makes up the remaining 48.12%, resulting in a nearly perfectly balanced dataset. This balance is also reflected in the entropy between normal and anomalous classes, which reaches a value of 0.999 very close to the ideal for effective classification [55]. In contrast, the entropy difference between normal and anomalous traffic in the KDD-99 dataset was 0.719, while the entropy across the various attack categories was only 0.214 [56]. This indicates greater variability among attack types than between benign and malicious traffic, suggesting that the original dataset was not only imbalanced but also potentially misleading for training generalizable models. The introduction of NSL-KDD therefore marked an important step toward a more realistic, balanced, and consistent data foundation, even though it remains inevitably tied to an outdated generation of network data [57].

4.3. CICIDS 2017

The CICIDS 2017 dataset [58], developed by the Canadian Institute for Cybersecurity (CIC), represents a major step forward in the development of realistic benchmarks for evaluating IDS [59,60]. Designed to address the limitations of earlier datasets like KDD Cup 1999 and NSL-KDD, CICIDS 2017 provides a more accurate representation of contemporary enterprise network environments by including up-to-date attack types and realistic traffic patterns. Composed of approximately 2.8 million records, each representing a unique network flow, the dataset includes 80 features describing aspects such as IP addresses, ports, protocols, duration, packet and byte counts, and flow-based statistics [61]. This rich feature set enables the modeling of high-dimensional traffic behavior, essential for training advanced machine learning models [62]. CICIDS 2017 includes a wide range of attack scenarios, such as DoS and DDoS, brute force attempts (e.g., SSH/HTTP), port scans, web attacks like SQL injection and XSS, as well as botnet activity and infiltration with data exfiltration. The distribution of these categories across the dataset is shown in Figure 3.

Despite its value, the dataset was generated in a controlled testbed, and as such, it may not fully reflect the complexity and unpredictability of real-world networks. Some traffic distributions may also introduce biases, affecting model generalization. To improve robustness, it is recommended to use CICIDS 2017 alongside additional real-world data and apply cross-validation or data augmentation techniques [63]. Nevertheless, CICIDS 2017 remains one of the most comprehensive and accessible datasets for evaluating both traditional and deep-learning-based IDS approaches, making it a cornerstone for research and benchmarking in network security [64]. See Table 4 for the description of the main characteristics of the dataset itself.

4.4. UNSW-NB15

The UNSW-NB15 dataset, introduced in 2015, was developed to overcome the limitations of legacy benchmarks such as KDD Cup 1999 and NSL-KDD, which no longer reflect modern network environments or low-footprint threats [65,66,67]. Created at the Cyber Range Lab (University of New South Wales) using the IXIA PerfectStorm traffic generator linked to the CVE vulnerability database, it simulates both realistic benign and malicious network activity [68,69]. The testbed included virtual servers, routers, and clients within a diversified infrastructure. Traffic was collected during two sessions, producing over 100 GB of PCAP data, later converted into CSV format using tools such as tcpdump, Argus, Bro-IDS (Zeek), and custom C# scripts [70]. The final dataset consists of 2,540,044 records and 49 features, categorized as Basic, Flow, Content, Time, Additional, and Labelled features [71] as shown in Figure 4 and Table 5.

Attacks are grouped into nine categories: Fuzzers, Analysis, Backdoors, DoS, Exploits, Generic, Reconnaissance, Shellcode, and Worms. Each record includes ground-truth labels with metadata such as timestamp, protocol, IP, and attack type [72]. With an entropy of 0.548 (normal vs. anomalous) and 0.514 among attack classes [73], the dataset ensures reasonable balance. Despite minor issues like the broad “Generic” label, UNSW-NB15 is widely regarded as a comprehensive and modern benchmark for both signature-based and anomaly-based IDS research [74]. Distributed across four CSV files and accompanied by ground-truth and event files [75], UNSW-NB15 improves significantly upon KDD99 in terms of attack diversity, feature richness, and network realism [76,77].

4.5. Bot-IoT

The Bot-IoT dataset [78,79], released in 2019 by UNSW Canberra, was specifically designed to emulate realistic IoT environments and modern botnet-based threats [80,81]. It was generated within a virtualized testbed simulating both benign IoT activity and malicious botnet traffic [82,83]. The dataset is distributed across multiple partitions, including the Raw Set (PCAP), Full Set (CSV), 5% Subset, and 10 Best Subset [84]. The Raw Set, totaling around 70 GB, contains binary traffic captures requiring preprocessing through tools like Wireshark, Zeek, or Argus [85]. For dataset construction, PCAP files were processed using Argus and stored in a MySQL database [86,87], from which the Full Set was exported, generating approximately 72 million records. Each instance includes 26 independent and 3 dependent features, though extended engineered features were not included in the original release [88,89].

Bot-IoT qualifies as a big data resource, featuring high volume, variety, and complexity, making it well suited for evaluating IDS in large-scale IoT contexts [90,91]. However, it suffers from extreme class imbalance, as illustrated in Figure 5 and Table 6. In binary classification, attack samples dominate; in multiclass tasks, DoS and DDoS traffic overwhelms less frequent categories, impairing model generalization and learning on minority classes.

4.6. TON-IoT

The TON-IoT dataset [92,93] is a modern large-scale benchmark designed for cybersecurity research involving ML and AI [94,95,96]. Developed within a three tier edge–fog–cloud architecture [97], it replicates the distributed nature of real-world IoT/IIoT infrastructures [98,99]. The dataset integrates telemetry from IoT sensors, system logs from Windows/Linux environments, and network traffic captured via Zeek [100,101]. From the network traces, 43 features were extracted, along with binary and multiclass labels [102]. A total of nine attack types are represented: backdoor, DoS, DDoS, injection, MITM, password cracking, ransomware, scanning, and XSS [103], as shown in Figure 6 and Table 7.

Each attack scenario was crafted to emulate real industrial threats. Notably, the dataset exhibits class imbalance, with benign traffic outweighing malicious samples [104], reflecting operational network environments [105]. This characteristic enhances its relevance for anomaly detection, where rare but impactful events must be identified in large volumes of normal traffic. Due to its multimodal nature, realism, and diversity of features, TON-IoT supports a wide range of research applications, including intrusion detection, behavioral modeling, threat prediction, malware analysis, and adversarial ML [98,106]. It stands as a representative benchmark for evaluating modern cybersecurity solutions in connected infrastructures.

4.7. Dataset Selection Justification

Although several public IDS datasets exist, we intentionally selected a subset of four representative datasets (NSL-KDD, UNSW-NB15, CICIDS2017, TON_IoT) based on coverage, diversity, and relevance to the targeted runtime IDS scenario. This choice is motivated by the following considerations:

Attack class overlap: The selected datasets already include a comprehensive variety of attacks, such as DoS/DDoS, brute force, infiltration, reconnaissance, shellcode, and data exfiltration, which are also present in BoT-IoT and CSE-CIC-IDS2018. Therefore, including those additional datasets would not introduce substantially novel threat scenarios.
Source domain coverage: We ensured coverage across different environments: legacy datasets (NSL-KDD), enterprise-grade simulations (CICIDS2017), IoT-specific topologies (TON_IoT), and low-footprint attacks (UNSW-NB15). This combination already spans the intended industrial use cases.
Avoiding redundancy and dataset inflation: The number of publicly available IDS datasets has grown significantly (e.g., on platforms like Kaggle, OpenML, and CICLab). Including all would dilute experimental focus and increase computational cost without a corresponding gain in insight.
Quality and structure constraints: Some datasets, such as BoT-IoT, suffer from extreme class imbalance and redundancy, while CSE-CIC-IDS2018 is structurally similar to CICIDS2017 but with partial overlaps in feature sets and attack definitions. These factors would have introduced additional preprocessing complexity without offering unique evaluation benefits.

Therefore, we focused on a carefully chosen selection of four datasets that collectively reflect the diversity and relevance of modern network traffic patterns, both in traditional and IoT-enabled environments, without sacrificing clarity or reproducibility.

As shown in Table 8, all relevant attack classes present in BoT-IoT and CSE-CIC-IDS2018 are already well represented within the four datasets analyzed in this work. This justifies the decision to exclude redundant datasets without compromising coverage or diversity.

4.8. Dataset Bias Analysis

While previous sections discussed dataset limitations qualitatively, we now provide a quantitative assessment of class imbalance and entropy characteristics for the four datasets used in our experiments: NSL-KDD, UNSW-NB15, CICIDS2017, and TON_IoT. These metrics help evaluate the degree of dataset bias and its potential impact on model performance:

1.: Imbalance Ratio (IR)
It is defined as the ratio between the number of instances in the majority class and that of the minority class:

$I R = \frac{max (class counts)}{min (class counts)}$

(1)

In binary classification (normal vs. anomalous), an $I R$ close to 1 indicates balance, while high values reflect skewness and difficulty for models to generalize.
2.: Class Entropy
Entropy H is a measure of uncertainty or diversity in class labels:

$H = - \sum_{i = 1}^{C} p_{i} \cdot {log}_{2} (p_{i})$

(2)

where $p_{i}$ is the proportion of class i and C is the number of classes. Higher entropy indicates more uniform class distributions; lower entropy suggests dominance of one class.
3.: Skewness of Entropy
We also compute the normalized entropy skewness $δ_{H}$ with respect to the maximum entropy $H_{m a x} = {log}_{2} (C)$ :

$δ_{H} = 1 - \frac{H}{H_{m a x}}$

(3)

Values closer to 1 indicate severe imbalance; values close to 0 indicate uniformity.

The evaluation metrics briefly described above were computed for each of the proposed datasets, with the corresponding results summarized in Table 9 below.

An examination of the results presented in Table 9 reveals that:

NSL-KDD exhibits near-perfect class balance, both in IR and in entropy. This makes it ideal for training, but it may not reflect real-world distributions.
UNSW-NB15 and CICIDS2017 show moderate imbalance (IR between 5 and 7), with relatively high entropy, indicating various types of attacks.
TON_IoT, while modern and representative, shows severe class skewness, with IR ≈ 26 and low entropy. This may lead to underfitting of rare attack behaviors and high false negatives.

These results quantitatively validate the performance differences reported in Section 7.2, where models trained on UNSW-NB15 consistently outperformed others. They also support the claim that the selection of the dataset critically affects the behavior and generalization of the OCC model.

5. ML Methods One-Class Anomaly Detection

This section analyzes some of the most established and commonly used ML techniques for anomaly detection based on the OCC approach, with particular emphasis on their applicability in runtime environments. The analysis focuses on four algorithms considered particularly effective and versatile for a runtime application: AE [107], VAE [108,109], OCSVM [110], and iF [111]. The fundamental characteristics of each method, as well as their functioning within the context of one-class anomaly detection, will be briefly described below.

5.1. Taxonomy of One-Class Anomaly Detection Techniques

In this section, we present a taxonomy of one-class classification (OCC) methods used in anomaly detection, based on three orthogonal criteria: detection principle, model complexity, and runtime applicability. This classification helps contextualize the methods discussed in this work and highlights their respective trade-offs in terms of detection strategy, scalability, and deployment constraints.

1.: Detection Principle (Core Mechanism):

Reconstruction-based: Models such as Autoencoder (AE) and Variational Autoencoder (VAE) aim to learn and reproduce normal input data, identifying anomalies via high reconstruction error.
Boundary-based: One-Class SVM (OCSVM) and Deep SVDD create a compact boundary around normal data, flagging any deviation.
Isolation-based: Isolation Forest separates anomalies by exploiting the fact that they are easier to isolate through random partitioning.
Density/Distance-based: LOF and kNN-based approaches evaluate local data density or proximity to detect outliers.

2.: Learning Paradigm:

Shallow learning: Classical models (e.g., OCSVM, LOF, kNN, iForest) offer simplicity and interpretability, suitable for small or structured data.
Deep learning: AE, VAE, and Deep SVDD are more expressive and powerful, particularly for complex or high-dimensional data.

3.: Runtime Deployment Suitability:

Edge-ready: AE, iForest, and OCSVM have acceptable latency and memory requirements for resource-constrained environments.
Resource-intensive: VAE, Deep SVDD, and LOF may require significant CPU/GPU and are better suited for cloud or offline processing.

This taxonomy supports the comparative analysis of OCC models presented in Section 5.6 and provides additional clarity for readers evaluating deployment strategies across different operational environments. See Table 10 for a summary on OCC Taxonomy.

5.2. Autoencoder One-Class for Anomaly Detection

The AE is a specialized neural network architecture widely used for tasks such as dimensionality reduction, feature extraction, and data compression [112]. Its ability to learn compact representations of input data makes it particularly useful for cybersecurity applications, especially for identifying anomalous behavior in networks [113]. An autoencoder consists of two main components: an encoder, which compresses the input data into a lower-dimensional latent space, and a decoder, which reconstructs the input from this compressed representation. During the encoding phase, the input vector

x = {x_{1}, x_{2}, \dots, x_{n}}

is transformed into a latent representation

h = {h_{1}, h_{2}, \dots, h_{m}}

, where

m < n

. This transformation is defined as:

h = f (x) = ϕ (W x + b),

(4)

where

W

is the weight matrix,

b

is the bias vector, and

ϕ

is the activation function (e.g., ReLU or Sigmoid) applied element-wise. In the decoding phase, the latent representation

h

is mapped back into the original input space to produce a reconstruction

\hat{x} = {{\hat{x}}_{1}, {\hat{x}}_{2}, \dots, {\hat{x}}_{n}}

. This is performed via:

\hat{x} = g (h) = ψ (W^{'} h + b^{'}),

(5)

where

W^{'}

and

b^{'}

are the decoder’s weights and biases, and

ψ

is the decoder’s activation function. The goal of training the autoencoder is to minimize the difference between the original input

x

and its reconstruction

\hat{x}

[114]. The optimization is carried out using gradient descent methods, typically minimizing a loss function such as the binary cross-entropy (BCE), which is especially effective when the input data are binary or normalized in the [0, 1] range [115]:

L_{BCE} = \frac{1}{M} \sum_{i = 1}^{M} \sum_{j = 1}^{n} [x_{j, i} log ({\hat{x}}_{j, i}) + (1 - x_{j, i}) log (1 - {\hat{x}}_{j, i})],

(6)

where M is the number of training samples and

x_{j, i}

is the j-th feature of the i-th instance.

In cybersecurity, one-class autoencoders are trained exclusively on normal traffic, learning to reconstruct only legitimate network behavior [116]. During inference, any significant deviation between the input and its reconstruction indicates an anomaly or potential cyber threat [117]. In this setup, the AE operates as a one-class anomaly detection model, effectively identifying previously unseen attack patterns without requiring labeled malicious examples during training [118]. A summary diagram showing how we have implemented this technique in our work is included in Figure 7 below.

5.3. Variational Autoencoder (VAE) for One-Class Anomaly Detection

The VAE is a generative model particularly well suited for learning the underlying distribution of input data [115]. It is widely employed in applications involving density estimation and probabilistic representation, making it especially relevant in the field of cybersecurity for one-class anomaly detection [109,119,120]. The architecture consists of two primary components: an encoder and a decoder. The encoder, denoted as

q_{ϕ} (z | x)

, maps the input data x into a latent variable z described by a Gaussian distribution parameterized by a mean

μ_{ϕ} (x)

and a standard deviation

σ_{ϕ} (x)

. To allow backpropagation through stochastic sampling, the reparameterization trick is applied as follows:

z = μ_{ϕ} (x) + σ_{ϕ} (x) ⊙ ϵ, ϵ \sim N (0, I)

(7)

The decoder, expressed as

p_{θ} (\hat{x} | z)

, reconstructs the input data

\hat{x}

from the latent vector z. The training objective is to minimize the difference between the original input x and the reconstruction

\hat{x}

, while also regularizing the latent space through variational inference. This is achieved by minimizing the Kullback –Leibler (KL) divergence between the approximate posterior

q_{ϕ} (z | x)

and a predefined prior, typically a standard normal distribution

p (z) = N (0, I)

[121,122]. The total VAE loss function is composed of two terms:

L_{VAE} (x, \hat{x}) = E_{q_{ϕ} (z | x)} [- log p_{θ} (x | z)] + KL (q_{ϕ} (z | x) ∥ p (z))

(8)

The first term is the negative log-likelihood, quantifying the reconstruction error [123], and the second term enforces a regularized and meaningful latent space [124]. In the context of one-class anomaly detection, the VAE is trained exclusively on normal data, learning a compressed probabilistic representation of typical behavior [125]. During inference, inputs that produce low reconstruction accuracy or deviate from the learned latent distribution are flagged as potential anomalies or threats [126]. Unlike traditional autoencoders, the VAE introduces regularization that avoids overfitting and enables better generalization [119,123]. Moreover, the structured latent space enhances both interpretability and generative capabilities [127]. These characteristics make VAE a powerful tool for detecting unknown or unseen cyber threats in real-world network environments, especially where labeled attack data are scarce or unavailable [128]. A summary diagram showing how we have implemented this technique in our work is included in Figure 8 below.

5.4. Isolation Forest for One-Class Anomaly Detection

iForest is an ensemble-based anomaly detection algorithm specifically designed to identify rare or anomalous patterns in large datasets [129,130]. Unlike traditional density-based or distance-based methods, which attempt to model normal behavior, Isolation Forest follows a fundamentally different approach by directly targeting how easily an observation can be isolated from the rest of the data [131]. The key intuition behind Isolation Forest is that anomalies are data points that are few and different. Hence, they are more susceptible to isolation by random partitioning [132]. In contrast, normal instances reside in dense regions and require more partitions to be isolated. An Isolation Forest model is constructed by building an ensemble of binary decision trees, known as isolation trees [133]. Each tree is built by randomly selecting a feature and a split value between the minimum and maximum value of that feature [134]. This process is repeated recursively, producing randomly partitioned subspaces. The number of splits required to isolate a point corresponds to the path length from the root node to the terminating node [135]. Anomalies, which are easier to isolate, tend to have shorter average path lengths across the ensemble [136]. The anomaly score for a given observation x is defined as:

s (x, n) = 2^{- \frac{E (h (x))}{c (n)}}

(9)

where:

$E (h (x))$ is the average path length of point x across all trees;
$c (n)$ is the average path length of unsuccessful searches in a binary search tree, used to normalize the score, and is approximated as:

c (n) = 2 H (n - 1) - \frac{2 (n - 1)}{n}

(10)

with

H (i)

being the i-th harmonic number, approximated by

ln (i) + γ

(Euler–Mascheroni constant

γ \approx 0.5772

) [137]. An anomaly score

s (x, n)

close to 1 indicates a high likelihood of x being anomalous, while values near 0 suggest normality. In one-class anomaly detection, the Isolation Forest is trained exclusively on normal data. During inference, instances that yield high anomaly scores are flagged as outliers, as they diverge from the patterns learned from the normal class. Isolation Forest has several advantages in cybersecurity contexts: it is computationally efficient, scales well with high-dimensional data, does not require distance metrics, and is robust against irrelevant features [138]. These characteristics make it particularly suitable for real-time intrusion detection systems, especially when labeled attack data are scarce or unavailable [139]. A summary diagram showing how we have implemented this technique in our work is included in Figure 9 below.

5.5. One-Class Support Vector Machine (OCSVM) for Anomaly Detection

The One-Class Support Vector Machine (OCSVM) is a variation of the traditional Support Vector Machine (SVM) designed specifically for one-class classification problems, which are typical in anomaly detection scenarios [140,141,142]. The key idea behind OCSVM is to identify a hyperplane in the feature space that is maximally distant from the origin, under the assumption that only samples from the normal class are available during training [143,144]. The separating hyperplane is defined by a weight vector

w

and a bias term b, and is described by the following equation:

w^{⊤} x + b = 0

(11)

Unlike classical SVM, which aims to maximize the margin between two classes, the goal in OCSVM is to find the decision boundary that best encompasses the majority of data points while remaining far from the origin [145]. Any data point that falls outside this learned boundary can be considered a potential anomaly. To handle nonlinear relationships in the data, OCSVM employs a kernel function to implicitly map the input data to a higher-dimensional feature space where linear separation is possible [146,147]. Let

ϕ (\cdot)

denote the nonlinear mapping function, and let the training dataset be represented by

X = [x_{1}, x_{2}, \dots, x_{N}] \in R^{D}

, where N is the number of samples and D is the dimensionality of the input space. The optimization problem for the OCSVM is formulated as:

min_{w, b, ξ} \frac{1}{2} {∥ w ∥}^{2} + C \sum_{i = 1}^{N} ξ_{i} - b

(12)

subject to the constraints:

w^{⊤} ϕ (x_{i}) \geq b - ξ_{i}, ξ_{i} \geq 0, \forall i = 1, \dots, N

(13)

Here, the slack variables

ξ_{i}

allow for a small fraction of training samples to violate the boundary (i.e., to be considered outliers). The hyperparameter

C > 0

controls the trade-off between the model complexity and the number of allowed violations. A large value of C imposes strict penalties on misclassifications, while a smaller value provides more flexibility [148,149]. The OCSVM is particularly well suited for cybersecurity applications, where it is common to have access only to normal (benign) traffic during training [150]. By learning the characteristics of legitimate behavior, the model is able to identify deviations or anomalies at inference time without requiring labeled attack data [151]. This makes it a powerful tool for detecting novel threats or zero-day attacks in real-world environments [152]. A summary diagram showing how we have implemented this technique in our work is included in Figure 10 below.

5.6. Other One-Class Anomaly Detection Techniques: LOF, Deep SVDD, and kNN

In addition to the four core OCC techniques implemented and tested in our runtime setup, several other models are commonly adopted in anomaly detection research. This section provides a concise overview of three significant alternatives, Local Outlier Factor (LOF)-, Deep Support Vector Data Description (Deep SVDD)-, and k-Nearest Neighbor (kNN)-based OCC methods that were excluded from the experimental validation due to runtime constraints, but are highly relevant in comparative surveys.

1.: Local Outlier Factor (LOF)
LOF is a density-based anomaly detection method that estimates the degree of isolation of a data point by comparing its local density with that of its neighbors. The local reachability density of a point $x_{i}$ is computed based on its k-nearest neighbors, and the LOF score is defined as:

${LOF}_{k} (x_{i}) = \frac{1}{| N_{k} (x_{i}) |} \sum_{x_{j} \in N_{k} (x_{i})} \frac{{lrd}_{k} (x_{j})}{{lrd}_{k} (x_{i})}$

(14)

where ${lrd}_{k} (x)$ denotes the local reachability density, and $N_{k} (x_{i})$ is the set of the k-nearest neighbors of $x_{i}$ . An LOF score significantly greater than 1 indicates a potential outlier. Although efficient and unsupervised, LOF is sensitive to the choice of k and may degrade in high-dimensional feature spaces.
2.: Deep Support Vector Data Description (Deep SVDD)
Deep SVDD [153] extends the classical SVDD method by leveraging a deep neural network to learn a nonlinear transformation $ϕ (x; θ)$ that maps inputs into a latent space where normal samples are enclosed in a minimal-radius hypersphere. The training objective is to minimize:

$min_{θ} \frac{1}{n} \sum_{i = 1}^{n} {∥ ϕ (x_{i}; θ) - c ∥}^{2}$

(15)

where c is the center of the hypersphere in the latent space. This approach avoids the need for input reconstruction, focusing instead on compressing normal behavior representations. Deep SVDD is particularly effective on structured data but incurs a high computational cost, making it less suitable for edge-level runtime deployment.
3.: kNN-Based One-Class Detection
This method assigns anomaly scores to new observations based on their distances from k-nearest neighbors in the training data [154]:

${score}_{avg} (x) = \frac{1}{k} \sum_{x_{i} \in N_{k} (x)} ∥ x - x_{i} ∥ or {score}_{min} (x) = min_{x_{i} \in N_{k} (x)} ∥ x - x_{i} ∥$

(16)

If the score exceeds a threshold $τ$ , the sample is flagged as anomalous. kNN-based OCC is simple and interpretable but not well suited to high-volume or high-dimensional data due to scalability issues.

5.7. Discussion and Justification of Exclusion

Although LOF, Deep SVDD, and kNN are robust and well-established OCC methods, they were not included in our runtime testbed for the following reasons:

Lack of optimized support for real-time deployment in Python-based runtime environments (e.g., pyshark, joblib).
Inability to meet the latency constraints imposed by our edge-based system on Raspberry Pi hardware.
High sensitivity to hyperparameter tuning, which may hinder practical out-of-the-box integration in a plug-and-play GUI framework.

Nonetheless, these methods remain promising for future inclusion in our platform. Our current selections of AE, VAE, iForest, and OCSVM prioritized compatibility with real-time streaming, modular deployment, and literature adoption in edge-ready IDS pipelines [92]. Having presented the theoretical foundations, taxonomy, and core mechanisms of one-class classification (OCC) techniques, we now transition to the second part of this work: an experimental evaluation of these methods in a runtime anomaly detection context. The following sections describe the testbed implementation, preprocessing pipeline, and performance assessment under real-time constraints using NFSv4-based network communication. This dual approach is intended to bridge the gap between academic surveys and practical IDS deployment strategies.

6. Preprocessing Data and Setup

This section presents the experimental phase, focused on evaluating the runtime performance of OCC techniques within a live network testbed designed to emulate edge and industrial environments. Moving beyond offline benchmarks, the goal is to assess model behavior under real-time operational constraints. The analysis considers four widely adopted datasets: NSL-KDD 2009, UNSW-NB15, CICIDS 2017, and TON-IoT, selected for their relevance to one-class anomaly detection and their realistic traffic characteristics [155,156,157]. Preprocessing involved identifying and removing redundant, missing, or low-information features, followed by normalization to enhance model training. Evaluation metrics were defined to ensure consistent assessment of detection accuracy across normal and anomalous traffic patterns. The experimental setup, including system architecture and parameter configurations, is also detailed. Results are then analyzed to compare model performance across datasets, providing insights into their effectiveness under realistic runtime conditions.

6.1. Feature Preprocessing and Normalization

Regardless of the specific dataset to which the proposed workflow is applied, several preprocessing operations are necessary to ensure the robustness and consistency of the subsequent steps [158]. The first transformation consists of converting symbolic or categorical features into numerical representations. This applies, for example, to attributes such as Timestamp or Port Number, which may encode information in a symbolic or sequential form. To this end, a label encoding strategy is adopted, where each unique categorical value within a feature is mapped to a distinct integer. This approach is chosen to minimize the number of features and maintain a compact feature space, as opposed to one-hot encoding, which could lead to dimensionality explosion and increased computational cost in downstream tasks. After encoding, a normalization step is applied to mitigate the risk of computational distortions due to heterogeneous feature scales [159]. Specifically, min-max normalization is used to rescale all feature values to the interval [0, 1]. The normalization of a feature vector

f = {f_{1}, f_{2}, \dots, f_{n}}

is defined as:

f_{i}^{'} = \frac{f_{i} - min (f)}{max (f) - min (f)} for i = 1, \dots, n

(17)

where

f_{i}

is the original value of the i-th component of feature f, and

f_{i}^{'}

is the normalized value. Once normalization is complete, all observations containing missing or undefined values (NaNs) are removed from the final dataset, as they carry no informative content and may compromise the integrity and performance of the learning process.

6.2. Feature Reduction Using Principal Component Analysis (PCA)

To preliminarily reduce the number of features in the datasets, feature reduction techniques are applied during both the data preparation and learning phases. In particular, we employ principal component analysis (PCA), a widely used linear method for projecting data into a lower-dimensional space while preserving as much information (i.e., variance) as possible [160,161,162]. Let us consider the dataset represented by a matrix

A \in R^{N \times n}

, where N is the number of observations and n is the number of features. The goal of PCA is to transform this dataset into a new representation with fewer dimensions, capturing most of the original information [163].

1.: Dataset Centering

Starting from the original matrix

A

, the mean of each column is computed:

μ_{j} = \frac{1}{N} \sum_{i = 1}^{N} a_{i j} for j = 1, \dots, n

(18)

A centered matrix

B \in R^{N \times n}

is then obtained by subtracting the mean from each element:

B_{i j} = A_{i j} - μ_{j}

(19)

2.: Covariance Matrix Computation

The covariance matrix

Σ \in R^{n \times n}

is computed as:

Σ = \frac{1}{N - 1} B^{⊤} B

(20)

Each element

Σ_{i j}

represents the covariance between features i and j.

3.: Eigenvalue and Eigenvector Decomposition

To determine the principal directions (principal components), the eigenvalues and eigenvectors of

Σ

are computed by solving:

det (Σ - λ I) = 0

(21)

This yields the matrix of eigenvectors

V = [v_{1}, \dots, v_{n}] \in R^{n \times n}

, sorted by decreasing eigenvalue magnitude. By selecting the first m eigenvectors (with

m < n

) that account for the largest variance, we obtain the projection matrix:

V_{red} = [v_{1}, \dots, v_{m}] \in R^{n \times m}

(22)

4.: Projection into the Reduced Space

The reduced representation of the dataset is then obtained by projecting the centered data onto the new basis:

A_{new} = B \cdot V_{red} \in R^{N \times m}

(23)

This transformation projects the original data into a lower-dimensional space

R^{m}

, preserving the structure of feature correlations while improving the computational efficiency of the subsequent machine learning models [164,165]. After having carried out the analysis of the PCA with the corresponding correlation matrices, the cumulative variance is calculated, and then the graphs of the latter are plotted. Cumulative variance refers to the total proportion of variance in the original dataset that is explained by a given number of principal components in PCA [166]. It is computed as the cumulative sum of the individual explained variance ratios of each principal component. Let

λ_{1}, λ_{2}, \dots, λ_{n}

be the eigenvalues of the covariance matrix of the data, sorted in descending order, and let

λ_{total} = \sum_{i = 1}^{n} λ_{i}

be the total variance. Then, the cumulative variance explained by the first k components is:

Cumulative {Variance}_{k} = \frac{\sum_{i = 1}^{k} λ_{i}}{\sum_{i = 1}^{n} λ_{i}} = \frac{\sum_{i = 1}^{k} λ_{i}}{λ_{total}}

This measure helps determine the smallest number of principal components needed to retain a desired proportion (e.g., 95%) of the original variance, which is crucial for effective dimensionality reduction while preserving information [167,168]. It is important to note that, during the PCA process, all features exhibiting binary or Boolean values were excluded from the analysis [169]. This decision was based on the assumption that such features provide limited informational variance and are therefore of low utility for training the algorithms effectively [170,171]. The results we obtain from the PCA analysis for each dataset are graphically shown in the following Figure 11, Figure 12, Figure 13, Figure 14, Figure 15, Figure 16, Figure 17 and Figure 18 below and summarized in Table 11.

6.3. Performance Evaluation

To assess the effectiveness of the ML models applied in this study, a set of standard performance metrics was used, based on the construction of the confusion matrix, which is a widely adopted tool for evaluating the outcomes of OCC tasks [172,173]. In an OCC, each instance is assigned to either the positive class (e.g., anomaly) or the negative class (e.g., normal) [174]. The confusion matrix is composed of the following elements:

True Positives (TP): number of instances correctly predicted as belonging to the positive class.
True Negatives (TN): number of instances correctly predicted as belonging to the negative class.
False Positives (FP): number of instances incorrectly predicted as positive when they actually belong to the negative class.
False Negatives (FN): number of instances incorrectly predicted as negative when they actually belong to the positive class.

Based on these quantities, the following performance indicators are computed:

Accuracy

It measures the proportion of total correct predictions among all predictions and is defined as:

Accuracy = \frac{T P + T N}{T P + T N + F P + F N}

(24)

Precision

Also known as PPV, it measures the proportion of true positives among all predicted positives:

Precision = \frac{T P}{T P + F P}

(25)

Recall (Sensitivity or True Positive Rate)

This metric quantifies the ability of the classifier to correctly identify positive instances and is defined as:

Recall = \frac{T P}{T P + F N}

(26)

F1-Score

The F1-score is the harmonic mean of precision and recall, offering a balanced measure when there is an uneven class distribution. It is computed as:

F 1 Score = \frac{2 \cdot Precision \cdot Recall}{Precision + Recall}

(27)

These metrics collectively provide a comprehensive view of the classifier’s performance, especially in imbalanced classification problems such as anomaly detection, where the majority of data belong to the normal class and anomalies are relatively rare [175,176,177]. In such cases, relying solely on accuracy may be misleading, and greater emphasis is placed on precision, recall, and F1-score to evaluate model robustness and reliability [178,179,180].

6.4. Experimental Setup: NFSv4 Server with Runtime IDS and Multi-Client Architecture

To evaluate the proposed anomaly detection techniques in realistic conditions, a custom network demonstrator was implemented, consisting of a centralized NFSv4 server and three client nodes communicating over a TCP/UDP infrastructure, as illustrated in Figure 19.

The environment simulates both nominal and anomalous behavior through file-sharing operations. The NFSv4 server exports a shared directory accessed by clients via sequential mount and write operations, emulating enterprise or industrial file system interactions. All communications adhere to the NFSv4 stack, with core transactions over TCP and occasional auxiliary RPC messages over UDP. Each client initiates a mount, then performs randomized write operations to introduce variability in traffic and prevent overfitting. Variations in inter-arrival times and payloads simulate diverse usage patterns and workload conditions. On the server side, a runtime IDS monitors network- and system-level activity in real time. The detection engine incorporates multiple OCC techniques (OCSVM, VAE, iForest, and statistical reconstruction models) and analyzes the following:

Request frequency and inter-arrival time;
Packet size distribution;
Port and protocol usage (TCP/UDP);
System metrics (I/O, syscall anomalies).

Anomalies are flagged when deviations from the learned normal profile are detected, e.g., brute-force attempts, unauthorized file access, malformed payloads, or scanning activity. Traffic capture is performed at both endpoints using Wireshark for offline analysis. The testbed serves as a controlled, reproducible environment for validating IDS models in runtime, with constraints on latency, partial observability, and evolving traffic characteristics typical of real-world edge and industrial systems.

7. Experiment & Results

7.1. Experimental Procedure and Runtime Evaluation Setup

To validate the proposed OCC techniques under runtime conditions, an experimental client–server testbed was developed. The setup consists of a centralized NFS server and three client nodes interconnected via TCP/UDP, generating both nominal and anomalous traffic for runtime evaluation. All components were implemented in Python v13, with four custom scripts automating the test procedures. On the client side, Algorithm 1 generates legitimate NFS operations (e.g., mount, write), while Algorithm 2 injects anomalous behaviors such as high-frequency bursts, unauthorized port scanning, brute-force commands, malformed payloads, and low-size UDP floods. On the server side, Algorithm 3 passively captures incoming traffic, and Algorithm 4 classifies packets in real time using the selected OCC model (OCSVM, iForest, AE, or VAE). This framework enables direct performance assessment of anomaly detection methods in a live runtime environment.

Algorithm 1: Nominal Traffic Generation (Client Side)

All ML models were trained offline on normal data only, using supervised-free approaches tailored for OCC settings. The training output was embedded into Python pipeline objects as Selected -ML-pipeline.joblib, allowing seamless loading and deployment during runtime without the need for retraining. A dedicated graphical user interface (GUI) was implemented on the server side to ensure that the entire system is easily executable, fully automated, and accessible even to users outside the development team. This allows the proposed solution to be tested and deployed not only within the demonstrator environment but also in different operational contexts, facilitating broader experimentation and adaptation of the anomaly detection algorithms across various scenarios. This GUI allows the operator to:

Select the ML technique to be tested from a predefined list;
Specify the network interface or port to be monitored;
Launch the detection process and visualize outcomes in real time.

Network traffic is captured using pyshark, a Python wrapper for Wireshark, enabling detailed packet-level analysis directly within the Python runtime. Each incoming packet is processed and classified as either nominal or anomalous according to the behavior learned during the offline training phase. Additionally, the system automatically stores the following outputs:

A summary of performance metrics (accuracy, precision, recall, F1-score) in structured output;
A historical report saved in plain-text (.txt) format, logging each observed packet with timestamp, classification result, and basic flow metadata.

This fully packaged solution provides an effective environment for runtime evaluation and facilitates seamless integration into operational networks without manual tuning. The functioning of the graphical user interface (GUI) described in this section is summarized in the flowchart shown in Figure 20, while the configuration interface of the GUI is illustrated in Figure 21.

Algorithm 2: Multi-Type Attack Traffic Generator (Client Side)

Algorithm 3: TCP/UDP Request Listener on NFSv4 Simulated Server

Algorithm 4: Runtime Intrusion Detection on Server using ML model selected (Server Side)

7.2. Results of Experiment

The evaluation was conducted on a custom-designed testbed (Figure 22) replicating realistic industrial network conditions using an edge-based client–server architecture over NFSv4. The network comprised one server and three client nodes, implemented with Raspberry Pi 3 devices, simulating bidirectional TCP/UDP traffic flows.

During the runtime testing phase, a total of 250,192 packets were captured and processed. Of these, 110,741 were anomalous, emulating various attack patterns, including DoS, UDP floods, brute-force commands, port scanning, and malformed payloads, while the remaining 130,449 packets represented legitimate NFSv4 operations (e.g., MOUNT, WRITE) transmitted over TCP and UDP. All OCC models were pre-trained offline using the preprocessed datasets discussed previously. Hyperparameters for each algorithm are reported in Table 12, ensuring reproducibility and comparability.

All models were evaluated on the same traffic stream under identical runtime conditions, allowing direct comparison. Performance metrics (accuracy, precision, recall, and F1-score) were computed at the end of each run, and these are summarized in Table 13. Logging was automated, with each packet labeled and timestamped for post-analysis.

All models achieved their best performance on the UNSW-NB15 dataset, with F1-scores above 0.91, peaking at 0.93 with both AE and VAE. This confirms the dataset’s strong feature representation and class balance. Deep learning models (AE, VAE) showed higher recall, indicating better detection of subtle anomalies, particularly under complex temporal or statistical patterns. Performance on CICIDS2017 and TON_IoT was slightly lower, likely due to greater class imbalance and sparse attack representation. Results on NSL-KDD were consistently lower, aligning with known limitations in diversity and data aging. Average inference times per packet (Table 14) reflect model complexity. VAE had the highest latency (5.3 ms), followed by AE and OCSVM, while iForest was the fastest (1.3 ms), confirming its efficiency for real-time deployment (Figure 23).

Based on the results presented and discussed in this section, it can be concluded that the UNSW-NB15 dataset is the most effective training source across all evaluated models, enabling the development of accurate, precise, and robust anomaly detection systems. The strong performance of deep learning methods, particularly the Autoencoder and Variational Autoencoder, further confirms their suitability for real-time IDS deployment, especially in contexts requiring high sensitivity to subtle or low-profile anomalies. Although slightly less accurate, the Isolation Forest algorithm demonstrated the fastest inference times among all evaluated models, confirming its effectiveness as a lightweight and computationally efficient solution for scenarios where low latency is a critical constraint.

7.3. Statistical Validation of Performance Differences

To validate that the observed performance differences between OCC models are not due to random variation, we performed a one-way ANOVA (analysis of variance) test on the F1-scores achieved by each model across multiple datasets. The tests were conducted separately for each dataset. All proposed OCC models were evaluated through 10 repeated runs on each dataset, with F1-scores collected to assess performance consistency. The null hypothesis (

H_{0}

) states that the mean F1-score is equal across all four models. The alternative hypothesis (

H_{a}

) asserts that at least one model significantly differs from the others. The significance level was set at

α = 0.05

. A summary of the ANOVA test results is reported in Table 15. All datasets produced p-values well below the threshold, indicating that the observed differences in model performance are statistically significant.

Across all datasets, ANOVA tests confirm that the performance differences among the OCC models are statistically significant. In particular:

The UNSW-NB15 dataset yielded the strongest statistical separation (F = 21.58), with VAE and AE showing consistently higher F1-scores than OCSVM and iForest.
NSL-KDD and CICIDS2017 showed similar trends, although the gap between the deep and classical models was narrower.
TON_IoT, being highly imbalanced, resulted in slightly lower F-statistics but still confirmed significance.

Pairwise comparisons using Tukey’s HSD test (not shown) revealed that:

AE and VAE significantly outperform iForest and OCSVM across all datasets;
The difference between AE and VAE was not statistically significant, supporting their comparable effectiveness;
OCSVM and iForest yielded overlapping performance in TON_IoT.

These results statistically support the empirical ranking observed in Table 13, confirming that deep-learning-based models offer superior robustness and generalization under real-time runtime conditions.

7.4. Benchmark Comparison with State-of-the-Art IDS

To contextualize the performance of the proposed OCC-based IDS, we compared it with several representative real-time IDS frameworks reported in the recent literature. Table 16 summarizes the key features and performance metrics (where available) for each system.

Compared to other recent IDS implementations, our OCC-based IDS demonstrates several strengths:

Real-time execution: All models are executed in streaming mode using PyShark capture, with average latency below 5.5 ms per packet.
Offline-free deployment: Unlike supervised models such as DeepIDS, our framework does not require labeled attack data for training.
Modular GUI and configurability: The implemented interface allows model switching, real-time monitoring, and auto-logging—rarely found in academic prototypes.
Competitive accuracy: The best model (VAE with UNSW-NB15) achieves 96% accuracy and 93% F1-score, outperforming unsupervised baselines like LOF.

While supervised models may achieve higher accuracy in some cases, they require extensive labeled attack samples and retraining on domain-specific data. In contrast, our IDS remains effective and deployable in restricted environments such as military or industrial networks, where access to annotated threats is often limited or prohibited.

7.5. Limitations and Critical Analysis

Although the experimental results confirm the effectiveness of OCC models for runtime intrusion detection, several limitations must be acknowledged to ensure a realistic and complete interpretation of the findings.

Sensitivity to Training Data Quality: OCC models are trained exclusively on nominal data. Any biases, artifacts, or inconsistencies in this training set may lead to overfitting or misclassification of benign deviations as anomalies. Datasets such as NSL-KDD, which contain redundant or outdated traffic patterns, tend to reduce model generalization.
False Positive Rate (FPR): Due to the absence of labeled attack samples, OCC models construct tight boundaries around the nominal class. As shown in Table 17, this often results in non-negligible false positive rates—particularly when tested on complex or imbalanced datasets such as CICIDS2017 or TON_IoT. For instance, the AE model trained on CICIDS2017 yields an FPR above 11%.
Threshold Calibration: For models based on reconstruction error (e.g., AE and VAE), detection depends on a critical threshold $θ$ . If the reconstruction error $ϵ (x)$ exceeds $θ$ , the sample is flagged as anomalous:

$ϵ (x) = ∥ x - \hat{x} ∥_{2} > θ$

(28)

Improper selection of $θ$ can significantly impact the trade-off between precision and recall. Static thresholds may also degrade performance in the presence of runtime variability.
Concept Drift: Network traffic patterns and system behaviors evolve over time. Without periodic retraining or adaptation, OCC models may fail to detect new types of anomalies or may increasingly classify benign changes as malicious, thereby degrading long-term performance.
Lack of Interpretability: Deep-learning-based OCC models (e.g., VAE) tend to operate as black boxes. This impedes root cause analysis and limits the explainability of alarms—a critical requirement in industrial or military environments.

To mitigate these limitations, we recommend the following enhancements in future work:

1.: Dynamic or percentile-based thresholding to reduce sensitivity to global scale variations;
2.: Online learning or periodic retraining using updated nominal traffic samples;
3.: Hybrid semi-supervised models capable of incorporating occasional labeled anomalies;
4.: Explainable AI (XAI) layers or attribution techniques to improve transparency and operational trust.

8. Discussion and Future Work

The results of this study confirm the suitability of OCC approaches for real-time anomaly detection in networked environments, particularly where the structure and nature of potential threats are unknown. Traditional rule-based IDSs rely on predefined attack patterns and show limited adaptability against novel or evolving threats [185,186]. In contrast, OCC models trained exclusively on nominal traffic model legitimate behavior and flag deviations as anomalies [187,188]. A key advantage of OCC techniques is their simplified training process, as they do not require labeled attack samples or prior knowledge of threat types. This makes them well suited for operational contexts such as industrial or defense networks, where acquiring representative malicious data is often infeasible or constrained by data sensitivity [189,190]. Additionally, this reduces the time and effort associated with dataset preparation and training [191,192,193]. The runtime evaluation further validated the practical applicability of the proposed system. During the test phase, the IDS processed over 250,000 packets, comprising both nominal and anomalous traffic, using pre-trained models integrated via modular pipelines. While the current setup targets a selected range of anomalies, future iterations will incorporate additional threat classes (e.g., reconnaissance, malware, data exfiltration) to evaluate model generalization under more diverse conditions. Initial experiments relied on widely used public datasets for training. However, given that several of these benchmarks are now outdated, future work will explore the integration of more recent and domain-specific datasets, particularly those relevant to industrial applications [194,195]. This work presents a dual contribution: a comparative survey of OCC techniques and a real-time evaluation conducted on an industrial-like testbed. The survey reviewed key public IDS datasets and organized OCC models within a unified taxonomy. Four representative algorithms (Autoencoder, Variational Autoencoder, Isolation Forest, and One-Class SVM) were selected based on their relevance in the literature and runtime feasibility. These models were trained on four benchmark datasets and deployed within a Python-based IDS framework with GUI support, running over an NFSv4 client–server architecture using Raspberry Pi devices to simulate realistic network behavior.The main findings obtained from this study are summarized as follows:

UNSW-NB15 outperformed other datasets, consistently yielding the highest precision and F1-scores across all OCC models.
AE and VAE showed superior recall and F1-score, confirming their effectiveness in capturing complex traffic patterns, albeit with increased inference latency.
Isolation Forest demonstrated the best latency/performance trade-off, making it suitable for resource-constrained, real-time deployments.
The runtime testbed revealed the practical impact of dataset bias, dimensionality, and preprocessing strategies on detection accuracy and system robustness.

By bridging the gap between theoretical analysis and practical deployment, this study provides both a structured taxonomy of OCC methods and a validated framework for their real-time evaluation. Future developments will focus on hybrid learning approaches, integration of real industrial datasets, and adaptive IDS architectures capable of responding to evolving threat landscapes.

Author Contributions

Conceptualization, P.D., E.S., and D.P.; methodology, P.D., E.S., and D.P.; software, P.D., E.S., and D.P.; validation, P.D., E.S., and D.P.; formal analysis, P.D.; investigation, P.D., E.S., and D.P.; resources, P.D., E.S., D.P., and S.S.; data curation, P.D., E.S., and D.P.; writing—original draft preparation, P.D., E.S., D.P., and S.S.; writing—review and editing, P.D., E.S., D.P., and S.S.; visualization, P.D., E.S., D.P., and S.S.; supervision, S.S. and P.D.; project administration, S.S. and P.D.; funding acquisition, S.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the Project PNRR CN 1 “Piano Nazionale di Ripresa e Resilineza del Centro Nazionale 1: Simulation, Calculation and Analysis of High-Performance Data”, CUP I53C22000690001 Spoke 6 “Multiscale Modeling and Engineering Applications”; in part by the European High-Performance Computing Joint Undertaking (JU) under Framework Partnership Agreement No 800928 and Specific Grant Agreement No 101036168 European Processor Initiative (EPI SGA2); and in part by the Italian Ministry of Education and Research (MIUR) in the framework of the FoReLab (Future-Oriented Research Laboratory) “Departments of Excellence”.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author, P.D., upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

Acronym	Meaning	Description
OCC	One-Class Classification	ML technique trained only on normal data to detect anomalies as deviations.
IDS	Intrusion Detection System	System for identifying unauthorized or anomalous activity in a network.
ML	Machine Learning	Field of study focused on algorithms that improve from data.
AE	Autoencoder	Neural network that learns to reconstruct its input, used for anomaly detection.
VAE	Variational Autoencoder	Probabilistic autoencoder that models data distribution with latent variables.
OCSVM	One-Class Support Vector Machine	OCC algorithm that finds a boundary around normal data in feature space.
iForest	Isolation Forest	Tree-based anomaly detection method that isolates anomalies via random partitioning.
NFSv4	Network File System version 4	Protocol for remote file sharing over a network.
RPC	Remote Procedure Call	Protocol for executing a procedure on a remote host.
GUI	Graphical User Interface	Visual interface allowing users to interact with a system or application. User-friendly interface for controlling runtime detection and training.
TCP	Transmission Control Protocol	Reliable transport-layer protocol for ordered data delivery.
UDP	User Datagram Protocol	Lightweight transport-layer protocol for low-latency communication.
PCA	Principal Component Analysis	Linear dimensionality reduction technique based on variance preservation.
PPV	Positive Predictive Value	Also called precision; proportion of true positives among predicted positives.
BCE	Binary Cross-Entropy	Loss function used in binary classification and reconstruction tasks.
DoS	Denial of Service	Attack that aims to disrupt service availability by flooding with traffic.
XSS	Cross-Site Scripting	Attack that injects malicious scripts into otherwise benign websites.
MITM	Man-In-The-Middle	Attack where an adversary intercepts and possibly alters communication.

References

Annapareddy, V.N.; Preethish Nanan, B.; Kommaragiri, V.B.; Gadi, A.L.; Kalisetty, S. Emerging Technologies in Smart Computing, Sustainable Energy, and Next-Generation Mobility: Enhancing Digital Infrastructure, Secure Networks, and Intelligent Manufacturing. SSRN Electron. J. 2022. [Google Scholar] [CrossRef]
Nair, M.M.; Tyagi, A.K. AI, IoT, blockchain, and cloud computing: The necessity of the future. In Distributed Computing to Blockchain; Elsevier: Amsterdam, The Netherlands, 2023; pp. 189–206. [Google Scholar]
Lampropoulos, G.; Siakas, K.; Anastasiadis, T. Internet of things in the context of industry 4.0: An overview. Int. J. Entrep. Knowl. 2019, 7, 4–19. [Google Scholar] [CrossRef]
Malik, A.; Om, H. Cloud computing and internet of things integration: Architecture, applications, issues, and challenges. In Sustainable Cloud and Energy Services: Principles and Practice; Springer: Berlin/Heidelberg, Germany, 2018; pp. 1–24. [Google Scholar]
Sharma, S.; Chang, V.; Tim, U.S.; Wong, J.; Gadia, S. Cloud and IoT-based emerging services systems. Clust. Comput. 2019, 22, 71–91. [Google Scholar] [CrossRef]
Elhanashi, A.; Dini, P.; Saponara, S.; Zheng, Q. Integration of deep learning into the iot: A survey of techniques and challenges for real-world applications. Electronics 2023, 12, 4925. [Google Scholar] [CrossRef]
Li, Y.; Liu, Q. A comprehensive review study of cyber-attacks and cyber security; Emerging trends and recent developments. Energy Rep. 2021, 7, 8176–8186. [Google Scholar] [CrossRef]
Afaq, S.A.; Husain, M.S.; Bello, A.; Sadia, H. A critical analysis of cyber threats and their global impact. In Computational Intelligent Security in Wireless Communications; CRC Press: Boca Raton, FL, USA, 2023; pp. 201–220. [Google Scholar]
Rajasekharaiah, K.; Dule, C.S.; Sudarshan, E. Cyber security challenges and its emerging trends on latest technologies. In Proceedings of the IOP Conference Series: Materials Science and Engineering, Warangal, India, 9–10 October 2020; Volume 981, p. 022062. [Google Scholar]
Galinec, D.; Možnik, D.; Guberina, B. Cybersecurity and cyber defence: National level strategic approach. J. Control Meas. Electron. Comput. Commun. 2017, 58, 273–286. [Google Scholar] [CrossRef]
Jimmy, F. Emerging threats: The latest cybersecurity risks and the role of artificial intelligence in enhancing cybersecurity defenses. Val. Int. J. Digit. Libr. 2021, 1, 564–574. [Google Scholar] [CrossRef]
Samaras, C.; Nuttall, W.J.; Bazilian, M. Energy and the military: Convergence of security, economic, and environmental decision-making. Energy Strategy Rev. 2019, 26, 100409. [Google Scholar] [CrossRef]
Te Kulve, H.; Smit, W.A. Civilian–military co-operation strategies in developing new technologies. Res. Policy 2003, 32, 955–970. [Google Scholar] [CrossRef]
Asharf, J.; Moustafa, N.; Khurshid, H.; Debie, E.; Haider, W.; Wahab, A. A review of intrusion detection systems using machine and deep learning in internet of things: Challenges, solutions and future directions. Electronics 2020, 9, 1177. [Google Scholar] [CrossRef]
Kocher, G.; Kumar, G. Machine learning and deep learning methods for intrusion detection systems: Recent developments and challenges. Soft Comput. 2021, 25, 9731–9763. [Google Scholar] [CrossRef]
Dini, P.; Elhanashi, A.; Begni, A.; Saponara, S.; Zheng, Q.; Gasmi, K. Overview on intrusion detection systems design exploiting machine learning for networking cybersecurity. Appl. Sci. 2023, 13, 7507. [Google Scholar] [CrossRef]
Khan, S.S.; Madden, M.G. One-class classification: Taxonomy of study and review of techniques. Knowl. Eng. Rev. 2014, 29, 345–374. [Google Scholar] [CrossRef]
Mahmud, J.S.; Lendak, I. Enhancing One-Class Anomaly Detection in Unlabeled Datasets Through Unsupervised Data Refinement. In Proceedings of the 2024 IEEE 22nd Jubilee International Symposium on Intelligent Systems and Informatics (SISY), Pula, Croatia, 19–21 September 2024; pp. 000497–000502. [Google Scholar]
Al-Haija, Q.A.; Altamimi, S.; AlWadi, M. Analysis of extreme learning machines (ELMs) for intelligent intrusion detection systems: A survey. Expert Syst. Appl. 2024, 253, 124317. [Google Scholar] [CrossRef]
Camacho, N.G. The role of AI in cybersecurity: Addressing threats in the digital age. J. Artif. Intell. Gen. Sci. (JAIGS) 2024, 3, 143–154. [Google Scholar] [CrossRef]
Khan, O.U.; Abdullah, S.M.; Olajide, A.O.; Sani, A.I.; Faisal, S.M.W.; Ogunola, A.A.; Lee, M.D. The Future of Cybersecurity: Leveraging Artificial Intelligence to Combat Evolving Threats and Enhance Digital Defense Strategies. J. Comput. Anal. Appl. 2024, 33, 776–787. [Google Scholar]
Rane, N.; Choudhary, S.; Rane, J. Machine learning and deep learning: A comprehensive review on methods, techniques, applications, challenges, and future directions. In Techniques, Applications, Challenges, and Future Directions; SSRN (Elsevier): Amsterdam, The Netherlands, 2024. [Google Scholar]
Diana, L.; Dini, P.; Paolini, D. Overview on Intrusion Detection Systems for Computers Networking Security. Computers 2025, 14, 87. [Google Scholar] [CrossRef]
Mathews, S.M. Explainable artificial intelligence applications in NLP, biomedical, and malware classification: A literature review. In Intelligent Computing, Proceedings of the 2019 Computing Conference, Volume 2, London, UK, 16–17 July 2019; Springer: Cham, Switzerland, 2019; pp. 1269–1292. [Google Scholar]
Ozkan-Okay, M.; Akin, E.; Aslan, Ö.; Kosunalp, S.; Iliev, T.; Stoyanov, I.; Beloev, I. A comprehensive survey: Evaluating the efficiency of artificial intelligence and machine learning techniques on cyber security solutions. IEEE Access 2024, 12, 12229–12256. [Google Scholar] [CrossRef]
Martins, I.; Resende, J.S.; Sousa, P.R.; Silva, S.; Antunes, L.; Gama, J. Host-based IDS: A review and open issues of an anomaly detection system in IoT. Future Gener. Comput. Syst. 2022, 133, 95–113. [Google Scholar] [CrossRef]
Chukwunweike, J.N.; Yussuf, M.; Okusi, O.; Oluwatobi, T. The role of deep learning in ensuring privacy integrity and security: Applications in AI-driven cybersecurity solutions. World J. Adv. Res. Rev. 2024, 23, 1778–1790. [Google Scholar] [CrossRef]
Osazuwa, O.M.C.; Mitchell, O.; Osazuwa, C. Confidentiality; Integrity, and Availability in Network Systems: A Review of Related Literature. Int. J. Innov. Sci. Res. Technol. 2023, 8, 1946–1953. [Google Scholar]
Mishra, N.; Pandya, S. Internet of things applications, security challenges, attacks, intrusion detection, and future visions: A systematic review. IEEE Access 2021, 9, 59353–59377. [Google Scholar] [CrossRef]
Khraisat, A.; Alazab, A. A critical review of intrusion detection systems in the internet of things: Techniques, deployment strategy, validation strategy, attacks, public datasets and challenges. Cybersecurity 2021, 4, 18. [Google Scholar] [CrossRef]
Dong, H.; Kotenko, I. Cybersecurity in the AI era: Analyzing the impact of machine learning on intrusion detection. In Knowledge and Information Systems; Springer: Berlin/Heidelberg, Germany, 2025; pp. 1–52. [Google Scholar]
Alkadi, S.; Al-Ahmadi, S.; Ben Ismail, M.M. Toward improved machine learning-based intrusion detection for internet of things traffic. Computers 2023, 12, 148. [Google Scholar] [CrossRef]
Sun, N.; Ding, M.; Jiang, J.; Xu, W.; Mo, X.; Tai, Y.; Zhang, J. Cyber threat intelligence mining for proactive cybersecurity defense: A survey and new perspectives. IEEE Commun. Surv. Tutor. 2023, 25, 1748–1774. [Google Scholar] [CrossRef]
Alsulami, B.; Almalawi, A.; Fahad, A. A Review on Machine Learning Based Approaches of Network Intrusion Detection Systems. Int. J. Curr. Sci. Res. Rev. 2022, 5, 2159–2177. [Google Scholar] [CrossRef]
Priyanka, C.; Vivek, Y.; Ravi, V. Benchmarking One Class Classification in Banking, Insurance, and Cyber Security. In Proceedings of the International Conference on Frontiers of Intelligent Computing: Theory and Applications, London, UK, 6–7 June 2024; pp. 1–14. [Google Scholar]
Lwakatare, L.E.; Raj, A.; Crnkovic, I.; Bosch, J.; Olsson, H.H. Large-scale machine learning systems in real-world industrial settings: A review of challenges and solutions. Inf. Softw. Technol. 2020, 127, 106368. [Google Scholar] [CrossRef]
Aouedi, O.; Sacco, A.; Piamrat, K.; Marchetto, G. Handling privacy-sensitive medical data with federated learning: Challenges and future directions. IEEE J. Biomed. Health Inform. 2022, 27, 790–803. [Google Scholar] [CrossRef] [PubMed]
Shen, M.; Ye, K.; Liu, X.; Zhu, L.; Kang, J.; Yu, S.; Li, Q.; Xu, K. Machine learning-powered encrypted network traffic analysis: A comprehensive survey. IEEE Commun. Surv. Tutor. 2022, 25, 791–824. [Google Scholar] [CrossRef]
Khan, N.; Ahmad, K.; Tamimi, A.A.; Alani, M.M.; Bermak, A.; Khalil, I. Explainable AI-based Intrusion Detection System for Industry 5.0: An Overview of the Literature, associated Challenges, the existing Solutions, and Potential Research Directions. arXiv 2024, arXiv:2408.03335. [Google Scholar]
He, K.; Kim, D.D.; Asghar, M.R. Adversarial machine learning for network intrusion detection systems: A comprehensive survey. IEEE Commun. Surv. Tutor. 2023, 25, 538–566. [Google Scholar] [CrossRef]
Seliya, N.; Abdollah Zadeh, A.; Khoshgoftaar, T.M. A literature review on one-class classification and its potential applications in big data. J. Big Data 2021, 8, 122. [Google Scholar] [CrossRef]
Perera, P.; Oza, P.; Patel, V.M. One-class classification: A survey. arXiv 2021, arXiv:2101.03064. [Google Scholar]
Lee, W.; Stolfo, S.J. A framework for constructing features and models for intrusion detection systems. ACM Trans. Inf. Syst. Secur. (TiSSEC) 2000, 3, 227–261. [Google Scholar] [CrossRef]
Afifi, H.; Pochaba, S.; Boltres, A.; Laniewski, D.; Haberer, J.; Paeleke, L.; Poorzare, R.; Stolpmann, D.; Wehner, N.; Redder, A.; et al. Machine learning with computer networks: Techniques, datasets, and models. IEEE Access 2024, 12, 54673–54720. [Google Scholar] [CrossRef]
Wang, Y.; Zhang, B.; Ma, J.; Jin, Q. Artificial intelligence of things (AIoT) data acquisition based on graph neural networks: A systematical review. Concurr. Comput. Pract. Exp. 2023, 35, e7827. [Google Scholar] [CrossRef]
Shaukat, K.; Luo, S.; Varadharajan, V.; Hameed, I.A.; Chen, S.; Liu, D.; Li, J. Performance comparison and current challenges of using machine learning techniques in cybersecurity. Energies 2020, 13, 2509. [Google Scholar] [CrossRef]
Hettich, S.; Bay, S. The UCI KDD Archive; University of California, Department of Information and Computer Science: Irvine, CA, USA, 1999; Volume 152, Available online: http://kdd.ics.uci.edu/ (accessed on 10 July 2025).
Tavallaee, M.; Bagheri, E.; Lu, W.; Ghorbani, A.A. A detailed analysis of the KDD CUP 99 data set. In Proceedings of the 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, Ottawa, ON, Canada, 8–10 July 2009; pp. 1–6. [Google Scholar]
Siddique, K.; Akhtar, Z.; Khan, F.A.; Kim, Y. KDD cup 99 data sets: A perspective on the role of data sets in network intrusion detection research. Computer 2019, 52, 41–51. [Google Scholar] [CrossRef]
Khraisat, A.; Gondal, I.; Vamplew, P.; Kamruzzaman, J. Survey of intrusion detection systems: Techniques, datasets and challenges. Cybersecurity 2019, 2, 20. [Google Scholar] [CrossRef]
Cholakoska, A.; Shushlevska, M.; Todorov, Z.; Efnusheva, D. Analysis of machine learning classification techniques for anomaly detection with nsl-kdd data set. In Data Science and Intelligent Systems, Proceedings of 5th Computational Methods in Systems and Software 2021; Springer: Cham, Switzerland, 2021; Volume 2, pp. 258–267. [Google Scholar]
Protić, D.D. Review of KDD Cup ‘99, NSL-KDD and Kyoto 2006+ datasets. Vojnoteh. Glas. Tech. Cour. 2018, 66, 580–596. [Google Scholar] [CrossRef]
Albayati, M.; Issac, B. Analysis of intelligent classifiers and enhancing the detection accuracy for intrusion detection system. Int. J. Comput. Intell. Syst. 2015, 8, 841–853. [Google Scholar] [CrossRef][Green Version]
Jaw, E.; Wang, X. Feature selection and ensemble-based intrusion detection system: An efficient and comprehensive approach. Symmetry 2021, 13, 1764. [Google Scholar] [CrossRef]
Rani, M.; Gagandeep. Effective network intrusion detection by addressing class imbalance with deep neural networks multimedia tools and applications. Multimed. Tools Appl. 2022, 81, 8499–8518. [Google Scholar] [CrossRef]
Yang, C. Anomaly network traffic detection algorithm based on information entropy measurement under the cloud computing environment. Clust. Comput. 2019, 22, 8309–8317. [Google Scholar] [CrossRef]
Zhang, C.; Jia, D.; Wang, L.; Wang, W.; Liu, F.; Yang, A. Comparative research on network intrusion detection methods based on machine learning. Comput. Secur. 2022, 121, 102861. [Google Scholar] [CrossRef]
Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp 2018, 1, 108–116. [Google Scholar]
Thakkar, A.; Lohiya, R. A review of the advancement in intrusion detection datasets. Procedia Comput. Sci. 2020, 167, 636–645. [Google Scholar] [CrossRef]
Kilincer, I.F.; Ertam, F.; Sengur, A. Machine learning methods for cyber security intrusion detection: Datasets and comparative study. Comput. Netw. 2021, 188, 107840. [Google Scholar] [CrossRef]
Oyelakin, A.; Ameen, A.; Ogundele, T.; Salau-Ibrahim, T.; Abdulrauf, U.; Olufadi, H.; Ajiboye, I.; Muhammad-Thani, S.; Adeniji, I.A. Overview and exploratory analyses of CICIDS 2017 intrusion detection dataset. J. Syst. Eng. Inf. Technol. (JOSEIT) 2023, 2, 45–52. [Google Scholar] [CrossRef]
Talukder, M.A.; Islam, M.M.; Uddin, M.A.; Hasan, K.F.; Sharmin, S.; Alyami, S.A.; Moni, M.A. Machine learning-based network intrusion detection for big and imbalanced data using oversampling, stacking feature embedding and feature extraction. J. Big Data 2024, 11, 33. [Google Scholar] [CrossRef]
Al Farsi, A.; Khan, A.; Bait-Suwailam, M.M.; Mughal, M.R. Comparative Performance Evaluation of Machine Learning Algorithms for Cyber Intrusion Detection. J. Cybersecur. Priv. 2024. [Google Scholar] [CrossRef]
Mallidi, S.K.R.; Ramisetty, R.R. Advancements in training and deployment strategies for AI-based intrusion detection systems in IoT: A systematic literature review. Discov. Internet Things 2025, 5, 8. [Google Scholar] [CrossRef]
Moustafa, N.; Slay, J. The evaluation of Network Anomaly Detection Systems: Statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set. Inf. Secur. J. Glob. Perspect. 2016, 25, 18–31. [Google Scholar] [CrossRef]
Moustafa, N.; Slay, J. UNSW-NB15: A comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In Proceedings of the 2015 Military Communications and Information Systems Conference (MilCIS), Canberra, ACT, Australia, 10–12 November 2015; pp. 1–6. [Google Scholar]
Singh, G.; Khare, N. A survey of intrusion detection from the perspective of intrusion datasets and machine learning techniques. Int. J. Comput. Appl. 2022, 44, 659–669. [Google Scholar] [CrossRef]
Moustafa, N.; Slay, J.; Creech, G. Novel geometric area analysis technique for anomaly detection using trapezoidal area estimation on large-scale networks. IEEE Trans. Big Data 2017, 5, 481–494. [Google Scholar] [CrossRef]
Moustafa, N.; Creech, G.; Slay, J. Big data analytics for intrusion detection system: Statistical decision-making using finite dirichlet mixture models. In Data Analytics and Decision Support for Cybersecurity: Trends, Methodologies and Applications; Springer: Berlin/Heidelberg, Germany, 2017; pp. 127–156. [Google Scholar]
Meftah, S.; Rachidi, T.; Assem, N. Network based intrusion detection using the UNSW-NB15 dataset. Int. J. Comput. Digit. Syst. 2019, 8, 478–487. [Google Scholar] [CrossRef] [PubMed]
Oroian, D.; Bolboaca, R.; Roman, A.S.; Dobrota, V. Network Intrusion Detection System Using Anomaly Detection Techniques. In Proceedings of the 2024 IEEE 20th International Conference on Intelligent Computer Communication and Processing (ICCP), Cluj-Napoca, Romania, 17–19 October 2024; pp. 1–8. [Google Scholar]
Kamal, H.; Mashaly, M. Advanced Hybrid Transformer-CNN Deep Learning Model for Effective Intrusion Detection Systems with Class Imbalance Mitigation Using Resampling Techniques. Future Internet 2024, 16, 481. [Google Scholar] [CrossRef]
Chou, D.; Jiang, M. Data-driven network intrusion detection: A taxonomy of challenges and methods. arXiv 2020, arXiv:2009.07352. [Google Scholar]
Moualla, S.; Khorzom, K.; Jafar, A. Improving the Performance of Machine Learning-Based Network Intrusion Detection Systems on the UNSW-NB15 Dataset. Comput. Intell. Neurosci. 2021, 2021, 5557577. [Google Scholar] [CrossRef] [PubMed]
Aleesa, A.; Younis, M.; Mohammed, A.A.; Sahar, N. Deep-intrusion detection system with enhanced UNSW-NB15 dataset based on deep learning techniques. J. Eng. Sci. Technol. 2021, 16, 711–727. [Google Scholar]
Janarthanan, T.; Zargari, S. Feature selection in UNSW-NB15 and KDDCUP’99 datasets. In Proceedings of the 2017 IEEE 26th International Symposium on Industrial Electronics (ISIE), Edinburgh, UK, 19–21 June 2017; pp. 1881–1886. [Google Scholar]
Sarhan, M.; Layeghy, S.; Moustafa, N.; Portmann, M. Netflow datasets for machine learning-based network intrusion detection systems. In Proceedings of the Big Data Technologies and Applications: 10th EAI International Conference, BDTA 2020, and 13th EAI International Conference on Wireless Internet, WiCON 2020, Virtual Event, 11 December 2020; pp. 117–135. [Google Scholar]
Koroniotis, N.; Moustafa, N.; Sitnikova, E.; Turnbull, B. Towards the development of realistic botnet dataset in the internet of things for network forensic analytics: Bot-iot dataset. Future Gener. Comput. Syst. 2019, 100, 779–796. [Google Scholar] [CrossRef]
Koroniotis, N.; Moustafa, N.; Sitnikova, E.; Slay, J. Towards developing network forensic mechanism for botnet activities in the IoT based on machine learning techniques. In Proceedings of the Mobile Networks and Management: 9th International Conference, MONAMI 2017, Melbourne, Australia, 13–15 December 2017; pp. 30–44. [Google Scholar]
Peterson, J.M.; Leevy, J.L.; Khoshgoftaar, T.M. A review and analysis of the bot-iot dataset. In Proceedings of the 2021 IEEE International Conference on Service-Oriented System Engineering (SOSE), Oxford, UK, 23–26 August 2021; pp. 20–27. [Google Scholar]
Mashaleh, A.S.; Ibrahim, N.F.; Alauthman, M.; AlKaraki, J.; Almomani, A.; Atalla, S.; Gawanmeh, A. Evaluation of machine learning and deep learning methods for early detection of internet of things botnets. Int. J. Electr. Comput. Eng. (IJECE) 2024, 14, 4732–4744. [Google Scholar] [CrossRef]
Koroniotis, N.; Moustafa, N.; Sitnikova, E. A new network forensic framework based on deep learning for Internet of Things networks: A particle deep framework. Future Gener. Comput. Syst. 2020, 110, 91–106. [Google Scholar] [CrossRef]
Leevy, J.L.; Hancock, J.; Khoshgoftaar, T.M.; Peterson, J. Detecting information theft attacks in the bot-iot dataset. In Proceedings of the 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA), Pasadena, CA, USA, 13–16 December 2021; pp. 807–812. [Google Scholar]
Al-Haija, Q.A.; Droos, A. A comprehensive survey on deep learning-based intrusion detection systems in Internet of Things (IoT). Expert Syst. 2025, 42, e13726. [Google Scholar] [CrossRef]
Kerrakchou, I.; Abou El Hassan, A.; Chadli, S.; Emharraf, M.; Saber, M. Selection of efficient machine learning algorithm on Bot-IoT dataset for intrusion detection in internet of things networks. Indones. J. Electr. Eng. Comput. Sci. 2023, 31, 1784–1793. [Google Scholar] [CrossRef]
Pinto, D.A.P. HERA: Enhancing Network Security with a New Dataset Creation Tool. Master’s Thesis, University of Porto, Porto, Portugal, 2024. [Google Scholar]
Koroniotis, N.; Moustafa, N. Enhancing network forensics with particle swarm and deep learning: The particle deep framework. arXiv 2020, arXiv:2005.00722. [Google Scholar]
Peterson, J.M.; Khoshgoftaar, T.M.; Leevy, J.L. Composition analysis of the Bot-IoT dataset. Int. J. Internet Things Cyber-Assur. 2022, 2, 31–44. [Google Scholar] [CrossRef]
Koroniotis, N.; Moustafa, N.; Schiliro, F.; Gauravaram, P.; Janicke, H. A holistic review of cybersecurity and reliability perspectives in smart airports. IEEE Access 2020, 8, 209802–209834. [Google Scholar] [CrossRef]
Leevy, J.L.; Khoshgoftaar, T.M.; Hancock, J. Iot attack prediction using big bot-iot data. Int. J. Internet Things Cyber-Assur. 2022, 2, 45–61. [Google Scholar] [CrossRef]
Koroniotis, N. Designing an Effective Network Forensic Framework for the Investigation of Botnets in the Internet of Things. Ph.D. Thesis, UNSW Sydney, Sydney, Australia, 2020. [Google Scholar]
Moustafa, N. A new distributed architecture for evaluating AI-based security systems at the edge: Network TON_IoT datasets. Sustain. Cities Soc. 2021, 72, 102994. [Google Scholar] [CrossRef]
Booij, T.M.; Chiscop, I.; Meeuwissen, E.; Moustafa, N.; Den Hartog, F.T. ToN_IoT: The role of heterogeneity and the need for standardization of features and attack types in IoT network intrusion data sets. IEEE Internet Things J. 2021, 9, 485–496. [Google Scholar] [CrossRef]
Moustafa, N.; Keshky, M.; Debiez, E.; Janicke, H. Federated TON_IoT Windows datasets for evaluating AI-based security applications. In Proceedings of the 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), Guangzhou, China, 31 December 2020–1 January 2021; pp. 848–855. [Google Scholar]
Alsaedi, A.; Moustafa, N.; Tari, Z.; Mahmood, A.; Anwar, A. TON_IoT telemetry dataset: A new generation dataset of IoT and IIoT for data-driven intrusion detection systems. IEEE Access 2020, 8, 165130–165150. [Google Scholar] [CrossRef]
Ashraf, J.; Keshk, M.; Moustafa, N.; Abdel-Basset, M.; Khurshid, H.; Bakhshi, A.D.; Mostafa, R.R. IoTBoT-IDS: A novel statistical learning-enabled botnet detection framework for protecting networks of smart cities. Sustain. Cities Soc. 2021, 72, 103041. [Google Scholar] [CrossRef]
Moustafa, N. A systemic IoT–fog–cloud architecture for big-data analytics and cyber security systems: A review of fog computing. Secure Edge Comput. 2021, 41–50. [Google Scholar]
Zachos, G.; Essop, I.; Mantas, G.; Porfyrakis, K.; Ribeiro, J.C.; Rodriguez, J. Generating IoT edge network datasets based on the TON_IoT telemetry dataset. In Proceedings of the 2021 IEEE 26th International Workshop on Computer Aided Modeling and Design of Communication Links and Networks (CAMAD), Porto, Portugal, 25–27 October 2021; pp. 1–6. [Google Scholar]
Moustafa, N.; Ahmed, M.; Ahmed, S. Data analytics-enabled intrusion detection: Evaluations of ToN_IoT linux datasets. In Proceedings of the 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), Guangzhou, China, 29 December 2020–1 January 2021; pp. 727–735. [Google Scholar]
Maasaoui, Z.; Merzouki, M.; Battou, A.; Lbath, A. A Scalable Framework for Real-Time Network Security Traffic Analysis and Attack Detection Using Machine and Deep Learning. Platforms 2025, 3, 7. [Google Scholar] [CrossRef]
Moustafa, N. New generations of internet of things datasets for cybersecurity applications based machine learning: TON_IoT datasets. In Proceedings of the Research Australasia Conference, Brisbane, Australia, 21–25 October 2019; pp. 21–25. [Google Scholar]
Sarhan, M.; Layeghy, S.; Portmann, M. Towards a standard feature set for network intrusion detection system datasets. Mob. Netw. Appl. 2022, 27, 357–370. [Google Scholar] [CrossRef]
Mutleg, M.L.; Mahmood, A.M.; Al-Nayar, M.M.J. A Comprehensive Review of Cyber-Attacks Targeting IoT Systems and Their Security Measures. Int. J. Saf. Secur. Eng. 2024, 14, 1073–1086. [Google Scholar] [CrossRef]
Guo, G.; Pan, X.; Liu, H.; Li, F.; Pei, L.; Hu, K. An IoT intrusion detection system based on TON IoT network dataset. In Proceedings of the 2023 IEEE 13th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 8–11 March 2023; pp. 0333–0338. [Google Scholar]
Gad, A.R.; Nashat, A.A.; Barkat, T.M. Intrusion detection system using machine learning for vehicular ad hoc networks based on ToN-IoT dataset. IEEE Access 2021, 9, 142206–142217. [Google Scholar] [CrossRef]
Inuwa, M.M.; Das, R. A comparative analysis of various machine learning methods for anomaly detection in cyber attacks on IoT networks. Internet Things 2024, 26, 101162. [Google Scholar] [CrossRef]
Deng, H.; Li, X. Anomaly detection via reverse distillation from one-class embedding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 9737–9746. [Google Scholar]
Pollastro, A.; Testa, G.; Bilotta, A.; Prevete, R. Semi-supervised detection of structural damage using variational autoencoder and a one-class support vector machine. IEEE Access 2023, 11, 67098–67112. [Google Scholar] [CrossRef]
An, J.; Cho, S. Variational autoencoder based anomaly detection using reconstruction probability. Spec. Lect. IE 2015, 2, 1–18. [Google Scholar]
Li, K.L.; Huang, H.K.; Tian, S.F.; Xu, W. Improving one-class SVM for anomaly detection. In Proceedings of the 2003 International Conference on Machine Learning and Cybernetics (IEEE Cat. No. 03EX693), Xi’an, China, 5 November 2003; Volume 5, pp. 3077–3081. [Google Scholar]
Liu, F.T.; Ting, K.M.; Zhou, Z.H. Isolation-based anomaly detection. ACM Trans. Knowl. Discov. Data (TKDD) 2012, 6, 1–39. [Google Scholar] [CrossRef]
Li, P.; Pei, Y.; Li, J. A comprehensive survey on design and application of autoencoder in deep learning. Appl. Soft Comput. 2023, 138, 110176. [Google Scholar] [CrossRef]
Song, Y.; Hyun, S.; Cheong, Y.G. Analysis of autoencoders for network intrusion detection. Sensors 2021, 21, 4294. [Google Scholar] [CrossRef] [PubMed]
Wang, W.; Huang, Y.; Wang, Y.; Wang, L. Generalized autoencoder: A neural network framework for dimensionality reduction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Columbus, OH, USA, 23–28 June 2014; pp. 490–497. [Google Scholar]
Berahmand, K.; Daneshfar, F.; Salehi, E.S.; Li, Y.; Xu, Y. Autoencoders and their applications in machine learning: A survey. Artif. Intell. Rev. 2024, 57, 28. [Google Scholar] [CrossRef]
Alsaade, F.W.; Al-Adhaileh, M.H. Cyber attack detection for self-driving vehicle networks using deep autoencoder algorithms. Sensors 2023, 23, 4086. [Google Scholar] [CrossRef] [PubMed]
Torabi, H.; Mirtaheri, S.L.; Greco, S. Practical autoencoder based anomaly detection by using vector reconstruction error. Cybersecurity 2023, 6, 1. [Google Scholar] [CrossRef]
Bhuyan, M.H.; Bhattacharyya, D.K.; Kalita, J.K. Network anomaly detection: Methods, systems and tools. IEEE Commun. Surv. Tutor. 2013, 16, 303–336. [Google Scholar] [CrossRef]
Zavrak, S.; Iskefiyeli, M. Anomaly-based intrusion detection from network flow features using variational autoencoder. IEEE Access 2020, 8, 108346–108358. [Google Scholar] [CrossRef]
Neloy, A.A.; Turgeon, M. A comprehensive study of auto-encoders for anomaly detection: Efficiency and trade-offs. Mach. Learn. Appl. 2024, 17, 100572. [Google Scholar] [CrossRef]
Asperti, A.; Trentin, M. Balancing reconstruction error and kullback-leibler divergence in variational autoencoders. IEEE Access 2020, 8, 199440–199448. [Google Scholar] [CrossRef]
Mendonça, F.; Mostafa, S.S.; Morgado-Dias, F.; Ravelo-García, A.G. On the Use of Kullback–Leibler Divergence for Kernel Selection and Interpretation in Variational Autoencoders for Feature Creation. Information 2023, 14, 571. [Google Scholar] [CrossRef]
Böhm, V.; Seljak, U. Probabilistic autoencoder. arXiv 2020, arXiv:2006.05479. [Google Scholar]
Sinha, S.; Dieng, A.B. Consistency regularization for variational auto-encoders. Adv. Neural Inf. Process. Syst. 2021, 34, 12943–12954. [Google Scholar]
Nicolau, M.; McDermott, J. Learning neural representations for network anomaly detection. IEEE Trans. Cybern. 2018, 49, 3074–3087. [Google Scholar] [CrossRef] [PubMed]
Huang, H.; Wang, P.; Pei, J.; Wang, J.; Alexanian, S.; Niyato, D. Deep Learning Advancements in Anomaly Detection: A Comprehensive Survey. arXiv 2025, arXiv:2503.13195. [Google Scholar] [CrossRef]
Yang, H.; Kong, Q.; Mao, W. A deep latent space model for interpretable representation learning on directed graphs. Neurocomputing 2024, 576, 127342. [Google Scholar] [CrossRef]
Chaabouni, N.; Mosbah, M.; Zemmari, A.; Sauvignac, C.; Faruki, P. Network intrusion detection for IoT security based on learning techniques. IEEE Commun. Surv. Tutor. 2019, 21, 2671–2701. [Google Scholar] [CrossRef]
Li, C.; Guo, L.; Gao, H.; Li, Y. Similarity-measured isolation forest: Anomaly detection method for machine monitoring data. IEEE Trans. Instrum. Meas. 2021, 70, 1–12. [Google Scholar] [CrossRef]
Jain, P.; Jain, S.; Zaïane, O.R.; Srivastava, A. Anomaly detection in resource constrained environments with streaming data. IEEE Trans. Emerg. Top. Comput. Intell. 2021, 6, 649–659. [Google Scholar] [CrossRef]
Barbariol, T.; Chiara, F.D.; Marcato, D.; Susto, G.A. A review of tree-based approaches for anomaly detection. In Control Charts and Machine Learning for Anomaly Detection in Manufacturing; Springer: Berlin/Heidelberg, Germany, 2022; pp. 149–185. [Google Scholar]
Xu, H.; Pang, G.; Wang, Y.; Wang, Y. Deep isolation forest for anomaly detection. IEEE Trans. Knowl. Data Eng. 2023, 35, 12591–12604. [Google Scholar] [CrossRef]
Lifandali, O.; Abghour, N.; Chiba, Z. Feature selection using a combination of ant colony optimization and random forest algorithms applied to isolation forest based intrusion detection system. Procedia Comput. Sci. 2023, 220, 796–805. [Google Scholar] [CrossRef]
Hasan, M.A.M.; Nasser, M.; Ahmad, S.; Molla, K.I. Feature selection for intrusion detection using random forest. J. Inf. Secur. 2016, 7, 129–140. [Google Scholar] [CrossRef]
Loh, W.Y.; Shih, Y.S. Split selection methods for classification trees. Stat. Sin. 1997, 7, 815–840. [Google Scholar]
Bandaragoda, T.R.; Ting, K.M.; Albrecht, D.; Liu, F.T.; Zhu, Y.; Wells, J.R. Isolation-based anomaly detection using nearest-neighbor ensembles. Comput. Intell. 2018, 34, 968–998. [Google Scholar] [CrossRef]
Lesouple, J.; Baudoin, C.; Spigai, M.; Tourneret, J.Y. Generalized isolation forest for anomaly detection. Pattern Recognit. Lett. 2021, 149, 109–119. [Google Scholar] [CrossRef]
He, M.; Chen, X. Anomaly detection algorithm for big data based on isolation forest algorithm. J. Comput. Methods Sci. Eng. 2025, 14727978251337984. [Google Scholar] [CrossRef]
Gao, J.; Ozbay, K.; Hu, Y. Real-time anomaly detection of short-term traffic disruptions in urban areas through adaptive isolation forest. J. Intell. Transp. Syst. 2025, 29, 269–286. [Google Scholar] [CrossRef]
Shang, W.; Zeng, P.; Wan, M.; Li, L.; An, P. Intrusion detection algorithm based on OCSVM in industrial control system. Secur. Commun. Netw. 2016, 9, 1040–1049. [Google Scholar] [CrossRef]
Maglaras, L.A.; Jiang, J.; Cruz, T. Integrated OCSVM mechanism for intrusion detection in SCADA systems. Electron. Lett. 2014, 50, 1935–1936. [Google Scholar] [CrossRef]
Maglaras, L.A.; Jiang, J. Ocsvm model combined with k-means recursive clustering for intrusion detection in scada systems. In Proceedings of the 10th International Conference on Heterogeneous Networking for Quality, Reliability, Security and Robustness, Rhodes, Greece, 18–20 August 2014; pp. 133–134. [Google Scholar]
Shahid, N.; Naqvi, I.H.; Qaisar, S.B. One-class support vector machines: Analysis of outlier detection for wireless sensor networks in harsh environments. Artif. Intell. Rev. 2015, 43, 515–563. [Google Scholar] [CrossRef]
Stibor, T.; Timmis, J.; Eckert, C. A comparative study of real-valued negative selection to statistical anomaly detection techniques. In Proceedings of the Artificial Immune Systems: 4th International Conference, ICARIS 2005, Banff, AL, Canada, 14–17 August 2005; pp. 262–275. [Google Scholar]
Awad, M.; Khanna, R. Support vector machines for classification. In Efficient Learning Machines: Theories, Concepts, and Applications for Engineers and System Designers; Springer: Berlin/Heidelberg, Germany, 2015; pp. 39–66. [Google Scholar]
Erfani, S.M.; Rajasegarar, S.; Karunasekera, S.; Leckie, C. High-dimensional and large-scale anomaly detection using a linear one-class SVM with deep learning. Pattern Recognit. 2016, 58, 121–134. [Google Scholar] [CrossRef]
Quadir, A.; Sajid, M.; Tanveer, M. One class restricted kernel machines. arXiv 2025, arXiv:2502.10443. [Google Scholar]
Bermúdez-Chacón, R.; Gonnet, G.H.; Smith, K. Automatic Problem-Specific Hyperparameter Optimization and Model Selection for Supervised Machine Learning; Technical Report; ETH: Zurich, Switzerland, 2015. [Google Scholar]
Lin, N.; Chen, Y.; Liu, H.; Liu, H. A comparative study of machine learning models with hyperparameter optimization algorithm for mapping mineral prospectivity. Minerals 2021, 11, 159. [Google Scholar] [CrossRef]
Kaliyaperumal, P.; Periyasamy, S.; Thirumalaisamy, M.; Balusamy, B.; Benedetto, F. A novel hybrid unsupervised learning approach for enhanced cybersecurity in the IoT. Future Internet 2024, 16, 253. [Google Scholar] [CrossRef]
Abdelli, K.; Cho, J.Y.; Azendorf, F.; Griesser, H.; Tropschug, C.; Pachnicke, S. Machine-learning-based anomaly detection in optical fiber monitoring. J. Opt. Commun. Netw. 2022, 14, 365–375. [Google Scholar] [CrossRef]
Otokwala, U.; Arifeen, M.; Petrovski, A. A comparative study of novelty detection models for zero day intrusion detection in industrial internet of things. In Proceedings of the UK Workshop on Computational Intelligence, Sheffield, UK, 7–9 September 2022; pp. 238–249. [Google Scholar]
Chen, X.; Cao, C.; Mai, J. Network anomaly detection based on deep support vector data description. In Proceedings of the 2020 5th IEEE International Conference on Big Data Analytics (ICBDA), Xiamen, China, 8–11 May 2020; pp. 251–255. [Google Scholar]
Pandey, P. A KNN-Based Intrusion Detection System for Enhanced Anomaly Detection in Industrial IoT Networks. Int. J. Innov. Res. Technol. Sci. 2024, 12, 1–7. [Google Scholar]
Zheng, M.; Robbins, H.; Chai, Z.; Thapa, P.; Moore, T. Cybersecurity research datasets: Taxonomy and empirical analysis. In Proceedings of the 11th USENIX Workshop on Cyber Security Experimentation and Test (CSET 18), Baltimore, MD, USA, 13 August 2018. [Google Scholar]
Larriva-Novo, X.A.; Vega-Barbas, M.; Villagrá, V.A.; Rodrigo, M.S. Evaluation of cybersecurity data set characteristics for their applicability to neural networks algorithms detecting cybersecurity anomalies. IEEE Access 2020, 8, 9005–9014. [Google Scholar] [CrossRef]
Boateng, E.A.; Bruce, J.W.; Talbert, D.A. Anomaly detection for a water treatment system based on one-class neural network. IEEE Access 2022, 10, 115179–115191. [Google Scholar] [CrossRef]
Haluška, R.; Brabec, J.; Komárek, T. Benchmark of data preprocessing methods for imbalanced classification. In Proceedings of the 2022 IEEE International Conference on Big Data (Big Data), Osaka, Japan, 17–20 December 2022; pp. 2970–2979. [Google Scholar]
Wang, L. Heterogeneous data and big data analytics. Autom. Control Inf. Sci. 2017, 3, 8–15. [Google Scholar] [CrossRef]
Parizad, A.; Hatziadoniu, C.J. Cyber-attack detection using principal component analysis and noisy clustering algorithms: A collaborative machine learning-based framework. IEEE Trans. Smart Grid 2022, 13, 4848–4861. [Google Scholar] [CrossRef]
Krithivasan, K.; Pravinraj, S.; Shankar Sriram, V.S. Detection of cyberattacks in industrial control systems using enhanced principal component analysis and hypergraph-based convolution neural network (EPCA-HG-CNN). IEEE Trans. Ind. Appl. 2020, 56, 4394–4404. [Google Scholar] [CrossRef]
Dini, P.; Begni, A.; Ciavarella, S.; De Paoli, E.; Fiorelli, G.; Silvestro, C.; Saponara, S. Design and testing novel one-class classifier based on polynomial interpolation with application to networking security. IEEE Access 2022, 10, 67910–67924. [Google Scholar] [CrossRef]
Hasan, B.M.S.; Abdulazeez, A.M. A review of principal component analysis algorithm for dimensionality reduction. J. Soft Comput. Data Min. 2021, 2, 20–30. [Google Scholar] [CrossRef]
Nanga, S.; Bawah, A.T.; Acquaye, B.A.; Billa, M.I.; Baeta, F.D.; Odai, N.A.; Obeng, S.K.; Nsiah, A.D. Review of dimension reduction methods. J. Data Anal. Inf. Process. 2021, 9, 189–231. [Google Scholar] [CrossRef]
Bhardwaj, A.; Ahluwalia, A.S.; Pant, K.K.; Upadhyayula, S. A principal component analysis assisted machine learning modeling and validation of methanol formation over Cu-based catalysts in direct CO₂ hydrogenation. Sep. Purif. Technol. 2023, 324, 124576. [Google Scholar] [CrossRef]
Tharwat, A. Principal component analysis-a tutorial. Int. J. Appl. Pattern Recognit. 2016, 3, 197–240. [Google Scholar] [CrossRef]
Ivosev, G.; Burton, L.; Bonner, R. Dimensionality reduction and visualization in principal component analysis. Anal. Chem. 2008, 80, 4933–4944. [Google Scholar] [CrossRef] [PubMed]
Cumming, J.A.; Wooff, D.A. Dimension reduction via principal variables. Comput. Stat. Data Anal. 2007, 52, 550–565. [Google Scholar] [CrossRef]
Apolloni, B.; Bassis, S.; Brega, A. Feature selection via Boolean independent component analysis. Inf. Sci. 2009, 179, 3815–3831. [Google Scholar] [CrossRef]
Li, J.; Cheng, K.; Wang, S.; Morstatter, F.; Trevino, R.P.; Tang, J.; Liu, H. Feature selection: A data perspective. ACM Comput. Surv. (CSUR) 2017, 50, 1–45. [Google Scholar] [CrossRef]
Russo, D.; Zou, J. How much does your data exploration overfit? Controlling bias via information usage. IEEE Trans. Inf. Theory 2019, 66, 302–323. [Google Scholar] [CrossRef]
Hossin, M.; Sulaiman, M.N. A review on evaluation metrics for data classification evaluations. Int. J. Data Min. Knowl. Manag. Process 2015, 5, 1. [Google Scholar]
Pommé, L.E.; Bourqui, R.; Giot, R.; Auber, D. Relative Confusion Matrix: An Efficient Visualization for the Comparison of Classification Models. In Artificial Intelligence and Visualization: Advancing Visual Knowledge Discovery; Springer: Berlin/Heidelberg, Germany, 2024; pp. 223–243. [Google Scholar]
Amin, F.; Mahmoud, M. Confusion matrix in binary classification problems: A step-by-step tutorial. J. Eng. Res. 2022, 6, 1. [Google Scholar]
Nassif, A.B.; Talib, M.A.; Nasir, Q.; Dakalbab, F.M. Machine learning for anomaly detection: A systematic review. IEEE Access 2021, 9, 78658–78700. [Google Scholar] [CrossRef]
Ma, X.; Wu, J.; Xue, S.; Yang, J.; Zhou, C.; Sheng, Q.Z.; Xiong, H.; Akoglu, L. A comprehensive survey on graph anomaly detection with deep learning. IEEE Trans. Knowl. Data Eng. 2021, 35, 12012–12038. [Google Scholar] [CrossRef]
Chandola, V.; Banerjee, A.; Kumar, V. Anomaly detection: A survey. ACM Comput. Surv. (CSUR) 2009, 41, 1–58. [Google Scholar] [CrossRef]
Chicco, D.; Jurman, G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genom. 2020, 21, 6. [Google Scholar] [CrossRef] [PubMed]
Nicora, G.; Rios, M.; Abu-Hanna, A.; Bellazzi, R. Evaluating pointwise reliability of machine learning prediction. J. Biomed. Inform. 2022, 127, 103996. [Google Scholar] [CrossRef] [PubMed]
Elhanashi, A.; Gasmi, K.; Begni, A.; Dini, P.; Zheng, Q.; Saponara, S. Machine learning techniques for anomaly-based detection system on CSE-CIC-IDS2018 dataset. In Proceedings of the International Conference on Applications in Electronics Pervading Industry, Environment and Society, Genova, Italy, 26–27 September 2022; pp. 131–140. [Google Scholar]
Racherla, S.; Sripathi, P.; Faruqui, N.; Kabir, M.A.; Whaiduzzaman, M.; Shah, S.A. Deep-IDS: A Real-Time Intrusion Detector for IoT Nodes Using Deep Learning. IEEE Access 2024, 12, 63584–63597. [Google Scholar] [CrossRef]
Disha, R.A.; Waheed, S. Performance analysis of machine learning models for intrusion detection system using Gini Impurity-based Weighted Random Forest (GIWRF) feature selection technique. Cybersecurity 2022, 5, 1. [Google Scholar] [CrossRef]
Alghushairy, O.; Alsini, R.; Soule, T.; Ma, X. A review of local outlier factor algorithms for outlier detection in big data streams. Big Data Cogn. Comput. 2020, 5, 1. [Google Scholar] [CrossRef]
Mahadevappa, P.; Murugesan, R.K.; Al-Amri, R.; Thabit, R.; Al-Ghushami, A.H.; Alkawsi, G. A secure edge computing model using machine learning and IDS to detect and isolate intruders. MethodsX 2024, 12, 102597. [Google Scholar] [CrossRef] [PubMed]
Abdulganiyu, O.H.; Ait Tchakoucht, T.; Saheed, Y.K. A systematic literature review for network intrusion detection system (IDS). Int. J. Inf. Secur. 2023, 22, 1125–1162. [Google Scholar] [CrossRef]
Soe, Y.N.; Feng, Y.; Santosa, P.I.; Hartanto, R.; Sakurai, K. Rule generation for signature based detection systems of cyber attacks in iot environments. Bull. Netw. Comput. Syst. Softw. 2019, 8, 93–97. [Google Scholar]
Do, P.H.; Le, T.D.; Dinh, T.D.; Dai Pham, V. Classifying IoT Botnet Attacks With Kolmogorov-Arnold Networks: A Comparative Analysis of Architectural Variations. IEEE Access 2025, 13, 16072–16093. [Google Scholar] [CrossRef]
Marques, H.O.; Swersky, L.; Sander, J.; Campello, R.J.; Zimek, A. On the evaluation of outlier detection and one-class classification: A comparative study of algorithms, model selection, and ensembles. Data Min. Knowl. Discov. 2023, 37, 1473–1517. [Google Scholar] [CrossRef] [PubMed]
Acquaah, Y.T.; Kaushik, R. Normal-only Anomaly detection in environmental sensors in CPS: A comprehensive review. IEEE Access 2024, 12, 191086–191107. [Google Scholar] [CrossRef]
Sarhan, M.; Layeghy, S.; Moustafa, N.; Portmann, M. Cyber threat intelligence sharing scheme based on federated learning for network intrusion detection. J. Netw. Syst. Manag. 2023, 31, 3. [Google Scholar] [CrossRef]
Zhu, X.; Vondrick, C.; Fowlkes, C.C.; Ramanan, D. Do we need more training data? Int. J. Comput. Vis. 2016, 119, 76–92. [Google Scholar] [CrossRef]
Kotsiantis, S.B.; Kanellopoulos, D.; Pintelas, P.E. Data preprocessing for supervised leaning. Int. J. Comput. Sci. 2006, 1, 111–117. [Google Scholar]
Patcha, A.; Park, J.M. An overview of anomaly detection techniques: Existing solutions and latest technological trends. Comput. Netw. 2007, 51, 3448–3470. [Google Scholar] [CrossRef]
Gaggero, G.B.; Armellin, A.; Portomauro, G.; Marchese, M. Industrial control system-anomaly detection dataset (ICS-ADD) for cyber-physical security monitoring in smart industry environments. IEEE Access 2024, 12, 64140–64149. [Google Scholar] [CrossRef]
Faramondi, L.; Flammini, F.; Guarino, S.; Setola, R. A hardware-in-the-loop water distribution testbed dataset for cyber-physical security testing. IEEE Access 2021, 9, 122385–122396. [Google Scholar] [CrossRef]

Figure 1. KDD-99 CUP train and test data distribution.

Figure 2. NSL-KDD 2009 train and test data distribution.

Figure 3. CICIDS 2017 data distribution.

Figure 4. UNSW-NB15 2015 train and test data distribution.

Figure 5. BoT-IoT 2019 data distribution.

Figure 6. TON-IoT 2020 data distribution.

Figure 7. Structural flow diagram of the Autoencoder (AE) used for anomaly detection. The input feature vector is encoded into a latent representation, then reconstructed and compared to the original input to compute the reconstruction error. High reconstruction error indicates potential anomalies.

Figure 8. Architectural diagram of the Variational Autoencoder (VAE) process for one-class anomaly detection. The encoder maps input samples to a latent distribution, from which latent vectors are sampled and passed through the decoder to reconstruct the input. During inference, anomalies are identified based on high reconstruction error or low likelihood under the learned distribution of normal data.

Figure 9. Architectural diagram of the Isolation Forest (iForest) process for one-class anomaly detection. The algorithm constructs multiple binary trees by recursively partitioning the input data using randomly selected features and split values. Normal instances, which are densely clustered, require more partitions to isolate, resulting in longer average path lengths. In contrast, anomalies are isolated earlier due to their sparsity, yielding shorter path lengths.

Figure 10. Architectural diagram of the One-Class Support Vector Machine (OCSVM) process for anomaly detection. The model learns a decision boundary that best encloses the distribution of normal data in a high-dimensional feature space using a kernel function. During inference, samples that fall outside the learned boundary are considered anomalies.

Figure 11. Matrix correlation of NSL-KDD 2009 dataset.

Figure 12. Features reduction phase of NSL-KDD 2009 dataset.

Figure 13. Correlation matrix of the UNSW-NB15 dataset after preprocessing. Color intensity indicates the strength and direction of linear correlation between pairs of features. Highly correlated features are candidates for removal or PCA transformation.

Figure 14. Features reduction phase of UNSW-NB15 2009 dataset.

Figure 15. Matrix correlation of CICIDS 2017 dataset.

Figure 16. Features reduction phase of CICIDS 2017 dataset.

Figure 17. Matrix correlation of TON-IoT dataset.

Figure 18. Features reduction phase of TON-IoT dataset.

Figure 19. Architecture of the experimental setup: NFSv4 server with runtime IDS and three clients.

Figure 20. Flowchart illustrating the runtime behavior of the IDS GUI. The user selects the anomaly detection model, starts packet capture, and visualizes classification results in real time. Detected anomalies are logged and reported.

Figure 21. The generic GUI interface displayed before initiating the IDS is shown on the right, while on the left are the system behaviors in the absence of anomalies (A) and in the presence of anomalies (B).

Figure 22. Experimental edge network setup with NFSv4 and ML tested.

Figure 23. Boxplot of runtime inference time distribution per packet for each OCC algorithm tested.

Table 1. Overview of the most widely used cybersecurity datasets for enterprise network intrusion detection.

Dataset	Year	Organization	Repository URL
KDD CUP 99	1999	UCI/DARPA	https://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
			(on-line accessed in 30 June 2025)
NSL-KDD	2009	University of New Brunswick (UNB)	https://www.unb.ca/cic/datasets/nsl.html
			(on-line accessed in 30 June 2025)
UNSW-NB15	2015	Australian Centre for Cyber Security	https://research.unsw.edu.au/projects/unsw-nb15-dataset
			(on-line accessed in 30 June 2025)
CSE-CIC-IDS 2017	2018	CIC + Communications Security Establishment	https://www.unb.ca/cic/datasets/ids-2017.html
			(on-line accessed in 30 June 2025)
Bot-IoT	2018	UNSW/Cyber Range Lab	https://research.unsw.edu.au/projects/bot-iot-dataset
			(on-line accessed in 30 June 2025)
TON_IoT	2020	UNSW + CSCRC	https://research.unsw.edu.au/projects/toniot-datasets
			(on-line accessed in 30 June 2025)

Table 2. Characteristics of the KDD-Cup99 dataset.

Number of Features	41
Types of Features	Basic features: duration, protocol, service, flags, etc. Content features: number of failed login attempts, number of files created, etc. Traffic features: number of connections in the last 2 s, percentage of connections with errors, etc.
Types of Attacks	DoS: smurf, neptune, back, teardrop, pod Probe: ipsweep, nmap, portsweep, satan R2L: guess_passwd, ftp_write, imap, multihop, phf, spy, warezclient, warezmaster U2R: buffer_overflow, loadmodule, perl, rootkit
Additional Notes	Dataset generated in a simulated environment High data redundancy, which may affect model performance Widely used as a standard benchmark for IDS evaluation

Table 3. Characteristics of the NSL-KDD dataset.

Number of Features	41
Types of Features	Basic features: duration, protocol, service, flags, etc. Content features: number of failed login attempts, number of files created, etc. Traffic features: number of connections in the last 2 s, percentage of connections with errors, etc.
Types of Attacks	DoS: smurf, neptune, back, teardrop, pod Probe: ipsweep, nmap, portsweep, satan R2L: guess_passwd, ftp_write, imap, multihop, phf, spy, warezclient, warezmaster U2R: buffer_overflow, loadmodule, perl, rootkit
Additional Notes	Removal of duplicated instances from KDD Cup 1999 Improved class balance More suitable for machine learning model evaluation

Table 4. Characteristics of the CICIDS 2017 dataset.

Number of Features	80
Types of Features	Network traffic features: source and destination IPs, ports, protocols, flow duration, total packets, bytes transmitted. Statistical features: mean, variance, skewness of incoming and outgoing packets and bytes. Content features: state flags, count of previous connections, flow-behavior-related attributes.
Types of Attacks	DoS/DDoS: DoS Hulk, DoS GoldenEye, DoS Slowloris, DoS Slow HTTP Test Brute Force: SSH-Patator, FTP-Patator Scanning: port scan Web Attacks: web attack—brute force, web attack—XSS, web attack—SQL injection Others: botnet, infiltration, Heartbleed
Additional Notes	Dataset generated in a controlled environment by CIC Wide coverage of realistic and recent attack scenarios Commonly used as a modern benchmark for evaluating IDS and ML models Potential imbalance in less-represented attack classes

Table 5. UNSW-NB15 dataset overview.

Number of Features	49
Types of Features	Flow: duration, packets, bytes, etc. Basic: protocol, service, flags, etc. Content: payload size, login attempts, etc. Time: timestamps, intervals, etc. Additional: derived statistics and labels
Types of Attacks	Fuzzers, Analysis, Backdoors, DoS, Exploits, Generic, Reconnaissance, Shellcode, Worms
Additional Notes	Generated using IXIA PerfectStorm Reflects realistic attack scenarios Rich in features for granular evaluation

Table 6. Bot-IoT Dataset Overview.

Number of Features	46
Types of Features	Flow: duration, packets, bytes, in/out flows Time: timestamps, inter-arrival times Additional: TCP flags, transport protocol, port, binary/numeric labels
Types of Attacks	DoS, DDoS, reconnaissance, information theft, data exfiltration
Additional Notes	Strong imbalance toward malicious traffic Generated using IoT devices (e.g., camera, fitness tracker, smart TV) Available in CSV and PCAP with detailed labeling Suitable for large-scale IoT threat evaluations

Table 7. TON_IoT dataset overview.

Number of Features	Variable by data source: IoT Telemetry: ∼10 features Windows/Linux Logs: ∼20+ features Network Traffic: 49 features
Types of Features	Telemetry (e.g., temperature, humidity, battery level) System logs (e.g., processes, login attempts, events) Network traffic (e.g., packets, bytes, protocol, TCP flags)
Types of Attacks	DoS, DDoS, ransomware, backdoor, injection, MITM, XSS, password cracking, scanning
Additional Notes	Multimodal dataset (IoT, host, and network data) Includes timestamps and detailed labeling Suitable for detection, prediction, and response model evaluation

Table 8. Coverage of major attack types across considered and excluded datasets.

Attack Type	NSL-KDD	UNSW-NB15	CICIDS2017	TON_IoT	BoT-IoT	CSE-CIC-IDS2018
DoS/DDoS	✓	✓	✓	✓	✓	✓
Brute Force/Auth	✓	✓	✓	✓		✓
Port Scanning	✓	✓	✓	✓	✓	✓
Infiltration		✓	✓	✓	✓	✓
Botnet/Backdoor		✓		✓	✓	✓
Shellcode/Payload	✓	✓				✓
Data Exfiltration			✓	✓		✓
Reconnaissance	✓	✓		✓	✓	✓
Web Attacks (XSS, SQLi)			✓			✓

Table 9. Quantitative assessment of dataset bias: imbalance ratio and entropy metrics.

Dataset	Normal (%)	Imbalance Ratio (IR)	Entropy (Bits)	Skewness $δ_{H}$
NSL-KDD	51.9	1.08	0.999	0.001
UNSW-NB15	87.4	6.91	0.548	0.208
CICIDS2017	84.6	5.49	0.610	0.144
TON_IoT	96.3	26.0	0.253	0.633

Table 10. Taxonomy of OCC techniques according to detection type, learning model, and runtime suitability.

Method	Detection Type	Learning Model	Runtime Suitability
Autoencoder (AE)	Reconstruction	Deep	Good
Variational Autoencoder (VAE)	Reconstruction (probabilistic)	Deep	Moderate
One-Class SVM (OCSVM)	Boundary-based	Shallow	Good
Isolation Forest (iForest)	Isolation-based	Shallow (ensemble)	Excellent
Local Outlier Factor (LOF)	Density-based	Shallow	Poor
Deep SVDD	Boundary-based	Deep	Poor
kNN OCC	Distance-based	Shallow	Moderate

Table 11. Principal component analysis (PCA)—selected features and variance explained per dataset.

Dataset	PCA Components (95% Variance)	Selected Features from PCA
NSL-KDD	20	diff_srv_rate, duration, num_file_creations, num_shells, root_shell, num_access_files, urgent, dst_host_srv_diff_host_rate, num_failed_logins, srv_diff_host_rate, dst_bytes, dst_host_diff_srv_rate, wrong_fragment, land, dst_host_count, logged_in, protocol_type, su_attempted, count, num_root
UNSW-NB15	24	smean_sz, Djit, Sjit, res_bdy_len, Dload, Sload, service, stcpb, ct_flw_http_mthd, dttl, dtcpb, ct_srv_dst, ct_src_ltm, ct_srv_src, ct_state_ttl, ct_dst_ltm, trans_depth, Dintpkt, ct_dst_sport_ltm, state, is_sm_ips_ports, ct_ftp_cmd, ct_src_dport_ltm, sttl
TON_IoT	23	dns_rcode, conn_state, src_ip_bytes, http_response_body_len, missed_bytes, dns_qclass, dns_query, src_port, service, src_pkts, weird_name, duration, dst_port, dst_pkts, weird_addl, dst_ip, http_resp_mime_types, src_ip, proto, dst_ip_bytes, ts, dst_bytes, dns_qtype
CICIDS 2017	30	Down/Up Ratio, TotLen Bwd Pkts, Dst Port, Fwd Pkt Len Mean, Pkt Len Mean, Fwd PSH Flags, Init Bwd Win Byts, Fwd Act Data Pkts, Tot Fwd Pkts, Fwd Pkt Len Max, Flow IAT Min, Fwd IAT Min, TotLen Fwd Pkts, Tot Bwd Pkts, Fwd URG Flags, Init Fwd Win Byts, Fwd Pkt Len Min, Bwd IAT Min, Active Std, Bwd Pkt Len Min, URG Flag Cnt, Flow Pkts/s, Bwd Pkts/s, Fwd Header Len, Pkt Len Std, Bwd Header Len, Pkt Size Avg, Bwd Pkt Len Mean, Timestamp, Flow IAT Max

Table 12. Hyperparameter values used for the anomaly detection algorithms.

Algorithm	Hyperparameter	Value/Choice
Autoencoder	Layers (enc/dec)	2 encoder, 2 decoder
	Units per layer	[400, 200]
	Activation	ReLU
	Loss	MSE
	Optimizer	Adam (lr = 0.001)
	Epochs	150
	Batch size	64
Variational Autoencoder	Latent size	6
	Activation	ReLU
	Loss	MSE + KL divergence
	Optimizer	Adam (lr = 0.001)
	Epochs	150
	Batch size	64
One-Class SVM	Kernel	RBF
	Gamma	$1 / N_{f e a t u r e s}$
	Nu	0.05
Isolation Forest	Trees	100
	Max samples	256
	Contamination	0.05

Table 13. Performance of OCC algorithms across datasets (best values are highlighted in green).

Model	Dataset	Accuracy	Precision	Recall	F1-Score
OCSVM	NSL-KDD	0.89	0.85	0.81	0.83
	UNSW-NB15	0.95	0.93	0.92	0.92
	TON_IoT	0.87	0.84	0.79	0.81
	CICIDS2017	0.88	0.86	0.80	0.83
AE	NSL-KDD	0.91	0.89	0.84	0.86
	UNSW-NB15	0.96	0.94	0.93	0.93
	TON_IoT	0.88	0.86	0.82	0.84
	CICIDS2017	0.89	0.87	0.83	0.85
iForest	NSL-KDD	0.86	0.83	0.78	0.80
	UNSW-NB15	0.94	0.92	0.91	0.91
	TON_IoT	0.85	0.82	0.77	0.79
	CICIDS2017	0.87	0.84	0.80	0.82
VAE	NSL-KDD	0.90	0.87	0.83	0.85
	UNSW-NB15	0.96	0.94	0.92	0.93
	TON_IoT	0.88	0.85	0.81	0.83
	CICIDS2017	0.89	0.86	0.82	0.84

Table 14. Average runtime inference time per packet for each OCC technique (fastest technique is highlighted in green).

ML Technique	Inference Time (ms)
OCSVM	2.8
AE	3.1
VAE	5.3
iForest	1.3

Table 15. ANOVA results on F1-score across OCC models for each dataset.

Dataset	F-Statistic	p-Value	Significance
NSL-KDD	14.23	0.00009	Yes
UNSW-NB15	21.58	0.00001	Yes
CICIDS2017	11.41	0.00027	Yes
TON_IoT	8.67	0.00065	Yes

Table 16. Benchmark comparison with recent real-time IDS solutions.

System (Ref.)	Methodology	Accuracy (%)	F1-Score (%)	Runtime/Deployment
DeepIDS [181]	CNN + supervised	97.2	96.1	Offline (KDD, NSL-KDD)
CICFlowMeter + Random Forest [182]	Flow-based + RF	92.5	91.0	Real-time (CICIDS2017)
LOF-IDS [183]	Unsupervised LOF	87.3	85.2	Partial runtime, no GUI
EdgeML-IDS [184]	Lightweight DL	94.0	93.1	IoT gateway, online
Ours (OCC-IDS)	One-Class SVM/AE/VAE/iForest	96.0 (max)	93.0 (UNSW-NB15)	Real-time, modular GUI, TCP/UDP capture

Table 17. Critical instances of elevated false positive rate (FPR) are often associated with complex and imbalanced datasets.

Model	Training Dataset	FPR (%)	Comments
Autoencoder (AE)	UNSW-NB15	3.8	Balanced feature space yields low FPR.
Autoencoder (AE)	CICIDS2017	11.5	High variability in benign traffic increases FPR.
Variational Autoencoder (VAE)	TON_IoT	9.2	Sensitive to rare yet legitimate deviations.
OCSVM	NSL-KDD	12.0	Legacy dataset with limited behavioral diversity.
Isolation Forest (iForest)	UNSW-NB15	4.1	Fast and robust tree-based separation.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Paolini, D.; Dini, P.; Soldaini, E.; Saponara, S. One-Class Anomaly Detection for Industrial Applications: A Comparative Survey and Experimental Study. Computers 2025, 14, 281. https://doi.org/10.3390/computers14070281

AMA Style

Paolini D, Dini P, Soldaini E, Saponara S. One-Class Anomaly Detection for Industrial Applications: A Comparative Survey and Experimental Study. Computers. 2025; 14(7):281. https://doi.org/10.3390/computers14070281

Chicago/Turabian Style

Paolini, Davide, Pierpaolo Dini, Ettore Soldaini, and Sergio Saponara. 2025. "One-Class Anomaly Detection for Industrial Applications: A Comparative Survey and Experimental Study" Computers 14, no. 7: 281. https://doi.org/10.3390/computers14070281

APA Style

Paolini, D., Dini, P., Soldaini, E., & Saponara, S. (2025). One-Class Anomaly Detection for Industrial Applications: A Comparative Survey and Experimental Study. Computers, 14(7), 281. https://doi.org/10.3390/computers14070281

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

One-Class Anomaly Detection for Industrial Applications: A Comparative Survey and Experimental Study

Abstract

1. Introduction

2. Related Works

3. Methodology

4. Dataset

4.1. KDD Cup 1999

4.2. NSL-KDD

4.3. CICIDS 2017

4.4. UNSW-NB15

4.5. Bot-IoT

4.6. TON-IoT

4.7. Dataset Selection Justification

4.8. Dataset Bias Analysis

5. ML Methods One-Class Anomaly Detection

5.1. Taxonomy of One-Class Anomaly Detection Techniques

5.2. Autoencoder One-Class for Anomaly Detection

5.3. Variational Autoencoder (VAE) for One-Class Anomaly Detection

5.4. Isolation Forest for One-Class Anomaly Detection

5.5. One-Class Support Vector Machine (OCSVM) for Anomaly Detection

5.6. Other One-Class Anomaly Detection Techniques: LOF, Deep SVDD, and kNN

5.7. Discussion and Justification of Exclusion

6. Preprocessing Data and Setup

6.1. Feature Preprocessing and Normalization

6.2. Feature Reduction Using Principal Component Analysis (PCA)

6.3. Performance Evaluation

6.4. Experimental Setup: NFSv4 Server with Runtime IDS and Multi-Client Architecture

7. Experiment & Results

7.1. Experimental Procedure and Runtime Evaluation Setup

7.2. Results of Experiment

7.3. Statistical Validation of Performance Differences

7.4. Benchmark Comparison with State-of-the-Art IDS

7.5. Limitations and Critical Analysis

8. Discussion and Future Work

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI