1. Introduction
Rapid technological and digital advancements have propelled society into an era of unprecedented hyperconnectivity, marking the beginning of a fundamental transformation in how we generate, share, and protect information. Since its conception in the 1960s, the internet has evolved from a simple communication network among researchers to a global infrastructure supporting most human activities [
1]. This evolution has enabled instantaneous access to massive volumes of information, connecting devices, automating processes, and facilitating a highly interdependent digital environment [
2]. However, the infrastructure enabling the benefits of the digital age also brings new and complex challenges regarding information security and privacy. Cybersecurity has evolved in tandem with technology, continuously adapting to increasingly sophisticated and frequent security threats [
3]. Initially, information security focused on protecting systems against physical intrusions and rudimentary malware [
4]. Yet, as the internet expanded and the number of connected devices grew exponentially, cybercriminal tactics likewise became more intricate, ranging from distributed denial of service (DDoS) attacks to complex social engineering techniques, mainly aimed at exfiltrating sensitive data [
5].
Today, in an environment where cybersecurity is essential to safeguard the integrity and confidentiality of information, one of the greatest challenges is the proliferation of cyber threats facilitated by the anonymity afforded by technologies such as virtual private networks (VPNs) and proxies [
6]. While these services protect user privacy on unsecured networks, they also equip cybercriminals with tools to carry out malicious activities such as identity theft, malware distribution, and DDoS attacks [
7]. Due to the opacity provided by anonymous IP addresses, the difficulty in tracing and mitigating these threats underscores the critical need for precise detection of IP addresses associated with anonymity services to bolster digital security.
In this context, machine learning (ML) algorithms emerge as a viable solution to identify complex patterns indicative of VPN and proxy usage [
8]. However, the “black box” nature of many such algorithms poses significant challenges in terms of transparency and trustworthiness of their decisions [
9]. This paper explores two crucial dimensions for addressing these issues within cybersecurity: interpretability, which enables an understanding of why a model makes specific decisions [
10], and explainability, which elucidates which characteristics of the model influenced these decisions and how [
9].
For example, a decision tree is considered interpretable because the decision path can be traced through explicit rules at each node, allowing direct human understanding. Similarly, fuzzy models such as HFER are explainable because they use rule-based logic that describes how inputs are transformed into outputs, even when the boundaries are soft or hierarchical. In contrast, neural networks are non-interpretable, as their inner structure involves multiple layers and parameters that cannot be easily understood or traced by human observers without external tools. These distinctions guide our classification of the models throughout the study.
Traditional rule-based systems often struggle to detect VPN and proxy usage because these services are specifically designed to bypass standard filtering and masking techniques. Machine learning (ML) provides a powerful alternative by enabling the automatic identification of complex and non-obvious patterns in large-scale, high-dimensional traffic data. In the context of VPN/proxy detection, ML can uncover behavioural signals—such as unusual attack timing, repetitive origin patterns, or atypical AS number distributions—that would be difficult to define explicitly through rules. Therefore, ML is employed in this study not only as a classifier but as a tool for uncovering hidden structures in anonymised threat data that can inform early detection and proactive security measures. To guide our research, we formulate the following key research questions (RQs):
RQ1: To what extent do different ML algorithms balance predictive accuracy and interpretability when detecting anonymity service IPs?
RQ2: Which machine learning models offer the highest degree of explainability without compromising detection efficiency with the proposed dataset?
RQ3: Can a multi-criteria framework (e.g., a two-axis diagram) effectively categorise ML models based on their interpretability-performance trade-off?
In addressing these questions, we explore a variety of machine learning techniques—from decision trees to neural networks and support vector machines—evaluating them through the lens of both interpretability and explainability. Our methodology places particular emphasis on structuring a comparative analysis, including a visual representation to illustrate the interpretability-performance trade-off among the evaluated models clearly.
This paper is structured as follows:
Section 2 describes the state of the art;
Section 3 an analysis of the dataset to be addressed is to be carried out;
Section 4 details the methodology for developing predictive models;
Section 5 presents the algorithms employed and experimental results;
Section 6 discusses the findings and their relevance in the cybersecurity context; and finally,
Section 7 provides conclusions and suggests future research avenues in cybersecurity and machine learning.
2. State of the Art
The advent of ubiquitous digital connectivity and the proliferation of sophisticated cyberattacks have necessitated a paradigmatic shift in cybersecurity paradigms. Traditional rule-based detection mechanisms, while effective in static and known threat scenarios, have proven inadequate in the face of dynamic, obfuscated, and increasingly anonymised threats, such as those perpetrated via VPNs or proxy services. Consequently, the integration of machine learning (ML) into cybersecurity frameworks has emerged as a promising strategy for detecting subtle and complex threat patterns [
8].
Nonetheless, the deployment of ML in this domain introduces a critical trade-off between predictive performance and interpretability. While high-performing models—particularly deep neural networks—are adept at capturing intricate data patterns, their opaque decision-making processes raise serious concerns in contexts where transparency, auditability, and accountability are paramount [
9]. These concerns have catalysed the emergence of explainable artificial intelligence (XAI), which seeks to render ML models comprehensible to human stakeholders without significantly compromising performance.
Models such as decision trees and random forests have historically been favoured for their inherent interpretability and ability to yield structured, logical decision paths. Random forests, in particular, have demonstrated high robustness in high-dimensional and noisy environments, making them well-suited for intrusion detection systems [
11]. Likewise, probabilistic classifiers such as naïve Bayes offer computational efficiency and have proven effective in streaming and high-frequency data scenarios typical of cybersecurity environments.
Conversely, deep learning architectures (e.g., multilayer neural networks) dominate in terms of predictive accuracy, especially in complex classification tasks involving high-volume and temporally granular datasets [
12]. These models, however, are often criticised as “black boxes”, lacking the semantic transparency required for operational deployment in security-sensitive or regulated industries [
13].
In addition to these classical and post-hoc approaches, fuzzy logic-based models—such as the hierarchical fuzzy exception rules (HFER) framework—have gained prominence. These models combine rule-based transparency with the capacity to handle imprecise or ambiguous input data, thereby offering a middle ground between deterministic logic and statistical learning [
14]. Such hybrid systems are particularly beneficial in threat scenarios characterised by uncertainty and evolving attack signatures.
Recent contributions reinforce the centrality of explainability in security-related ML applications. For instance, in [
15] present a lightweight and explainable ensemble classifier for real-time anomaly detection in IoT environments, addressing the dual concerns of resource efficiency and model transparency. Similarly, [
16] presents a comprehensive approach to Android malware detection using explainable machine learning techniques. The authors emphasise the importance of feature selection to enhance model interpretability, identifying key features that significantly contribute to malware classification. By reducing data dimensionality, the proposed method achieves high accuracy while maintaining transparency, facilitating trust and compliance in security applications.
To guide model selection, some studies advocate the use of multi-criteria taxonomies, such as two-dimensional maps plotting interpretability against performance. These frameworks enable security practitioners to align algorithm choice with specific operational requirements—balancing detection accuracy with the need for explainability and computational tractability [
9,
10].
Finally, the contemporary literature reflects a transition from purely performance-driven models toward hybrid, explainable, and context-aware approaches. This evolution is particularly salient in cybersecurity, where the legitimacy and effectiveness of automated systems hinge not only on their predictive prowess but also on their interpretability, auditability, and compliance with evolving regulatory standards.
3. Problem Formulation: The CrowdSec Challenge
The challenge presented by CrowdSec (Available online:
https://www.crowdsec.net/ (accessed on 24 July 2025)) represents a critical advancement in cybersecurity, addressing a prevalent and increasingly sophisticated issue: the accurate detection of IP addresses associated with VPNs or proxy services often used to conceal malicious online activity. VPNs and proxy services are regularly utilized by threat actors to obfuscate their identities and locations, undermining the ability of organisations to detect, attribute, and mitigate cyber threats effectively. This layer of anonymity not only conceals the origins of malicious actions but also exacerbates the complexity of preventing unauthorised intrusions and various cybercrimes.
This study proposes a predictive methodology for identifying connections originating from VPN or proxy services, using attack reports provided by CrowdSec as a representative dataset. It is important to clarify that our methodology does not attempt to uncover the real IP addresses of users behind VPNs, as these services are explicitly designed to prevent such tracing. Instead, our focus lies in identifying whether an observed IP—often the public-facing endpoint of a VPN or proxy—is likely associated with anonymisation services. This is accomplished by analysing indirect signals, such as Autonomous System Number known to host VPN infrastructure, unusual temporal patterns in attack reports, and mismatches in geographic metadata. By learning from these features, machine learning models can effectively flag traffic that exhibits characteristics typical of anonymised sources without violating user privacy or relying on packet content inspection. Rather than integrating directly into the CrowdSec platform, the goal is to explore and validate a robust approach to threat detection based on advanced preprocessing, feature engineering, and machine learning techniques. By analysing patterns in attack signals and time series behaviours, the methodology aims to improve the identification of anonymised malicious traffic. Although the framework is not tested in a real-time setting, it lays the groundwork for future deployment scenarios by demonstrating the potential of combining structured data processing with model-driven insights. Ultimately, the contribution lies in the development of a flexible and generalizable pipeline for enhancing cyber threat detection using publicly available security datasets.
The implications of successfully implementing a precise detection model extend beyond the immediate utility of identifying VPN and proxy connections; it will also lay the foundation for future adaptive threat detection that evolves alongside changing adversarial tactics, thereby offering dynamic resilience. The inherent innovation in this project lies in the precision and efficiency of the detection model, combining high computational efficiency with advanced pattern recognition, making it a critical tool for the cybersecurity landscape of tomorrow.
The dataset employed in this study is structured with essential metadata, including the type of detected attack, IP addresses of the perpetrators, timestamps, and data from the reporting entities. This limited, but crucial, information poses both challenges and opportunities for feature engineering. This study utilises creative, novel approaches to feature engineering, creating nuanced time series features for each detected event that enable precise identification of anonymising connections (e.g., VPN or proxy) within the reported data.
To address the problem, we sourced an open dataset from Kaggle [
17], specifically curated to align with CrowdSec’s objectives of detecting VPN/proxy IP addresses associated with cyber threats.
A rigorous exploration of CrowdSec’s dataset was undertaken to enhance the interpretability and predictive power of each variable, given the importance of nuanced feature representation in threat detection (See
Table 1):
Each variable is described in detail below
: Represented in date time64 format, capturing high-resolution timestamps for each attack. This temporal granularity supports fine-grained time series analyses, which are critical for uncovering attack frequency patterns—whether daily, weekly, or seasonal.
: This categorical feature denotes the monitoring entity’s country of origin, enabling a spatial understanding of attack reporting distribution. Geographic categorisation assists in optimizing regionalised threat response strategies.
: A float64 variable representing the Autonomous System (AS) number of the reporting monitor. Recognising AS numbers provides insights into the network architecture supporting attack detection.
: The attacker’s country is instrumental in identifying geo-located patterns and regional threat vectors, particularly in cases involving organised or state-sponsored actors.
: This float64 variable details the attacker’s AS number. Knowing these identifiers allows for the tracing of origin infrastructure, revealing behavioural clusters of certain ASs.
: A categorical classification of the attack type, this feature enables model differentiation based on threat modality, enhancing detection specificity for each attack vector.
and : Unique identifiers for each monitoring entity and attacker IP, respectively, in int64 format. These variables enable precise tracking and correlation of monitoring and attack events across time and space.
label: This binary label acts as the supervised learning target, distinguishing IP addresses associated with VPN/proxy use (1) from those not using these services (0).
4. Methodology of Explainable, Interpretable Algorithms and Non-Interpretable Algorithms
To address the classification of anonymised network traffic through machine learning,
Section 4 is structured into two main subsections (see
Figure 1). First, in
Section 4.1, we describe the data preprocessing pipeline, which includes strategies for managing class imbalance, feature engineering, and normalization techniques designed to ensure the reliability and consistency of the input data. Second, in
Section 4.2, we present the suite of machine learning models employed—ranging from explainable and interpretable algorithms to non-interpretable architectures—along with the evaluation metrics used to assess their effectiveness in detecting VPN and proxy-related IP addresses.
4.1. Data Preprocessing
The dataset used in this study was obtained from a publicly available repository on Kaggle [
17]. Prior to training, a complete data preprocessing pipeline was applied to ensure quality input for the models and to support reliable and interpretable evaluation results.
For the construction of training and evaluation subsets, a hold-out method was applied. Since the evaluation labels were initially unavailable, the labelled portion of the dataset was partitioned following a stratified split approach inspired by [
18], allocating
for training and
for model evaluation. Although the choice of this split ratio is widely adopted in the literature, it is acknowledged that it lacks empirical justification as being universally optimal. Nonetheless, this strategy preserves the original class distribution and provides a practical balance between data availability for training and evaluation.
As a result, 61,629,685 tuples were divided into 43,140,780 for training and 18,488,905 for evaluation. This partitioning enables both the development and validation of models under consistent statistical conditions, maintaining the integrity of the original dataset structure. The stratified nature of the split also ensures proportional representation of both classes, which is essential when evaluating model performance in the context of significant class imbalance.
Before proceeding with the analysis, extensive pre-processing of the data was carried out, including handling missing values, coding categorical variables, feature selection, normalization and the application of techniques to correct class imbalances. This last aspect is particularly relevant due to the marked imbalance in the dataset: class 0 (non-VPN/Proxy) accounts for
of the instances (60,594,448 records), while class 1 (VPN/Proxy) accounts for only
(1,035,237 records). To mitigate this problem, the study implemented several strategies in the training set, such as undersampling and SMOTE (Synthetic Minority Over-sampling Technique), aimed at improving the sensitivity of the model towards the minority class without negatively affecting its overall performance. The decision on which technique to apply was guided by preliminary validation experiments to assess the trade-off between model stability and minority-class recall. As a result, the diagram shown in
Figure 1 is obtained. Although SMOTE proved useful in synthetically balancing the dataset, we acknowledge that oversampling alone may not fully capture the underlying variability and behavioural signatures associated with VPN/proxy traffic. The absence of alternative datasets or a controlled testbed setup limits the representativeness of minority class patterns, which could affect generalisation in real-world applications. These considerations are discussed further in the conclusions as avenues for future work.
4.2. Machine Learning Models
This section presents the machine learning algorithms employed for classifying IP addresses associated with VPN or proxy services. Each model is briefly described, highlighting its operational principles and applicability to the detection task.
To evaluate model performance, we used four commonly accepted metrics [
12]:
Accuracy: The proportion of correctly predicted instances among all predictions. In imbalanced datasets, this metric may be misleading, as high accuracy can be achieved by consistently predicting the majority class.
Precision: The proportion of true positive predictions among all instances predicted as positive. It reflects how reliable the model is when it predicts a VPN/proxy connection.
Recall: The proportion of true positives identified out of all actual positive instances. It indicates the model’s ability to detect all VPN/proxy-related traffic.
Macro F1-score: The harmonic mean of precision and recall, calculated for each class and then averaged. It is particularly suitable for imbalanced datasets, as it gives equal weight to both classes.
The following algorithms were evaluated:
Explainable: Explainable algorithms are those whose internal decision-making logic can be transparently represented and understood by humans, often through rule-based or instance-based mechanisms. These models provide direct insight into how predictions are made, making them suitable for applications requiring high levels of transparency and traceability.
- –
k-Nearest Neighbours (KNN): A non-parametric algorithm that assigns class labels based on the majority label of the
k nearest data points. Simplicity and interpretability make KNN a useful baseline model. It was evaluated with
and
[
11].
- –
Chi-square + KNN (Chi): This model combines the feature selection approach proposed by Chi et al. [
19] with a fuzzy logic-based classification system. The method focuses on identifying the most relevant features through rule-based heuristics, enabling the construction of interpretable and effective fuzzy classification models.
- –
HFER (Hierarchical Fuzzy Exception Rules): A fuzzy rule-based approach that applies hierarchical logic to handle exceptions and borderline cases in classification tasks. It enhances interpretability while maintaining competitive performance [
14].
- –
Decision Tree: A tree-based model that recursively splits the dataset using features that result in the highest information gain. Decision trees are highly interpretable and useful for identifying decision logic explicitly [
20].
Interpretable: Interpretable algorithms offer a balance between predictive performance and understandability. While their internal workings may not be as directly explainable as rule-based systems, they maintain a structure that allows human analysts to interpret decisions through parameters, feature importance, or simplified representations.
- –
Random Forest: An ensemble method composed of multiple decision trees. Each tree is trained on a random subset of data and features, and their predictions are aggregated by majority voting. This improves robustness and generalization [
21].
- –
Naïve Bayes: A probabilistic model based on Bayes’ theorem and the assumption of conditional independence between features. Despite its simplicity, it often performs well in text and high-dimensional datasets [
22].
- –
Support Vector Machines (SVMs): A margin-based classifier that identifies the optimal hyperplane to separate classes. We evaluated SVM with linear, polynomial, and radial basis function (RBF) kernels to explore different types of boundaries [
23].
Non-interpretable: Non-interpretable algorithms, often referred to as black-box models, achieve high predictive accuracy by capturing complex, non-linear relationships in data. However, their internal mechanisms are opaque and difficult to interpret, which may limit their applicability in contexts requiring transparency, auditability, or regulatory compliance.
- –
Neural Networks: Deep learning models with multiple hidden layers capable of capturing complex, non-linear patterns in data. The implemented architecture included seven blocks, each consisting of
- ∗
Dense layer with 128 neurons and ReLU activation;
- ∗
Batch normalization for improved training stability;
- ∗
Dropout layer with 50% rate to reduce overfitting.
The final output layer uses softmax activation for binary classification [
13].
For the following pseudocode Algorithm 1, we outline the process of training a traditional machine learning classifier. The steps involve selecting features, handling missing data, and choosing the appropriate sampling and classification techniques. Based on the type of classifier selected (such as KNN, decision trees, or SVM), the model is trained on the provided dataset to generate a trained classifier. Below is the pseudocode:
Algorithm 1 Simplified Classification Pipeline |
- 1:
Input: Training dataset D - 2:
Output: Trained classifier model - 3:
Handle missing data and encode categorical variables in D - 4:
If needed, normalize D - 5:
If needed, apply undersampling or SMOTE based on validation performance and model characteristics - 6:
If needed, apply feature selection - 7:
Select and configure classification algorithm (e.g., KNN, SVM, RF, NB) - 8:
Train classifier on D - 9:
return Trained model
|
The pseudocode Algorithm 2 illustrates the process of training a neural network model. It includes essential steps such as feature selection, data preprocessing, and the configuration of the neural network architecture. The architecture incorporates dense layers, batch normalization, and dropout techniques to improve model performance and prevent overfitting. Finally, the network is trained using the provided dataset to generate a fully trained neural network model.
Algorithm 2 Neural Network Training Pipeline |
- 1:
Input: Training dataset D - 2:
Output: Trained neural network classifier - 3:
Handle missing data and encode categorical variables in D - 4:
if normalization is needed then - 5:
Normalize D - 6:
end if - 7:
if sampling is needed then - 8:
Apply undersampling or SMOTE based on validation performance and model characteristics - 9:
end if - 10:
if feature selection is needed then - 11:
Apply feature selection - 12:
end if - 13:
Initialize neural network: - 14:
Add 7 blocks of Dense(128, relu), BatchNormalization, Dropout(0.5) - 15:
Add output layer: Dense(1, softmax) - 16:
Train the neural network with D - 17:
return Trained model
|
The results of each algorithm are then presented, highlighting their strengths and weaknesses in terms of accuracy and computational efficiency. Furthermore, the implications of these results for the effective detection of VPN/Proxy services in the context of cybersecurity are discussed.
5. Analysis of Results
In this section, we conduct a comprehensive analysis of the results obtained from the explainable, interpretable, and non-interpretable algorithms evaluated for the classification of IP addresses associated with VPN or Proxy services. Furthermore, we will provide recommendations based on the specific context of their implementation. In the realm of cybersecurity, it is imperative not only to identify threats accurately but also to do so in a manner that is understandable and efficient. The selection of the appropriate algorithm can significantly impact the ability to detect and mitigate attacks, particularly in contexts where adversaries employ advanced techniques to conceal their identities, such as through VPN and proxy services.
To effectively address this challenge, it is essential to evaluate the macro F1-scores of each algorithm, as this metric provides a balanced perspective on model performance in terms of precision and recall, without unduly favouring more frequent classes. The evaluation of macro F1-scores allows for an objective comparison of the inherent strengths and weaknesses of each algorithm. This metric is particularly valuable in scenarios characterised by class imbalance, such as the present study, where the majority of IP addresses are not associated with VPN or proxy services.
By comparing different algorithms, we can identify those models that offer higher precision, as well as those that excel in terms of interpretability and computational efficiency. This information enables us to formulate recommendations tailored to the project’s specific needs. For instance, in contexts where precision is paramount, such as in detecting sophisticated attacks, algorithms with superior macro F1-scores will be prioritised. Conversely, in environments where interpretability and ease of implementation are of greater importance, explainable algorithms will be deemed more suitable.
Moreover, contextualised recommendations are essential to ensure that cybersecurity solutions are effective and practical. An algorithm that delivers high precision but proves difficult to interpret may not be the best choice in all cases, particularly when thorough auditing of the decision-making process is required to comply with security and privacy regulations.
The results highlight a clear distinction in algorithm effectiveness, as it can be shown in
Figure 2 and in the scatter plot depicted in
Figure 3. In this figure, neural networks achieve the highest performance, approaching the maximum value of 1.0, suggesting their ability to capture complex patterns and relationships in the data. Nevertheless, it is important to consider that complex data structures can often be transformed into simpler representations through dimensionality reduction or embedding techniques (e.g., PCA, autoencoders). Such transformations may allow less complex or more interpretable models to achieve higher performance by making key patterns more accessible. While this study did not explore these techniques, their integration could potentially bridge the performance gap between transparent and opaque models and thus remains a promising avenue for future work. Other models, such as decision trees, random forest, naïve Bayes, and linear SVM, perform moderately well, indicating they may capture simpler patterns effectively but might struggle with more intricate structures. On the other hand, polynomial and radial SVM show the lowest performance, close to 0.4, potentially due to suboptimal parameter settings, insufficient flexibility, or data characteristics that are poorly suited to these models. This comparison emphasises the importance of algorithm selection and configuration when addressing classification tasks, as performance can vary significantly depending on the method and its suitability to the dataset.
Although
Figure 4 shows that most algorithms achieve high accuracy scores—often exceeding 0.9—this metric must be interpreted with caution, particularly in imbalanced datasets. In such cases, high accuracy can be misleading, as it may mask poor performance on minority classes. While neural networks appear to deliver the highest accuracy, approaching 1.0 in some instances, this alone does not necessarily reflect superior classification performance. Therefore, complementary metrics such as precision, recall, or F1-score are essential for a more nuanced and reliable evaluation of model effectiveness, especially in tasks where class imbalance is a concern.
In addition to neural networks, other algorithms such as HFER3.5, random forest, naïve Bayes, and linear support vector machines (SVM) also show commendable performance, achieving relatively high accuracy scores. These algorithms display significant effectiveness in handling the data and task requirements. Their strong results suggest they are well-equipped to deal with diverse data patterns and underlying complexities.
Given the scale of the dataset used in this study—over 61 million labelled instances—computational efficiency becomes a critical factor in the practical deployment of threat detection models. Therefore, we provide a theoretical analysis of the computational complexity and empirical feasibility of the evaluated algorithms.
Explainable algorithms such as KNN and decision trees exhibit relatively low training complexity. KNN has negligible training time but incurs a prediction cost of O() per query, which may hinder real-time scalability. Decision trees, on the other hand, typically require O() for training and O() for prediction, making them suitable for rapid inference tasks. HFER, a fuzzy rule-based system, introduces moderate computational overhead due to rule generation but remains tractable with hierarchical pruning mechanisms.
Interpretable algorithms like naïve Bayes and linear SVM offer efficient training and prediction. Naïve Bayes has linear time complexity O() for both phases, and SVMs with linear kernels scale better than their polynomial or RBF counterparts, which become impractical in large-scale settings due to their higher training complexity (between O() and O()).
Neural networks, while achieving the highest predictive performance, exhibit the highest computational cost. Their training complexity typically lies in the order of O(), where h is the number of neurons per layer and e the number of epochs. Training on over 40 million samples required high memory availability and prolonged processing times, making them less suitable for real-time or low-resource scenarios without GPU acceleration.
In conclusion, explainable and interpretable models such as decision trees, HFER and naïve Bayes provide a favourable trade-off between detection quality and resource consumption, which supports the resource efficiency claim made in the summary.
On the other hand, polynomial SVM, although performing slightly less well than the aforementioned methods, still maintains competitive accuracy levels. Its performance is respectable, reflecting its ability to capture some level of complexity, but it falls short of matching the top performers. Meanwhile, Radial SVM demonstrates the weakest performance across the board, with an accuracy score approaching 0.8. While still functional, this result indicates that Radial SVM struggles with the task at hand, potentially due to its limitations in handling complex data structures or the nature of the problem being addressed. To ensure reproducibility and facilitate further experimentation, all code and artefacts related to model development and hyperparameter tuning have been made publicly available (Available online:
https://github.com/jrtrillo/TFM-ciberseguridad (accessed on 24 July 2025)).
To ensure a thorough and rigorous analysis, we systematically explored a selection of hyperparameters for each algorithm, thoroughly evaluating the entire parameter space. The optimal hyperparameter selection for each algorithm, which was determined through extensive experimentation and fine-tuning, is detailed in
Table 2. Although
Table 2 reports only the best-performing preprocessing configuration for each model, it is important to note that all combinations of the considered techniques (i.e., SMOTE, undersampling, feature selection, and normalization) were tested independently and jointly during experimentation. The configurations shown are those that achieved the highest macro F1-score in each case. Including all possible results would have significantly increased the length and complexity of the manuscript, so only the optimal setting per model is presented for clarity. Nevertheless, a complete ablation-style breakdown may be considered in future work to quantify the impact of each individual preprocessing component.
Table 2 summarises the combination of preprocessing techniques applied to each model. The columns represent the following operations:
Feature Selection: Identifies and retains only the most relevant features to reduce dimensionality and improve model generalisation.
SMOTE (Synthetic Minority Over-sampling Technique): A strategy to synthetically generate new instances of the minority class in order to balance the dataset.
Undersampling: Reduces the number of majority class samples to mitigate class imbalance by equalising class proportions.
Data Normalization: Transforms feature values to a common scale (typically [0,1]) to ensure consistency across different models, especially those sensitive to feature magnitudes.
Following the identification of the optimal configurations, we proceeded to evaluate the performance of the algorithms across multiple metrics. In addition to the primary evaluation metric, the
macro F1-score, we incorporated several other important performance measures, including
accuracy,
recall, and
precision for each class. By considering these complementary metrics, we were able to gain a more comprehensive understanding of how well the algorithms performed in various aspects of the classification task. Each metric provides unique insights into different facets of model performance:
accuracy reflects overall correctness,
precision assesses the relevance of positive predictions, and
recall evaluates the model’s ability to identify all relevant instances. This multifaceted evaluation approach, detailed in
Table 3, allowed for a more informed and holistic assessment of the algorithms’ effectiveness, ensuring that our conclusions were based on a well-rounded analysis of their performance. To further enhance reproducibility,
Table 4 summarises the best-performing hyperparameter configurations used for each algorithm after tuning.
This section addresses
RQ1, as the analysis—particularly the results presented in
Table 2 and
Figure 2—offers a comprehensive comparative evaluation of diverse machine learning algorithms, emphasizing the inherent trade-off between predictive accuracy and interpretability in the context of detecting IP addresses linked to anonymity services.
6. Discussion
In this study, we conducted a comparative analysis of explainable and interpretable algorithms based on their macro F1-scores, evaluating their advantages and drawbacks depending on the application context. Explainable algorithms like KNN, fuzzy logic models, decision trees, and random forests offer inherent transparency in decision-making processes, which is valuable in applications where interpretability is a priority. Among these, random forest stands out with a macro F1-score of 0.5972, indicating a balance of accuracy and interpretability that makes it suitable for real-world deployment where model transparency is required. KNN variants, despite being resource-efficient and straightforward, showed lower scores, such as KNN3’s 0.5680 and KNN5’s 0.5447, suggesting that while useful for resource-constrained environments, they may fall short in more demanding scenarios. Fuzzy logic models, especially HFER with a 0.6106 score, demonstrated robust interpretability and moderate performance, aligning well with applications that demand transparent yet reasonably accurate threat detection.
Beyond the comparative analysis of performance metrics, it is important to connect these results back to the core problem that motivates this study: identifying VPN/proxy-based anonymised IP traffic in real-world cybersecurity scenarios. Our findings suggest that models like neural networks, despite their limited interpretability, can effectively flag traffic patterns typically associated with anonymity services due to their high sensitivity to complex behavioural indicators such as anomalous AS numbers or atypical geolocation dynamics. On the other hand, explainable models like HFER and decision trees, though less accurate, are better suited for environments where traceability and operational transparency are essential, such as real-time security audits or compliance-driven monitoring. This highlights that model selection is not only a technical decision but a strategic one, depending on the specific needs of the cybersecurity infrastructure.
On the other hand, interpretable algorithms like naïve bayes, SVM, and neural networks demonstrated a significant edge in terms of performance, especially in complex data structures, where interpretability alone is less critical. Neural networks achieved the highest macro F1-score of 0.8786, outperforming all other models and emphasizing their capability to capture intricate, non-linear data relationships critical in cybersecurity. Naïve Bayes, with a macro F1-score of 0.6288, exhibited strong interpretability and adaptability for dynamic data, adding value in fast-paced cybersecurity environments. While SVMs performed comparably lower, with scores ranging from 0.4878 to 0.5172, they retain their strength in high-dimensional scenarios, which are common in security analytics, but with limited transparency due to the complexity of kernel-based decision boundaries.
When prioritizing accuracy, especially for scenarios demanding sophisticated pattern recognition, interpretable algorithms, particularly neural networks, clearly excel, though they require considerable computational resources and offer limited insight into the prediction rationale. This characteristic can restrict their applicability in cases where decision transparency is crucial for compliance or trust. In contrast, explainable models such as decision trees and random forests offer a compelling balance for implementations that demand clarity, efficiency, and ease of interpretation. Decision trees, scoring 0.5668, and random forest, with a slightly higher 0.5972, provide straightforward interpretability while maintaining sufficient accuracy, making them advantageous for scenarios where decisions need to be easily understandable and verifiable by human analysts. Fuzzy logic models, specifically HFER, maintain a level of flexibility and performance suitable for applications where ambiguity in data exists, providing a middle ground between complexity and transparency.
The findings of this study indicate that a flexible approach combining both types of algorithms could maximize the effectiveness of cybersecurity solutions. Hybrid systems could employ explainable algorithms for continuous monitoring, where transparency is key and interpretable algorithms for detailed analysis in response to specific incidents. Such an adaptive framework, aligning with the dynamic and diverse threat landscape of cybersecurity, allows for optimal model selection based on operational demands, enhancing both the robustness and responsiveness of threat detection systems. Ultimately, these findings directly answer RQ2 by identifying the models that offer the best explainability–performance balance. In particular, the HFER model (macro F1-score = 0.6106) emerges as the most effective explainable algorithm, combining rule-based transparency with acceptable detection capability. Among interpretable models, Random Forest (macro F1-score = 0.5972) also demonstrates high detection efficiency while maintaining a moderate level of interpretability. Therefore, both models represent suitable options when explainability is required without severely compromising classification performance. Furthermore,
RQ3 is addressed through the conceptual framework articulated in the Introduction and substantiated by the bidimensional representation in
Figure 3, which serves as a cogent tool for classifying machine learning models according to the inherent trade-offs between their interpretability and predictive performance.
7. Conclusions
This study addresses the challenge of enhancing threat detection in cybersecurity through the use of explainable and interpretable machine learning algorithms, focusing on the accurate classification of IP addresses associated with VPN and Proxy services. These addresses are commonly exploited by malicious actors to conceal their identities and facilitate harmful activities. This work extends to applications in next-generation firewalls (NGFWs), which integrate traditional firewall capabilities with advanced technologies like deep packet inspection (DPI), intrusion prevention systems (IPS), and threat intelligence, allowing for efficient identification and blocking of both known and unknown threats. Additionally, NGFWs support SSL/TLS inspection to decrypt and analyse encrypted traffic, providing an added layer of security by revealing hidden threats in encrypted communications. They also incorporate zero-trust access models, where access is granted based on user verification and contextual factors rather than solely perimeter security.
Our analysis highlights the role of specific algorithms: KNN, though simplistic, proved effective in balanced data scenarios, while decision trees provided high interpretability but required pruning to avoid overfitting. Fuzzy logic algorithms excelled in managing ambiguity in complex datasets, and random forests offered robust accuracy with substantial feature importance interpretation. SVMs, while challenging to interpret, demonstrated efficacy in high-dimensional spaces, essential in cybersecurity contexts, while naïve Bayes showed agility in real-time data updates, and neural networks, though opaque, yielded insights through feature importance analyses.
While foundational, this study opens avenues for future work in combining explainable and interpretable algorithms for more robust threat detection. Priority areas include advancing feature engineering to better capture attacker behaviour dynamics, incorporating ensemble techniques like stacking and blending to bolster model accuracy and interpretability and developing real-time deployment solutions in intrusion detection systems. Enhancing neural network transparency with post-hoc explainability methods such as LIME (local interpretable model-agnostic explanations) or SHAP (Shapley additive explanations) could significantly improve understanding of their decision-making processes. Although not used in this study, these techniques are highly recommended for use in future work aiming to make deep learning models more interpretable in cybersecurity environments. Furthermore, embedding real-time adaptive learning models would enable quicker responses to evolving threats, enhancing overall resilience. Additionally, future work should consider the collection of enriched datasets through controlled testbeds or the integration of heterogeneous sources with annotated VPN/proxy traffic. This would address the limitations of relying solely on oversampling techniques such as SMOTE and enhance the robustness and generalisability of the proposed models.
Finally, the integration of NGFWs with DPI, IPS, threat intelligence, SSL/TLS inspection, and zero-trust models lays a strong defence against sophisticated threats. Emphasizing digital forensics and automated AI-driven responses further bolsters incident response capabilities. This study underscores the importance of ethical considerations in data privacy, proposing compliance with standards like GDPR to ensure responsible handling of cybersecurity data.