Figure 1.
Layers of an Intelligent Edge Environment.
Figure 1.
Layers of an Intelligent Edge Environment.
Figure 2.
SDN controller in an Intelligent Edge Environment.
Figure 2.
SDN controller in an Intelligent Edge Environment.
Figure 3.
ML process for SDN Model development.
Figure 3.
ML process for SDN Model development.
Figure 4.
Confusion matrices for binary and multi-class classification on the InSDN dataset. XGBoost giving the most consistent multi-class results, accurately classifying both majority and minority attack categories.
Figure 4.
Confusion matrices for binary and multi-class classification on the InSDN dataset. XGBoost giving the most consistent multi-class results, accurately classifying both majority and minority attack categories.
Figure 5.
Hyper parameter-Tuned Results for Binary Classification.
Figure 5.
Hyper parameter-Tuned Results for Binary Classification.
Figure 6.
Hyperparameter-Tuned Results for Multi-Class Classification.
Figure 6.
Hyperparameter-Tuned Results for Multi-Class Classification.
Figure 7.
Confusion matrices after hyper-parameter tuning (binary classification) on the InSDN dataset (Benign vs Attack). Top row: raw counts illustrating absolute misclassification numbers. Bottom row: row-normalized confusion matrices (per-class recall), showing 100% recall for both classes across models after tuning.
Figure 7.
Confusion matrices after hyper-parameter tuning (binary classification) on the InSDN dataset (Benign vs Attack). Top row: raw counts illustrating absolute misclassification numbers. Bottom row: row-normalized confusion matrices (per-class recall), showing 100% recall for both classes across models after tuning.
Figure 8.
Confusion matrices after hyper-parameter tuning (multi-class classification) on the InSDN dataset. Top row: raw counts for each model (Logistic Regression, Random Forest, XGBoost). Bottom row: row-normalized confusion matrices (per-class recall). The row-normalized plots emphasize per-class detection performance and reveal that Random Forest and XGBoost achieve near-perfect per-class recall after tuning.
Figure 8.
Confusion matrices after hyper-parameter tuning (multi-class classification) on the InSDN dataset. Top row: raw counts for each model (Logistic Regression, Random Forest, XGBoost). Bottom row: row-normalized confusion matrices (per-class recall). The row-normalized plots emphasize per-class detection performance and reveal that Random Forest and XGBoost achieve near-perfect per-class recall after tuning.
Figure 9.
Binary class after tuned Matrix results.
Figure 9.
Binary class after tuned Matrix results.
Figure 10.
Multi-class after tuned Matrix results.
Figure 10.
Multi-class after tuned Matrix results.
Figure 11.
Combined ROC curves for ALL models (Binary + Multiclass). (a) Binary:Combined ROC Curves (all models); (b) Multiclass:Combined ROC Curves (micro-average, all models).
Figure 11.
Combined ROC curves for ALL models (Binary + Multiclass). (a) Binary:Combined ROC Curves (all models); (b) Multiclass:Combined ROC Curves (micro-average, all models).
Figure 12.
Kafka-enabled Machine Learning framework in SDN architecture.
Figure 12.
Kafka-enabled Machine Learning framework in SDN architecture.
Figure 13.
Execution of Docker compose command in the terminal.
Figure 13.
Execution of Docker compose command in the terminal.
Figure 14.
Successful installations of Zookeeper, Grafana, PostgreSQL, and Kafka.
Figure 14.
Successful installations of Zookeeper, Grafana, PostgreSQL, and Kafka.
Figure 15.
Results of Docker images list.
Figure 15.
Results of Docker images list.
Figure 16.
Docker desktop status.
Figure 16.
Docker desktop status.
Figure 17.
Kafka producer reads the SDN dataset.
Figure 17.
Kafka producer reads the SDN dataset.
Figure 18.
Kafka producer reads from the SDN dataset and streams the data into defined Kafka topics for downstream processing.
Figure 18.
Kafka producer reads from the SDN dataset and streams the data into defined Kafka topics for downstream processing.
Figure 19.
Streamlit command execution.
Figure 19.
Streamlit command execution.
Figure 20.
SDN live analysis: Streamlit web interface (localhost:8501) visualizes the live output of the Kafka consumer and real-time anomaly detection model.
Figure 20.
SDN live analysis: Streamlit web interface (localhost:8501) visualizes the live output of the Kafka consumer and real-time anomaly detection model.
Figure 21.
Kafka consumer processes incoming messages.
Figure 21.
Kafka consumer processes incoming messages.
Figure 22.
Kafka consumer logs results into PostgreSQL: The system dynamically creates PostgreSQL tables using column headers from the SDN dataset and uploads streaming outputs for structured querying.
Figure 22.
Kafka consumer logs results into PostgreSQL: The system dynamically creates PostgreSQL tables using column headers from the SDN dataset and uploads streaming outputs for structured querying.
Figure 23.
Grafana dashboards visualize key metrics such as label distribution.
Figure 23.
Grafana dashboards visualize key metrics such as label distribution.
Figure 24.
Grafana dashboards visualize key metrics such as label distribution, anomaly counts, and real-time data trends from PostgreSQL.
Figure 24.
Grafana dashboards visualize key metrics such as label distribution, anomaly counts, and real-time data trends from PostgreSQL.
Figure 25.
Kafka-enabled Machine Learning framework status.
Figure 25.
Kafka-enabled Machine Learning framework status.
Figure 26.
Implementation scenario: MEC in core end point.
Figure 26.
Implementation scenario: MEC in core end point.
Table 1.
Summary of Strengths and Gaps in Data Processing Approaches.
Table 1.
Summary of Strengths and Gaps in Data Processing Approaches.
Category | Strengths | Gaps |
---|
Traditional Methods | Simple deployment | Static, low adaptability |
ML-based methods | High accuracy, adaptable | Not real-time, offline updates |
Streaming (Kafka) | Low latency, scalable | Limited SDN-focused ML integrations |
Edge-based ML | Fast, local decisions | Rarely applied in SDN/NFV pipelines |
Table 2.
InSDN dataset composition.
Table 2.
InSDN dataset composition.
File Name | Data Source | Attack Types | Record Count |
---|
Normal_data.csv | Benign traffic | FTP, DNS, HTTPS, etc. | 68,424 |
OVS.csv | Open vSwitch testbed | dos, ddos, probe, brute force, web attack, botnet | 138,722 |
Metasploitable-2.csv | Metasploitable-2 VM | dos, ddos, probe, brute force, U2R | 136,743 |
Merged Total | 3 | | 343,889 |
Table 3.
SDN-specific attack categories and severity levels.
Table 3.
SDN-specific attack categories and severity levels.
Attack Type | Total Instances | Severity Level |
---|
ddos (Distributed Denial-of-Service) | 121,942 | Critical |
probe (Network Reconnaissance) | 98,129 | Medium |
dos (Denial of Service) | 53,616 | High |
BFA (Brute force attack) | 1405 | High |
web attack | 192 | Medium |
botnet | 164 | Critical |
U2R (User-to-Root Escalation) | 17 | Critical |
Table 4.
Top features ranked by Mutual Information.
Table 4.
Top features ranked by Mutual Information.
Feature | Mutual Information Score |
---|
Bwd Header Len | 1.247 |
Dst Port | 1.126 |
Bwd IAT Tot | 1.034 |
Bwd IAT Max | 1.026 |
Bwd IAT Mean | 0.997 |
Init Bwd Win Byts | 0.978 |
Flow Pkts/s | 0.945 |
Bwd Pkts/s | 0.945 |
Flow Duration | 0.944 |
Flow IAT Max | 0.939 |
Table 5.
Multi-class label distribution in the InSDN Dataset.
Table 5.
Multi-class label distribution in the InSDN Dataset.
Class Label | Instance Count |
---|
probe | 98,129 |
ddos | 121942 |
Normal | 68,423 |
dos | 53,616 |
brute force (BFA) | 1405 |
web attack | 192 |
botnet | 164 |
U2R | 17 |
Table 6.
Binary classification class distribution before and after SMOTE.
Table 6.
Binary classification class distribution before and after SMOTE.
Class | Before SMOTE (%) | After SMOTE (%) |
---|
Benign (0) | 54,738 (19.9%) | 220,372 (50.0%) |
Attack (1) | 220,372 (80.1%) | 220,372 (50.0%) |
Table 7.
Multi-class classification distribution before and after SMOTE.
Table 7.
Multi-class classification distribution before and after SMOTE.
Class | Before SMOTE (%) | After SMOTE (%) |
---|
Brute force | 1124 (0.41%) | 97,553 (12.5%) |
botnet | 131 (0.05%) | 97,553 (12.5%) |
ddos | 97,553 (35.46%) | 97,553 (12.5%) |
dos | 42,893 (15.59%) | 97,553 (12.5%) |
normal | 54,738 (19.90%) | 97,553 (12.5%) |
probe | 78,503 (28.54%) | 97,553 (12.5%) |
U2R | 14 (0.01%) | 97,553 (12.5%) |
web attack | 154 (0.06%) | 97,553 (12.5%) |
Table 8.
Binary classification results on the InSDN Dataset.
Table 8.
Binary classification results on the InSDN Dataset.
Model | Accuracy | Precision | Recall | F1-Score | ROC-AUC |
---|
LR | 0.9986 | 0.9986 | 0.9986 | 0.9986 | 0.9994 |
RF | 0.9999 | 0.9999 | 0.9999 | 0.9999 | 0.99996 |
XGB | 0.9999 | 0.9999 | 0.9999 | 0.9999 | 0.999998 |
Table 9.
Multiclass classification results on the InSDN Dataset.
Table 9.
Multiclass classification results on the InSDN Dataset.
Model | Accuracy | Precision | Recall | F1-Score | ROC-AUC |
---|
LR | 0.9778 | 0.9919 | 0.9778 | 0.9838 | 0.9991 |
RF | 0.9997 | 0.9997 | 0.9997 | 0.9997 | 0.99999 |
XGB | 0.9998 | 0.9998 | 0.9998 | 0.9998 | 0.9999999 |
Table 10.
Winner per metric combined results.
Table 10.
Winner per metric combined results.
Index | Task | Metric | Best Model | Score (%) |
---|
0 | Binary | Accuracy | XGBoost | 0.999913 |
1 | Binary | F1 | XGBoost | 0.999913 |
2 | Binary | Precision | XGBoost | 0.999913 |
3 | Binary | ROC-AUC | XGBoost | 0.999998 |
4 | Binary | Recall | XGBoost | 0.999913 |
5 | Multi-class | Accuracy | XGBoost | 0.999826 |
6 | Multi-class | F1 | XGBoost | 0.999826 |
7 | Multi-class | Precision | XGBoost | 0.999826 |
8 | Multi-class | ROC-AUC | XGBoost | 1.00000 |
9 | Multi-class | Recall | XGBoost | 0.999826 |
Table 11.
Prototype requirements and associated components.
Table 11.
Prototype requirements and associated components.
Name | Details |
---|
Apache Kafka 7.0.1 | Real-time data streaming and event processing |
Docker | Containerized environment for the proposed Kafka-driven framework |
Azure Cloud | Scalable cloud-based storage for large datasets |
Grafana (latest) | Real-time visualization and interactive dashboards |
SDN Dataset | Dataset used for anomaly detection in software-defined networks |
PostgreSQL 14 | Time-series and relational data storage backend |
Table 12.
Mapping between SDN dataset features and VANET Kafka streams.
Table 12.
Mapping between SDN dataset features and VANET Kafka streams.
Feature Type | SDN Dataset (InSDN) | VANET Kafka Streams |
---|
Flow Duration | Yes | Derived from sensor timestamps |
Packet/Byte Count | Yes | Vehicle telemetry stream |
Source/Destination IPs | Yes | OBU/RSU identifiers |
Port Numbers | Yes | Network channel mapping |
Protocol Type | Yes | Vehicle communication protocol |
Timestamps | Yes | Streaming log time |
Attack Labels | Yes | Predicted in real-time using trained models |