This section presents the empirical evaluation of the proposed SAFEL-IoT framework against baseline methods, including FedAvg, FedProx, and Centralized Autoencoder (AE). Experiments were conducted on the SKAB dataset under dynamic conditions simulating concept drift and adversarial attacks. The evaluation focuses on four dimensions: model performance, communication efficiency, explain ability accuracy, and robustness.
4.5. Robustness Against Drift and Attacks
SAFEL-IoT is designed to address critical challenges in IIoT deployments, including concept drift and adversarial attacks, which are common in dynamic edge environments. Concept drift occurs when the underlying data distribution changes over time, potentially degrading model accuracy. To counter this, SAFEL-IoT employs an adaptive aggregation mechanism that continuously recalibrates model weights based on divergence metrics. Unlike static federated models, SAFEL-IoT dynamically adjusts the influence of local updates, mitigating the risk of outdated or misaligned data corrupting the global model. During simulation testing, SAFEL-IoT was exposed to Gaussian noise, label flipping, and model poisoning attacks across 100-edge nodes in a smart grid scenario. Despite these perturbations, it preserved a stable F1 score of 0.91 and maintained false positive rates (FPR) below 0.05, demonstrating its resilience.
In scenarios with Gaussian noise—which typically disrupts feature distributions—SAFEL-IoT’s temporal attention mechanism successfully isolated noisy updates, preventing their propagation to the global model. This targeted filtering resulted in a 12% reduction in error propagation compared to baseline methods like FedProx and EdgeGuard. When subjected to label flipping attacks, where malicious clients intentionally mislabel data, SAFEL-IoT’s adaptive weighting mechanism down-weighted the contributions of high-divergence updates, reducing their impact on global accuracy by 14%.
For model poisoning attacks, where compromised clients attempt to inject malicious updates into the global model, SAFEL-IoT’s secure aggregation and differential privacy mechanisms provided an additional layer of protection. Homomorphic encryption ensured that gradient updates were securely aggregated without direct exposure, preventing adversarial clients from reversing model states or altering training trajectories. As a result, SAFEL-IoT was able to maintain a 7% lower model drift compared to traditional federated models under attack scenarios.
Additionally, future iterations of SAFEL-IoT are planned to include secure enclave processing, which will shield local models during aggregation, providing enhanced protection against model inversion attacks and backdoor poisoning. This enhancement aims to fortify SAFEL-IoT’s robustness by isolating sensitive computations within trusted execution environments (TEEs), further reducing the attack surface during federated updates. Such advances are expected to push SAFEL-IoT’s anomaly detection reliability even further, solidifying its application for 6G-enabled IIoT ecosystems where data security and model integrity are critical.
Summary: The results validate SAFEL-IoT’s readiness for secure and scalable deployment in 6G-enabled smart industrial ecosystems. Its adaptive aggregation mechanism and robust explainability not only improve anomaly detection accuracy but also ensure resilience against concept drift and adversarial threats. These capabilities highlight its superiority over existing models in terms of communication efficiency, interpretability, and security, making it an ideal choice for next-generation IIoT applications.
4.6. SAFEL-IoT: Explainable Federated Learning for Anomaly Detection in 6G Smart Industry 5.0
We evaluate SAFEL-IoT’s performance in diverse 6G-enabled smart industrial scenarios across multiple critical aspects, including robustness under drift, model convergence, security-performance trade-offs, training efficiency, and explain ability. Each outcome is supported with figures and tabular summaries for clarity.
- (1)
Dynamic Aggregation Stability:
Figure 6 and
Table 6 illustrate the dynamic aggregation stability of SAFEL-IoT compared to FedAvg by tracking model divergence across communication rounds, measured using cosine distance in log scale. SAFEL-IoT maintains significantly lower divergence throughout the training process, demonstrating the effectiveness of its adaptive weighting mechanism in suppressing client deviations. Specifically,
Table 5 shows that SAFEL-IoT consistently achieves lower divergence values (e.g., 0.22 at round 5 and 0.09 at round 20) compared to FedAvg, highlighting improved aggregation stability and smoother convergence under non-i.i.d. conditions.
- (2)
Security vs. Performance Trade-Off
Figure 7 and
Table 7 present the trade-off between encryption strength and model performance in terms of F1 score and encryption overhead. The baseline with no encryption achieves the highest F1 score (0.94) but lacks security protection. Applying only differential privacy (DP) slightly reduces the F1 score to 0.89 with minimal overhead (15 ms), while using only homomorphic encryption (HE) further degrades performance (F1 score 0.85) and incurs the highest overhead (120 ms). SAFEL-IoT, which combines HE and DP, strikes an optimal balance, achieving a strong F1 score of 0.92 while maintaining a reasonable overhead of 80 ms, thereby ensuring both security and operational efficiency in federated industrial deployments.
The differential privacy (DP) mechanism in SAFEL-IoT uses noise parameters ε and δ to manage privacy-accuracy trade-offs. Empirical analysis revealed that reducing ε from 1.0 to 0.5 enhances privacy but decreases the F1 score by 0.02. Meanwhile, δ adjustments had minimal impact unless set below 0.1, where model accuracy dropped by 3%. These findings validate that DP can be fine-tuned to optimize both privacy and detection reliability in high-risk IIoT applications.
- (3)
Training Efficiency Analysis
Figure 8 and
Table 8 demonstrate the multi-dimensional training efficiency comparison across different models based on accuracy (F1 score), training time, communication cost, and energy consumption. SAFEL-IoT is positioned on the efficiency frontier, achieving the best balance between high accuracy (0.93), low training time (63.7 s), minimal communication overhead (95.1 MB), and lowest energy consumption (12.1 J). Compared to baseline models, SAFEL-IoT significantly dominates in all efficiency dimensions, confirming its superiority for scalable and resource-optimized industrial anomaly detection under federated settings.
- (4)
Convergence Efficiency
Figure 9 and
Table 9 highlight the convergence efficiency of SAFEL-IoT compared to FedAvg and FedProx. SAFEL-IoT reaches the target training loss threshold (loss < 0.1) in just 10 communication rounds, outperforming FedProx (13 rounds) and FedAvg (15 rounds). This accelerated convergence is attributed to SAFEL-IoT’s adaptive temporal aggregation mechanism, which effectively stabilizes model updates and minimizes training loss fluctuations across rounds, leading to faster optimization in distributed environments.
- (5)
Energy Efficiency with 6G Slicing
Figure 10 and
Table 10 present a comparative analysis of energy consumption under 5G (without slicing) and 6G (with network slicing) conditions during federated learning communication rounds. Under traditional 5G settings without slicing, the system incurs an energy cost of approximately 120 millijoules per round. In contrast, deploying 6G with network slicing reduces the energy consumption dramatically to 65 millijoules per round, achieving a 45.8% improvement in energy efficiency.
This substantial reduction is attributed to 6G’s ability to dynamically allocate network resources through slicing, allowing edge nodes to operate with optimized bandwidth and minimal idle communication overhead. Network slicing ensures that SAFEL-IoT clients can transmit updates more efficiently without unnecessary protocol overhead, thus conserving battery life and reducing operational costs in real-world IIoT environments.
These results validate the strategic integration of 6G slicing into the SAFEL-IoT framework, emphasizing not only faster data transmission but also sustainable energy management critical for large-scale, long-duration industrial deployments.
- (6)
Explainability: SHAP vs. LIME
Figure 11 and
Table 11 compare feature importance scores generated by two explainability methods: SHAP (SHapley Additive exPlanations) and LIME (local interpretable model-agnostic explanations). Both methods consistently identify “Vibration” as the most influential feature for anomaly detection, with SAFEL-IoT’s SHAP analysis assigning the highest absolute importance score (−0.70), compared to LIME (−0.50). This indicates strong agreement across different interpretability techniques, confirming that vibration sensor anomalies are the primary indicators of system faults in the industrial dataset. Additionally, pressure and temperature features show secondary contributions, while humidity exhibits minimal impact.
The higher absolute feature attribution scores observed with SHAP demonstrate its finer granularity and stability in explanation, which is critical for providing actionable insights in safety-critical IIoT deployments. These results validate the effectiveness of the explainability module integrated into SAFEL-IoT.
The explainability pipeline in SAFEL-IoT integrates SHAP (SHapley Additive exPlanations) with temporal attention mechanisms to provide real-time interpretability during anomaly detection. The process begins with the extraction of input features from federated edge devices. These features are analyzed using SHAP, which assigns contribution values to each feature, explaining their influence on model predictions. Unlike traditional black-box models, SHAP enables local interpretability by quantifying how each feature pushes a prediction toward an anomaly or normal state. These SHAP values are then passed through a temporal attention mechanism, which dynamically adjusts their importance based on historical data patterns. This step ensures that time-sensitive anomalies are weighted more heavily, improving detection accuracy in IIoT environments. Following this, the aggregated insights flow into the Anomaly Detection stage, where high-confidence anomalies are flagged for immediate action. Finally, the results are sent to the Root Cause Identification module, where SAFEL-IoT automatically highlights contributing factors, enabling rapid response and corrective measures. This entire pipeline is optimized for edge processing, allowing real-time interpretability without sacrificing latency or computational efficiency.
Figure 12 illustrates the explainability pipeline for SAFEL-IoT, highlighting the flow from Input Features through SHAP Analysis to Temporal Attention, followed by Anomaly Detection and culminating in Root Cause Identification. Each stage builds upon the previous one, enabling transparent and interpretable decision-making for real-time anomaly detection in IIoT environments.
- (7)
Drift Resilience (Revisit)
For completeness, we also revisit the concept drift robustness. The performance drop is much less severe in SAFEL-IoT compared to traditional methods, reaffirming its stability in volatile environments.
Figure 13 and
Table 12 evaluate the robustness of anomaly detection models under simulated concept drift conditions across different time windows. The results show that SAFEL-IoT consistently maintains high F1 scores (ranging from 0.91 to 0.92) despite environmental shifts, demonstrating strong resistance to concept drift. In contrast, both Centralized AE and FedAvg models experience notable performance degradation, with F1 scores declining more sharply across windows.
These findings highlight the advantage of SAFEL-IoT’s adaptive aggregation and temporal regularization mechanisms, which enable the model to remain stable even as data distributions evolve over time. This robustness is critical for real-world IIoT environments where system dynamics and operating conditions frequently change, ensuring reliable and uninterrupted anomaly detection performance.
- (8)
Adversarial Attack Resilience Comparison
Figure 14 illustrates the false positive rates (FPR) of FedAvg and SAFEL-IoT under four adversarial scenarios: no attack, label flipping, Gaussian noise, and model poisoning. SAFEL-IoT consistently maintains FPR below the critical threshold (0.5), demonstrating its robustness against data and model manipulation.
Figure 14 and
Table 13 present a comparison of adversarial attack resilience between SAFEL-IoT and FedAvg under various attack scenarios, including label flipping, Gaussian noise injection, and model poisoning. SAFEL-IoT consistently achieves lower false positive rates (FPR) across all attack types, maintaining FPR values well below the critical threshold (e.g., 0.08 under no attack, 0.18 under model poisoning), whereas FedAvg suffers significantly higher FPRs (e.g., 0.43 for label flip, 0.62 for model poisoning).
These results demonstrate that SAFEL-IoT’s integration of adaptive aggregation and privacy-preserving encryption mechanisms not only improves anomaly detection but also significantly strengthens model robustness against adversarial manipulations. This ensures higher trustworthiness and operational security for industrial IoT systems deployed in hostile or noisy environments.
SAFEL-IoT outperforms FedAvg in all scenarios, offering significantly lower false positive rates. Especially under model poisoning, the FPR of FedAvg peaks at 0.62, whereas SAFEL-IoT effectively suppresses it to 0.18—a 71.0% relative reduction.
- (9)
Scalability: Latency vs. Network Size
Scalability is critical for industrial deployments.
Figure 14 shows how detection latency increases with the number of edge nodes. SAFEL-IoT remains under the real-time threshold (50 ms) even with 100 nodes, while both centralized AE and FedAvg exceed this threshold after 60 nodes.
Figure 15 and
Table 14 evaluate the scalability of SAFEL-IoT compared to baseline models in terms of detection latency as the number of edge nodes increases. SAFEL-IoT demonstrates superior scalability, maintaining the lowest detection latency across different network sizes. At 100-edge nodes, SAFEL-IoT achieves a latency of only 10.3 ms, significantly outperforming Centralized AE (87.6 ms) and FedAvg (32.1 ms).
The reduced latency is attributed to SAFEL-IoT’s optimized federated aggregation and efficient communication strategies, enabling real-time anomaly detection even as the network scales. This performance ensures that SAFEL-IoT remains well below the critical real-time operational thresholds, making it highly suitable for large-scale, delay-sensitive industrial IoT deployments.
These results confirm SAFEL-IoT’s ability to scale effectively across increasing IIoT deployments while meeting the latency demands of real-time anomaly detection.
SAFEL-IoT integrates partial homomorphic encryption (PHE) and differential privacy (DP) to secure model updates without compromising latency. PHE supports encrypted arithmetic operations during aggregation, reducing the risk of data leakage. Although PHE introduces a 70 ms per round latency overhead, its scalability across 100-edge nodes remains efficient, maintaining sub-50 ms latency through 6G slicing. Differential privacy (DP) further anonymizes model gradients, ensuring that individual data contributions remain confidential while optimizing bandwidth usage. Empirical analysis demonstrated that DP reduces synchronization bandwidth by 18%, balancing security and communication efficiency for large-scale IIoT deployments.
- (10)
Industry Feedback and Pilot Deployment
Early discussions with manufacturing partners confirmed the importance of real-time anomaly explanations and secure aggregation mechanisms in IIoT environments. Based on preliminary feedback, a pilot deployment of SAFEL-IoT is planned at a semiconductor manufacturing facility. The initial focus will be on anomaly detection during wafer inspection processes, where precise and explainable fault identification is critical to production quality. This real-world deployment will help validate SAFEL-IoT’s scalability, latency performance, and explainability effectiveness under live operational conditions.
4.7. Rigorous Comparison with State-of-the-Art Methods
The comparison evaluates SAFEL-IoT against key baseline methods: FedPer, FedProx, FedML-Secure, and EdgeGuard. The analysis covers key performance metrics, including F1 score, false positive rate (FPR), training time, explanation error (EE), and communication cost.
Table 15 presents a comparative analysis of SAFEL-IoT against baseline methods—FedPer, FedProx, FedML-Secure, and EdgeGuard. SAFEL-IoT achieves the highest F1 score (0.93) and the lowest false positive rate (0.04), indicating superior anomaly detection accuracy. It also reduces training time and communication overhead by up to 21.3% and 35%, respectively, compared to traditional methods. Furthermore, SAFEL-IoT’s explanation error (0.15) is significantly lower, enhancing model interpretability and transparency. These improvements highlight its scalability and robustness for 6G-enabled IIoT environments.
Statistical Validation
To substantiate the claims of improved performance, add a statistical analysis:
The significant p-values indicate that SAFEL-IoT’s improvements are statistically robust, particularly in terms of explainability (lower EE) and communication efficiency.
The evaluation of SAFEL-IoT against FedPer, FedProx, FedML-Secure, and EdgeGuard demonstrates its superior performance in anomaly detection within IIoT environments. SAFEL-IoT achieved the highest F1 score of 0.93, representing a significant improvement over baseline methods, along with the lowest false positive rate (0.04), highlighting its reliability in critical monitoring scenarios. Additionally, it reduced training time by 21.3% compared to FedPer, and minimized communication costs by 35%, making it highly efficient for bandwidth-constrained IIoT networks. Its explanation error (EE) of 0.15 is the lowest among all models, attributed to its SHAP-based interpretability and temporal attention mechanisms. Statistical analysis with p-values confirms the improvements are significant, positioning SAFEL-IoT as a robust solution for secure and transparent anomaly detection in 6G-enabled environments.
4.8. Discussion
The comprehensive experimental evaluation of SAFEL-IoT reveals multiple performance advantages across critical metrics. Firstly, in terms of detection accuracy (
Figure 3), SAFEL-IoT significantly outperforms FedAvg and FedProx, achieving the highest F1 score while maintaining the lowest training time and explanation error, as confirmed in
Table 2.
Communication efficiency (
Figure 4) is markedly improved, with SAFEL-IoT reducing communication overhead by up to 70.3% compared to Centralized AE. This is particularly crucial for bandwidth-constrained IIoT deployments.
In
Figure 6, dynamic aggregation results in minimal model divergence over communication rounds, validating the effectiveness of the adaptive weighting strategy in SAFEL-IoT.
From an explainability perspective, SAFEL-IoT provides more stable and interpretable outputs (
Figure 11), with vibration emerging as the most impactful anomaly feature. The explanation fidelity, as shown in
Table 3, confirms reduced explanation error, enhancing model trust. Real-world deployment faces hardware constraints, including limited compute capacity, memory, and battery at edge devices. SAFEL-IoT optimizes communication and computation, but future extensions must explore lightweight explainability modules and energy-aware scheduling for ultra-low power nodes.
Security-performance trade-off analysis (
Figure 7) highlights that the hybrid encryption strategy in SAFEL-IoT maintains a strong F1 score (0.92) with moderate encryption overhead (80 ms), balancing protection and performance better than single-layer methods.
Robustness under concept drift (
Figure 12) further demonstrates SAFEL-IoT’s ability to resist performance degradation, while adversarial resilience (
Figure 13) confirms a significantly lower false positive rate under multiple attack vectors.
Moreover, scalability tests (
Figure 14) show that SAFEL-IoT maintains real-time detection latency across increasing edge nodes, unlike FedAvg or centralized baselines, which breach the 50 ms latency threshold.
Lastly, the convergence comparison (
Figure 9) indicates SAFEL-IoT converges five rounds faster than FedAvg, reflecting its learning efficiency.
In summary, SAFEL-IoT exhibits high accuracy, strong security, low latency, and interpretability, making it a robust solution for next-generation anomaly detection in smart Industry 5.0 powered by 6G.
Ethical considerations include potential biases in anomaly scoring, especially across heterogeneous device behaviors. Industrial adoption challenges include trust establishment, regulatory compliance, and edge system certification for critical infrastructure.
Limitations Against Adversarial Attacks
Despite its robustness against concept drift, Gaussian noise, label flipping, and model poisoning, SAFEL-IoT still faces challenges in addressing model inversion attacks and backdoor poisoning. Model inversion attacks attempt to reconstruct sensitive training data by analyzing gradients during federated learning rounds. While SAFEL-IoT’s homomorphic encryption and differential privacy provide strong protection, fine-grained inversion attempts can still infer partial data representations, posing a risk to privacy. Similarly, backdoor poisoning allows malicious clients to inject hidden triggers during local updates, subtly influencing global model behavior. Although SAFEL-IoT’s adaptive aggregation mechanism mitigates some of this risk by down-weighting high-divergence updates, it may not fully eliminate backdoor traces, especially if the triggers are well crafted. To enhance its resilience, future iterations of SAFEL-IoT will integrate trusted execution environments (TEEs), such as Intel SGX or AMD SEV, enabling secure enclave processing of sensitive gradient computations. This approach isolates local models from direct memory access, preventing inversion attempts during federated rounds. Additionally, the introduction of a differential backdoor detection (DBD) mechanism is proposed, which will compare local updates against a backdoor signature database to flag and exclude potentially malicious updates. To further solidify its robustness, certified robustness verification techniques will be employed to mathematically validate the integrity of local models against gradient manipulation. A planned real-world pilot deployment in a smart grid monitoring facility aims to evaluate these enhancements, targeting a 28% reduction in data exposure during inversion attempts and a 40% decrease in backdoor activation through enclave processing and differential detection. Preliminary simulations estimate an 8 ms overhead for secure enclave processing with a minimal 3.2% impact on synchronization time, highlighting the practicality of these defenses for 6G-enabled IIoT environments. These improvements are expected to elevate SAFEL-IoT’s security standards, making it more resilient against sophisticated adversarial threats in real-world deployments.