AI-Driven Anomaly Detection for Securing IoT Devices in 5G-Enabled Smart Cities
Abstract
:1. Introduction
1.1. Background and Motivation
1.2. Importance of Cybersecurity in IoT and 5G Networks
1.3. Role of AI in Enhancing Cybersecurity
1.4. Challenges in AI-Driven Anomaly Detection for IoT Security
- Device and Data Heterogeneity: IoT networks consist of highly diverse devices with varying capabilities and protocols, leading to inconsistent data patterns and vulnerabilities [6].
- Real-Time Requirements: Cybersecurity systems must process vast data streams quickly and with minimal latency to ensure an effective threat response [7].
1.5. Major Contributions of This Paper
- A novel hybrid intrusion detection system (IDS) that integrates autoencoders, LSTMs, and CNNs to improve anomaly classification accuracy and robustness.
- A federated learning and edge AI architecture enabling decentralized model training across IoT nodes, which enhances scalability and preserves data privacy.
- An experimental validation using real-world datasets (CICIDS2017, TON_IoT, UNSW-NB15) and synthetic attack data, demonstrating high accuracy (97.5% precision, 96.8% F1 score) and low inference latency (<310 ms) suitable for edge deployment.
- A comparative evaluation, showing that the hybrid model outperforms standalone models and rule-based IDS in terms of detection performance, efficiency, and adaptability.
- Integration of autonomous threat mitigation mechanisms, providing real-time security enforcement based on anomaly detection.
- Use of explainable AI (XAI) methods, such as SHAP values, to support interpretability and trust in AI-driven decisions.
1.6. Structure of the Paper
2. Related Work
2.1. AI-Driven Anomaly Detection in IoT Networks
2.2. AI and IoT Integration for Enhanced Security
2.3. Challenges in AI-Based Anomaly Detection
- Data Quality and Diversity: The heterogeneity of IoT devices leads to diverse data types and quality, complicating the development of universal anomaly detection models [18,19]. Abusitta et al. addressed this issue by utilizing denoising autoencoders to extract robust features from corrupted IoT data [14].
- Real-Time Processing: Achieving real-time anomaly detection requires efficient algorithms that are capable of processing large data streams with minimal latency [20,21]. Liu et al. discussed the application of unsupervised deep learning techniques for IoT time series analysis, highlighting the importance of efficient models for real-time anomaly detection [22].
- Scalability: Deploying AI models across extensive IoT networks necessitates scalable solutions that maintain performance as the network grows [23,24]. The clustered federated learning approach by Sáez-de-Cámara et al. offers a scalable solution by addressing heterogeneity and scalability in large IoT networks [17].
- Explainability and Interpretability: Many deep learning models function as black boxes, making it difficult for security teams to interpret AI-generated threat detection results. Explainable AI (XAI) techniques, such as SHAP values and attention mechanisms, have been proposed to increase model transparency and reduce false positives in cybersecurity applications [25]. Subasi et al. critically assessed interpretable and explainable machine learning models for intrusion detection, emphasizing the need for transparency in AI-driven security systems [25].
2.4. Recent Developments and Future Directions
2.5. Summary of Prior Work and Research Gaps
- Key Findings:
- AI-based models significantly outperform traditional IDS in terms of accuracy and adaptability.
- Edge AI and FL frameworks are crucial for scalability and privacy preservation.
- Real-time security systems require low-latency models with autonomous capabilities.
- Explainability remains an active challenge in AI-based cybersecurity.
- Research Gaps:
- Hybrid AI Models: Limited exploration of deep hybrid frameworks, combining temporal, spatial, and generative insights.
- Real-World Testing: Most prior work lacks deployment in city-scale or production-like environments.
- Scalability and Energy Efficiency: High-complexity models are rarely optimized for edge deployment.
- Lack of Integrated Response Systems: Few systems incorporate AI-driven autonomous mitigation mechanisms.
3. Proposed Methodology
3.1. Overview of the Proposed AI-Driven Anomaly Detection System
- Data collection and preprocessing: Captures traffic from IoT devices (e.g., sensors, meters, and cameras) and applies cleaning, normalization, and dimensionality reduction to ensure high-quality input.
- AI-driven anomaly detection: A hybrid deep learning model composed of autoencoders (for pattern reconstruction), LSTM networks (for temporal dependencies), and CNNs (for spatial feature extraction). This model distinguishes between normal behavior and sophisticated attack patterns.
- Federated learning and edge AI processing: Distributes model training across edge devices. Using federated learning, local models are trained on-site, with only the model parameters shared for global aggregation. This ensures data privacy and low-latency threat detection.
- Adaptive threat mitigation: Incorporates AI-based autonomous response mechanisms. When anomalies are detected, the system triggers actions like firewall rule updates or device isolation. Explainable AI (XAI) modules provide interpretability using SHAP values and attention mechanisms.
3.2. Dataset Selection and Preprocessing
3.2.1. Synthetic Data Generation for Attack Simulation
- IoT botnet attack simulations:
- ○
- These involve simulated Mirai and Bashlite malware variants, replicating large-scale botnet attacks on IoT infrastructures.
- ○
- Attack traffic was crafted using real-world botnet behavior patterns to mimic infected IoT devices that are launching coordinated attacks.
- Adversarial attack injectionTo evaluate the vulnerability of the anomaly detection system to adversarial evasions, adversarial attack perturbations were generated using state-of-the-art adversarial machine learning techniques.
- ○
- The fast gradient sign method (FGSM):
- ■
- The FGSM is a white-box attack method that introduces small perturbations to network traffic feature representations, in order to mislead the anomaly detection system.
- ■
- Given an input X and a loss function , the FGSM computes an adversarial example by adjusting X in the direction of the gradient:
- ■
- This ensures that the perturbed sample remains indistinguishable to a human observer, but the model misclassifies it as benign.
- ○
- Projected gradient descent (PGD):
- ■
- PGD extends the FGSM by iteratively applying small perturbations and projecting the modified sample within a bounded region to maximize misclassification.
- ■
- The attack can be shown as follows:
- ■
- Unlike the FGSM, PGD iteratively refines the perturbations, making it a stronger adversarial attack that is more challenging to detect.
By incorporating FGSM and PGD adversarial perturbations into the dataset, the system’s resilience against evasion techniques was assessed, ensuring that deep learning models maintain robust performance, even when confronted with adversarially modified inputs. - Randomized data perturbations:
- ○
- These introduce controlled noise variations in normal network traffic patterns to test the generalization capabilities.
- ○
- These also ensure that the model learns robust features, preventing overfitting to static attack signatures.
3.2.2. Data Preprocessing
- Data cleaning and handling missing values: Missing values are handled using mean imputation for numerical features and mode imputation for categorical attributes. Duplicate records and corrupted logs are removed to ensure a cleaner dataset.
- Feature engineering and selection: Chi-square statistical tests are used to retain only the most significant features correlated with anomaly detection. Recursive feature elimination (RFE) with a random forest classifier is used to filter out low-impact features. Principal component analysis (PCA) is employed for dimensionality reduction while preserving key data variance.
- Feature normalization and encoding: Min–Max normalization is applied to numerical features to scale values between 0 and 1. One-hot encoding is used to convert categorical variables (e.g., protocol types and attack labels) into numerical representations.
- Data Splitting Strategy: 70% training set, used for model learning; 15% validation set, used for hyperparameter tuning and model optimization; 15% test set, used for a final performance evaluation to assess generalizability on unseen data. Additionally, cross-validation (e.g., 5-fold cross-validation) is applied to ensure that the model performance is not biased by a specific data split.
3.2.3. Justification for Dataset Choices and Preprocessing Steps
- Reduce computational complexity while preserving key information for anomaly detection.
- Improve model robustness by eliminating redundant or noisy features.
- Ensure data consistency across different IoT network traffic logs.
3.3. Model Architecture
3.3.1. Hybrid AI Model Components
- Autoencoder:
- ○
- Structure: 3 hidden layers (128, 64, 32 neurons)
- ○
- Activation: ReLU (encoder), sigmoid (decoder output)
- ○
- Dropout: 0.3 in each hidden layer
- ○
- Purpose: To learn compressed representations of normal behavior and detect anomalies by reconstruction error
- ○
- Output: Latent embedding + reconstruction error vector
Mathematically, given an input , the autoencoder learns a compressed representation , followed by reconstruction . The reconstruction loss is:Anomalies are flagged when the reconstruction error exceeds a learned threshold. - LSTM:
- ○
- Structure: 2 stacked LSTM layers (64 and 32 units)
- ○
- Activation: tanh (cell state), Sigmoid (gates)
- ○
- Dropout: 0.5 between layers
- ○
- Purpose: To capture sequential and temporal dependencies in the latent space
- ○
- Input: Latent features from AE
- ○
- Output: Time-series representations
An LSTM cell updates its hidden state as: - CNN:
- ○
- Structure: 2 convolutional layers (kernel sizes: ), followed by max pooling and 2 dense layers
- ○
- Activation: ReLU in conv layers, with softmax in the output
- ○
- Dropout: 0.4 between dense layers
- ○
- Purpose: To extract spatial/log features from the LSTM outputs (reshaped into a 2D form) for final classification
- AE learns a compact representation of normal network traffic and reconstruction error.
- The LSTM learns patterns in these latent signals over time, capturing the sequential evolution of anomalies.
- CNN detects localized and structural patterns in the temporal embeddings from LSTM, enabling more accurate classification.
- The sequential integration of reconstruction, temporal, and spatial learning, where each model enhances the output space of the previous stage.
- End-to-end training across components, enabling joint optimization and faster convergence.
- The use of shared latent representations, which improves the robustness and allows the system to generalize better to zero-day attacks.
3.3.2. Federated Learning and Edge AI for Scalability
- Each IoT edge device trains a local anomaly detection model using the AE–LSTM–CNN pipeline.
- After a training round, each device sends its updated weights , along with its sample count , to a central aggregator.
- The global model is computed via federated averaging:
- The updated global model is redistributed to all edge devices for the next round.
- Synchronous aggregation is used in the baseline setup, where all selected clients must complete their training before aggregation occurs.
- Each federated round includes one local epoch per node, with global aggregation after every five rounds.
- For large-scale deployments, we support asynchronous updates (optional), where straggling nodes do not block global updates, improving resilience in unstable networks.
- Clustered aggregation is implemented to minimize communication overhead. Nodes are grouped by proximity (e.g., same subnet/district) and send updates to local edge aggregators, which forward the averaged models to the central server.
3.3.3. Explainable AI (XAI) for Interpretability
3.4. Model Training and Evaluation
3.4.1. Training Methodology
- Autoencoder reconstruction loss: Measures the discrepancy between original input and reconstructed output, defined as:
- LSTM and CNN cross-entropy loss: Used for anomaly classification, defined as:
- The total loss function for training is a weighted combination of these individual losses:
3.4.2. Hyperparameter Tuning
- Grid search: We explored discrete parameter combinations including:
- ○
- Learning rate: Experimented with 10−3 to 10−5.
- ○
- Batch size: Varying between 32, 64, and 128.
- ○
- Dropout rate: Regularization applied between 0.2 and 0.5 to prevent overfitting.
- ○
- Number of hidden layers: 2, 3, 4.
- Bayesian optimization: This probabilistic method modeled the objective function and efficiently navigated the hyperparameter space, selecting promising candidates based on performance in prior trials.
- Validation strategy: We applied 5-fold cross-validation on the training set to evaluate model robustness and generalization. Each configuration was assessed using the average F1 score and AUC across folds.
- Convergence criteria: Optimization stopped when no significant improvement in F1 score was observed over 5 consecutive iterations.
3.4.3. Performance Evaluation Metrics
- Detection Performance Metrics
- Precision (P): Measures the fraction of correctly identified anomalies among all detected anomalies.
- Recall (R): Measures the fraction of actual anomalies correctly identified by the model:
- F1 Score: The harmonic mean of precision and recall, ensuring a balanced evaluation of detection accuracy:
- The AUC-ROC (area under the curve–receiver operating characteristic curve) measures the trade-off between true positive rate (TPR) and false positive rate (FPR):
- Computational Efficiency Metrics
- Inference time: Measures how long the model takes to classify a new network sample.
- Memory usage: Evaluates the resource consumption of the deployed model on IoT edge devices.
3.4.4. Federated Learning Model Aggregation
3.5. Deployment in a Real-Time IoT Environment
3.5.1. Scalability Considerations
- Edge-based processing: Distributed AI inference at IoT gateways to minimize latency.
- Federated learning updates: Periodic model aggregation ensures improved detection without the need for centralized data storage.
- Cloud-assisted model retraining: While inference is performed at the edge, periodic updates are sent to a cloud-based server for long-term model adaptation to emerging threats.
3.5.2. AI-Driven Adaptive Threat Response Mechanism
- IoT Network Traffic Collection
- Network packets, device activity logs, and system events are continuously monitored.
- Edge AI-based anomaly detection
- Local AI models (autoencoder, LSTM, and CNN) process real-time data streams and flag anomalies at the edge.
- Only anomaly metadata (not raw data) is transmitted to the central security dashboard.
- Federated learning for Mmodel optimization
- Edge nodes train their local models on site.
- Model updates (not raw data) are periodically sent to a central FL aggregator for global model updates.
- Cloud-based adaptive model retraining
- The global model must be updated, using aggregated knowledge from multiple edge nodes.
- The updated AI model is redistributed to all IoT nodes for enhanced anomaly detection.
3.5.3. AI-Driven Adaptive Threat Mitigation
- Threat categorization: Identified anomalies are categorized based on their risk level (e.g., low, moderate, and critical).
- Automated security policies: The system applies pre-configured security rules, such as firewall modifications, device isolation, and rate-limiting measures, in response to specific cyber threats.
- Incident reporting and human oversight: Alerts are generated for network administrators, providing detailed insights into the detected anomaly and recommended mitigation actions.
- Continuous learning and model adaptation: The system continuously updates its anomaly detection model by integrating threat intelligence feeds and retraining AI models to adapt to emerging cyberattack patterns.
4. Experimental Setup and Results
- Evaluating hybrid deep learning models (autoencoder, LSTM, and CNN) on real-world datasets.
- Assessing federated learning for scalable AI-based security.
- Validating AI-driven autonomous mitigation strategies for real-time threat response.
- Section 4.1: Details the experimental setup, including hardware, datasets, and training configurations.
- Section 4.2: Defines the performance evaluation metrics used to assess the effectiveness of the proposed system.
- Section 4.3: Presents the experimental results and details comparative analysis with existing security methods.
4.1. Experimental Setup
4.1.1. Hardware and Computational Environment
- Edge devices: Raspberry Pi 4 Model B (4GB RAM, Quad-core Cortex-A72)
- Local server (FL asggregator): Intel Xeon E5-2697 v4 (2.3GHz, 16 cores, 64GB RAM, NVIDIA RTX 3090 GPU)
- Cloud infrastructure: Google Cloud AI platform with TPU acceleration
4.1.2. Dataset Utilization and Preprocessing
- CICIDS2017: Used for general network intrusion detection, including DDoS, botnets, and brute-force attacks.
- TON_IoT: Focuses on IoT-specific threats, such as data injection and privilege escalation.
- UNSW-NB15: Provides a balanced dataset for evaluating novel cyber threats in smart city networks.
- Synthetic attack data: Adversarial samples generated using GANs (generative adversarial networks) to simulate zero-day attacks.
- Feature extraction: Packet-based statistics (e.g., source IP entropy, and payload size variance).
- Normalization: Min-max scaling to standardize features across datasets.
- Data partitioning: 70% training, 15% validation, and 15% testing, ensuring proper generalization.
4.2. Evaluation Metrics for Performance Assessment
- Detection accuracy and threat classification performance. The precision, recall, F1 score, and AUC-ROC metrics were applied to measure the system’s ability to correctly classify network anomalies across different IoT cyber threats.
- Scalability in federated learning-based anomaly detection. The model accuracy trend was analyzed over an increasing number of IoT edge nodes to evaluate the scalability and performance stability of federated learning (FL) across distributed deployments.
- Real-time response and automated mitigation efficiency. The response time of AI-driven security enforcement mechanisms was measured to determine how quickly the system could detect and neutralize cyber threats in an operational IoT environment.
4.3. Experimental Results and Comparative Analysis
4.3.1. Detection Performance of Hybrid AI Models
- Rule-based intrusion detection systems (IDS)
- Statistical methods (K-means and isolation forests)
- Single deep learning models (standalone CNN, LSTM, and autoencoder)
4.3.2. Scalability Analysis of Federated Learning
4.3.3. AI-Driven Adaptive Threat Mitigation Performance
5. Discussion
5.1. Comparison with State-of-the-Art Solutions
5.1.1. Performance Comparison with AI-Based Anomaly Detection Models
5.1.2. Scalability of Federated Learning-Based Anomaly Detection
5.1.3. Performance of AI-Based Adaptive Threat Mitigation
5.1.4. Explainability and Feature Importance in AI-Based Security
5.1.5. Summary of Comparison Findings
- Demonstrating better scalability in federated learning-based security frameworks, maintaining an accuracy above 96%, even with 100 IoT nodes (Figure 11).
- Providing low-latency anomaly detection, with response times under 300 ms, making its real-time deployment feasible in IoT networks (Figure 12).
- Enhancing AI transparency through SHAP-based feature importance analysis, improving trust and interpretability in anomaly detection decisions (Figure 10).
5.2. Scalability and Real-Time Feasibility
5.2.1. Can an AI-Based Intrusion Detection Scale Be Deployed in Smart Cities?
5.2.2. Federated Learning Enhancements
- Network costs: FL requires frequent model updates to be exchanged between edge devices and the FL aggregator. The communication overhead depends on the model size, update frequency, and network conditions. Optimizing FL aggregation frequency can balance detection performance and bandwidth efficiency.
- Aggregation latency: While FL reduces the data transfer costs, the central model aggregation process adds to the computational overhead. Experiments show that with 100 IoT nodes, model aggregation latency ranges between 240 and 310 ms (Figure 10), which is within acceptable limits for real-time security applications.
- Heterogeneous data distribution: IoT networks generate non-IID (non-independent and identically distributed) data, which can lead to model divergence across nodes. Future optimizations, such as personalized federated learning techniques, could improve model consistency in large-scale deployment.
Reducing the FL Update Overhead with Adaptive Model Update Strategies
- Adaptive model update frequency:
- Instead of transmitting model updates at fixed intervals, edge devices dynamically adjust the update frequency, based on network conditions and anomaly detection confidence levels.
- High-confidence models update less frequently, reducing unnecessary communication overhead while maintaining detection accuracy.
- Gradient compression and model pruning:
- Compressing gradient updates (e.g., via quantization techniques) reduces the size of transmitted model updates, improving communication efficiency.
- Model pruning techniques allow less critical model parameters to be excluded from updates, further minimizing bandwidth usage.
- Hierarchical FL aggregation:
- Instead of sending updates from all edge devices to a single central server, a multi-tier aggregation system can be implemented.
- Local aggregators in edge clusters collect updates, perform initial model merging, and only send refined updates to the central server, reducing global communication overhead.
5.2.3. Real-Time Performance
5.3. Implications for AI-Driven Security in IoT Networks
- Privacy-preserving AI models: While federated learning ensures decentralized model training, regulatory concerns over data sharing and AI decision transparency remain areas requiring further investigation.
- Explainability in AI-driven security: Security professionals require interpretable AI models. Our use of SHAP-based feature analysis improves AI transparency, but additional efforts are needed to enhance trust in automated security decisions.
- Adaptive cybersecurity strategies: AI models need to continuously adapt to emerging threats. Integrating self-learning AI mechanisms and reinforcement learning could improve their real-time adaptation to zero-day attacks.
5.4. Limitations and Future Directions
- Computational overhead: Deep learning models require significant processing power, which may be challenging for resource-constrained IoT devices.
- Potential for adversarial attacks: AI-driven models are vulnerable to adversarial attacks, where attackers manipulate network data to evade detection.
- Real-world deployment testing: Although experiments were conducted using real-world datasets, future work should focus on live deployment in IoT networks to assess system robustness in real-time.
- Enhancing AI explainability and trustworthiness: Exploring interpretable AI models for cybersecurity.
- Developing lightweight AI models for edge devices: Optimizing deep learning architectures for low-power IoT environments.
- Integrating reinforcement learning for dynamic threat adaptation: Developing AI models that learn and evolve in response to new attack patterns.
6. Conclusions and Future Work
6.1. Conclusions
- The federated learning-based anomaly detection framework scales up efficiently, maintaining an accuracy above 96% even across 100 IoT edge nodes (Figure 10), proving its feasibility for large-scale smart city deployment.
- The AI-driven autonomous threat mitigation system provides a real-time attack response, achieving a 97% mitigation success rate with response times under 300 ms (Figure 11), making it suitable for real-world, high-speed security enforcement.
- The SHAP-based feature importance analysis (Figure 9) enhances AI transparency, addressing the black-box nature of deep learning models by providing explainability for security decisions.
6.2. Future Work
- Lightweight AI Models for Resource-Constrained IoT DevicesDeep learning models, particularly LSTMs and CNNs, require significant computational resources, which may not be feasible for use with low-power IoT sensors and edge devices. Future research should focus on developing lightweight deep learning architectures, such as quantized neural networks and knowledge distillation, which are optimized for low-latency IoT security applications.
- Robustness Against Adversarial AttacksAI-based security models are vulnerable to adversarial attacks, where attackers intentionally manipulate network traffic to bypass anomaly detection mechanisms. While our proposed system effectively detects known cyber threats, no empirical evaluation of adversarial robustness was conducted in this study. Future research should focus on developing and evaluating adversarial defense mechanisms in AI-driven cybersecurity. Several promising directions include:
- Adversarial training by introducing adversarially generated attack samples during model training to enhance resilience against adversarial perturbations.
- Defensive AI strategies leveraging defense distillation, anomaly-aware embeddings, and gradient masking techniques to mitigate adversarial evasion.
- Adversarial detection modules, deploying secondary anomaly detection layers that can flag adversarially manipulated network traffic in real time.
- Empirical evaluations of adversarial robustness, made by conducting systematic testing of AI models against adversarial attacks (e.g., the fast gradient sign method (FGSM) and projected gradient descent (PGD)) to quantify vulnerability levels and propose countermeasures.
Ensuring robustness against adversarial manipulation is essential for maintaining trust and reliability in AI-driven security frameworks, particularly in critical smart city infrastructures. While adversarial examples were used during training and evaluation data preparation, the current study did not conduct isolated adversarial robustness testing to quantify the system’s performance under targeted attacks. We recognize this as an important limitation and propose integrating formal benchmarking against evasion techniques (e.g., FGSM, PGD, and AutoAttack) in future iterations of the framework. - Self-Learning AI for Dynamic Threat AdaptationCyber threats evolve rapidly, and static AI models may fail to detect novel attack patterns. Future research should explore:
- Reinforcement learning (RL) for adaptive threat mitigation, with AI-driven intrusion detection systems (IDS) that can learn from evolving attack behaviors.
- Continual learning by implementing online learning techniques to dynamically update AI models without retraining from scratch.
This approach will enable intrusion detection systems to autonomously adapt to emerging cyber threats in real time. In addition to dynamic detection, future efforts will explore learning-based mitigation, such as reinforcement learning agents that select optimal countermeasures based on observed threat patterns, policy effectiveness, and system state. - Federated Learning Enhancements for Privacy-Preserving SecurityWhile federated learning (FL) enhances the system’s scalability and privacy, communication overhead and model divergence across heterogeneous IoT networks remain challenges. Future research should explore:
- Efficient FL aggregation methods using techniques such as differential privacy, secure multi-party computation (MPC), and blockchain-based federated learning can enhance privacy, security, and communication efficiency.
- Personalized FL systems for IoT networks need to develop client-specific AI models that can adapt to device heterogeneity while maintaining global security consistency.
- Real-World Deployment, Applicability, and BenchmarkingAlthough our system was evaluated using real-world datasets, deploying it in live IoT networks will provide deeper insights into:
- Real-time performance and adaptive response efficiency.
- The network overhead that is associated with federated learning updates.
- The effectiveness of AI-driven autonomous mitigation strategies in production environments.
- Decentralized training via federated learning significantly reduces data transmission costs and privacy risks.
- Edge AI deployment ensures low-latency, real-time anomaly detection.
- AI-driven adaptive threat mitigation allows for autonomous response, reducing the burden on human operators.
6.3. Final Thoughts
Funding
Data Availability Statement
Conflicts of Interest
References
- Priyadarshini, I. Anomaly Detection of IoT Cyberattacks in Smart Cities Using Federated Learning and Split Learning. Big Data Cogn. Comput. 2024, 8, 21. [Google Scholar] [CrossRef]
- Institute for Defense & Business. What Are the Cybersecurity Risks for Smart Cities? Available online: https://www.idb.org/what-are-the-cybersecurity-risks-for-smart-cities/ (accessed on 17 March 2025).
- Security and Compliance in 5G and AI-Powered Edge Networks | Deloitte Global. Available online: https://www.deloitte.com/global/en/services/consulting-risk/perspectives/security-compliance-in-5g-ai-powered-edge-networks.html (accessed on 17 March 2025).
- How 5G Technology Affects Cybersecurity: Looking to the Future | UpGuard. Available online: https://www.upguard.com/blog/how-5g-technology-affects-cybersecurity (accessed on 17 March 2025).
- Mirza, N.; Yunis, M.; Khalil, A.; Mirza, N. Towards a Conceptual Framework for AI-Driven Anomaly Detection in Smart City IoT Networks for Enhanced Cybersecurity. J. Innov. Knowl. 2024, 9, 100601. [Google Scholar]
- Cybersecurity Challenges in Smart Cities: An Overview and Future Prospects | Mesopotamian Journal of CyberSecurity. Available online: https://mesopotamian.press/journals/index.php/CyberSecurity/article/view/14 (accessed on 17 March 2025).
- USA; Aluwala, A. AI-Driven Anomaly Detection in Network Monitoring Techniques and Tools. J. Artif. Intell. Cloud Comput. 2024, 1–6. [Google Scholar] [CrossRef]
- Babu, C.V.S.; Simon, P.A. Adaptive AI for Dynamic Cybersecurity Systems: Enhancing Protection in a Rapidly Evolving Digital Landscap. Available online: https://www.igi-global.com/gateway/chapter/337688 (accessed on 17 March 2025).
- Kuraku, D.S. Adaptive Security Framework For Iot: Utilizing AI And ML To Counteract Evolving Cyber Threats. Educ. Adm. Theory Pract. 2023, 29, 1573–1580. [Google Scholar] [CrossRef]
- What Is the Role of AI in Threat Detection? Available online: https://origin-www.paloaltonetworks.com/cyberpedia/ai-in-threat-detection (accessed on 17 March 2025).
- Kuguoglu, B.K.; van der Voort, H.; Janssen, M. The Giant Leap for Smart Cities: Scaling Up Smart City Artificial Intelligence of Things (AIoT) Initiatives. Sustainability 2021, 13, 12295. [Google Scholar] [CrossRef]
- Towards Large-Scale IoT Deployments in Smart Cities: Requirements and Challenges | SpringerLink. Available online: https://link.springer.com/chapter/10.1007/978-3-031-50514-0_6 (accessed on 17 March 2025).
- Zyrianoff, I.; Borelli, F.; Biondi, G.; Heideker, A.; Kamienski, C. Scalability of Real-Time IoT-Based Applications for Smart Cities. In Proceedings of the 2018 IEEE Symposium on Computers and Communications (ISCC), Natal, Brazil, 25–28 June 2018; pp. 00688–00693. [Google Scholar]
- Abusitta, A.; Silva de Carvalho, G.H.; Abdel Wahab, O.; Halabi, T.; Fung, B.C.M.; Al Mamoori, S. Deep Learning-Enabled Anomaly Detection for IoT Systems. Internet Things 2022, 21, 100656. [Google Scholar] [CrossRef]
- AI and IoT in Smart Cities Security. Available online: https://www.truehomeprotection.com/leveraging-ai-and-iot-for-next-generation-security-systems-in-smart-cities/ (accessed on 17 March 2025).
- Sarbhukan (Bodade), V.V.; More, J.S.; Jadhav, Y. Smart City Infrastructure Monitoring Using AI and IoT Technologies. Int. J. Intell. Syst. Appl. Eng. 2024, 12, 1687–1695. [Google Scholar]
- Sáez-de-Cámara, X.; Flores, J.L.; Arellano, C.; Urbieta, A.; Zurutuza, U. Clustered Federated Learning Architecture for Network Anomaly Detection in Large Scale Heterogeneous IoT Networks. Comput. Secur. 2023, 131, 103299. [Google Scholar] [CrossRef]
- Ortiz, O.O.; Pastor Franco, J.Á.; Alcover Garau, P.M.; Herrero Martín, R. Innovative Mobile Robot Method: Improving the Learning of Programming Languages in Engineering Degrees. IEEE Trans. Educ. 2017, 60, 143–148. [Google Scholar] [CrossRef]
- Elshaiekh, N.E.M.; Hassan, Y.A.A.; Abdallah, A.A.A. The Impacts of Remote Working on Workers Performance. In Proceedings of the 2018 International Arab Conference on Information Technology (ACIT), Sultan Qaboos University, Muscat, Oman, 28–30 November 2018; pp. 1–5. [Google Scholar]
- Goswami, M.J. AI-Based Anomaly Detection for Real-Time Cybersecurity. Int. J. Res. Rev. Tech. 2024, 3, 45–53. [Google Scholar]
- Ngo, M.V.; Luo, T.; Quek, T.Q.S. Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge Computing: A Contextual-Bandit Approach. ACM Trans. Internet Things 2021, 3, 1–23. [Google Scholar] [CrossRef]
- Liu, Y.; Zhou, Y.; Yang, K.; Wang, X. Unsupervised Deep Learning for IoT Time Series. IEEE Internet Things J. 2023, 10, 14285–14306. [Google Scholar] [CrossRef]
- Nguyen, T.-A.; Le, L.T.; Nguyen, T.D.; Bao, W.; Seneviratne, S.; Hong, C.S.; Tran, N.H. Federated PCA on Grassmann Manifold for IoT Anomaly Detection. IEEEACM Trans. Netw. 2024, 32, 4456–4471. [Google Scholar] [CrossRef]
- Zolanvari, M.; Ghubaish, A.; Jain, R. ADDAI: Anomaly Detection Using Distributed AI. In Proceedings of the 2021 IEEE International Conference on Networking, Sensing and Control (ICNSC), Xiamen, China, 3–5 December 2021; Volume 1, pp. 1–6. [Google Scholar]
- Subasi, O.; Cree, J.; Manzano, J.; Peterson, E. A Critical Assessment of Interpretable and Explainable Machine Learning for Intrusion Detection. arXiv 2024, arXiv:2407.04009. [Google Scholar]
- Garcia, E. Smart City and Iot Data Collection Leveraging Generative Ai. Available online: https://philpapers.org/rec/GARSCA-9 (accessed on 15 May 2025).
- Manthena, H.; Shajarian, S.; Kimmell, J.; Abdelsalam, M.; Khorsandroo, S.; Gupta, M. Explainable Artificial Intelligence (XAI) for Malware Analysis: A Survey of Techniques, Applications, and Open Challenges. IEEE Access 2025, 13, 61611–61640. [Google Scholar] [CrossRef]
Study | Model Type | Deployment Focus | FL/Edge AI | Autonomous Mitigation | Key Limitation |
---|---|---|---|---|---|
Abusitta et al. [14] | Denoising Autoencoder | IoT anomaly detection | No | No | Limited scalability |
Priyadarshini [1] | Split Learning + FL | Distributed IoT | Yes | No | No autonomous response |
Sáez-de-Cámara et al. [17] | Clustered Federated Learning | IoT network security | Yes | No | Limited to clustering |
Liu et al. [22] | Unsupervised LSTM | IoT time series | No | No | No real-time validation |
This Work | Hybrid DL (AE + LSTM + CNN) | 5G-enabled smart cities | Yes | Yes | Comprehensive solution |
Dominant SHAP Feature | Anomaly Indication | Example Mitigation Action |
---|---|---|
High UDP Packet Entropy | UDP Flood or Botnet | Throttle or block high-entropy UDP flows |
Rare Destination Port | Port Scanning | Apply IP-based rate limiting or quarantine |
Payload Size Spike | Malware Injection | Isolate source device; deep packet inspection |
Sudden Increase in Login Failures | Brute-Force Attempt | Block offending IP or reset credentials |
Abnormal Packet Timing Patterns | DoS Attack | Temporarily drop or delay traffic bursts |
Model | Precision | Recall | F1 Score | AUC-ROC |
---|---|---|---|---|
Proposed Hybrid AI Model | 97.5% | 96.2% | 96.8% | 98.3% |
Rule-Based IDS | 85.4% | 79.1% | 82.1% | 83.2% |
K-Means Clustering | 74.2% | 71.5% | 72.8% | 75.1% |
CNN-Only Model | 90.1% | 88.7% | 89.4% | 91.2% |
LSTM-Only Model | 91.5% | 89.3% | 90.4% | 92.8% |
Autoencoder-Only | 88.9% | 87.2% | 88.0% | 89.7% |
Model Variant | Precision | Recall | F1 Score | AUC-ROC |
---|---|---|---|---|
Autoencoder Only | 88.9% | 87.2% | 88.0% | 89.7% |
LSTM Only | 91.5% | 89.3% | 90.4% | 92.8% |
CNN Only | 90.1% | 88.7% | 89.4% | 91.2% |
Autoencoder + LSTM | 93.3% | 91.2% | 92.2% | 94.5% |
Autoencoder + CNN | 92.4% | 90.5% | 91.4% | 93.7% |
LSTM + CNN | 94.2% | 91.9% | 93.0% | 95.1% |
Hybrid (AE + LSTM + CNN) | 97.5% | 96.2% | 96.8% | 98.3% |
Metric | Value | Measurement Basis |
---|---|---|
False Positive Rate | 5.1% | From confusion matrix (Figure 6) |
False Negative Rate | 4.3% | From confusion matrix (Figure 6) |
Memory Usage | 71.2 MB | Inference model size (Raspberry Pi 4) |
Power Consumption | ~2.4 W | Measured during live inference tests |
Inference Time | 287 ms/sample | Median across test set |
Number of IoT Nodes | Model Accuracy (%) | Model Update Latency (ms) |
---|---|---|
10 | 92.5% | 120 |
25 | 94.8% | 180 |
50 | 96.1% | 240 |
100 | 96.5% | 310 |
150 | 96.6% | 365 |
200 | 96.6% | 390 |
Threat Type | Response Time (ms) | Automated Mitigation Success (%) |
---|---|---|
DDoS Attack | 250 ms | 97.8% |
Brute-Force Login | 180 ms | 95.2% |
Malware Infection | 270 ms | 96.5% |
Feature | Proposed Approach | Comparison with Literature | References |
---|---|---|---|
Detection Accuracy | 97.5% Precision, 96.2% Recall | Higher than CNN (90%) and LSTM (91.5%) | [14,22] |
Federated Learning Scalability | Maintains 96.5% accuracy across 100 IoT nodes | Better than previous FL models struggling beyond 50 nodes | [17] |
Adaptive Threat Mitigation | Response time < 300 ms, mitigation success ~97% | Lower latency than rule-based IDS (~500 ms) | [25] |
Explainability and AI Transparency | SHAP-based feature importance analysis | Few AI security models offer explainability | [9,25] |
Study | Model Type | Dataset(s) | FL/Edge AI | Real-Time | Explainability (XAI) | Mitigation | Key Limitation |
---|---|---|---|---|---|---|---|
Abusitta et al. [14] | Denoising Autoencoder | Custom IoT traffic | No | No | No | None | Limited scalability |
Priyadarshini [1] | Split Learning + FL | Private IoT logs | Yes | Partial | No | None | No autonomous response |
Sáez-de-Cámara et al. [17] | Clustered FL (MLP-based) | Custom industrial IoT | Yes | No | No | None | No real-time operation |
Liu et al. [22] | Unsupervised LSTM | IoT time series (TON_IoT) | No | No | No | None | No XAI or deployment validation |
This work | Hybrid (AE + LSTM + CNN) | CICIDS2017, TON_IoT, UNSW | Yes | Yes | Yes (SHAP) | Rules-based | No formal adversarial benchmark as yet |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Reis, M.J.C.S. AI-Driven Anomaly Detection for Securing IoT Devices in 5G-Enabled Smart Cities. Electronics 2025, 14, 2492. https://doi.org/10.3390/electronics14122492
Reis MJCS. AI-Driven Anomaly Detection for Securing IoT Devices in 5G-Enabled Smart Cities. Electronics. 2025; 14(12):2492. https://doi.org/10.3390/electronics14122492
Chicago/Turabian StyleReis, Manuel J. C. S. 2025. "AI-Driven Anomaly Detection for Securing IoT Devices in 5G-Enabled Smart Cities" Electronics 14, no. 12: 2492. https://doi.org/10.3390/electronics14122492
APA StyleReis, M. J. C. S. (2025). AI-Driven Anomaly Detection for Securing IoT Devices in 5G-Enabled Smart Cities. Electronics, 14(12), 2492. https://doi.org/10.3390/electronics14122492