A Comparative Study of Federated Learning and Amino Acid Encoding with IoT Malware Detection as a Case Study
Abstract
1. Introduction
- We present a systematic and controlled evaluation of federated learning for IoT malware detection, explicitly comparing centralized and federated training under both IID and Non-IID data distributions, thereby quantifying the practical performance gap between these learning paradigms in realistic IoT settings.
- We provide a detailed empirical analysis of amino acid-based feature encoding within federated learning environments, demonstrating its consistent impact on detection accuracy, false positive rate, and Matthews correlation coefficient across multiple architectures and data distributions.
- We investigate the interaction between model complexity and learning paradigm by comparing a lightweight multi-layer perceptron (MLP) with a deep residual neural network, showing that increased architectural depth yields only limited performance gains and does not compensate for information loss introduced by feature encoding.
- We establish experimentally that federated learning achieves performance close to centralized training even under severe data heterogeneity, indicating that feature representation and task formulation exert a stronger influence on intrusion detection performance than the choice between centralized and federated learning.
2. Materials and Methods
2.1. Experimental Setup
- Centralized training on the complete dataset.
- Federated learning with equal (IID) data distribution across 5 clients.
- Federated learning with skewed (Non-IID) data distribution across 5 clients.
2.2. Dataset
2.3. Subset Construction and Splitting Protocol
2.4. Amino Acid Encoding
2.5. Feature Representation and Potential Overfitting Concerns
2.6. Experiment 1: Simple Multi-Layer Perceptron Architecture
| Layer | Output Size | Activation | Parameters |
|---|---|---|---|
| Input | 11 or 10 | - | - |
| Hidden 1 | 64 | ReLU + BatchNorm + Dropout(0.3) | 704 |
| Hidden 2 | 32 | ReLU + BatchNorm + Dropout(0.3) | 2112 |
| Output | 2 | Softmax | 66 |
| Total Trainable Parameters | 2882 | ||
2.7. Experiments 2 and 3: Complex Residual Neural Network Architecture
| Layer | Output Channels | Residual Blocks | Parameters |
|---|---|---|---|
| Input | 11 or 10 | - | - |
| Hidden 1 | 128 | 2 | 15,488 |
| Hidden 2 | 64 | 2 | 41,600 |
| Hidden 3 | 32 | 2 | 10,688 |
| Hidden 4 | 16 | 2 | 2688 |
| Output | 4 | - | 68 |
| Total Trainable Parameters | 70,532 | ||
2.8. Experimental Design Summary
3. Results
3.1. Experiment 1 Results: Simple MLP Architecture
| Features | Paradigm | Accuracy | Precision | Recall | F1 | AUC | MCC |
|---|---|---|---|---|---|---|---|
| Raw | Centralized | 0.9859 ± 0.0026 | 0.9838 ± 0.0022 | 0.9819 ± 0.0027 | 0.9855 ± 0.0026 | 0.9929 ± 0.0016 | 0.9710 ± 0.0072 |
| Fed-IID | 0.9856 ± 0.0026 | 0.9886 ± 0.0009 | 0.9870 ± 0.0014 | 0.9872 ± 0.0029 | 0.9943 ± 0.0036 | 0.9736 ± 0.0045 | |
| Fed-NonIID | 0.9901 ± 0.0026 | 0.9833 ± 0.0030 | 0.9833 ± 0.0059 | 0.9874 ± 0.0025 | 0.9856 ± 0.0039 | 0.9696 ± 0.0083 | |
| Amino Acid- | Centralized | 0.9292 ± 0.0073 | 0.9278 ± 0.0023 | 0.9247 ± 0.0038 | 0.9298 ± 0.0044 | 0.9706 ± 0.0042 | 0.8447 ± 0.0148 |
| Encoded | Fed-IID | 0.9323 ± 0.0034 | 0.9287 ± 0.0024 | 0.9347 ± 0.0101 | 0.9298 ± 0.0041 | 0.9748 ± 0.0012 | 0.8632 ± 0.0098 |
| Fed-NonIID | 0.9247 ± 0.0061 | 0.9304 ± 0.0045 | 0.9256 ± 0.0047 | 0.9211 ± 0.0043 | 0.9689 ± 0.0015 | 0.8448 ± 0.0213 |
3.2. Experiments 2 and 3 Results: Complex Residual Network Architecture
| Features | Configuration | Accuracy | Precision | Recall | F1-Score | AUC-ROC | MCC |
|---|---|---|---|---|---|---|---|
| Raw | Centralized | 0.9798 ± 0.0032 | 0.9802 ± 0.0070 | 0.9791 ± 0.0039 | 0.9773 ± 0.0043 | 0.9927 ± 0.0039 | 0.9589 ± 0.0026 |
| Federated IID | 0.9740 ± 0.0082 | 0.9723 ± 0.0012 | 0.9739 ± 0.0042 | 0.9726 ± 0.0035 | 0.9819 ± 0.0079 | 0.9461 ± 0.0056 | |
| Federated Non-IID | 0.7996 ± 0.0165 | 0.8200 ± 0.0039 | 0.7956 ± 0.0113 | 0.7921 ± 0.0057 | 0.8846 ± 0.0031 | 0.6055 ± 0.0151 | |
| Amino acid- | Centralized | 0.9760 ± 0.0024 | 0.9772 ± 0.0020 | 0.9772 ± 0.0024 | 0.9777 ± 0.0046 | 0.9890 ± 0.0019 | 0.9513 ± 0.0067 |
| encoded | Federated IID | 0.9714 ± 0.0010 | 0.9692 ± 0.0062 | 0.9646 ± 0.0061 | 0.9682 ± 0.0019 | 0.9807 ± 0.0042 | 0.9412 ± 0.0057 |
| Federated Non-IID | 0.8146 ± 0.0080 | 0.8250 ± 0.0086 | 0.8117 ± 0.0141 | 0.8035 ± 0.0089 | 0.8833 ± 0.0076 | 0.6337 ± 0.0081 |
3.3. Comparative Analysis
3.4. Statistical Significance of Differences Across All Experiments
4. Discussion
4.1. Impact of Feature Representation
4.2. Federated Learning Under Data Heterogeneity
4.3. Role of Model Complexity
4.4. Practical Implications
4.5. Limitations and Future Directions
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| DoS | Denial of Service |
| FL | Federated Learning |
| IID | Independent and Identically Distributed |
| IoT | Internet of Things |
| MCC | Matthews correlation coefficient |
| MLP | Multi-Layer Perceptron |
| NIDS | Network Intrusion Detection System |
| PQS-FP | Parameter Quantity Shifting–Fitting Performance |
References
- McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; Arcas, B.A.y. Communication-Efficient Learning of Deep Networks from Decentralized Data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, PMLR, Fort Lauderdale, FL, USA, 20–22 April 2017; pp. 1273–1282. [Google Scholar]
- Hernandez-Ramos, J.L.; Karopoulos, G.; Chatzoglou, E.; Kouliaridis, V.; Marmol, E.; Gonzalez-Vidal, A.; Kambourakis, G. Intrusion Detection Based on Federated Learning: A Systematic Review. ACM Comput. Surv. 2025, 57, 1–36. [Google Scholar] [CrossRef]
- Yang, H.; Wang, Z.; Chou, B.; Xu, S.; Wang, H.; Wang, J.; Zhang, Q. An Empirical Study of the Impact of Federated Learning on Machine Learning Model Accuracy. arXiv 2025. [Google Scholar] [CrossRef]
- Li, T.; Sahu, A.K.; Zaheer, M.; Sanjabi, M.; Talwalkar, A.; Smith, V. Federated Optimization in Heterogeneous Networks. In Proceedings of the Machine Learning and Systems (MLSys), Austin, TX, USA, 2–4 March 2020; Volume 2, pp. 429–450. [Google Scholar]
- Karimireddy, S.P.; Kale, S.; Mohri, M.; Reddi, S.; Stich, S.; Suresh, A.T. SCAFFOLD: Stochastic Controlled Averaging for Federated Learning. In Proceedings of the 37th International Conference on Machine Learning (ICML), Virtual Event, 13–18 July 2020; Volume 119, pp. 5132–5143. [Google Scholar]
- Wang, J.; Liu, Q.; Liang, H.; Joshi, G.; Poor, H.V. Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Virtual Event, 6–12 December 2020; Volume 33, pp. 7611–7623. [Google Scholar]
- Yurdem, B.; Kuzlu, M.; Gullu, M.K.; Catak, F.O.; Tabassum, M. Federated learning: Overview, strategies, applications, tools and future directions. Heliyon 2024, 10, e38137. [Google Scholar] [CrossRef] [PubMed]
- Bo, L.; Huang, H.; Gu, S.; Chen, Y. Federated Learning: From Algorithms To System Implementation; World Scientific Publishing Company: Singapore, 2024. [Google Scholar]
- Garst, S.; Dekker, J.; Reinders, M. A comprehensive experimental comparison between federated and centralized learning. Database 2025, 2025, baaf016. [Google Scholar] [CrossRef] [PubMed]
- Selvam, P.; Karthikeyan, P.; Manochitra, S.; Sujith, A.V.L.N.; Ganesan, T.; Ayyasamy, R.; Shuaib, M.; Alam, S.; Rajendran, A. Federated learning-based hybrid convolutional recurrent neural network for multi-class intrusion detection in IoT networks. Discov. Internet Things 2025, 5, 39. [Google Scholar] [CrossRef]
- Lu, Z.; Pan, H.; Dai, Y.; Si, X.; Zhang, Y. Federated Learning With Non-IID Data: A Survey. IEEE Internet Things J. 2024, 11, 19188–19209. [Google Scholar] [CrossRef]
- Zhu, H.; Xu, J.; Liu, S.; Jin, Y. Federated learning on non-IID data: A survey. Neurocomputing 2021, 465, 371–390. [Google Scholar] [CrossRef]
- Bilal, M.A.; Ul Islam, I.; Idrees, S.; Qasim, M.; Khan, M.J.; Khan, J. Dataset-centric evaluation of federated intrusion detection models in IoT networks. Sci. Rep. 2026, 16, 2683. [Google Scholar] [CrossRef] [PubMed]
- Khraisat, A.; Talukder, M.A.; Uddin, M.A.; Alazab, A. RF-FedAvg: Federated learning-based random forest model for intrusion detection in wireless sensor networks. Clust. Comput. 2025, 28, 873. [Google Scholar] [CrossRef]
- Rehman, M.U.; Abrar, M.; Khalid, S.; Kazim, M.; Singh, V.K. Metaheuristically Enhanced ANN-Based Intrusion Detection System with Explainable AI Integration. In Proceedings of the 2025 International Joint Conference on Neural Networks (IJCNN), Rome, Italy, 6–11 July 2025; IEEE: Piscataway, NJ, USA, 2025; pp. 1–8. [Google Scholar]
- García-Teodoro, P.; Díaz-Verdejo, J.; Maciá-Fernández, G.; Vázquez, E. Anomaly-based network intrusion detection: Techniques, systems and challenges. Comput. Secur. 2009, 28, 18–28. [Google Scholar] [CrossRef]
- Arun Joseph, A.; Ranjani, P.; Suresh Kumar, V. Federated Deep Learning-Based Intrusion Detection System for Multi-Attack Detection in MANETs. Int. J. Res. Appl. Sci. Eng. Technol. 2025, 13, 1057–1065. [Google Scholar] [CrossRef]
- Ibaisi, T.A.; Kuhn, S.; Kaiiali, M.; Kazim, M. Network Intrusion Detection Based on Amino Acid Sequence Structure Using Machine Learning. Electronics 2023, 12, 4294. [Google Scholar] [CrossRef]
- Rashid, O.F.; Othman, Z.A.; Zainudin, S.; Samsudin, N.A. DNA Encoding and STR Extraction for Anomaly Intrusion Detection Systems. IEEE Access 2021, 9, 31892–31907. [Google Scholar] [CrossRef]
- Arnob, A.K.B.; Chowdhury, R.R.; Chaiti, N.A.; Saha, S.; Roy, A. A comprehensive systematic review of intrusion detection systems: Emerging techniques, challenges, and future research directions. J. Edge Comput. 2025, 4, 73–104. [Google Scholar] [CrossRef]
- Cho, H.; Lim, S.; Belenko, V.; Kalinin, M.; Zegzhda, D.; Nuralieva, E. Application and improvement of sequence alignment algorithms for intrusion detection in the Internet of Things. In Proceedings of the 2020 IEEE Conference on Industrial Cyberphysical Systems (ICPS), Tampere, Finland, 10–12 June 2020; Volume 1, pp. 93–97. [Google Scholar] [CrossRef]
- Ioffe, S.; Szegedy, C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, 7–9 July 2015; Bach, F., Blei, D., Eds.; PMLR: Cambridge, MA, USA, 2015; Volume 37, pp. 448–456. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2017. [Google Scholar] [CrossRef]
- IBAISI, T.A. FedAminoIoT: Federated Learning Framework for IoT Intrusion Detection with Amino Acid Encoding. Available online: https://github.com/stefhk3/federated-learning-and-IoT.git (accessed on 15 January 2026).
- Hsu, T.M.H.; Qi, H.; Brown, M. Measuring the Effects of Non-Identical Data Distribution for Federated Visual Classification. arXiv 2019. [Google Scholar] [CrossRef]
- Iqbal, M.F. CTU-IoT-Malware IoT Network Traffic. Available online: https://www.kaggle.com/datasets/agungpambudi/network-malware-detection-connection-analysis/data (accessed on 5 December 2025).
- Garcia, S.; Parmisano, A.; Erquiaga, M.J. IoT-23: A Labeled Dataset with Malicious and Benign IoT Network Traffic (Version 1.0.0) [Data Set]. 2020. Available online: https://zenodo.org/records/4743746 (accessed on 5 January 2026). [CrossRef]
- Biopython. Analyzing Protein Sequences with the Protparam Module. Available online: https://biopython.org/wiki/ProtParam (accessed on 10 March 2026).
- PyTorch. PyTorch: An Imperative Style, High-Performance Deep Learning Library. 2024. Available online: https://pytorch.org/ (accessed on 15 January 2026).
- Jarrett, K.; Kavukcuoglu, K.; Ranzato, M.; LeCun, Y. What is the best multi-stage architecture for object recognition? In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision, Kyoto, Japan, 27 September–4 October 2009; pp. 2146–2153. [Google Scholar] [CrossRef]
- Gorodkin, J. Comparing two K-category assignments by a K-category correlation coefficient. Comput. Biol. Chem. 2004, 28, 367–374. [Google Scholar] [CrossRef] [PubMed]
- Xiang, Q.; Wang, X.; Lei, L.; Song, Y. Dynamic bound adaptive gradient methods with belief in observed gradients. Pattern Recognit. 2025, 168, 111819. [Google Scholar] [CrossRef]
- Xiang, Q.; Wang, X.; Lai, J.; Lei, L.; Song, Y.; He, J.; Li, R. Quadruplet depth-wise separable fusion convolution neural network for ballistic target recognition with limited samples. Expert Syst. Appl. 2024, 235, 121182. [Google Scholar] [CrossRef]






| Features | Configuration | Accuracy | Precision | Recall | F1-Score | AUC-ROC | MCC |
|---|---|---|---|---|---|---|---|
| Raw | Centralized | 0.9770 ± 0.0014 | 0.9750 ± 0.0042 | 0.9795 ± 0.0067 | 0.9794 ± 0.0025 | 0.9889 ± 0.0053 | 0.9569 ± 0.0069 |
| Federated IID | 0.9779 ± 0.0060 | 0.9761 ± 0.0033 | 0.9754 ± 0.0008 | 0.9735 ± 0.0009 | 0.9881 ± 0.0015 | 0.9497 ± 0.0070 | |
| Federated Non-IID | 0.9696 ± 0.0051 | 0.9717 ± 0.0036 | 0.9746 ± 0.0028 | 0.9721 ± 0.0051 | 0.9883 ± 0.0032 | 0.9465 ± 0.0031 | |
| Amino acid- | Centralized | 0.9294 ± 0.0067 | 0.9308 ± 0.0062 | 0.9316 ± 0.0028 | 0.9298 ± 0.0068 | 0.9709 ± 0.0032 | 0.8606 ± 0.0083 |
| encoded | Federated IID | 0.9332 ± 0.0037 | 0.9335 ± 0.0071 | 0.9304 ± 0.0043 | 0.9371 ± 0.0056 | 0.9738 ± 0.0056 | 0.8665 ± 0.0127 |
| Federated Non-IID | 0.9295 ± 0.0042 | 0.9226 ± 0.0102 | 0.9275 ± 0.0032 | 0.9294 ± 0.0051 | 0.9685 ± 0.0010 | 0.8532 ± 0.0091 |
| Benign | DoS | Probe | Web Attack | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Features | Config. | P | R | F1 | P | R | F1 | P | R | F1 | P | R | F1 |
| Raw | Centralized | 0.987 | 0.991 | 0.989 | 0.965 | 0.958 | 0.961 | 0.972 | 0.968 | 0.970 | 0.988 | 0.997 | 0.992 |
| Fed-IID | 0.985 | 0.989 | 0.987 | 0.961 | 0.954 | 0.958 | 0.968 | 0.965 | 0.967 | 0.986 | 0.996 | 0.991 | |
| Fed-NonIID | 0.983 | 0.987 | 0.985 | 0.958 | 0.951 | 0.954 | 0.965 | 0.962 | 0.964 | 0.990 | 0.996 | 0.993 | |
| Amino Acid- | Centralized | 0.952 | 0.965 | 0.958 | 0.901 | 0.885 | 0.893 | 0.918 | 0.902 | 0.910 | 0.949 | 0.972 | 0.960 |
| Encoded | Fed-IID | 0.954 | 0.967 | 0.960 | 0.905 | 0.889 | 0.897 | 0.920 | 0.905 | 0.912 | 0.951 | 0.974 | 0.962 |
| Fed-NonIID | 0.949 | 0.963 | 0.956 | 0.898 | 0.881 | 0.889 | 0.915 | 0.899 | 0.907 | 0.950 | 0.972 | 0.961 | |
| Configuration | Raw Features | Encoded Features | ||
|---|---|---|---|---|
| Experiment 1 | Experiment 2 | Experiment 1 | Experiment 2 | |
| Centralized | 98.6% | 98.0% | 92.9% | 97.6% |
| Federated IID | 98.6% | 97.4% | 93.2% | 97.1% |
| Federated Non-IID | 99.0% | 80.0% | 92.5% | 81.5% |
| Parameters | 2882 | 70,532 | 2882 | 70,532 |
| Communication Rounds | 50 | 100 | 50 | 100 |
| Experiment | Comparison | t-Statistic | p-Value | Significance |
|---|---|---|---|---|
| Experiment 1: Simple MLP (Binary) | Centralized vs. Fed-IID (Raw) | − 0.207 | 0.846 | ns |
| Centralized vs. Fed-NonIID (Raw) | −2.228 | 0.090 | ns | |
| Fed-IID vs. Fed-NonIID (Raw) | −3.537 | 0.024 | * | |
| Raw vs. Encoded (Centralized) | 19.261 | <0.001 | *** | |
| Raw vs. Encoded (Fed-IID) | 17.617 | <0.001 | *** | |
| Experiment 2: Complex Residual Network (Binary) | Centralized vs. Fed-IID (Raw) | 2.484 | 0.068 | ns |
| Centralized vs. Fed-NonIID (Raw) | 31.023 | <0.001 | *** | |
| Fed-IID vs. Fed-NonIID (Raw) | 27.169 | <0.001 | *** | |
| Raw vs. Encoded (Centralized) | 2.780 | 0.050 | * | |
| Experiment 3: Complex Residual Network (Multi-class) | Centralized vs. Fed-IID (Raw) | −1.027 | 0.362 | ns |
| Centralized vs. Fed-NonIID (Raw) | 2.176 | 0.095 | ns | |
| Fed-IID vs. Fed-NonIID (Raw) | 2.468 | 0.069 | ns | |
| Raw vs. Encoded (Centralized) | 18.355 | <0.001 | *** |
| Architecture | Parameters | Model Size | Comm. per Round | Total Comm. (per Client) | Rounds | Training Time |
|---|---|---|---|---|---|---|
| Simple MLP | 2882 | 11.3 KB | 11.3 KB | ∼1.1 MB | 50 | 45–90 min |
| Complex ResNet | 70,532 | 275.5 KB | 275.5 KB | ∼53.8 MB | 100 | 4–6 h |
| Ratio (ResNet/MLP) | 24.5× | 24.5× | 24.5× | ∼49× | 2× | 3–8× |
| Raw Features Accuracy (Binary) | Encoded Features Accuracy (Binary) | |||||
| Centralized | Fed-IID | Fed-NonIID | Centralized | Fed-IID | Fed-NonIID | |
| Simple MLP | 98.59% | 98.56% | 98.01% | 92.92% | 93.23% | 92.47% |
| Complex ResNet | 97.98% | 97.40% | 79.96% | 97.60% | 97.14% | 81.46% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Ibaisi, T.A.; Kuhn, S.; Kazim, M.; Kara, I.; Altindag, T.; Rehman, M.U. A Comparative Study of Federated Learning and Amino Acid Encoding with IoT Malware Detection as a Case Study. Big Data Cogn. Comput. 2026, 10, 111. https://doi.org/10.3390/bdcc10040111
Ibaisi TA, Kuhn S, Kazim M, Kara I, Altindag T, Rehman MU. A Comparative Study of Federated Learning and Amino Acid Encoding with IoT Malware Detection as a Case Study. Big Data and Cognitive Computing. 2026; 10(4):111. https://doi.org/10.3390/bdcc10040111
Chicago/Turabian StyleIbaisi, Thaer AL, Stefan Kuhn, Muhammad Kazim, Ismail Kara, Turgay Altindag, and Mujeeb Ur Rehman. 2026. "A Comparative Study of Federated Learning and Amino Acid Encoding with IoT Malware Detection as a Case Study" Big Data and Cognitive Computing 10, no. 4: 111. https://doi.org/10.3390/bdcc10040111
APA StyleIbaisi, T. A., Kuhn, S., Kazim, M., Kara, I., Altindag, T., & Rehman, M. U. (2026). A Comparative Study of Federated Learning and Amino Acid Encoding with IoT Malware Detection as a Case Study. Big Data and Cognitive Computing, 10(4), 111. https://doi.org/10.3390/bdcc10040111
