MDPI - Publisher of Open Access Journals

22 pages, 2678 KiB

Open AccessArticle

Federated Semi-Supervised Learning with Uniform Random and Lattice-Based Client Sampling

by Mei Zhang and Feng Yang

Entropy 2025, 27(8), 804; https://doi.org/10.3390/e27080804 - 28 Jul 2025

Viewed by 225

Federated semi-supervised learning (Fed-SSL) has emerged as a powerful framework that leverages both labeled and unlabeled data distributed across clients. To reduce communication overhead, real-world deployments often adopt partial client participation, where only a subset of clients is selected in each round. However, [...] Read more.

Federated semi-supervised learning (Fed-SSL) has emerged as a powerful framework that leverages both labeled and unlabeled data distributed across clients. To reduce communication overhead, real-world deployments often adopt partial client participation, where only a subset of clients is selected in each round. However, under non-i.i.d. data distributions, the choice of client sampling strategy becomes critical, as it significantly affects training stability and final model performance. To address this challenge, we propose a novel federated averaging semi-supervised learning algorithm, called FedAvg-SSL, that considers two sampling approaches, uniform random sampling (standard Monte Carlo) and a structured lattice-based sampling, inspired by quasi-Monte Carlo (QMC) techniques, which ensures more balanced client participation through structured deterministic selection. On the client side, each selected participant alternates between updating the global model and refining the pseudo-label model using local data. We provide a rigorous convergence analysis, showing that FedAvg-SSL achieves a sublinear convergence rate with linear speedup. Extensive experiments not only validate our theoretical findings but also demonstrate the advantages of lattice-based sampling in federated learning, offering insights into the interplay among algorithm performance, client participation rates, local update steps, and sampling strategies. Full article

(This article belongs to the Special Issue Number Theoretic Methods in Statistics: Theory and Applications)

► Show Figures

Figure 1

22 pages, 1359 KiB

Open AccessArticle

Fall Detection Using Federated Lightweight CNN Models: A Comparison of Decentralized vs. Centralized Learning

by Qasim Mahdi Haref, Jun Long and Zhan Yang

Appl. Sci. 2025, 15(15), 8315; https://doi.org/10.3390/app15158315 - 25 Jul 2025

Viewed by 279

Abstract

Fall detection is a critical task in healthcare monitoring systems, especially for elderly populations, for whom timely intervention can significantly reduce morbidity and mortality. This study proposes a privacy-preserving and scalable fall-detection framework that integrates federated learning (FL) with transfer learning (TL) to [...] Read more.

Fall detection is a critical task in healthcare monitoring systems, especially for elderly populations, for whom timely intervention can significantly reduce morbidity and mortality. This study proposes a privacy-preserving and scalable fall-detection framework that integrates federated learning (FL) with transfer learning (TL) to train deep learning models across decentralized data sources without compromising user privacy. The pipeline begins with data acquisition, in which annotated video-based fall-detection datasets formatted in YOLO are used to extract image crops of human subjects. These images are then preprocessed, resized, normalized, and relabeled into binary classes (fall vs. non-fall). A stratified 80/10/10 split ensures balanced training, validation, and testing. To simulate real-world federated environments, the training data is partitioned across multiple clients, each performing local training using pretrained CNN models including MobileNetV2, VGG16, EfficientNetB0, and ResNet50. Two FL topologies are implemented: a centralized server-coordinated scheme and a ring-based decentralized topology. During each round, only model weights are shared, and federated averaging (FedAvg) is applied for global aggregation. The models were trained using three random seeds to ensure result robustness and stability across varying data partitions. Among all configurations, decentralized MobileNetV2 achieved the best results, with a mean test accuracy of 0.9927, F1-score of 0.9917, and average training time of 111.17 s per round. These findings highlight the model’s strong generalization, low computational burden, and suitability for edge deployment. Future work will extend evaluation to external datasets and address issues such as client drift and adversarial robustness in federated environments. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

18 pages, 1005 KiB

Open AccessArticle

FedEach: Federated Learning with Evaluator-Based Incentive Mechanism for Human Activity Recognition

by Hyun Woo Lim, Sean Yonathan Tanjung, Ignatius Iwan, Bernardo Nugroho Yahya and Seok-Lyong Lee

Sensors 2025, 25(12), 3687; https://doi.org/10.3390/s25123687 - 12 Jun 2025

Viewed by 455

Abstract

Federated learning (FL) is a decentralized approach that aims to establish a global model by aggregating updates from diverse clients without sharing their local data. However, the approach becomes complicated when Byzantine clients join with arbitrary manipulation, referred to as malicious clients. Classical [...] Read more.

Federated learning (FL) is a decentralized approach that aims to establish a global model by aggregating updates from diverse clients without sharing their local data. However, the approach becomes complicated when Byzantine clients join with arbitrary manipulation, referred to as malicious clients. Classical techniques, such as Federated Averaging (FedAvg), are insufficient to incentivize reliable clients and discourage malicious clients. Other existing Byzantine FL schemes to address malicious clients are either incentive-reliable clients or need-to-provide server-labeled data as the public validation dataset, which increase time complexity. This study introduces a federated learning framework with an evaluator-based incentive mechanism (FedEach) that offers robustness with no dependency on server-labeled data. In this framework, we introduce evaluators and participants. Unlike the existing approaches, the server selects the evaluators and participants among the clients using model-based performance evaluation criteria such as test score and reputation. Afterward, the evaluators assess and evaluate whether a participant is reliable or malicious. Subsequently, the server exclusively aggregates models from these identified reliable participants and the evaluators for global model updates. After this aggregation, the server calculates each client’s contribution, prioritizing each client’s contribution to ensure the fair recognition of high-quality updates and penalizing malicious clients based on their contributions. Empirical evidence obtained from the performance in human activity recognition (HAR) datasets highlights FedEach’s effectiveness, especially in environments with a high presence of malicious clients. In addition, FedEach maintains computational efficiency so that it is reliable for efficient FL applications such as sensor-based HAR with wearable devices and mobile sensing. Full article

(This article belongs to the Special Issue Wearable Devices for Physical Activity and Healthcare Monitoring)

► Show Figures

Figure 1

24 pages, 1347 KiB

Open AccessArticle

SecFedDNN: A Secure Federated Deep Learning Framework for Edge–Cloud Environments

by Roba H. Alamir, Ayman Noor, Hanan Almukhalfi, Reham Almukhlifi and Talal H. Noor

Systems 2025, 13(6), 463; https://doi.org/10.3390/systems13060463 - 12 Jun 2025

Cited by 1 | Viewed by 1127

Abstract

Cyber threats that target Internet of Things (IoT) and edge computing environments are growing in scale and complexity, which necessitates the development of security solutions that are both robust and scalable while also protecting privacy. Edge scenarios require new intrusion detection solutions because [...] Read more.

Cyber threats that target Internet of Things (IoT) and edge computing environments are growing in scale and complexity, which necessitates the development of security solutions that are both robust and scalable while also protecting privacy. Edge scenarios require new intrusion detection solutions because traditional centralized intrusion detection systems (IDSs) lack in the protection of data privacy, create excessive communication overhead, and show limited contextual adaptation capabilities. This paper introduces the SecFedDNN framework, which combines federated deep learning (FDL) capabilities to protect edge–cloud environments from cyberattacks such as Distributed Denial of Service (DDoS), Denial of Service (DoS), and injection attacks. SecFedDNN performs edge-level pre-aggregation filtering through Layer-Adaptive Sparsified Model Aggregation (LASA) for anomaly detection while supporting balanced multi-class evaluation across federated clients. A Deep Neural Network (DNN) forms the main model that trains concurrently with multiple clients through the Federated Averaging (FedAvg) protocol while keeping raw data local. We utilized Google Cloud Platform (GCP) along with Google Colaboratory (Colab) to create five federated clients for simulating attacks on the TON_IoT dataset, which we balanced across selected attack types. Initial tests showed DNN outperformed Long Short-Term Memory (LSTM) and SimpleNN in centralized environments by providing higher accuracy at lower computational costs. Following federated training, the SecFedDNN framework achieved an average accuracy and precision above 84% and recall and F1-score above 82% across all clients with suitable response times for real-time deployment. The study proves that FDL can strengthen intrusion detection across distributed edge networks without compromising data privacy guarantees. Full article

► Show Figures

Figure 1

28 pages, 1638 KiB

Open AccessArticle

Sign-Entropy Regularization for Personalized Federated Learning

by Koffka Khan

Entropy 2025, 27(6), 601; https://doi.org/10.3390/e27060601 - 4 Jun 2025

Viewed by 714

Abstract

Personalized Federated Learning (PFL) seeks to train client-specific models across distributed data silos with heterogeneous distributions. We introduce Sign-Entropy Regularization (SER), a novel entropy-based regularization technique that penalizes excessive directional variability in client-local optimization. Motivated by Descartes’ Rule of Signs, we hypothesize that [...] Read more.

Personalized Federated Learning (PFL) seeks to train client-specific models across distributed data silos with heterogeneous distributions. We introduce Sign-Entropy Regularization (SER), a novel entropy-based regularization technique that penalizes excessive directional variability in client-local optimization. Motivated by Descartes’ Rule of Signs, we hypothesize that frequent sign changes in gradient trajectories reflect complexity in the local loss landscape. By minimizing the entropy of gradient sign patterns during local updates, SER encourages smoother optimization paths, improves convergence stability, and enhances personalization. We formally define a differentiable sign-entropy objective over the gradient sign distribution and integrate it into standard federated optimization frameworks, including FedAvg and FedProx. The regularizer is computed efficiently and applied post hoc per local round. Extensive experiments on three benchmark datasets (FEMNIST, Shakespeare, and CIFAR-10) show that SER improves both average and worst-case client accuracy, reduces variance across clients, accelerates convergence, and smooths the local loss surface as measured by Hessian trace and spectral norm. We also present a sensitivity analysis of the regularization strength

ρ

and discuss the potential for client-adaptive variants. Comparative evaluations against state-of-the-art methods (e.g., Ditto, pFedMe, momentum-based variants, Entropy-SGD) highlight that SER introduces an orthogonal and scalable mechanism for personalization. Theoretically, we frame SER as an information-theoretic and geometric regularizer that stabilizes learning dynamics without requiring dual-model structures or communication modifications. This work opens avenues for trajectory-based regularization and hybrid entropy-guided optimization in federated and resource-constrained learning settings. Full article

(This article belongs to the Section Information Theory, Probability and Statistics)

► Show Figures

Figure 1

37 pages, 4457 KiB

Open AccessArticle

Enhancing Privacy in IoT-Enabled Digital Infrastructure: Evaluating Federated Learning for Intrusion and Fraud Detection

by Amogh Deshmukh, Peplluis Esteva de la Rosa, Raul Villamarin Rodriguez and Sandeep Dasari

Sensors 2025, 25(10), 3043; https://doi.org/10.3390/s25103043 - 12 May 2025

Viewed by 1272

Abstract

Challenges in implementing machine learning (ML) include expanding data resources within the finance sector. Banking data with significant financial implications are highly confidential. Diverse breaches and privacy violations can result from a combination of user information from different institutions for banking purposes. To [...] Read more.

Challenges in implementing machine learning (ML) include expanding data resources within the finance sector. Banking data with significant financial implications are highly confidential. Diverse breaches and privacy violations can result from a combination of user information from different institutions for banking purposes. To address these issues, federated learning (FL) using a flower framework is utilized to protect the privacy of individual organizations while still collaborating through separate models to create a unified global model. However, joint training on datasets with diverse distributions can lead to suboptimal learning and additional privacy concerns. To mitigate this, solutions using federated averaging (FedAvg), federated proximal (FedProx), and federated optimization methods have been proposed. These methods work with data locality during training at local clients without exposing data, while maintaining global convergence to enhance the privacy of local models within the framework. In this analysis, the UNSW-NB15 and credit datasets were employed, utilizing precision, recall, accuracy, F1-score, ROC, and AUC as performance indicators to demonstrate the effectiveness of the proposed strategy using FedAvg, FedProx, and FedOpt. The proposed algorithms were subjected to an empirical study, which revealed significant performance benefits when using the flower framework. Consequently experiments were conducted over 50 rounds using the UNSW-NB15 dataset, which achieved accuracies of 99.87% for both FedAvg and FedProx and 99.94% for FedOpt. Similarly, with the credit dataset under the same conditions, FedAvg and FedProx achieved accuracies of 99.95% and 99.94%, respectively. These results indicate that the proposed framework is highly effective and can be applied in real-world applications across various domains for secure and privacy-preserving collaborative machine learning. Full article

(This article belongs to the Section Internet of Things)

► Show Figures

Figure 1

20 pages, 2041 KiB

Open AccessArticle

Top-k Shuffled Differential Privacy Federated Learning for Heterogeneous Data

by Di Xiao, Xinchun Fan and Lvjun Chen

Sensors 2025, 25(5), 1441; https://doi.org/10.3390/s25051441 - 26 Feb 2025

Cited by 1 | Viewed by 1117

Abstract

Federated learning (FL) has emerged as a promising framework for training shared models across diverse participants, ensuring data remains securely stored on local devices. Despite its potential, FL still faces some critical challenges, including data heterogeneity, privacy risks, and substantial communication overhead. Current [...] Read more.

Federated learning (FL) has emerged as a promising framework for training shared models across diverse participants, ensuring data remains securely stored on local devices. Despite its potential, FL still faces some critical challenges, including data heterogeneity, privacy risks, and substantial communication overhead. Current privacy-preserving FL research frequently fails to tackle complexities posed by heterogeneous data adequately, hence increasing communication expenses. To tackle these issues, we propose a top-k shuffled differential privacy FL (TopkSDP-FL) framework tailored to heterogeneous data environments. To address the model drift issue effectively, we design a novel regularization for local training, drawing inspiration from contrastive learning. To enhance efficiency, we propose a bidirectional top-k communication mechanism that reduces uplink and downlink overhead while strengthening privacy protection through double amplification with the shuffle model. Additionally, we shuffle all local gradient parameters at the layer level to address privacy budget concerns associated with high-dimensional aggregation and repeated iterations. Finally, a formal privacy analysis confirms the privacy amplification effect of TopkSDP-FL. The experimental results further demonstrate its superiority over other state-of-the-art FL methods, with an average accuracy improvement of 3% compared to FedAvg and other leading algorithms under the non-IID scenario, while also reducing communication costs by over 90%. Full article

(This article belongs to the Special Issue Federated and Distributed Learning in IoT)

► Show Figures

Figure 1

18 pages, 3819 KiB

Open AccessArticle

Robust Client Selection Strategy Using an Improved Federated Random High Local Performance Algorithm to Address High Non-IID Challenges

by Pramote Sittijuk, Narin Petrot and Kreangsak Tamee

Algorithms 2025, 18(2), 118; https://doi.org/10.3390/a18020118 - 19 Feb 2025

Viewed by 911

Abstract

This paper introduces an improved version of the Federated Random High Local Performance (Fed-RHLP) algorithm, specifically aimed at addressing the difficulties posed by Non-IID (Non-Independent and Identically Distributed) data within the context of federated learning. The refined Fed-RHLP algorithm implements a more targeted [...] Read more.

This paper introduces an improved version of the Federated Random High Local Performance (Fed-RHLP) algorithm, specifically aimed at addressing the difficulties posed by Non-IID (Non-Independent and Identically Distributed) data within the context of federated learning. The refined Fed-RHLP algorithm implements a more targeted client selection approach, emphasizing clients based on the size of their datasets, the diversity of labels, and the performance of their local models. It employs a biased roulette wheel mechanism for selecting clients, which improves the aggregation of the global model. This approach ensures that the global model is primarily influenced by high-performing clients while still permitting contributions from those with lower performance during the model training process. Experimental findings indicate that the improved Fed-RHLP algorithm significantly surpasses existing methodologies, including FederatedAveraging (FedAvg), Power of Choice (PoC), and FedChoice, by achieving superior global model accuracy, accelerated convergence rates, and decreased execution times, especially under conditions of high Non-IID data. Furthermore, the improved Fed-RHLP algorithm exhibits resilience even when the number of clients participating in local model updates and aggregation is diminished in each communication round. This characteristic positively influences the conservation of limited communication and computational resources. Full article

(This article belongs to the Section Evolutionary Algorithms and Machine Learning)

► Show Figures

Figure 1

25 pages, 747 KiB

Open AccessArticle

Convergence Analysis for Differentially Private Federated Averaging in Heterogeneous Settings

by Yiwei Li, Shuai Wang and Qilong Wu

Mathematics 2025, 13(3), 497; https://doi.org/10.3390/math13030497 - 2 Feb 2025

Viewed by 978

Abstract

Federated learning (FL) has emerged as a prominent approach for distributed machine learning, enabling collaborative model training while preserving data privacy. However, the presence of non-i.i.d. data and the need for robust privacy protection introduce significant challenges in theoretically analyzing the performance of [...] Read more.

Federated learning (FL) has emerged as a prominent approach for distributed machine learning, enabling collaborative model training while preserving data privacy. However, the presence of non-i.i.d. data and the need for robust privacy protection introduce significant challenges in theoretically analyzing the performance of FL algorithms. In this paper, we present novel theoretical analysis on typical differentially private federated averaging (DP-FedAvg) by judiciously considering the impact of non-i.i.d. data on convergence and privacy guarantees. Our contributions are threefold: (i) We introduce a theoretical framework for analyzing the convergence of DP-FedAvg algorithm by considering different client sampling and data sampling strategies, privacy amplification and non-i.i.d. data. (ii) We explore the privacy–utility tradeoff and demonstrate how client strategies interact with differential privacy to affect learning performance. (iii) We provide extensive experimental validation using real-world datasets to verify our theoretical findings. Full article

(This article belongs to the Section E1: Mathematics and Computer Science)

► Show Figures

Figure 1

18 pages, 30485 KiB

Open AccessArticle

Federated Learning for Extreme Label Noise: Enhanced Knowledge Distillation and Particle Swarm Optimization

by Chengtian Ouyang, Jihong Mao, Yehong Li, Taiyong Li, Donglin Zhu, Changjun Zhou and Zhenyu Xu

Electronics 2025, 14(2), 366; https://doi.org/10.3390/electronics14020366 - 17 Jan 2025

Viewed by 961

Abstract

Federated learning, with its unique privacy protection mechanisms and distributed model training capabilities, provides an effective solution for data security by addressing the challenges associated with the inability to directly share private data due to privacy concerns. It exhibits broad application potential across [...] Read more.

Federated learning, with its unique privacy protection mechanisms and distributed model training capabilities, provides an effective solution for data security by addressing the challenges associated with the inability to directly share private data due to privacy concerns. It exhibits broad application potential across various fields, particularly in scenarios such as autonomous vehicular networks, where collaborative learning is required from data sources distributed across different clients, thus optimizing and enhancing model performance. Nevertheless, in complex real-world environments, challenges such as data poisoning and labeling errors may cause some clients to introduce label noise that significantly exceeds ordinary levels, severely impacting model performance. The following conclusions are drawn from research on extreme label noise: highly polluted data severely affect the generalization capability of the global model and the stability of the training process, while the reweighting strategy can improve model performance. Based on these research conclusions, we propose a method named Enhanced Knowledge Distillation and Particle Swarm Optimization for Federated Learning (FedDPSO) to deal with extreme label noise. In FedDPSO, the server dynamically identifies extremely noisy clients based on uncertainty. It then uses the particle swarm optimization algorithm to determine client model weights for global model aggregation. In subsequent rounds, the identified extremely noisy clients construct an interpolation loss combining pseudo-label loss and knowledge distillation loss, effectively mitigating the negative impact of label noise overfitting on the local model. We carried out experiments on the CIFAR10/100 datasets to validate the effectiveness of FedDPSO. At the highest noise ratio under Beta = (0.1, 0.1), experiments show that FedDPSO improves the average accuracy on CIFAR10 by 15% compared to FedAvg and by 11% compared to the more powerful FOCUS. On CIFAR100, it outperforms FedAvg by 8% and FOCUS by 5%. Full article

(This article belongs to the Special Issue Secure Integration of Artificial Intelligence (AI) and Autonomous Vehicular Networks)

► Show Figures

Figure 1

23 pages, 1526 KiB

Open AccessArticle

CLDP-pFedAvg: Safeguarding Client Data Privacy in Personalized Federated Averaging

by Wenquan Shen, Shuhui Wu and Yuanhong Tao

Mathematics 2024, 12(22), 3630; https://doi.org/10.3390/math12223630 - 20 Nov 2024

Viewed by 914

Abstract

The personalized federated averaging algorithm integrates a federated averaging approach with a model-agnostic meta-learning technique. In real-world heterogeneous scenarios, it is essential to implement additional privacy protection techniques for personalized federated learning. We propose a novel differentially private federated meta-learning scheme, CLDP-pFedAvg, which [...] Read more.

The personalized federated averaging algorithm integrates a federated averaging approach with a model-agnostic meta-learning technique. In real-world heterogeneous scenarios, it is essential to implement additional privacy protection techniques for personalized federated learning. We propose a novel differentially private federated meta-learning scheme, CLDP-pFedAvg, which achieves client-level differential privacy guarantees for federated learning involving large heterogeneous clients. The client-level differentially private meta-based FedAvg algorithm enables clients to upload local model parameters for aggregation securely. Furthermore, we provide a convergence analysis of the clipping-enabled differentially private meta-based FedAvg algorithm. The proposed strategy is evaluated across various datasets, and the findings indicate that our approach offers improved privacy protection while maintaining model accuracy. Full article

(This article belongs to the Special Issue Advanced Deep Learning and Mathematical Modeling for Reliability, Security and Privacy Problems in Engineering: 2nd Edition)

► Show Figures

Figure 1

25 pages, 6241 KiB

Open AccessArticle

Privacy-Preserving Federated Learning-Based Intrusion Detection Technique for Cyber-Physical Systems

by Syeda Aunanya Mahmud, Nazmul Islam, Zahidul Islam, Ziaur Rahman and Sk. Tanzir Mehedi

Mathematics 2024, 12(20), 3194; https://doi.org/10.3390/math12203194 - 12 Oct 2024

Cited by 5 | Viewed by 6197

Abstract

The Internet of Things (IoT) has revolutionized various industries, but the increased dependence on all kinds of IoT devices and the sensitive nature of the data accumulated by them pose a formidable threat to privacy and security. While traditional IDSs have been effective [...] Read more.

The Internet of Things (IoT) has revolutionized various industries, but the increased dependence on all kinds of IoT devices and the sensitive nature of the data accumulated by them pose a formidable threat to privacy and security. While traditional IDSs have been effective in securing critical infrastructures, the centralized nature of these systems raises serious data privacy concerns as sensitive information is sent to a central server for analysis. This research paper introduces a Federated Learning (FL) approach designed for detecting intrusions in diverse IoT networks to address the issue of data privacy by ensuring that sensitive information is kept in the individual IoT devices during model training. Our framework utilizes the Federated Averaging (FedAvg) algorithm, which aggregates model weights from distributed devices to refine the global model iteratively. The proposed model manages to achieve above 90% accuracies across various metrics, including precision, recall, and F1 score, while maintaining low computational demands. The results show that the proposed system successfully identifies various types of cyberattacks, including Denial-of-Service (DoS), Distributed Denial-of-Service (DDoS), data injection, ransomware, and several others, showcasing its robustness. This research makes a great advancement to the IDSs by providing an efficient and reliable solution that is more scalable and privacy friendly than any of the existing models. Full article

(This article belongs to the Section E1: Mathematics and Computer Science)

► Show Figures

Figure 1

18 pages, 1649 KiB

Open AccessArticle

Research on Data Quality Governance for Federated Cooperation Scenarios

by Junxin Shen, Shuilan Zhou and Fanghao Xiao

Electronics 2024, 13(18), 3606; https://doi.org/10.3390/electronics13183606 - 11 Sep 2024

Cited by 2 | Viewed by 1475

Abstract

Exploring the data quality problems in the context of federated cooperation and adopting corresponding governance countermeasures can facilitate the smooth progress of federated cooperation and obtain high-performance models. However, previous studies have rarely focused on quality issues in federated cooperation. To this end, [...] Read more.

Exploring the data quality problems in the context of federated cooperation and adopting corresponding governance countermeasures can facilitate the smooth progress of federated cooperation and obtain high-performance models. However, previous studies have rarely focused on quality issues in federated cooperation. To this end, this paper analyzes the quality problems in the federated cooperation scenario and innovatively proposes a “Two-stage” data quality governance framework for the federated collaboration scenarios. The first stage is mainly local data quality assessment and optimization, and the evaluation is performed by constructing a metrics scoring formula, and corresponding optimization measures are taken at the same time. In the second stage, the outlier processing mechanism is introduced, and the Data Quality Federated Averaging (Abbreviation DQ-FedAvg) aggregation method for model quality problems is proposed, so as to train high-quality global models and their own excellent local models. Finally, experiments are conducted in real datasets to compare the model performance changes before and after quality governance, and to validate the advantages of the data quality governance framework in a federated learning scenario, so that it can be widely applied to various domains. The governance framework is used to check and govern the quality problems in the federated learning process, and the accuracy of the model is improved. Full article

► Show Figures

Figure 1

15 pages, 1507 KiB

Open AccessArticle

A Personalized Federated Learning Method Based on Clustering and Knowledge Distillation

by Jianfei Zhang and Yongqiang Shi

Electronics 2024, 13(5), 857; https://doi.org/10.3390/electronics13050857 - 23 Feb 2024

Cited by 5 | Viewed by 3371

Abstract

Federated learning (FL) is a distributed machine learning paradigm under privacy preservation. However, data heterogeneity among clients leads to the shared global model obtained after training, which cannot fit the distribution of each client’s dataset, and the performance of the model degrades. To [...] Read more.

Federated learning (FL) is a distributed machine learning paradigm under privacy preservation. However, data heterogeneity among clients leads to the shared global model obtained after training, which cannot fit the distribution of each client’s dataset, and the performance of the model degrades. To address this problem, we proposed a personalized federated learning method based on clustering and knowledge distillation, called pFedCK. In this algorithm, each client has an interactive model that participates in global training and a personalized model that is only trained locally. Both of the models perform knowledge distillation with each other through the feature representation of the middle layer and the soft prediction of the model. In addition, in order to make an interaction model only obtaining the model information from the client, which has similar data distribution and avoids the interference of other heterogeneous information, the server will cluster the clients according to the similarity of the amount of parameter variation uploaded by different interaction models during every training round. By clustering clients, interaction models with similar data distributions can cooperate with each other to better fit the local dataset distribution. Thereby, the performance of personalized model can be improved by obtaining more valuable information indirectly. Finally, we conduct simulation experiments on three benchmark datasets under different data heterogeneity scenarios. Compared to the single model algorithms, the accuracy of pFedCK improved by an average of 23.4% and 23.8% over FedAvg and FedProx, respectively; compared to typical personalization algorithms, the accuracy of pFedCK improved by an average of 0.8% and 1.3%, and a maximum of 1.0% and 2.9% over FedDistill and FML. Full article

(This article belongs to the Special Issue Deep Learning for Data Mining: Theory, Methods, and Applications)

► Show Figures

Figure 1

22 pages, 1588 KiB

Open AccessArticle

Federated Learning for IoT Intrusion Detection

by Riccardo Lazzarini, Huaglory Tianfield and Vassilis Charissis

AI 2023, 4(3), 509-530; https://doi.org/10.3390/ai4030028 - 24 Jul 2023

Cited by 38 | Viewed by 8477

Abstract

The number of Internet of Things (IoT) devices has increased considerably in the past few years, resulting in a large growth of cyber attacks on IoT infrastructure. As part of a defense in depth approach to cybersecurity, intrusion detection systems (IDSs) have acquired [...] Read more.

The number of Internet of Things (IoT) devices has increased considerably in the past few years, resulting in a large growth of cyber attacks on IoT infrastructure. As part of a defense in depth approach to cybersecurity, intrusion detection systems (IDSs) have acquired a key role in attempting to detect malicious activities efficiently. Most modern approaches to IDS in IoT are based on machine learning (ML) techniques. The majority of these are centralized, which implies the sharing of data from source devices to a central server for classification. This presents potentially crucial issues related to privacy of user data as well as challenges in data transfers due to their volumes. In this article, we evaluate the use of federated learning (FL) as a method to implement intrusion detection in IoT environments. FL is an alternative, distributed method to centralized ML models, which has seen a surge of interest in IoT intrusion detection recently. In our implementation, we evaluate FL using a shallow artificial neural network (ANN) as the shared model and federated averaging (FedAvg) as the aggregation algorithm. The experiments are completed on the ToN_IoT and CICIDS2017 datasets in binary and multiclass classification. Classification is performed by the distributed devices using their own data. No sharing of data occurs among participants, maintaining data privacy. When compared against a centralized approach, results have shown that a collaborative FL IDS can be an efficient alternative, in terms of accuracy, precision, recall and F1-score, making it a viable option as an IoT IDS. Additionally, with these results as baseline, we have evaluated alternative aggregation algorithms, namely FedAvgM, FedAdam and FedAdagrad, in the same setting by using the Flower FL framework. The results from the evaluation show that, in our scenario, FedAvg and FedAvgM tend to perform better compared to the two adaptive algorithms, FedAdam and FedAdagrad. Full article

(This article belongs to the Special Issue Feature Papers for AI)

► Show Figures

Figure 1

Search Results (25)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (25)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI