Efficient Adversarial Training for Federated Image Systems: Crafting Client-Specific Defenses with Robust Trimmed Aggregation

Zhao, Siyuan; Zheng, Xiaodong; Chen, Junming

doi:10.3390/electronics14081541

Open AccessArticle

Efficient Adversarial Training for Federated Image Systems: Crafting Client-Specific Defenses with Robust Trimmed Aggregation

by

Siyuan Zhao

,

Xiaodong Zheng

^* and

Junming Chen

^*

Faculty of Humanities and Arts, Macau University of Science and Technology, Macao 999078, China

^*

Authors to whom correspondence should be addressed.

Electronics 2025, 14(8), 1541; https://doi.org/10.3390/electronics14081541

Submission received: 6 March 2025 / Revised: 8 April 2025 / Accepted: 9 April 2025 / Published: 10 April 2025

(This article belongs to the Special Issue Security and Privacy in Distributed Machine Learning)

Download

Browse Figures

Versions Notes

Abstract

Federated learning offers a powerful approach for training models across decentralized datasets, enabling the creation of machine learning models that respect data privacy. However, federated learning faces significant challenges due to its vulnerability to adversarial attacks, especially when clients have diverse and potentially malicious data distributions. These challenges can lead to severe degradation in the global model’s performance and generalization. In this paper, we present a novel federated image adversarial training framework that combines client-specific adversarial example generation with a robust trimmed aggregation technique. By creating adversarial examples tailored to each client’s local data, our method strengthens individual model defenses against adversarial attacks. Meanwhile, the trimmed aggregation strategy ensures the global model’s robustness by mitigating the impact of harmful or low-quality updates during the model aggregation process. This framework effectively addresses both the issue of data heterogeneity and adversarial threats in federated learning settings. Our experimental results on standard image classification datasets show that the proposed approach significantly enhances model robustness, surpassing existing methods in defending against various adversarial attacks while maintaining high classification accuracy. This framework holds strong promise for real-world applications, particularly in privacy-sensitive domains where both security and model reliability are essential.

Keywords:

federated learning; adversarial training; adversarial generation

1. Introduction

Federated learning (FL) has emerged as a powerful approach to distributed machine learning, enabling collaborative model training across decentralized data sources without the need for raw data to leave individual devices [1]. By keeping data local and only sharing model updates with a central server, FL addresses significant privacy concerns, making it suitable for privacy-sensitive domains such as healthcare, finance, and mobile applications [2]. Furthermore, it helps alleviate issues related to limited network bandwidth, which are common in many real-world scenarios.

However, while FL offers clear advantages in terms of privacy preservation, it also presents challenges that are not typically encountered in traditional centralized machine learning. One of the most prominent issues is the heterogeneous data distribution across clients. In practical settings, data are often non-i.i.d. (non-independent and identically distributed) and unbalanced, which can impair the convergence and performance of the global model [3]. Moreover, the decentralized nature of FL opens the door to adversarial attacks [4]. Malicious clients or external attackers may inject adversarial data or manipulate model updates, leading to the degradation of the global model’s performance.

Studies have shown that even small adversarial perturbations can significantly affect the accuracy and robustness of deep neural networks [5]. While adversarial training has been extensively studied in centralized environments, directly applying these techniques in federated settings presents several additional challenges. These include the non-i.i.d. nature of client data and the absence of centralized control, which complicates the detection and mitigation of adversarial behaviors [6]. Malicious clients can exploit the federated setup to send manipulated updates, potentially skewing the aggregation process and damaging the global model [4].

To tackle these challenges, we introduce a novel federated adversarial training framework for image data that incorporates two main components: client-specific adversarial example generation and robust trimmed aggregation. This approach is aimed at improving the resilience of both local client models and the global model against adversarial attacks.

Unlike traditional adversarial training approaches that apply a uniform perturbation strategy across all clients, our method customizes the adversarial generation process based on the unique characteristics of each client’s dataset. Each client generates adversarial examples by adjusting perturbation parameters based on local data statistics, such as the mean and standard deviation of input samples. This client-specific approach improves the local robustness of individual models while ensuring that adversarial examples are neither too weak nor excessively strong. In addition to local adversarial training, protecting the global aggregation process is crucial in federated learning. Traditional aggregation methods, such as Federated Averaging (FedAvg), can be vulnerable to the influence of outlier updates from malicious clients. To address this, we employ a robust aggregation mechanism based on the trimmed mean, which discards extreme updates before averaging. This approach helps protect the global model from adversarial manipulations and ensures more stable updates during the aggregation process.

The motivation for our work arises from the need to develop robust machine learning models for adversarial and privacy-sensitive environments. While FL holds great promise for decentralized data processing, its vulnerability to adversarial attacks remains a significant obstacle to its widespread adoption. Our contributions can be summarized as follows:

Client-Specific Adversarial Generation: We introduce a novel mechanism that customizes adversarial example generation based on local data statistics, ensuring that the adversarial examples are appropriately scaled and effectively challenge the model.
Robust Trimmed Aggregation: We propose a robust aggregation strategy that uses the trimmed mean to mitigate the effects of malicious or low-quality updates during global model aggregation. This protects the global model from adversarial interference.
Comprehensive Evaluation: We evaluate our framework through experiments on widely used datasets like CIFAR-10 and MNIST. Our results show significant improvements in adversarial robustness, with the model maintaining high performance on clean data as well.

The rest of this paper is organized as follows. Section 2 reviews related works on federated learning and adversarial training, highlighting existing challenges and solutions. Section 3 describes the proposed methodology, including the client-specific adversarial generation and robust trimmed aggregation techniques. In Section 4, we present the experimental setup, evaluation metrics, and results, demonstrating the effectiveness of our approach. Finally, Section 5 concludes this paper and discusses potential future research directions, including adaptive parameter tuning and extending the framework to other data types and adversarial settings.

2. Related Works and Preliminary

The fields of federated learning (FL) and adversarial robustness have advanced considerably in recent years, shedding light on key challenges and potential solutions. In this section, we review related works on both federated learning and adversarial training, focusing on methods for robust aggregation and client-specific adaptations, which serve as the foundation for our proposed framework.

2.1. Federated Learning

Federated learning (FL) was introduced as a distributed machine learning paradigm that enables model training across multiple clients while preserving data privacy [7]. The pioneering Federated Averaging (FedAvg) algorithm demonstrated that distributed model training using local updates can achieve competitive performance compared to centralized training. However, FedAvg and similar methods assume that client data are independent and identically distributed (i.i.d.), an assumption that is often violated in real-world applications. Recent works have therefore focused on addressing challenges arising from non-i.i.d. data, communication constraints, and limited computational resources on client devices [8,9,10].

2.2. Adversarial Training in Centralized and Distributed Settings

Adversarial training was first popularized as a technique to defend against adversarial examples in centralized machine learning systems [11,12]. By incorporating adversarially perturbed examples into the training data, models become more robust to adversarial attacks [13]. While adversarial training has been effective in centralized settings, applying it directly to federated learning environments presents additional challenges [14]. In federated learning, adversarial training must address data heterogeneity across clients and potential adversarial behavior during both local training and global model aggregation.

In adversarial training, adversarial examples are generated by perturbing the input

x

with a carefully crafted noise

ffi

that maximizes the model’s loss. Formally, the perturbation is computed as

ffi = arg max_{δ} L (f (x + δ), y),

(1)

where

f (\cdot)

denotes the model,

L

is the loss function, and y is the true label. In federated learning, this process is performed locally on each client, where adversarial examples are generated using the client’s specific data distribution. This client-specific adversarial training enables local models to become robust to localized threats and contributes to a globally aggregated model that is resilient across heterogeneous, non-independent and identically distributed (non-i.i.d.) client datasets.

Several studies have attempted to incorporate adversarial training within federated frameworks [15,16]. However, many of these approaches rely on uniform adversarial example generation strategies, which may not effectively represent the diverse data distributions across clients. This gap motivates the development of client-specific adversarial training methods, where adversarial perturbations are tailored to the local data distribution of each client.

2.3. Robust Aggregation Techniques

The aggregation of local model updates is a critical aspect of federated learning [17]. Traditional methods such as FedAvg are vulnerable to adversarial attacks, where malicious clients inject harmful updates into the aggregation process. To mitigate this, robust aggregation techniques have been proposed, including methods based on the median, Krum, and trimmed means [18,19,20]. These methods aim to reduce the influence of outlier updates by either selecting a subset of trustworthy updates or discarding extreme values during aggregation.

Among these methods, the trimmed mean approach has shown particular promise. It works by discarding extreme values from both ends of the update distribution before averaging. While these aggregation methods are effective in mitigating the influence of malicious updates, they do not address vulnerabilities in local training processes that are susceptible to adversarial examples. Thus, a more integrated approach is required to tackle adversarial threats at both local and global levels.

2.4. Client-Specific Strategies

The heterogeneity of client data is a fundamental characteristic of federated learning and various strategies have been proposed to address this challenge [21]. Some studies focus on personalization strategies that adapt the global model to each client’s data distribution [22,23]. However, adapting adversarial training to the unique characteristics of client data has been less explored [24]. Client-specific adversarial generation tailors the adversarial perturbations based on local data statistics, such as the mean and standard deviation, thus generating more effective adversarial examples [25]. This method improves local model robustness, which contributes to a more resilient global model after aggregation.

2.5. Integration of Robust Aggregation and Client-Specific Adversarial Training

Our work builds upon these previous studies by integrating robust aggregation techniques with client-specific adversarial training. While robust aggregation helps protect the global model from malicious updates during the aggregation process, client-specific adversarial generation strengthens the local models against adversarial examples. This combination of strategies enhances overall system robustness in federated environments. To the best of our knowledge, this integrated approach has not been fully explored, and our framework represents a significant contribution to developing secure and resilient federated learning systems.

In summary, the related works on federated learning, adversarial training, robust aggregation, and client-specific adaptations provide a solid foundation for our approach. Our contributions extend the existing works by proposing a comprehensive framework that addresses vulnerabilities at both the local and global levels of federated learning under adversarial conditions.

3. Methodology

Our framework consists of two main components: (1) client-specific adversarial generation, and (2) robust trimmed aggregation. We also describe how these components are integrated into a federated training procedure, including considerations for non-i.i.d. data distributions, communication constraints, and hyperparameter choices.

3.1. Client-Specific Adversarial Generation

A central challenge in federated adversarial training is dealing with non-i.i.d. data across clients. In a typical centralized adversarial training setup, a single adversarial example generation procedure (e.g., standard Projected Gradient Descent (PGD) with fixed hyperparameters) is applied to a large, unified dataset. However, in federated scenarios, each client may have data drawn from different underlying distributions. For instance, one client may primarily hold images of a single class, while another client’s data may be more diverse. Consequently, a one-size-fits-all adversarial generation strategy can fail to capture the heterogeneity in local data.

To address this, we propose a client-specific adversarial generation approach, where each client dynamically adjusts adversarial perturbations based on its own data statistics, computational resources, and performance objectives. This localized approach not only improves the relevance of adversarial examples to each client’s data distribution but also ensures that the perturbations are neither too weak nor overly strong.

For each client i, we compute the empirical estimates of the mean

μ_{i}

and the standard deviation

σ_{i}

of the input samples (e.g., pixel intensities in image data). These statistics can be updated once per communication round or after a fixed number of local iterations. By normalizing the perturbation budget using these local statistics, each client scales its adversarial strength to match its data distribution. Formally,

ϵ_{i} = α \cdot σ_{i},

(2)

where

α

is a hyperparameter that controls the severity of perturbations. When

σ_{i}

is relatively large (e.g., high-contrast images),

ϵ_{i}

will be larger, allowing for more pronounced perturbations. Conversely, for a client with lower-variance data,

ϵ_{i}

is reduced to prevent excessive distortion.

This normalization extends beyond simple pixel-level statistics. If clients have different data modalities or if some store grayscale images while others use color images, the corresponding local

μ_{i}

and

σ_{i}

can differ substantially. Moreover, when the class distribution is skewed (e.g., a client predominantly holds samples from a single class), generating adversarial examples that are appropriately challenging becomes critical. Such clients may require tailored scaling to ensure that adversarial examples are meaningful and not overly disruptive.

Once

ϵ_{i}

is determined, each client generates adversarial examples using an iterative projected gradient method. For a given input

(x, y)

in a local mini-batch

B_{i}

, the adversarial example is defined as

x_{adv} = {arg max}_{∥ x^{'} {- x ∥}_{p} \leq ϵ_{i}} L (f_{θ_{i}} (x^{'}), y),

(3)

where

θ_{i}

denotes the local model parameters and, typically,

p = \infty

is chosen for simplicity. The iterative update is given by

x^{(j)} = Π_{S} (x^{(j - 1)} + η_{i} \cdot sign (\nabla_{x} L (f_{θ_{i}} (x^{(j - 1)}), y))),

(4)

where

Π_{S} (\cdot)

projects onto the set

{x^{'} : ∥ x^{'} - x ∥_{\infty} \leq ϵ_{i}}

and

η_{i}

is a client-specific step size. After

n_{i}

iterations, the adversarial example

x_{adv} = x^{(n_{i})}

is produced. This procedure, while standard in form, benefits from the client-specific adjustments in

ϵ_{i}

and

η_{i}

, ensuring that each client generates adversarial examples that are well suited to its data characteristics.

Rather than fixing the number of PGD steps

n_{i}

uniformly across all clients, our approach allows each client to adapt

n_{i}

based on its computational capacity or local validation performance. Resource-constrained clients may opt for a lower

n_{i}

(e.g., a single-step attack akin to the Fast Gradient Sign Method (FGSM)), while clients with more computational power can afford more iterations to craft stronger adversarial examples. Additionally, a client can monitor its validation performance to adjust

n_{i}

dynamically: if validation accuracy remains high, increasing

n_{i}

may further improve robustness; if accuracy drops significantly,

n_{i}

may be reduced to preserve clean performance.

Exclusive focus on adversarial examples may lead to a phenomenon known as catastrophic overfitting, where the model loses accuracy on clean data. To mitigate this, each client trains on a mix of clean and adversarial samples. A predetermined fraction of the mini-batch is kept unperturbed, ensuring that the model remains competent on clean inputs while still learning robust features from adversarial examples. Empirically, even a modest proportion (e.g., 20% clean samples) can significantly help maintain overall accuracy.

Local data distributions may evolve over time, especially as clients encounter new data. Thus, parameters such as

μ_{i}

,

σ_{i}

,

ϵ_{i}

,

η_{i}

, and

n_{i}

can be updated periodically. This dynamic updating allows each client to continuously recalibrate its adversarial generation process in response to changes in its data. For instance, if a client’s data distribution shifts due to seasonal effects or new classes being introduced, recomputing the local statistics ensures that the adversarial perturbations remain appropriately scaled.

The main advantage of this client-specific adversarial generation is the improved alignment of adversarial examples with local data characteristics, leading to enhanced robustness on a per-client basis. However, this approach also introduces additional complexity in hyperparameter tuning, as each client may require individualized settings for

ϵ_{i}

,

η_{i}

, and

n_{i}

. In practice, clients may start with a default configuration and then adapt these parameters over time based on local performance metrics.

While our discussion focuses on

ℓ_{\infty}

-bounded PGD, the client-specific framework is flexible enough to accommodate other attack norms such as

ℓ_{2}

or

ℓ_{1}

perturbations. Additionally, more advanced attack strategies (e.g., momentum-based attacks) can be integrated into this framework, with the core idea remaining the same: adjust adversarial parameters based on local data statistics and resource constraints.

3.2. Robust Trimmed Aggregation

Once clients finish their local adversarial training, they upload their model updates to the server. Traditional federated learning algorithms like FedAvg [7] simply average these updates. However, under adversarial conditions, a small number of compromised or malicious clients can dramatically skew the global model if we only perform naive averaging. To mitigate this risk, our approach employs a robust aggregation mechanism based on the trimmed mean.

We adopt a trimmed mean approach [18] that discards extreme update values before computing the average. For each coordinate j of the model update, we first sort the values

{Δ θ_{1, j}, \dots, Δ θ_{K, j}}

and then remove the smallest and largest b values. The remaining

K - 2 b

values are averaged to compute the aggregated update for coordinate j:

Δ θ_{j} = \frac{1}{K - 2 b} \sum_{i = b + 1}^{K - b} Δ θ_{(i), j},

(5)

where

Δ θ_{(i), j}

denotes the i-th smallest value among the j-th coordinates from all client updates. Finally, the global model is updated as follows:

θ^{t + 1} \leftarrow θ^{t} + Δ θ .

(6)

This coordinate-wise trimmed mean effectively reduces the influence of extreme outliers, making the global update more robust to adversarial manipulations.

The trimming parameter b represents the number of outlier updates to be removed from both ends of the sorted list for each coordinate. In choosing b, there is an inherent trade-off: a larger b provides stronger protection by eliminating more potential outliers, but it also risks discarding too many legitimate updates, which could slow down convergence. In practice, b is typically set as a fraction of K (e.g., 5% to 10% of K), based on prior estimates or assumptions about the proportion of compromised clients. Adaptive methods to select b during training are also a potential research direction.

While the trimmed mean approach enhances robustness, it introduces additional computational complexity. For a model with d parameters, sorting the K values for each coordinate requires

O (K log K)

time, leading to an overall cost of

O (d \cdot K log K)

for each aggregation step. However, this cost is still manageable in most federated settings, especially when d is relatively small compared to K or when parallel sorting techniques are employed.

3.3. Federated Training Procedure

At the start of each communication round, the server broadcasts the current global model parameters,

θ^{t}

, to all participating clients. Once selected—based on random choice, availability, resource constraints, or performance metrics—each client downloads the global model and initializes its local model

θ_{i}

with these parameters. The client then performs local adversarial training for E epochs. During each epoch, a mini-batch

B_{i}

of size B is sampled from the local dataset. For each sample

(x, y)

in the mini-batch, the client generates an adversarial example using a modified Projected Gradient Descent (PGD) method. Specifically, the input x is perturbed iteratively for

n_{i}

steps using a client-specific step size

η_{i}

and a perturbation-bound

ϵ_{i}

, which is computed based on local data statistics and scaled by a factor

α

. The generated adversarial examples, along with a fraction of clean samples, form a mixed batch that is used to update the local model via gradient descent. After completing the local training, each client computes its model update

Δ θ_{i}

as the difference between its locally updated model and the initial global model

θ^{t}

, and then transmits only these updates back to the server, thereby reducing communication overhead.

Upon receiving the updates from all participating clients, the server aggregates them using a robust trimmed mean procedure. For each coordinate j of the model updates, the server first sorts the set

{Δ θ_{1, j}, \dots, Δ θ_{K, j}}

and then discards the smallest and largest b values. The remaining

K - 2 b

values are averaged to compute the aggregated update for coordinate j, which is then applied to update the global model:

θ^{t + 1} \leftarrow θ^{t} + Δ θ .

(7)

This robust aggregation mechanism effectively limits the influence of extreme outlier updates that may result from adversarial clients, ensuring that the global model remains resilient to such perturbations.

Under typical federated learning assumptions, such as random client participation and sufficient update frequency, the combined use of client-specific adversarial training and robust trimmed aggregation facilitates reliable convergence of the global model, even in adversarial settings. Although detailed convergence proofs are beyond the scope of this work, our empirical evaluations on standard benchmarks confirm that the proposed method converges effectively while maintaining both high adversarial robustness and competitive clean performance.

Algorithm 1 provides a detailed step-by-step outline of the entire training procedure, encompassing both the client-side operations (local training and adversarial example generation) and the server-side robust aggregation process.

Under typical federated assumptions, such as random client participation and sufficient update frequency, convergence analysis of robust aggregation strategies can be extended from existing work on Byzantine-resilient federated learning [18]. Although detailed convergence proofs are beyond this paper’s scope, our empirical evaluation confirms that the proposed method reliably converges on standard benchmarks, even in the presence of adversarial clients.

Algorithm 1 Federated image adversarial training with client-specific generation and robust trimmed aggregation

1:: Input: Global model parameters $θ^{0}$ , number of communication rounds T, number of clients K, trimming parameter b, local epochs E, local batch size B, adversarial parameters ${ϵ_{i}, η_{i}, n_{i}}_{i = 1}^{K}$
2:: for each round $t = 0, 1, \dots, T - 1$ do
3:: Server: Broadcast $θ^{t}$ to all clients.
4:: for each client $i \in {1, \dots, K}$ in parallel do
5:: Initialize local model: $θ_{i} \leftarrow θ^{t}$ .
6:: for local epoch $e = 1$ to E do
7:: Sample mini-batch $B_{i}$ of size B from local dataset.
8:: For each $(x, y) \in B_{i}$ , generate adversarial example:
9:: Set $x^{(0)} \leftarrow x$ .
10:: for $j = 1$ to $n_{i}$ do
11:: Compute gradient: $g \leftarrow \nabla_{x} L (f_{θ_{i}} (x^{(j - 1)}), y)$ .
12:: Update: $x^{(j)} = x^{(j - 1)} + η_{i} \cdot sign (g)$ .
13:: Project $x^{(j)}$ onto ${x^{'} : ∥ x^{'} - x ∥_{\infty} \leq ϵ_{i}}$ .
14:: end for
15:: Set $x_{adv} \leftarrow x^{(n_{i})}$ .
16:: Form a mixed batch of clean and adversarial examples.
17:: Update local model parameters $θ_{i}$ using gradient descent on this batch.
18:: end for
19:: Compute local update $Δ θ_{i} = θ_{i} - θ^{t}$ .
20:: Send $Δ θ_{i}$ to the server.
21:: end for
22:: Server: For each coordinate j, sort ${Δ θ_{1, j}, \dots, Δ θ_{K, j}}$ and remove the smallest and largest b values.
23:: Compute aggregated update: $Δ θ_{j} = \frac{1}{K - 2 b} \sum_{i = b + 1}^{K - b} Δ θ_{(i), j} .$
24:: Update global model: $θ^{t + 1} \leftarrow θ^{t} + Δ θ$ .
25:: end for
26:: Output: Final global model parameters $θ^{T}$ .

4. Experiments

4.1. Experimental Setup

To evaluate the effectiveness of our proposed framework in enhancing adversarial robustness in federated learning, we conducted experiments on seven widely used benchmark datasets: CIFAR-10, CIFAR-100, MNIST, SVHN, Fashion-MNIST, EMNIST, and STL-10. These datasets, extensively used in the computer vision community, present a diverse range of challenges in terms of image complexity, resolution, and class granularity. To simulate real-world federated scenarios, we partitioned each dataset among 20 clients using a Dirichlet distribution-based strategy. This partitioning method produced non-identically distributed (non-i.i.d.) data across clients, introducing significant heterogeneity in local data distributions, characteristic of practical applications. Moreover, we simulated various threat models by varying the proportion of adversarial (or malicious) clients, thereby assessing the resilience of our framework under different adversarial conditions.

Table 1 provides an overview of the datasets used in our experiments. CIFAR-10 comprises 60,000 color images of size 32 × 32 pixels distributed over 10 classes, with 50,000 images used for training and 10,000 for testing. CIFAR-100 shares the same image size and total number of images as CIFAR-10 but is more challenging, as it contains 100 classes. MNIST consists of 70,000 grayscale images of handwritten digits (28 × 28 pixels), partitioned into 60,000 training samples and 10,000 testing samples. The SVHN dataset includes approximately 99,000 color images (32 × 32 pixels) of street view house numbers, with a typical split of around 73,257 images for training and 26,032 for testing. Fashion-MNIST, which serves as a more complex alternative to MNIST, also contains 70,000 grayscale images (28 × 28 pixels) depicting various fashion items. EMNIST (Balanced) is an extended version of MNIST featuring 131,600 images across 47 classes, with roughly 112,800 samples for training and 18,800 for testing. Lastly, STL-10 consists of 13,000 color images with a higher resolution of 96 × 96 pixels, divided into 5000 training samples and 8000 testing samples across 10 classes.

The datasets selected for our experiments—CIFAR-10, CIFAR-100, MNIST, SVHN, Fashion-MNIST, EMNIST, and STL-10—were chosen for their wide adoption in benchmarking image classification tasks and their ability to simulate key challenges in real-world federated learning scenarios. These datasets collectively span varying image complexities, resolutions, and class granularities, reflecting the heterogeneity commonly observed in practical deployments of federated systems. For example, CIFAR-100 and STL-10 offer fine-grained classification tasks, while MNIST and EMNIST simulate low-resolution, digit-based applications akin to mobile or embedded devices. The use of a Dirichlet distribution for partitioning ensures non-i.i.d. data across clients, closely modeling the personalized and imbalanced data distributions encountered in decentralized environments such as smartphones, hospitals, or IoT systems. This selection provides a comprehensive testbed to evaluate the robustness, generalization, and fairness of federated learning methods under diverse, realistic conditions.

Our experimental comparisons involved several baseline methods. The first baseline was FedAvg [7], which aggregates client updates using simple averaging without incorporating any adversarial defenses. The second baseline, denoted as FedAdv (no robust agg.), applies adversarial training at the client level through client-specific adversarial example generation but uses naive averaging during global aggregation. In contrast, our proposed method integrates client-specific adversarial generation with robust trimmed aggregation to mitigate the influence of malicious updates and enhance the overall resilience of the global model.

All experiments were implemented using PyTorch 2.6 and executed on a GPU cluster to ensure computational efficiency. In our experimental protocol, each client conducted local training for five epochs per communication round with a batch size of 64. Adversarial examples were generated using a Projected Gradient Descent (PGD) procedure, where the number of PGD steps ranged from 1 to 5 depending on client resources. The step size was fixed at 0.01, and the maximum perturbation (

ϵ_{i}

) was dynamically computed based on local data statistics and scaled by a factor

α

. For the robust aggregation step, we employed a trimmed mean with a trimming parameter set to 5% of the total number of clients, effectively discarding extreme updates prior to computing the final average.

To evaluate adversarial robustness, we considered both white-box and black-box attack scenarios. In the white-box setting, adversarial examples were generated using PGD and the Fast Gradient Sign Method (FGSM) under the assumption that the attacker had full knowledge of the model architecture and parameters. In the black-box setting, we simulated conditions where the adversary had limited access to the model, and we evaluated the transferability of adversarial examples generated on surrogate models.

4.2. Results

Table 2 presents the average performance of various federated learning methods across four benchmark datasets: CIFAR-10, CIFAR-100 [26], MNIST [27], and SVHN [28]. In our experiments, we compared five approaches: the standard FedAvg, a federated adversarial training variant that employs client-specific adversarial generation with naive averaging (FedAdv without robust aggregation), FedProx [29], MOON [30], and our proposed method that integrates client-specific adversarial generation with robust trimmed aggregation. These comparisons were designed to assess the effectiveness of each method under adversarial attacks and highlight the benefits of our integrated approach.

As shown in Table 2, FedAvg, which lacks any adversarial defenses, consistently exhibited high clean accuracy but suffered from notably low adversarial accuracy. For example, on CIFAR-10, FedAvg achieved a clean accuracy of 85.2% but only a 43.5% adversarial accuracy, underscoring its susceptibility to adversarial perturbations. In contrast, FedAdv (without robust aggregation) showed an improvement in adversarial accuracy (48.7% on CIFAR-10), although with a slight reduction in clean accuracy (83.0%). Methods such as FedProx and MOON, which incorporate mechanisms to better handle data heterogeneity and improve model personalization, further enhanced adversarial performance to 50.1% and 52.0% on CIFAR-10, respectively.

Our proposed approach, however, consistently outperformed these baselines. On CIFAR-10, our method achieved 82.5% clean accuracy and boosted adversarial accuracy to 55.3%, yielding an 11.8% robustness gain relative to FedAvg. Similar improvements were observed across the other datasets. On CIFAR-100, MNIST, and SVHN, our method achieved robustness gains of 9.8%, 9.7%, and 11.0%, respectively, compared to the baseline methods. These results indicate that the integration of client-specific adversarial generation with robust trimmed aggregation not only provides substantial improvements in adversarial accuracy but does so consistently across different data distributions and task complexities.

Figure 1 illustrates the adversarial accuracy of the five methods across the four datasets using a grouped bar chart. The chart visually reinforces our observations: while FedAvg consistently lagged in adversarial performance, FedAdv (without robust aggregation) offered only modest improvements. FedProx and MOON provided incremental gains, but our method achieved the highest adversarial accuracy across all datasets, highlighting its superior robustness.

These comparative experiments underscore the advantages of our approach. By integrating client-specific adversarial generation with robust trimmed aggregation, our method effectively mitigates the impact of adversarial perturbations and malicious updates. This comprehensive evaluation across multiple datasets and baselines demonstrates that our approach not only improves adversarial robustness but also maintains competitive performance on clean data.

4.2.1. Parameter Analysis

To ensure the reproducibility of our experiments, we detail the process used to select the initial hyperparameter ranges. For the adversarial training component, we set the step size

η_{i}

within the range

[0.005, 0.02]

and the number of PGD steps

n_{i} \in {1, 3, 5}

based on prior studies in both centralized and federated adversarial training. These values were chosen to strike a balance between adversarial strength and computational efficiency across clients with varying resources. The perturbation bound

ϵ_{i}

was dynamically computed for each client as

ϵ_{i} = α \cdot σ_{i}

, where

σ_{i}

is the local standard deviation of input data and

α

was fixed at

1.0

after a grid search over

[0.5, 1.5]

. For robust aggregation, the trimming parameter b was set to 5% of the total number of participating clients, following empirical findings in robust federated learning literature. All hyperparameters were validated on a held-out subset of the training data at each client, and sensitivity analysis was conducted to ensure stable performance across the selected ranges.

To thoroughly evaluate the robustness of our framework, we conducted an extensive analysis of its key hyperparameters. In our experiments, we systematically varied the number of PGD steps (

n_{i}

), the step size (

η_{i}

), the scaling factor for the maximum perturbation (

α

), and the trimming parameter (b) employed in the robust aggregation process. Our goal was to understand the sensitivity of our method to these parameters and identify optimal settings that balance clean and adversarial accuracy.

Our analysis reveals that the number of PGD steps (

n_{i}

) plays a critical role in adversarial training. Generally, an increase in

n_{i}

leads to the generation of stronger adversarial examples, which, in turn, improves the model’s robustness. However, our experiments indicate that this improvement is subject to diminishing returns; beyond a certain threshold, further increasing

n_{i}

results in only marginal gains and may even degrade clean accuracy due to overfitting to adversarial examples. For example, as shown in Table 3 and Figure 2, increasing

n_{i}

from 1 to 3 resulted in a substantial improvement in adversarial accuracy, while increasing

n_{i}

further from 3 to 5 provided only a slight additional gain.

The step size (

η_{i}

) and scaling factor (

α

) together determine the magnitude of the adversarial perturbations. Our findings indicated that moderate adjustments to

η_{i}

and

α

can help achieve an optimal balance between preserving clean accuracy and enhancing adversarial robustness. Larger values for these parameters tend to generate more potent adversarial examples, thereby improving robustness; however, if set too high, they may produce overly aggressive perturbations that compromise performance on clean data. Our experiments suggest that a step size of 0.01, in conjunction with an appropriately tuned

α

, provides a good trade-off between these competing objectives.

The trimming parameter (b) used in robust aggregation is particularly sensitive. This parameter determines the proportion of extreme update values that are discarded before averaging. Our analysis shows that setting b too high can inadvertently remove valuable information from benign clients, reducing the overall quality of the aggregated model, whereas a b value that is too low may fail to sufficiently filter out adversarial updates. We found that a b value corresponding to approximately 5% of the total number of clients strikes an effective balance, ensuring that the aggregation process is robust to outliers without sacrificing important contributions from non-malicious updates.

Overall, our parameter analysis confirms that our proposed framework maintains stable performance over a reasonable range of hyperparameter values. Although the optimal settings may vary slightly across different datasets, the method consistently exhibits strong adversarial robustness. These findings highlight the practical viability of our approach, as it can be effectively tuned to achieve a desirable trade-off between clean and adversarial performance in various federated learning scenarios.

Figure 2 illustrates the effect of varying the number of PGD steps (

n_{i}

) on adversarial accuracy for different step sizes (

η_{i}

). The plot indicates that for all step sizes tested, increasing

n_{i}

from 1 to 3 yielded a substantial improvement in adversarial accuracy. However, further increasing

n_{i}

from 3 to 5 resulted in only marginal gains, thereby supporting our observation of diminishing returns with excessive iterations. Moreover, the results reveal that a step size of 0.01 achieved the best overall performance, as lower and higher step sizes tended to underperform in comparison. These findings underscore the importance of carefully tuning both

n_{i}

and

η_{i}

to achieve an optimal balance between adversarial robustness and clean accuracy.

4.2.2. Ablation Studies

To further validate the contributions of the individual components of our framework, we performed a series of ablation studies. We systematically removed key modules to assess their individual impact on adversarial robustness. First, we compared our client-specific adversarial generation strategy against a uniform adversarial generation approach. The results indicate that adapting perturbation parameters based on local data statistics significantly improves the robustness of individual client models. Next, we evaluated the effectiveness of the robust trimmed aggregation technique by omitting this component. Without robust aggregation, adversarial accuracy noticeably declines, highlighting its critical role in mitigating the influence of malicious updates during global model aggregation. We also examined the scenario where both components were removed. Table 4 summarizes the experimental findings on the CIFAR-10 dataset.

The ablation studies clearly underscore that both client-specific adversarial generation and robust trimmed aggregation play critical roles in enhancing adversarial robustness. Removing either component leads to a significant drop in adversarial accuracy, and omitting both results in the most pronounced degradation. These results validate the design of our integrated approach and confirm that the combined use of these techniques is essential for achieving high robustness in federated learning environments.

4.3. Discussion

The experimental results provide compelling evidence that our integrated approach substantially improves adversarial robustness in federated learning environments. Our comprehensive evaluations across multiple benchmark datasets demonstrate that our method consistently outperforms conventional baselines—such as FedAvg, FedAdv (without robust aggregation), FedProx, and MOON—particularly under adversarial attack scenarios. The significant improvements in adversarial accuracy, achieved with only minor sacrifices in clean accuracy, underscore the effectiveness of combining client-specific adversarial generation with robust trimmed aggregation.

Our parameter analysis further reveals that optimal performance is highly dependent on careful tuning of key hyperparameters. For instance, a moderate number of PGD steps, paired with an appropriately chosen step size and scaling factor, is crucial for balancing the trade-off between clean and adversarial performance. Additionally, the robust aggregation mechanism, with a trimming parameter set at approximately 5% of client updates, plays a pivotal role in filtering out extreme, potentially malicious updates while preserving essential information from benign clients.

Our proposed framework is designed with practical federated learning applications in mind, particularly in privacy-sensitive domains such as healthcare, finance, and mobile edge environments. The integration of client-specific adversarial training allows each client to generate locally tailored perturbations, enhancing robustness without centralized data access. However, this also introduces additional computational overhead on the client side, especially for resource-constrained devices, due to iterative adversarial example generation. To mitigate this, we allow flexible configurations for attack strength and PGD steps based on device capabilities. On the server side, the robust trimmed aggregation strategy incurs moderate additional complexity (

O (d \cdot K log K)

) but remains scalable with parallelization. Despite these considerations, our results show that the method maintains high adversarial robustness and clean accuracy under realistic conditions. For deployment, challenges include ensuring consistent hyperparameter tuning across heterogeneous devices and managing communication costs. Future work may explore lightweight approximations and adaptive strategies to further improve scalability and deployment efficiency in large-scale federated environments.

An important extension of our work involves addressing fairness in federated learning (FL), particularly as adversarial training and privacy preservation may inadvertently introduce disparities among clients with heterogeneous data. Since client-specific adversarial generation can lead to uneven robustness, especially for clients with underrepresented or noisy data, it is essential to ensure equitable treatment during both local training and global aggregation. Robust trimmed aggregation helps mitigate the influence of extreme or malicious updates, but fairness-aware aggregation strategies may further improve inclusivity. Emerging privacy-preserving technologies, such as blockchain, offer promising avenues for enhancing fairness and accountability in FL. For example, the framework proposed by Chen et al. [31] demonstrates how blockchain can be leveraged to audit updates and promote transparent participation, thereby supporting both fairness and security objectives. As a future research direction, we plan to explore fairness-aware adversarial training strategies—both blockchain-integrated and standalone—that adaptively balance robustness and fairness across clients in decentralized and privacy-sensitive environments.

Despite these promising results, several avenues for future research remain. One promising direction is to develop adaptive hyperparameter tuning strategies that dynamically adjust the PGD steps, step size, and trimming parameter based on real-time performance metrics. Such adaptive methods could further enhance robustness while reducing the reliance on manual tuning. Moreover, extending our framework to address a broader range of adversarial attacks—including more sophisticated or targeted strategies—would provide valuable insights into its generalizability in complex adversarial settings. Another interesting direction is to explore dynamic adversary models in which the fraction of malicious clients varies over time, as this would more closely mimic real-world scenarios. Additionally, applying our framework to non-vision tasks, such as natural language processing or time-series forecasting, could broaden the scope of its applicability and further establish its versatility across different domains.

On the other hand, recent studies have emphasized the growing threat of data poisoning in large-scale models, particularly in federated learning. This has highlighted the importance of integrating robust unlearning mechanisms to mitigate such adversarial influences. The authors of [32] provided a comprehensive survey of machine unlearning techniques, including strategies to counter poisoning, while Jiang et al. [33] proposed efficient and certified recovery methods tailored for federated settings. Incorporating these perspectives helps frame our work within the broader context of secure and resilient learning systems.

In conclusion, our integrated approach represents a significant step forward in enhancing the adversarial robustness of federated learning systems. By effectively combining client-specific adversarial generation with robust trimmed aggregation, our method not only mitigates the adverse effects of malicious updates and adversarial perturbations but also maintains competitive performance on clean data. The insights gleaned from our extensive experiments lay a strong foundation for future research aimed at further refining and expanding this approach to meet the challenges of increasingly complex and dynamic federated learning environments.

5. Conclusions

In this paper, we introduced a novel framework for federated image adversarial training that integrates client-specific adversarial example generation with robust trimmed aggregation. Our approach enhances both local and global model robustness by tailoring adversarial perturbations to the unique characteristics of each client’s data and mitigating the influence of malicious updates during global aggregation. Extensive experimental evaluations on multiple benchmark datasets demonstrate that our method significantly improves adversarial accuracy while maintaining competitive performance on clean data. Our results validate that the combination of client-specific adversarial training and robust aggregation is effective in countering adversarial threats in federated learning environments. Furthermore, our comprehensive analyses, including parameter sensitivity and ablation studies, highlight the practical viability of our approach and underscore the importance of carefully tuning key hyperparameters.

Future work will focus on developing adaptive hyperparameter tuning strategies, exploring a broader range of adversarial attack scenarios, and extending the framework to other data modalities. We believe that our proposed method represents a significant step toward more secure and reliable federated learning systems in real-world applications.

Author Contributions

Conceptualization, J.C.; Methodology, S.Z. and J.C.; Software, J.C.; Validation, J.C. and S.Z.; Formal analysis, J.C.; Investigation, J.C.; Resources, X.Z.; Data curation, S.Z. and X.Z.; Writing—original draft, S.Z., X.Z. and J.C.; Writing—review and editing, S.Z., X.Z. and J.C.; Visualization, J.C.; Supervision, S.Z. and J.C.; Project administration, X.Z. and J.C.; Funding acquisition, X.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Ma, C.; Li, J.; Shi, L.; Ding, M.; Wang, T.; Han, Z.; Poor, H.V. When federated learning meets blockchain: A new distributed learning paradigm. IEEE Comput. Intell. Mag. 2022, 17, 26–33. [Google Scholar] [CrossRef]
Rauniyar, A.; Hagos, D.H.; Jha, D.; Håkegård, J.E.; Bagci, U.; Rawat, D.B.; Vlassov, V. Federated learning for medical applications: A taxonomy, current trends, challenges, and future research directions. IEEE Internet Things J. 2023, 11, 7374–7398. [Google Scholar] [CrossRef]
Lu, Z.; Pan, H.; Dai, Y.; Si, X.; Zhang, Y. Federated learning with non-iid data: A survey. IEEE Internet Things J. 2024, 11, 19188–19209. [Google Scholar] [CrossRef]
Kumar, K.N.; Mohan, C.K.; Cenkeramaddi, L.R. The impact of adversarial attacks on federated learning: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 46, 2672–2691. [Google Scholar] [CrossRef] [PubMed]
Amini, S.; Ghaemmaghami, S. Towards improving robustness of deep neural networks to adversarial perturbations. IEEE Trans. Multimed. 2020, 22, 1889–1903. [Google Scholar] [CrossRef]
Rodríguez-Barroso, N.; Jiménez-López, D.; Luzón, M.V.; Herrera, F.; Martínez-Cámara, E. Survey on federated learning threats: Concepts, taxonomy on attacks and defences, experimental study and challenges. Inf. Fusion 2023, 90, 148–173. [Google Scholar] [CrossRef]
McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; y Arcas, B.A. Communication-efficient learning of deep networks from decentralized data. In Proceedings of the Artificial Intelligence and Statistics, PMLR, Fort Lauderdale, FL, USA, 20–22 April 2017; pp. 1273–1282. [Google Scholar]
Li, X.; Huang, K.; Yang, W.; Wang, S.; Zhang, Z. On the convergence of fedavg on non-iid data. arXiv 2019, arXiv:1907.02189. [Google Scholar]
Konečnỳ, J. Federated Learning: Strategies for Improving Communication Efficiency. arXiv 2016, arXiv:1610.05492. [Google Scholar]
Xiang, M.; Ioannidis, S.; Yeh, E.; Joe-Wong, C.; Su, L. Efficient federated learning against heterogeneous and non-stationary client unavailability. arXiv 2024, arXiv:2409.17446. [Google Scholar]
Goodfellow, I.J.; Shlens, J.; Szegedy, C. Explaining and harnessing adversarial examples. arXiv 2014, arXiv:1412.6572. [Google Scholar]
Pan, Z.; Ying, Z.; Wang, Y.; Zhang, C.; Zhang, W.; Zhou, W.; Zhu, L. Feature-Based Machine Unlearning for Vertical Federated Learning in IoT Networks. IEEE Trans. Mob. Comput. 2025, 1–14. [Google Scholar] [CrossRef]
Morris, J.X.; Lifland, E.; Yoo, J.Y.; Grigsby, J.; Jin, D.; Qi, Y. Textattack: A framework for adversarial attacks, data augmentation, and adversarial training in nlp. arXiv 2020, arXiv:2005.05909. [Google Scholar]
Beltrán, E.T.M.; Pérez, M.Q.; Sánchez, P.M.S.; Bernal, S.L.; Bovet, G.; Pérez, M.G.; Pérez, G.M.; Celdrán, A.H. Decentralized federated learning: Fundamentals, state of the art, frameworks, trends, and challenges. IEEE Commun. Surv. Tutorials 2023, 25, 2983–3013. [Google Scholar] [CrossRef]
Bagdasaryan, E.; Veit, A.; Hua, Y.; Estrin, D.; Shmatikov, V. How to backdoor federated learning. In Proceedings of the International Conference on Artificial Intelligence and Statistics, PMLR, Online, 26–28 August 2020; pp. 2938–2948. [Google Scholar]
Lyu, L.; Yu, H.; Ma, X.; Chen, C.; Sun, L.; Zhao, J.; Yang, Q.; Philip, S.Y. Privacy and robustness in federated learning: Attacks and defenses. IEEE Trans. Neural Netw. Learn. Syst. 2022, 35, 8726–8746. [Google Scholar] [CrossRef]
Qi, P.; Chiaro, D.; Guzzo, A.; Ianni, M.; Fortino, G.; Piccialli, F. Model aggregation techniques in federated learning: A comprehensive survey. Future Gener. Comput. Syst. 2024, 150, 272–293. [Google Scholar] [CrossRef]
Yin, D.; Chen, Y.; Kannan, R.; Bartlett, P. Byzantine-robust distributed learning: Towards optimal statistical rates. In Proceedings of the International Conference on Machine Learning, PMLR, Stockholm, Sweden, 10–15 July 2018; pp. 5650–5659. [Google Scholar]
Blanchard, P.; El Mhamdi, E.M.; Guerraoui, R.; Stainer, J. Machine learning with adversaries: Byzantine tolerant gradient descent. Adv. Neural Inf. Process. Syst. 2017, 30, 1–11. [Google Scholar]
Li, Y.; Sani, A.S.; Yuan, D.; Bao, W. Enhancing federated learning robustness through clustering non-IID features. In Proceedings of the Asian Conference on Computer Vision, Macau, China, 4–8 December 2022; pp. 41–55. [Google Scholar]
Gallus, S. Federated Learning in Practice: Addressing Efficiency, Heterogeneity, and Privacy. TechRxiv 2025. [Google Scholar] [CrossRef]
Smith, V.; Chiang, C.K.; Sanjabi, M.; Talwalkar, A.S. Federated multi-task learning. Adv. Neural Inf. Process. Syst. 2017, 30, 4427–4437. [Google Scholar]
Fallah, A.; Mokhtari, A.; Ozdaglar, A. Personalized federated learning: A meta-learning approach. arXiv 2020, arXiv:2002.07948. [Google Scholar]
Tang, J.; Du, X.; He, X.; Yuan, F.; Tian, Q.; Chua, T.S. Adversarial training towards robust multimedia recommender system. IEEE Trans. Knowl. Data Eng. 2019, 32, 855–867. [Google Scholar] [CrossRef]
Kim, T.; Singh, S.; Madaan, N.; Joe-Wong, C. Characterizing internal evasion attacks in federated learning. In Proceedings of the International Conference on Artificial Intelligence and Statistics, PMLR, Valencia, Spain, 25–27 April 2023; pp. 907–921. [Google Scholar]
Krizhevsky, A.; Hinton, G. Learning Multiple Layers of Features From Tiny Images. 2009. Available online: https://www.cs.utoronto.ca/~kriz/learning-features-2009-TR.pdf (accessed on 8 April 2025).
Deng, L. The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Process. Mag. 2012, 29, 141–142. [Google Scholar] [CrossRef]
Netzer, Y.; Wang, T.; Coates, A.; Bissacco, A.; Wu, B.; Ng, A.Y. Reading digits in natural images with unsupervised feature learning. In Proceedings of the NIPS Workshop on Deep Learning and Unsupervised Feature Learning, Granada, Spain, 12–17 December 2011; Volume 2011, p. 4. [Google Scholar]
Tang, M.; Wang, Y.; Zhang, J.; DiValentin, L.; Ding, A.; Hass, A.; Chen, Y.; Li, H. FedProphet: Memory-Efficient Federated Adversarial Training via Theoretic-Robustness and Low-Inconsistency Cascade Learning. arXiv 2024, arXiv:2409.08372. [Google Scholar]
Li, Q.; He, B.; Song, D. Model-contrastive federated learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 10713–10722. [Google Scholar]
Chen, L.; Zhao, D.; Tao, L.; Wang, K.; Qiao, S.; Zeng, X.; Tan, C.W. A credible and fair federated learning framework based on blockchain. IEEE Trans. Artif. Intell. 2024, 6, 301–316. [Google Scholar] [CrossRef]
Nguyen, T.T.; Huynh, T.T.; Ren, Z.; Nguyen, P.L.; Liew, A.W.C.; Yin, H.; Nguyen, Q.V.H. A survey of machine unlearning. arXiv 2022, arXiv:2209.02299. [Google Scholar]
Jiang, Y.; Shen, J.; Liu, Z.; Tan, C.W.; Lam, K.Y. Towards efficient and certified recovery from poisoning attacks in federated learning. IEEE Trans. Inf. Forensics Secur. 2025, 20, 2632–2647. [Google Scholar] [CrossRef]

Figure 1. Relative adversarial robustness improvement over FedAvg across four datasets. Our method consistently provided the highest gain, highlighting its effectiveness.

Figure 2. Effect of varying the number of PGD steps (

n_{i}

) on adversarial accuracy for different step sizes (

η_{i}

). Increasing

n_{i}

from 1 to 3 significantly improves adversarial accuracy, while further increases yield only marginal gains. A step size of 0.01 provides the best overall performance.

Figure 2. Effect of varying the number of PGD steps (

n_{i}

) on adversarial accuracy for different step sizes (

η_{i}

). Increasing

n_{i}

from 1 to 3 significantly improves adversarial accuracy, while further increases yield only marginal gains. A step size of 0.01 provides the best overall performance.

Table 1. Overview of datasets used in experiments.

Dataset	Total Samples	Training Samples	Testing Samples	Image Size (Channels)
CIFAR-10	60,000	50,000	10,000	32 × 32 (RGB)
CIFAR-100	60,000	50,000	10,000	32 × 32 (RGB)
MNIST	70,000	60,000	10,000	28 × 28 (Grayscale)
SVHN	99,000	73,257	26,032	32 × 32 (RGB)
Fashion-MNIST	70,000	60,000	10,000	28 × 28 (Grayscale)
EMNIST	131,600	112,800	18,800	28 × 28 (Grayscale)
STL-10	13,000	5000	8000	96 × 96 (RGB)

Table 2. Average performance of different federated learning methods across various datasets.

Dataset	Method	Clean Acc. (%)	Adv. Acc. (%)	Robustness Gain (%)
CIFAR-10	FedAvg	85.2	43.5	-
	FedAdv (no robust)	83.0	48.7	+5.2
	FedProx	84.0	50.1	+6.1
	MOON	83.5	52.0	+8.5
	Ours	82.5	55.3	+11.8
CIFAR-100	FedAvg	60.0	30.2	-
	FedAdv (no robust)	58.0	35.1	+5.0
	FedProx	59.0	36.0	+6.0
	MOON	58.5	37.8	+7.8
	Ours	57.0	40.0	+9.8
MNIST	FedAvg	99.0	70.5	-
	FedAdv (no robust)	98.5	75.0	+4.5
	FedProx	98.8	76.0	+5.5
	MOON	98.7	77.5	+7.0
	Ours	98.0	80.2	+9.7
SVHN	FedAvg	92.0	55.0	-
	FedAdv (no robust)	91.0	60.5	+5.5
	FedProx	91.5	62.0	+6.0
	MOON	91.0	63.0	+7.0
	Ours	90.5	66.0	+11.0

Table 3. Impact of varying the number of PGD steps (

n_{i}

) on model performance.

Table 3. Impact of varying the number of PGD steps (

n_{i}

) on model performance.

$n_{i}$	Clean Accuracy (%)	Adversarial Accuracy (%)	Robustness Gain (%)
1	84.5	47.0	+2.5
3	82.5	55.3	+11.8
5	81.0	56.0	+12.5

Table 4. Ablation study results on CIFAR-10, illustrating the impact of removing key components from our framework.

Configuration	Adversarial Accuracy (%)	Robustness Gain (%)
Full Model (Ours)	55.3	+11.8
Without Robust Aggregation	49.1	+5.6
Without Client-Specific Adv. Generation	50.3	+6.8
Without Both Components	45.0	+2.0

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhao, S.; Zheng, X.; Chen, J. Efficient Adversarial Training for Federated Image Systems: Crafting Client-Specific Defenses with Robust Trimmed Aggregation. Electronics 2025, 14, 1541. https://doi.org/10.3390/electronics14081541

AMA Style

Zhao S, Zheng X, Chen J. Efficient Adversarial Training for Federated Image Systems: Crafting Client-Specific Defenses with Robust Trimmed Aggregation. Electronics. 2025; 14(8):1541. https://doi.org/10.3390/electronics14081541

Chicago/Turabian Style

Zhao, Siyuan, Xiaodong Zheng, and Junming Chen. 2025. "Efficient Adversarial Training for Federated Image Systems: Crafting Client-Specific Defenses with Robust Trimmed Aggregation" Electronics 14, no. 8: 1541. https://doi.org/10.3390/electronics14081541

APA Style

Zhao, S., Zheng, X., & Chen, J. (2025). Efficient Adversarial Training for Federated Image Systems: Crafting Client-Specific Defenses with Robust Trimmed Aggregation. Electronics, 14(8), 1541. https://doi.org/10.3390/electronics14081541

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Efficient Adversarial Training for Federated Image Systems: Crafting Client-Specific Defenses with Robust Trimmed Aggregation

Abstract

1. Introduction

2. Related Works and Preliminary

2.1. Federated Learning

2.2. Adversarial Training in Centralized and Distributed Settings

2.3. Robust Aggregation Techniques

2.4. Client-Specific Strategies

2.5. Integration of Robust Aggregation and Client-Specific Adversarial Training

3. Methodology

3.1. Client-Specific Adversarial Generation

3.2. Robust Trimmed Aggregation

3.3. Federated Training Procedure

4. Experiments

4.1. Experimental Setup

4.2. Results

4.2.1. Parameter Analysis

4.2.2. Ablation Studies

4.3. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI