Next Article in Journal
Federated Deep Reinforcement Learning for Energy Scheduling in Privacy-Sensitive PV-EV Charging Networks
Previous Article in Journal
Improved Jiles–Atherton Magnetic Core Model and Its SPICE Implementation
Previous Article in Special Issue
PP-EDUVec: Privacy-Preserving Intelligent Management Algorithms for Educational-Corpus Vector Databases Under Retrieval-Augmented Learning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

SecureFedGuard: Authenticated and Backdoor-Resilient Federated Learning with Dual-View Gradient Forensics

1
School of Management, Guangdong University of Science and Technology, Dongguan 523070, China
2
Faculty of Data Science, City University of Macau, Taipa, Macau
*
Author to whom correspondence should be addressed.
Electronics 2026, 15(5), 1010; https://doi.org/10.3390/electronics15051010
Submission received: 22 January 2026 / Revised: 12 February 2026 / Accepted: 13 February 2026 / Published: 28 February 2026
(This article belongs to the Special Issue Security and Privacy in Distributed Machine Learning)

Abstract

Federated learning (FL) enables collaborative model training without centralizing raw data, yet practical deployments remain vulnerable to security threats such as Byzantine model poisoning, stealthy backdoor implantation, and integrity attacks that exploit the opacity of client updates. This paper presents SecureFedGuard, a security-centric FL framework that introduces a novel combination of (i) dual-view update authentication that binds each client update to a lightweight stochastic gradient fingerprint, enabling server-side integrity screening without accessing client data, and (ii) backdoor-resilient aggregation driven by cross-round spectral forensics and adaptive coordinate-wise trimming guided by an estimated benign subspace. SecureFedGuard is designed to be compatible with secure aggregation and does not require trusted hardware, public datasets for pretraining, or expensive per-client verification. We provide a simple robustness analysis that clarifies when benign updates dominate the estimated subspace under mixed benign/malicious participation. Experiments on real FL benchmarks (vision and language) under diverse threat models show that SecureFedGuard substantially improves clean accuracy and backdoor attack success rate compared with strong baselines, while adding modest communication and computation overhead. These results suggest a practical path toward integrity-preserving and backdoor-resistant FL without weakening the privacy boundary between clients and the server.

1. Introduction

Federated learning (FL) enables many clients (e.g., mobile devices or organizations) to collaboratively train a shared model without centralizing raw training data [1,2]. This design mitigates direct data exposure, and is often combined with cryptographic secure aggregation so that the server observes only aggregated updates rather than individual client deltas [3]. Despite these advantages, practical FL deployments remain highly vulnerable to security threats: malicious clients can inject Byzantine updates that derail training, or implant stealthy backdoors that preserve clean utility while enforcing attacker-chosen behaviors at inference time [4,5,6]. The core difficulty is that the server must coordinate learning from updates it cannot fully trust, under data heterogeneity and limited observability.
A large body of work addresses Byzantine robustness by replacing naïve averaging with robust aggregation rules, such as coordinate-wise robust estimators and distance-based selection mechanisms [4,5,7]. However, recent evidence suggests that (i) non-IID client distributions and adaptive attackers can degrade the effectiveness of classic robust rules, and (ii) designing yet another aggregation heuristic may be insufficient without additional security signals [8,9,10]. In parallel, backdoor attacks in FL have become increasingly stealthy by concentrating poisoning into critical layers or carefully scaling updates (model replacement), which can bypass both naïve defenses and several robust aggregators [11]. Existing backdoor defenses often assume extra capabilities (e.g., clean reference data, strong client-side validation, or plaintext access to all updates), or operate myopically per round and thus struggle against persistent, cross-round attackers [12,13,14,15].
This paper targets an increasingly relevant deployment regime: FL systems that must remain privacy-preserving, scalable, and secure simultaneously. Privacy mechanisms such as secure aggregation can prevent the server from inspecting individual updates, which complicates traditional outlier filtering; meanwhile, integrity threats demand that malicious contributions be detected or attenuated without violating the privacy boundary [3,16,17]. Moreover, backdoor defenses must keep clean accuracy high under heterogeneous data, while suppressing targeted misbehavior under strong, coordinated attackers [6,18,19]. These constraints motivate a defense that goes beyond a single robust estimator and instead couples lightweight integrity screening with geometry-aware, cross-round anomaly suppression.
Our approach. We propose SecureFedGuard, a security-centric FL framework that combines two orthogonal mechanisms. First, we introduce dual-view update authentication (DVA), a lightweight trajectory-consistency test based on compact linear sketches: each client transmits a sketch of its final update and a sketch of its cumulative local gradient trace. The server uses these two views to detect integrity-evasive attacks that craft arbitrary vectors without following a plausible local optimization path. Second, to counter optimization-based poisoning and stealthy backdoors that may pass such consistency checks, we introduce cross-round spectral forensics with residual clipping: the server estimates a benign subspace each round, measures residual energies of client updates outside that subspace, and maintains a cross-round persistence memory to down-weight clients that repeatedly deviate. These signals are then integrated into an adaptive robust aggregation rule that preserves benign learning while suppressing persistent anomalous directions.
Key contributions. SecureFedGuard contributes: (1) a new sketch-based integrity screening mechanism (DVA) that is data-free and adds only low-bandwidth overhead, addressing integrity-evasive Byzantine behaviors; (2) a cross-round geometric defense that sanitizes updates by clipping only the suspicious residual component, improving robustness to stealthy backdoors and coordinated poisoning; and (3) a design that is compatible with secure aggregation by operating on sketches and scalar weights, aligning privacy requirements with integrity needs [3,16,17]. Empirically, SecureFedGuard substantially reduces backdoor attack success while maintaining clean accuracy across vision and language FL benchmarks, outperforming strong baselines including robust aggregators and recent backdoor defenses [7,12,15].

2. Related Work

2.1. Byzantine Robustness and Robust Aggregation

The standard cross-device FL pipeline popularized by FedAvg aggregates client updates by averaging, which is efficient but brittle when a subset of clients is malicious or faulty [1,2]. Byzantine-robust learning therefore studies aggregation rules that tolerate adversarial updates while preserving convergence. Early work introduced principled robust estimators for distributed gradients, motivating distance-based selection and coordinate-wise robust summaries [4,5]. In FL, robust aggregation via the geometric median (often referred to as robust federated aggregation) provides strong empirical robustness and can be implemented efficiently with Weiszfeld-type iterations [7].
Recent research highlights that practical FL security requires robustness under non-IID client data and potentially unknown corruption rates. Methods that incorporate auxiliary signals (e.g., public-data-based screening) can improve robustness in heterogeneous settings but may introduce assumptions that do not always hold [8]. Complementary directions leverage incentives or ensembles to reduce the effective influence of low-quality or adversarial participants [20]. In decentralized variants, aggregation must also tolerate dynamic topologies and local neighbor corruption, leading to new robust filters designed beyond the centralized-server setting [21]. At the same time, the security community has emphasized that strengthening classic aggregators (e.g., trimmed mean/median) with additional mechanisms can outperform designing ever more complex rules in some regimes, and provides new threat models and evaluation lenses [9,10]. These trends motivate our design choice to retain robust, scalable primitives while adding orthogonal signals that are rare in prior work: (i) trajectory-consistency checks that test whether an update is realizable by a plausible local optimization trajectory, and (ii) cross-round persistence that accumulates evidence over time rather than relying on a single-round outlier detector.

2.2. Backdoor Attacks and Defenses in Federated Learning

Backdoor (targeted poisoning) attacks in FL aim to implant a trigger-target behavior while maintaining high clean accuracy, making them harder to detect than untargeted Byzantine disruptions. Modern backdoor attacks can concentrate poisoning into model subsets or layers to improve stealthiness; for example, poisoning backdoor-critical layers can bypass multiple defenses with a small malicious fraction [11]. Defenses therefore increasingly exploit non-trivial structure beyond update magnitude, including hidden-layer behaviors, directional patterns, or parameter-importance disparities.
A representative line of work uses client-side validation signals or neuron-level statistics to detect poisoned updates, exemplified by CrowdGuard, which introduces feedback-driven inspection and pruning mechanisms [12]. Other defenses attempt to distinguish benign heterogeneity from malicious triggers via parameter-importance signals; FDCR uses Fisher-information-based discrepancies to cluster and rescale client updates under heterogeneous distributions [13]. Complementary approaches focus on model purification: FLSAD eliminates backdoor influence via trigger reconstruction and self-attention distillation without assuming a clean reference dataset [14]. Lightweight client-side modifications can also reduce backdoor success by patching or neutralizing trigger effects during local training [22]. In 2025, direction-based screening gained traction; AlignIns inspects multi-granularity directional alignment and couples filtering with clipping to resist stealthy backdoors under non-IID data [15]. Beyond standard FL, the literature also expands to vertical/split settings; UBD synthesizes latent trigger cues and uses label-consistent clustering with constrained probing to mitigate VFL backdoors [19]. Specialized domains such as federated graph learning introduce additional attack surfaces (diverse trigger structures and injection locations), prompting defenses like FedTGE that use topological graph energy for selection and reweighting [18]. Surveys summarize the rapidly growing landscape and expose recurring gaps between benchmark robustness and deployment constraints [6]. Overall, existing defenses often either (i) require extra assumptions (clean validation data, access to plaintext updates, or trusted components) or (ii) treat rounds largely independently using one-round statistics. As a result, many methods do not explicitly verify whether an update is realizable by a plausible local training trajectory, and they do not explicitly exploit temporal persistence of anomalous directions across rounds. SecureFedGuard is designed to fill both gaps via DVA (realizability) and persistence-aware spectral forensics (temporal evidence).

2.3. Secure Aggregation, Privacy, and Integrity Verification

Secure aggregation is a key primitive for FL privacy: the server learns only an aggregate of client updates, not individual contributions [3]. However, privacy can hinder many server-side defenses because robust outlier detection typically assumes access to individual updates in plaintext. This tension has spurred work on secure aggregation protocols with stronger dropout tolerance and cryptographic efficiency, including homomorphic-encryption-based constructions and variants tailored to practical deployment constraints [17]. In parallel, privacy-preserving robust aggregation has emerged to mitigate poisoning while maintaining confidentiality, e.g., PEAR performs similarity-based weighting under encrypted gradients to address Byzantine threats in realistic non-IID scenarios [16].
A central limitation in this space is that privacy alone does not guarantee integrity: malicious clients may still contribute adversarial updates, and an untrusted server could potentially deviate from the intended aggregation procedure. Recent overviews emphasize that robust FL must account for Sybil behavior, secure-aggregation interactions, and adaptive backdoor threats, motivating defenses that introduce verification or auxiliary signals with minimal privacy leakage [10]. SecureFedGuard follows this direction: it is designed to remain compatible with secure aggregation by relying on compact sketches and low-dimensional statistics for screening, while retaining a robust aggregation core for scalability and non-IID tolerance.

3. Preliminaries

3.1. Federated Learning Setup and Notation

We consider cross-device federated learning with a central server coordinating N clients indexed by i { 1 , , N } . Client i holds a private dataset D i drawn from a client-specific distribution. Let w R d denote the model parameters and ( w ; z ) the per-example loss for data point z. The global objective is
min w R d F ( w ) i = 1 N p i F i ( w ) , F i ( w ) E z D i ( w ; z ) ,
where p i 0 and i p i = 1 are client weights (typically proportional to dataset size).
Training proceeds in communication rounds t = 0 , 1 , , T 1 . At round t, the server holds the global model w t and samples a subset of clients S t { 1 , , N } with | S t | = K . Each selected client initializes local parameters w i , t ( 0 ) w t and performs E steps of a first-order optimizer (e.g., mini-batch SGD):
w i , t ( e + 1 ) = w i , t ( e ) η t g i , t ( e ) , g i , t ( e ) OptDir w i , t ( e ) ; ξ i , t ( e ) ,
where η t is the learning rate and ξ i , t ( e ) is a mini-batch sampled from D i . Here, OptDir ( · ) denotes the first-order step direction applied by the local optimizer (e.g., for plain SGD or the momentum velocity for SGD with momentum). The client sends an update vector
u i t w i , t ( E ) w t R d ,
(or equivalently the post-training model w i , t ( E ) ). The server aggregates received updates to form the next global model:   
w t + 1 = w t + Agg { u i t } i S t ,
where Agg ( · ) is an aggregation rule. For convenience, we denote the matrix of stacked updates by U t R K × d whose i-th row corresponds to ( u i t ) (with an arbitrary but fixed ordering of clients in S t ). We use · for the 2 norm and a , b for the standard inner product.

3.2. Adversary Model and Security Objectives

We assume an open FL setting in which a subset of participating clients may be malicious. Let A t S t denote the set of adversarial clients at round t, and let B t = S t A t denote benign clients. We consider a strong adversary that can (i) control any local training procedure, (ii) craft arbitrary vectors u i t to send to the server, and (iii) coordinate across compromised clients and across rounds. We bound the per-round corruption rate by
| A t | α K , α [ 0 , 1 ) .
  • The adversary may aim for
  • Byzantine model poisoning: degrade overall model utility by sending arbitrary or optimized updates.
  • Backdoor attacks: enforce a targeted misclassification behavior triggered by a pattern while maintaining high clean accuracy.
  • Integrity evasion: craft updates that appear statistically similar to benign ones under common defenses.
To align with practical deployments, we do not assume the server can access raw client data. Our defense targets two coupled objectives: (i) update integrity—detect or down-weight malicious updates without inspecting D i ; and (ii) backdoor resilience—maintain high clean performance while suppressing attack success.
We evaluate learning quality by Clean Accuracy on standard test sets. Backdoor robustness is measured by Attack Success Rate (ASR), defined as the fraction of triggered test inputs classified into the adversary’s target label. Lower ASR at similar clean accuracy indicates stronger backdoor resistance.
Threat boundaries. SecureFedGuard provides statistical screening rather than cryptographic integrity guarantees. DVA is most effective when an update is not produced by the claimed local SGD trajectory (e.g., arbitrary vectors or in-transit tampering). A fully adaptive attacker who can fabricate both views to satisfy f i t η t h i t may bypass DVA; in that case, robustness relies on cross-round forensics and robust aggregation. Spectral forensics can be weakened if an attack direction lies largely inside the estimated benign subspace or if the attacker poisons very intermittently, typically trading off ASR or requiring a higher attacker budget. We clarify these defensive boundaries in Section 4.4.

3.3. Robust Aggregation and Spectral Primitives

Robust FL defenses often rely on estimating a representative update direction in the presence of outliers. We briefly summarize two primitives used throughout the paper.
Coordinate-wise robust summaries. Given vectors { u i } i = 1 K R d , coordinate-wise trimmed mean removes a fraction of extreme values in each coordinate and averages the remainder. Let u i , j denote coordinate j. For a trimming level ρ [ 0 , 0.5 ) , define I j as the indices after removing the largest ρ K and smallest ρ K values among { u i , j } i = 1 K , and set
TrimMean ρ ( { u i } ) j 1 | I j | i I j u i , j .
This operation is efficient and provides robustness when the fraction of corrupted entries is bounded.
Benign subspace estimation. Many poisoning strategies introduce update directions that deviate from the dominant benign variation. Let u ¯ t denote a center estimate (e.g., coordinate-wise median or trimmed mean) of { u i t } i S t . Define centered updates u ˜ i t u i t u ¯ t and the empirical covariance
C t 1 K i S t u ˜ i t ( u ˜ i t ) R d × d .
Let V t R d × r contain the top-r eigenvectors of C t (orthonormal columns), defining the rank-r projector P t V t V t . For any update u i t , we define its spectral residual energy as
Res ( u i t ; P t ) ( I P t ) u i t u ¯ t 2 ,
which measures deviation from the estimated principal subspace. In later sections, we use residual statistics across rounds to flag persistent anomalous directions and to adapt the trimming strength in aggregation.
Lightweight sketching and fingerprints. To support integrity screening with minimal overhead, we use a randomized sketch map ϕ : R d R m with m d . A common choice is a sign-random projection
ϕ ( u ) 1 m R u , R { 1 , + 1 } m × d ,
generated from a public seed. Such sketches approximately preserve norms and inner products when m is sufficiently large. We will employ ϕ ( · ) to form compact gradient fingerprints that are inexpensive to transmit and compare, and that can be combined with robust statistics without revealing raw training data.

4. Methodology

4.1. Design Goals and End-to-End Protocol

SecureFedGuard is built for the practical FL regime where the server must learn from untrusted updates while respecting the privacy boundary implied by local data isolation and, optionally, secure aggregation. The methodology is guided by three design goals. (G1) Integrity without data access: the server should reject or down-weight obviously non-realizable updates without inspecting any client data D i . (G2) Backdoor resilience under non-IID: defenses must not confuse benign heterogeneity with attacks, and must suppress targeted triggers while maintaining high clean accuracy. (G3) Deployment compatibility: the defense should remain compatible with secure aggregation and incur modest overhead.
To meet these goals, SecureFedGuard uses two orthogonal security signals, each addressing a different failure mode of robust aggregation. First, trajectory consistency checks whether a reported update u i t is consistent with a plausible local optimization trace, which is effective against integrity-evasive attacks that craft arbitrary vectors. Second, geometric persistence tracks whether a client repeatedly deviates from the dominant benign update geometry across rounds, which is effective against optimization-based poisoning and stealthy backdoors that can mimic local SGD dynamics in a single round.
Protocol overview. At the beginning of round t, the server broadcasts the current model w t and a public seed that determines a linear sketch map ϕ ( · ) (Equation (9)) shared by all participants. Each selected client i S t performs local training starting from w t and produces the model delta u i t . In addition to u i t , the client computes and transmits two compact fingerprints: an update sketch f i t = ϕ ( u i t ) and a cumulative gradient-trace sketch h i t = e = 0 E 1 ϕ ( g i , t ( e ) ) . These sketches are low-dimensional ( m d ) and are designed to be inexpensive, while still enabling meaningful consistency checks at the server.
On the server, SecureFedGuard proceeds in three stages (Section 4.2, Section 4.3 and Section 4.4). First, it computes a DVA gate a i t ( 0 , 1 ] from the discrepancy between f i t and η t h i t , producing a soft integrity score per client. Second, it performs cross-round spectral forensics: using the current set of updates, it estimates a robust center u ¯ t and a low-rank benign subspace projector P t , measures the residual energy r i t = ( I P t ) ( u i t u ¯ t ) 2 , and updates a persistence memory M i t via an EMA. These quantities determine a residual shrink factor λ i t and a sanitized update u ^ i t that preserves within-subspace learning while attenuating anomalous residual components. Third, SecureFedGuard aggregates sanitized updates with a security-aware robust rule: it combines DVA and persistence into a scalar weight ω i t and applies an adaptive, coordinate-wise trimmed mean whose trimming level depends on an estimated corruption indicator α ^ t , yielding the global update Δ t and w t + 1 = w t + Δ t .
Secure-aggregation (SA) mode. In cross-device deployments, we consider a standard secure aggregation protocol in which the server learns only an aggregate of masked updates. SecureFedGuard operates in two phases each round. First, each selected client sends plaintext sketches ( f i t , h i t ) (Equation (10)); the server computes DVA gates and sketch-space forensics scores and then returns a per-client scalar multiplier s i t (and optionally an allowlist) over an authenticated channel. Second, only allowlisted clients participate in secure aggregation and contribute the masked, locally scaled update s i t u i t , so the server observes only i s i t u i t and updates with a weighted mean. Dropout is handled by the underlying secure aggregation protocol: if a client drops after sending sketches, its contribution is absent from the final aggregate and the server normalizes using the surviving set. Algorithm 1 provides the concrete message flow and threat assumptions.
Leakage from sketches. The server observes per-client m-dimensional randomized linear projections of both the final update and the gradient trace. These sketches can leak coarse information about the update direction and norm but are substantially lower-dimensional than u i t ; we treat them as defense metadata rather than a privacy mechanism. If stronger privacy guarantees are required, sketches can be noise-perturbed or securely aggregated as well, which is complementary and beyond the scope of this paper.
Deployment notes. In secure aggregation, the server does not observe per-client u i t , so coordinate-wise trimming and residual-only clipping in update space cannot be applied on plaintext updates. In SA mode, we therefore (i) compute DVA and persistence weights from plaintext sketches, (ii) optionally remove clients with very small weights before aggregation, and (iii) apply a per-client scalar shrink s i t (derived from sketch residual energy and persistence) that clients apply locally to their update before masking, yielding a weighted-mean secure-aggregation update. If encrypted robust aggregation (e.g., trimmed mean under MPC/HE) is required, it is complementary to SecureFedGuard and not assumed here.
In the remainder of this section, we reorganize the methodology into three technical subsections: (i) DVA and its scoring/tuning, (ii) cross-round spectral forensics with residual clipping and persistence memory, and (iii) the final security-aware robust aggregation and protocol summary.
Algorithm 1 SecureFedGuard with secure aggregation (SA mode)
  1:
Server selects clients S t and broadcasts ( w t , η t , seed t ) and secure-aggregation parameters.
  2:
for each client i S t (in parallel) do
  3:
   Client runs local training from w t to obtain u i t and per-step gradients { g i , t ( e ) } e = 0 E 1 .
  4:
   Client computes plaintext sketches f i t = ϕ ( u i t ) and h i t = e = 0 E 1 ϕ ( g i , t ( e ) ) and sends ( f i t , h i t ) to the server.
  5:
  (Optional) Client also sends its secure-aggregation setup message (e.g., ephemeral key material) as required by the SA protocol.
  6:
end for
  7:
Server computes a i t and α ^ t from ( f i t , h i t ) (Equations (13)–(15)).
  8:
Server runs the spectral/persistence pipeline in sketch space using { f i t } to update M i t and produces a scalar multiplier s i t for each client (e.g., s i t = ω i t · λ i t ).
  9:
Server broadcasts (possibly per-client) the allowlist A t = { i : s i t > ω min } and the corresponding scalars { s i t } i A t .
10:
for each surviving client i A t that completes secure aggregation do
11:
   Client locally scales its update u ˜ i t s i t u i t (and applies standard norm clipping if desired), then participates in secure aggregation to send a masked u ˜ i t .
12:
end for
13:
Secure aggregation returns the aggregate i A t u ˜ i t over the surviving set A t A t (dropouts handled by the SA protocol).
14:
Server updates w t + 1 = w t + i A t u ˜ i t / i A t s i t + ε w .

4.2. Dual-View Update Authentication via Sketch-Consistent Local Trajectories

Dual-view update authentication (DVA) targets integrity-evasive threats where a malicious client transmits an arbitrary vector u i t that does not correspond to any plausible local training trajectory under Equation (2). Such attacks can be surprisingly effective because many robust aggregators treat the received update as a black box; if the attacker keeps the update within typical magnitudes or mimics coordinate-wise statistics, purely distributional filters may fail. DVA introduces an orthogonal signal: consistency between the reported update and a compact summary of the local gradient path that generated it.
Two complementary views. During local training at round t, client i starts from w t and performs E SGD steps. It produces the final update u i t = w i , t ( E ) w t and computes two sketches
f i t ϕ ( u i t ) R m , h i t e = 0 E 1 ϕ ( g i , t ( e ) ) R m ,
where ϕ ( · ) is the public linear sketch (Equation (9)). The first view f i t summarizes the final update; the second view h i t summarizes the cumulative gradient trace. The key point is that both are linear images in the same sketch space, so the server can compare them without reconstructing high-dimensional gradients.
Consistency model and discrepancy.
Lemma 1
(Trajectory-consistency stability). Let g i , t ( e ) denote the first-order step direction used in Equation (2) (e.g., the stochastic gradient for SGD, or the momentum velocity for SGD with momentum). If the within-round step size is constant η t , then the realized update satisfies u i t = η t e = 0 E 1 g i , t ( e ) exactly, and by linearity of ϕ ( · )
f i t = η t h i t .
If the client uses a within-round learning-rate schedule { η t ( e ) } , then f i t = e = 0 E 1 η t ( e ) ϕ ( g i , t ( e ) ) and the mismatch obeys   
f i t + η t h i t max e | η t ( e ) η t | e = 0 E 1 ϕ ( g i , t ( e ) ) .
Thus, under typical FL settings where η t is constant within a round and E is small, honest clients exhibit small d i t up to numerical/compression noise; we set σ d and κ d via warm-up calibration to accommodate these benign deviations. DVA therefore defines the normalized discrepancy
d i t f i t + η t h i t f i t + ε d ,
where ε d > 0 prevents instability when f i t is small. The numerator measures violation of the sketch-consistency relation; the denominator makes the score comparable across clients with different update scales.
Soft gating and robustness to benign heterogeneity. Instead of hard-rejecting clients (which can harm performance under non-IID data), DVA converts d i t into a continuous gate
a i t exp ( d i t ) 2 2 σ d 2 ( 0 , 1 ] ,
where σ d controls tolerance. This design ensures that benign but heterogeneous clients are not discarded simply because their local updates differ in direction or magnitude; as long as their updates are self-consistent with their own gradient traces, they maintain high weights. Conversely, integrity-evasive attacks that directly craft u i t (e.g., sign-flip without corresponding gradient trace, arbitrary scaling that breaks the relation, or random vectors) tend to produce large discrepancies and are strongly down-weighted.
What DVA does and does not guarantee. Because the server has no access to client data D i , it cannot recompute the gradient trace and DVA should be interpreted as an internal-consistency test between two client-reported views. In plaintext-update mode (when u i t is available), the server can recompute f i t = ϕ ( u i t ) to prevent spoofing of the update sketch; however, the trace sketch h i t remains a client report. DVA therefore reliably penalizes attacks that modify u i t without producing a matching trace (e.g., in-transit tampering, sign-flip applied after local training, or arbitrary random vectors), but a fully adaptive attacker can fabricate a trace sketch to satisfy f i t η t h i t for an arbitrary malicious update. We explicitly evaluate this DVA-bypass attacker (Section 5.2) and show that the spectral/persistence layer and robust aggregation remain effective even when DVA provides no signal.
Practical tuning and stability. We use two complementary uses of DVA in SecureFedGuard. First, a i t directly contributes to the security weight and influences the robust center and covariance estimation in later stages. Second, DVA provides a round-level corruption indicator
α ^ t 1 K i S t I d i t > κ d ,
which controls the trimming strength ρ t . In practice, σ d and κ d can be set using a short warm-up window (initial rounds assumed mostly benign) by matching a target false-positive rate on the empirical distribution of d i t . This preserves stability: if all clients are benign, α ^ t stays near zero, yielding minimal trimming and near-FedAvg behavior; if suspicious behavior increases, trimming strengthens automatically.
DVA does not require access to client data D i and never transmits raw gradients. Each selected client sends two m-dimensional sketches, and the server computes all DVA scores in O ( K m ) . Clients compute f i t and accumulate h i t online during local training with negligible overhead.

4.3. Cross-Round Spectral Forensics with Persistence-Aware Residual Clipping

DVA is effective for integrity-evasive attacks that do not correspond to local optimization, but optimization-based poisoning and modern backdoor attacks can remain self-consistent with SGD and thus pass the sketch-consistency check. SecureFedGuard therefore adds a second defense layer that exploits two empirical properties of malicious behavior: (i) malicious updates often contain componentsthat deviate from the dominant benign update geometry, and (ii) such deviations tend to be persistent across rounds when the attacker aims to implant a stable backdoor or consistently degrade performance. This subsection formalizes how we estimate benign geometry each round and how we use cross-round persistence to attenuate suspicious components while preserving benign learning signals.

4.3.1. Weighted Robust Centering and Benign Subspace Estimation

We begin round t by computing a robust update center u ¯ t from { u i t } i S t . Unlike standard robust aggregation that treats all clients equally, we exploit DVA as a reliability prior: updates with low DVA gate a i t (Equation (14)) should have reduced influence on geometry estimation. We therefore compute u ¯ t as a coordinate-wise weighted median (weights a i t ), which is robust to large-magnitude outliers while remaining stable under benign heterogeneity. We define centered updates u ˜ i t u i t u ¯ t and form a weighted empirical covariance
C t 1 i S t a i t i S t a i t u ˜ i t ( u ˜ i t ) .
Let V t R d × r contain the top-r eigenvectors of C t (orthonormal columns) and define the projector P t V t V t . Intuitively, P t captures the dominant variation among (mostly) benign updates in the current round. This aligns with practical FL: although clients are non-IID, their updates often share substantial structure due to common architecture, similar optimization schedules, and shared feature representations, yielding a low-dimensional “benign manifold” in update space.

4.3.2. Robustness Analysis: When Benign Updates Dominate the Estimated Subspace

We provide a simple perturbation statement that clarifies when the estimated top-r subspace is dominated by benign updates. Let the weighted covariance in Equation (16) decompose as C t = C t ( b ) + C t ( m ) , where C t ( b ) is the contribution of benign clients and C t ( m ) the contribution of malicious clients after DVA weighting. Assume (A1) the benign covariance has an eigengap Δ t λ r ( C t ( b ) ) λ r + 1 ( C t ( b ) ) > 0 , and (A2) the malicious perturbation is bounded in operator norm, C t ( m ) 2 ε t . Then the Davis–Kahan sinΘ bound implies that the principal angle between the estimated top-r subspace span ( V t ) and the benign top-r subspace span ( V t ( b ) ) satisfies
sin Θ V t , V t ( b ) 2 ε t Δ t .
Equation (17) connects directly to practical parameter choices: (i) r should be chosen at a spectral “elbow” where Δ t is large (we operationalize this via the explained-variance rule in Section 4.4); (ii) tighter DVA gating reduces ε t but must be calibrated to avoid benign false positives; and (iii) the EMA factor β controls how quickly persistence separates benign and malicious residuals, since for a client with mean residual μ i we have M i t + L β L M i t + ( 1 β L ) μ i .

4.3.3. Residual Energy as an Anomaly Score

Given P t and u ¯ t , we quantify how much each update deviates from the estimated benign subspace by residual energy:
r i t ( I P t ) u ˜ i t 2 .
Residual energy is well suited for security screening because it is insensitive to benign within-subspace diversity: a benign client may update strongly in different directions, yet still lie largely in the same dominant subspace. In contrast, a poisoned update that injects an additional backdoor direction can create a higher-energy residual component even if the overall update magnitude and coordinate statistics appear normal. To calibrate r i t without assuming a known α , we compute a robust residual scale τ t as the weighted median of { r i t } using weights a i t . This yields a per-round reference level that adapts to training stage and data heterogeneity.

4.3.4. Persistence Memory for Cross-Round Detection

Single-round screening can be evaded by an adaptive attacker that intermittently poisons or carefully shapes the update distribution. SecureFedGuard therefore maintains a per-client persistence memory using an exponential moving average (EMA):
M i t β M i t 1 + ( 1 β ) r i t , i S t , M i t 1 , i S t ,
where β [ 0 , 1 ) controls the time scale. Benign clients may occasionally exhibit elevated residuals due to stochasticity or local distribution shifts, but persistent attackers that repeatedly introduce a backdoor direction accumulate consistently larger M i t . This persistence signal is later combined with DVA to form the security weight ω i t in Equation (23), enabling gradual but decisive suppression of repeatedly anomalous contributors.

4.3.5. Residual Clipping That Preserves Benign Signal

Hard filtering based on r i t risks removing useful benign updates, especially under non-IID distributions where benign variability is large. SecureFedGuard instead performs component-wise attenuation: it keeps the within-subspace component P t u ˜ i t intact and clips only the residual component ( I P t ) u ˜ i t . Specifically, we define a residual shrink factor
λ i t min 1 , τ t r i t + ε r ,
and form a sanitized update
u ^ i t u ¯ t + P t u ˜ i t + λ i t ( I P t ) u ˜ i t .
This design has two practical benefits. First, benign clients with typical residual energy satisfy r i t τ t and thus λ i t 1 , leaving their updates essentially unchanged. Second, if an attacker injects a backdoor direction that lies largely outside the benign subspace, then r i t τ t and λ i t 1 , shrinking precisely the suspicious component while retaining the component aligned with benign training dynamics. This selective attenuation is crucial in FL: it avoids over-penalizing clients with genuine distributional differences while still suppressing directions that are both geometrically atypical and persistent across rounds.
In summary, cross-round spectral forensics reduces the influence of stealthy poisoning directions that evade trajectory consistency checks. We next integrate DVA and forensics into the security-aware robust aggregation rule and summarize the full protocol in Algorithm 2.
Algorithm 2 SecureFedGuard protocol (one round)
  1:
Server broadcasts ( w t , η t , seed t ) ; the seed defines the public sketch ϕ ( · ) .
  2:
for each selected client i S t (in parallel) do
  3:
   Client runs E local SGD steps from w t to obtain update u i t and per-step gradients { g i , t ( e ) } e = 0 E 1 .
  4:
   Client computes f i t ϕ ( u i t ) and h i t e = 0 E 1 ϕ ( g i , t ( e ) ) .
  5:
   Client sends ( u i t , f i t , h i t ) (or masked u i t under secure aggregation).
  6:
end for
  7:
Compute DVA discrepancy d i t and gate a i t via Equations (13) and (14); compute α ^ t via Equation (15).
  8:
Compute robust center u ¯ t (coordinate-wise weighted median using a i t ) and centered updates u ˜ i t = u i t u ¯ t .
  9:
Estimate benign subspace P t from weighted covariance (Equation (16)); compute residuals r i t (Equation (18)) and scale τ t (weighted median).
10:
Update persistence memory M i t via Equation (19) and compute residual clipping λ i t (Equation (20)).
11:
Sanitize updates u ^ i t via Equation (21).
12:
Compute weights ω i t via Equation (23) and trimming level ρ t via Equation (24).
13:
Aggregate { u ^ i t } with the weighted trimmed mean in Equation (25) to obtain Δ t and update w t + 1 = w t + Δ t .
14:
(Secure aggregation mode) use sketches to compute ω i t , broadcast weights, and obtain a weighted-mean update via secure aggregation.

4.4. Security-Aware Robust Aggregation and Protocol Summary

We now define how SecureFedGuard combines the integrity signal (DVA) and the backdoor signal (cross-round persistence) into a single aggregation rule.
Persistence-to-weight mapping. We convert the EMA memory M i t (Equation (19)) into a soft persistence gate by normalizing with the current residual scale τ t and applying an exponential map:
p i t exp γ M i t τ t + ε m ( 0 , 1 ] ,
where γ > 0 controls how aggressively persistent anomalies are down-weighted and ε m is a small constant for stability.
Security weight. The final per-client security weight is the product of the DVA gate and the persistence gate:
ω i t a i t · p i t .
Adaptive trimming level. We set the coordinate-wise trimming level using the DVA-based corruption indicator (Equation (15)):
ρ t min ρ max , α ^ t .
Weighted coordinate-wise trimmed mean on sanitized updates. Let I j be the indices that remain after removing the top and bottom ρ t K values among { ( u ^ i t ) j } i S t (as in Equation (6)). We then aggregate the remaining coordinates using the security weights:
Δ t , j 1 i I j ω i t + ε w i I j ω i t ( u ^ i t ) j ,
where ε w avoids division by zero. Finally, the server updates w t + 1 = w t + Δ t where Δ t = ( Δ t , 1 , , Δ t , d ) .
Secure aggregation mode. When individual updates are hidden, Equation (25) is replaced by a client-weighted mean computed via secure aggregation using the scalar multipliers s i t derived from sketches (Algorithm 1).
Hyperparameter selection. (i) Sketch dimension m trades accuracy for bandwidth: for sign random projections, larger m yields tighter norm/inner-product preservation; in practice m [ 256 , 1024 ] is a robust range and we use m = 512 by default. (ii) Subspace rank r controls the capacity of the estimated benign manifold; a simple, reproducible rule is to choose the smallest r whose cumulative explained variance in a warm-up window exceeds a threshold (e.g., 90 % ), and we use r = 10 as a default. (iii) The EMA parameter β sets an effective memory horizon of roughly 1 / ( 1 β ) rounds; we use β = 0.9 and also report a β sweep in Section 5.4. (iv) DVA parameters ( σ d , κ d ) can be set from the warm-up empirical distribution of d i t to match a target false-positive rate.

5. Experiments

5.1. Experimental Setup and Evaluation Protocol

We evaluate SecureFedGuard under realistic cross-device FL conditions, emphasizing (i) heterogeneous client data, (ii) persistent adversaries that participate across many rounds, and (iii) threat models spanning untargeted Byzantine disruption and stealthy targeted backdoors. Unless otherwise stated, all methods are run with identical client sampling S t , identical local optimization hyperparameters, and identical attack budgets so that differences arise solely from the server-side defense.
Datasets and client partitions. We use four real datasets that are standard in the FL literature: CIFAR-10 and CIFAR-100 https://www.cs.toronto.edu/~kriz/cifar.html (accessed on 5 November 2025), and FEMNIST and Shakespeare from the LEAF benchmark suite https://leaf.cmu.edu/ (accessed on 5 November 2025). For CIFAR-10/100, we simulate cross-device heterogeneity by Dirichlet label partitioning: each client i is assigned a label mixture drawn from Dir ( δ ) with concentration δ = 0.3 , and samples are allocated accordingly; smaller δ yields stronger non-IID behavior. For LEAF tasks, we use the canonical client splits provided by LEAF, which reflect naturally heterogeneous user data (writers for FEMNIST and speaking styles for Shakespeare). In all cases, client datasets are disjoint and remain local.
Models and optimization details. We select widely used architectures to demonstrate that the defense scales across vision and language: ResNet-18 for CIFAR-10/100, a 2-layer CNN for FEMNIST, and a 2-layer LSTM for Shakespeare. We use the standard cross-device protocol with N = 2000 total clients and K = 100 selected per round, for T = 2000 rounds on CIFAR tasks and T = 1500 on LEAF tasks. Each selected client runs local SGD starting from w t for E = 2 local epochs with momentum 0.9 . For CNN/ResNet tasks, the learning rate η t follows cosine decay from 0.1 to 0.001 ; for the LSTM it decays from 0.8 to 0.05 to accommodate the different scale and curvature of language modeling. Client weights in the global objective (Equation (1)) are set to p i | D i | , matching standard practice in cross-device FL.
Threat models and attacker budgets. We consider a persistent adversary controlling a fixed subset of client identifiers across rounds, consistent with compromise or Sybil-style persistence. The per-round corruption rate is α = 0.2 (Equation (5)) unless otherwise noted, meaning up to 20 of the K = 100 participating clients can be adversarial each round. We evaluate four families of attacks: (i) Sign-flip where adversaries invert benign-like updates, u i t λ u i t with λ { 5 , 10 } , modeling disruptive Byzantine behavior; (ii) Gaussian noise where adversaries send u i t N ( 0 , σ 2 I ) and then rescale to match benign u i t , stressing magnitude-insensitive defenses; and (iii) Backdoor/model-replacement attacks where malicious clients locally optimize a trigger-target objective using a fixed 4 × 4 square trigger at the bottom-right corner and target class y , then apply scaling λ b (e.g., λ b { 8 , 10 } ) so that the poisoned objective dominates aggregation while maintaining near-normal clean performance on benign inputs. (iv) DVA-bypass (sketch-consistent fabrication) where adversaries craft an arbitrary poisoned update u i t but also fabricate the trace sketch to satisfy h i t 1 η t f i t (thus d i t 0 ), representing a fully adaptive attacker for which DVA provides no signal. These settings capture both untargeted availability attacks and targeted integrity attacks.
Baselines and implementation parity. We compare against (a) FedAvg (mean aggregation), (b) coordinate-wise Median, (c) coordinate-wise TrimmedMean, (d) Multi-Krum, and (e) RFA (geometric median aggregation). For backdoor settings, we additionally compare to three recent backdoor-specific FL defenses: CrowdGuard [12], FDCR [13], and AlignIns [15]. All baselines are implemented in the same training codebase and receive identical updates. Hyperparameters for baselines (e.g., trimming ratio for TrimmedMean, candidate selection size for Multi-Krum, and iteration budget for RFA) are tuned using a small clean validation subset for each dataset, but once fixed they are kept unchanged across attack settings to avoid “defense-overfitting” to a particular attacker.
SecureFedGuard settings. Unless otherwise stated, SecureFedGuard uses sketch dimension m = 512 , subspace rank r = 10 for P t , EMA parameter β = 0.9 for persistence, DVA tolerance σ d = 0.35 , and maximum trimming ρ max = 0.25 . These defaults follow the selection guidelines in Section 4.4. The remaining scalars ( ε d , ε r , ε m , ε w ) are set to 10 12 for numerical stability, and we set γ = 1 unless otherwise stated.
Metrics and reporting. We report Clean Accuracy on the standard test set; for backdoor settings, we additionally report Attack Success Rate (ASR). Unless stated otherwise, we run three seeds { 0 , 1 , 2 } and report means. For a given seed, we fix (i) the client partition (Dirichlet draw/LEAF split), (ii) model initialization, (iii) per-round client sampling, and (iv) the adversarial client IDs; all methods share the same realizations for fair comparison. We will release training scripts, configuration files, and per-run logs to enable exact reproduction.
For convenience, Table 1 and Table 2 summarize the datasets/partitions and training hyperparameters used in all experiments.

5.2. Main Results

We report the primary security and utility outcomes under both backdoor and Byzantine threats. Our evaluation emphasizes two key questions: (i) can the defense suppress targeted backdoors without sacrificing clean accuracy under non-IID data, and (ii) can the defense maintain stable convergence under strong Byzantine perturbations. Across all settings, SecureFedGuard uses the fixed hyperparameters stated in Section 5.1 and is not re-tuned per dataset or attack, while baselines are tuned on a benign validation split to avoid disadvantaging them.
Backdoor robustness on vision and handwriting tasks. Table 3 and Table 4 compare clean accuracy and ASR under a persistent backdoor attacker population ( α = 0.2 ) using model-replacement scaling. FedAvg is highly vulnerable, reaching near-perfect ASR despite strong clean accuracy. Robust aggregators reduce ASR, but a substantial residual backdoor remains. SecureFedGuard yields the lowest ASR while keeping clean accuracy close to FedAvg, indicating that (i) DVA reduces the influence of integrity-inconsistent updates and (ii) spectral residual clipping suppresses backdoor directions that persist outside the benign subspace.
Byzantine robustness. Table 5 and Table 6 report clean accuracy under sign-flip and Gaussian attacks. Under sign-flip, FedAvg collapses as expected. Robust aggregators recover, but SecureFedGuard improves further by attenuating residual components and down-weighting clients with persistent anomalous geometry. Under Gaussian noise, the main failure mode is variance inflation; SecureFedGuard maintains high accuracy by combining security-aware trimming with selective residual clipping, which reduces the effective noise injected outside dominant benign directions.
Visualization of the security-utility trade-off. To complement the tables, Figure 1 plots Clean Accuracy versus ASR for CIFAR-10 backdoor, and Figure 2 plots Clean Accuracy versus corruption ratio α under sign-flip. In both views, SecureFedGuard occupies a favorable region: low ASR at high accuracy, and graceful degradation as α increases.

Sketch-Consistent Fabrication (DVA-Bypass Attacker)

To delineate DVA’s defensive boundary, we evaluate an adaptive Byzantine attacker that fabricates the trace sketch to satisfy Equation (11) for an arbitrary poisoned update (i.e., h i t = f i t / η t ; hence, d i t 0 ). This removes the trajectory-consistency signal and isolates the contribution of cross-round spectral forensics and robust aggregation.

5.3. Additional Robustness Evaluations

Extreme non-IID and feature-shift heterogeneity. Beyond the default Dirichlet label skew ( δ = 0.3 ), we include (i) more extreme label skew with smaller δ , (ii) a pathological label-partition where each client holds only a small number of classes, and (iii) feature-distribution shifts created by assigning each client a fixed input transformation (e.g., brightness/contrast/blur) throughout training. Table 7 reports Clean Accuracy/ASR under these harsher conditions.
Comparison with recent backdoor defenses. We additionally report backdoor results for CrowdGuard [12], FDCR [13], and AlignIns [15] under the same local training protocol. Table 8 summarizes Clean Accuracy/ASR.

5.4. Dynamics and Ablation Studies

We now examine how SecureFedGuard achieves robustness and which components contribute most under different threat regimes. We focus on two aspects: (i) training-time dynamics (how quickly the backdoor emerges or is suppressed across rounds), and (ii) component-level ablations that isolate the effect of DVA, cross-round forensics, and security-aware robust aggregation. Throughout this subsection, we use the CIFAR-10 backdoor setting ( α = 0.2 , λ b = 10 ) as the primary case study and provide additional evidence under alternative backdoor strengths and corruption rates.
Backdoor dynamics across rounds. Figure 3 plots ASR as training proceeds. FedAvg rapidly converges to high ASR, indicating that backdoor features are learned early and persist. A purely robust baseline (TrimmedMean) slows the backdoor but often settles at a non-trivial ASR plateau, consistent with the attacker injecting a persistent direction that remains within the acceptance region of coordinate-wise robust summaries. SecureFedGuard suppresses ASR quickly and keeps it low throughout training. This behavior is consistent with the methodology: DVA reduces the influence of inconsistent crafted updates, while spectral residual clipping shrinks the residual component that repeatedly deviates from the benign subspace. Figure 4 shows that this suppression is not achieved by sacrificing clean utility: clean accuracy remains close to the best baselines once training stabilizes.
Ablations of defense components. We evaluate three ablations: (i) DVA-only, which uses the DVA gate a i t in the aggregation weights but disables spectral forensics (no residual clipping and no persistence memory); (ii) Forensics-only, which uses residual clipping but sets a i t 1 and uses uniform weights (no trajectory screening); and (iii) Full SecureFedGuard. Table 9 reports CIFAR-10 backdoor outcomes, and Table 10 reports sign-flip Byzantine outcomes. To stress-test DVA under an adaptive sketch-consistent attacker, Table 11 reports the same ablation when adversaries fabricate h i t to satisfy Equation (11). DVA-only substantially reduces ASR compared with FedAvg but leaves a visible residual backdoor because an optimization-based attacker can still produce self-consistent traces. Forensics-only further suppresses ASR by shrinking anomalous residual directions but is somewhat less effective against integrity-evasive manipulations (e.g., scaling/flip variants) because it lacks trajectory screening. The full method combines both, yielding the lowest ASR and strong Byzantine robustness, validating that the two signals are complementary.
Sensitivity to attacker strength and persistence memory. To stress-test persistence effects, Table 12 varies the model-replacement scaling λ b and reports the resulting ASR. The table shows that SecureFedGuard remains stable across a wide range of λ b , whereas robust baselines degrade as λ b increases. We also evaluate the EMA parameter β controlling the persistence memory time scale (Equation (19)). Figure 5 shows that intermediate values (e.g., β = 0.85 0.95 ) provide the best robustness: small β reacts quickly but is noisier, while very large β can delay suppression of newly emerging malicious behavior.

6. Conclusions

This paper proposed SecureFedGuard, a practical security framework for federated learning that jointly addresses integrity-evasive Byzantine updates and stealthy backdoor poisoning under non-IID data. SecureFedGuard couples a novel dual-view update authentication mechanism, which screens update plausibility using compact linear sketches of both the final update and the cumulative gradient trace, with cross-round spectral forensics that suppresses persistent anomalous directions via residual clipping and adaptive robust aggregation. Across real FL benchmarks and threat models, the method achieves strong clean accuracy while dramatically reducing backdoor attack success compared with widely used robust aggregators and recent defenses, and it can be deployed in a secure-aggregation-compatible mode using only sketches and scalar weights. These results suggest that combining lightweight trajectory-consistency signals with geometry- and persistence-aware update sanitization is an effective route to integrity-preserving and backdoor-resilient federated learning at scale.

Author Contributions

Conceptualization, T.C. and Y.L.; Methodology, T.C.; Formal analysis, Y.L.; Project administration, S.G.; Funding acquisition, S.G. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by (1) Guangdong University of Science and Technology Fund “Big Data and Digital Business Innovation Research Team” (GKY-2022CQTD-10); (2) Guangdong University of Science and Technology 2024 University-Level Project: Research on Innovation Strategies of Dongguan Cross-border E-commerce Driven by Digital Economy (GKY-2024KYZDW-14); (3) Philosophy and Social Sciences Project of Guangdong Province in 2024 (GD24XGL018); Guangdong Provincial Decision Consulting Research Base “Supply Chain Digital Innovation Research Center”; (4) 2024 Guangdong Provincial Key Discipline Construction Research Capacity Enhancement Project (Project No.: 2024ZDJS067).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. McMahan, H.B.; Moore, E.; Ramage, D.; Hampson, S.; y Arcas, B.A. Communication-Efficient Learning of Deep Networks from Decentralized Data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), Ft. Lauderdale, FL, USA, 20–22 April 2017; pp. 1273–1282. [Google Scholar]
  2. Li, T.; Sahu, A.K.; Talwalkar, A.; Smith, V. Federated Learning: Challenges, Methods, and Future Directions. IEEE Signal Process. Mag. 2020, 37, 50–60. [Google Scholar] [CrossRef]
  3. Bonawitz, K.; Ivanov, V.; Kreuter, B.; Marcedone, A.; McMahan, H.B.; Patel, S.; Ramage, D.; Segal, A.; Seth, K. Practical Secure Aggregation for Privacy-Preserving Machine Learning. In Proceedings of the ACM Conference on Computer and Communications Security (CCS), Dallas, TX, USA, 30 October–3 November 2017. [Google Scholar]
  4. Blanchard, P.; Mhamdi, E.M.E.; Guerraoui, R.; Stainer, J. Machine Learning with Adversaries: Byzantine Tolerant Gradient Descent. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Long Beach, CA, USA, 4–9 December 2017; pp. 119–129. [Google Scholar]
  5. Yin, D.; Chen, Y.; Ramchandran, K.; Bartlett, P.L. Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates. In Proceedings of the 35th International Conference on Machine Learning (ICML), Stockholm, Sweden, 10–15 July 2018; pp. 5650–5659. [Google Scholar]
  6. Li, Z.; Lan, J.; Yan, Z.; Gelenbe, E. Backdoor Attacks and Defense Mechanisms in Federated Learning: A Survey. Inf. Fusion 2025, 123, 103248. [Google Scholar] [CrossRef]
  7. Pillutla, K.; Kakade, S.M.; Harchaoui, Z. Robust Aggregation for Federated Learning. arXiv 2019, arXiv:1912.13445. [Google Scholar] [CrossRef]
  8. Huang, W.; Shi, Z.; Ye, M.; Li, H.; Du, B. Self-Driven Entropy Aggregation for Byzantine-Robust Heterogeneous Federated Learning. In Proceedings of the 41st International Conference on Machine Learning (ICML), Vienna, Austria, 21–27 July 2024; pp. 20096–20110. [Google Scholar]
  9. Fang, M.; Nabavirazavi, S.; Liu, Z.; Sun, W.; Iyengar, S.S.; Yang, H. Do We Really Need to Design New Byzantine-robust Aggregation Rules? In Proceedings of the Network and Distributed System Security Symposium (NDSS), San Diego, CA, USA, 24–28 February 2025. [Google Scholar]
  10. Deshmukh, A. Byzantine-Robust Federated Learning: An Overview With Focus on Developing Sybil-based Attacks to Backdoor Augmented Secure Aggregation Protocols. arXiv 2024, arXiv:2410.22680. [Google Scholar]
  11. Zhuang, H.; Yu, M.; Wang, H.; Hua, Y.; Li, J.; Yuan, X. Backdoor Federated Learning by Poisoning Backdoor-Critical Layers. In Proceedings of the The Twelfth International Conference on Learning Representations (ICLR), Vienna, Austria, 7–11 May 2024. [Google Scholar]
  12. Rieger, P.; Krauß, T.; Miettinen, M.; Dmitrienko, A.; Sadeghi, A.R. CrowdGuard: Federated Backdoor Detection in Federated Learning. In Proceedings of the Network and Distributed System Security Symposium (NDSS), San Diego, CA, USA, 26 February–1 March 2024. [Google Scholar]
  13. Huang, W.; Ye, M.; Shi, Z.; Wan, G.; Li, H.; Du, B. Parameter Disparities Dissection for Backdoor Defense in Heterogeneous Federated Learning. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Vancouver, BC, Canada, 10–15 December 2024. [Google Scholar]
  14. Chen, L.; Liu, X.; Wang, A.; Zhai, W.; Cheng, X. FLSAD: Defending Backdoor Attacks in Federated Learning via Self-Attention Distillation. Symmetry 2024, 16, 1497. [Google Scholar] [CrossRef]
  15. Xu, J.; Zhang, Z.; Hu, R. Detecting Backdoor Attacks in Federated Learning via Direction Alignment Inspection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 11–15 June 2025; pp. 20654–20664. [Google Scholar]
  16. Sun, H.; Zhang, Y.; Zhuang, H.; Li, J.; Xu, Z.; Wu, L. PEAR: Privacy-preserving and Effective Aggregation for Byzantine-robust Federated Learning in Real-world Scenarios. Comput. J. 2025, 68, 1087–1104. [Google Scholar] [CrossRef]
  17. Hosseini, E.; Chen, S.; Khisti, A. Secure Aggregation in Federated Learning using Multiparty Homomorphic Encryption. arXiv 2025, arXiv:2503.00581. [Google Scholar] [CrossRef]
  18. Wan, G.; Shi, Z.; Huang, W.; Zhang, G.; Tao, D.; Ye, M. Energy-based Backdoor Defense Against Federated Graph Learning. In Proceedings of the The Thirteenth International Conference on Learning Representations (ICLR), Singapore, 24–28 April 2025; OpenReview.net: Amherst, MA, USA, 2025. [Google Scholar]
  19. Chen, P.; Xiang, H.; Du, X.; Xu, X.; Jiang, X.; Lu, Z.; Yang, J.; Duan, Q.; Dou, W. Universal Backdoor Defense via Label Consistency in Vertical Federated Learning. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), Montreal, QC, Canada, 16–22 August 2025. [Google Scholar]
  20. Zhao, S.; Pu, J.; Fu, X.; Liu, L.; Dai, F. Byzantine-robust Federated Learning with Ensemble Incentive Mechanism. Future Gener. Comput. Syst. 2024, 159, 272–283. [Google Scholar] [CrossRef]
  21. Cajaraville-Aboy, D.; Fernández-Vilas, A.; Díaz-Redondo, R.P.; Fernández-Veiga, M. Byzantine-Robust Aggregation for Securing Decentralized Federated Learning. arXiv 2024, arXiv:2409.17754. [Google Scholar] [CrossRef]
  22. Molina-Coronado, B. Client-Side Patching against Backdoor Attacks in Federated Learning. arXiv 2024, arXiv:2412.10605. [Google Scholar] [CrossRef]
Figure 1. Clean Accuracy versus ASR on CIFAR-10 backdoor ( α = 0.2 , λ b = 10 ). Markers correspond to circle = RFA, square = Median, triangle = TrimmedMean, diamond = Multi-Krum, dot = FedAvg, star = SecureFedGuard. Lower ASR and higher accuracy are better.
Figure 1. Clean Accuracy versus ASR on CIFAR-10 backdoor ( α = 0.2 , λ b = 10 ). Markers correspond to circle = RFA, square = Median, triangle = TrimmedMean, diamond = Multi-Krum, dot = FedAvg, star = SecureFedGuard. Lower ASR and higher accuracy are better.
Electronics 15 01010 g001
Figure 2. Clean accuracy on CIFAR-10 under sign-flip attacks ( λ = 10 ) versus corruption ratio α . Curves shown (no legend): solid = FedAvg, dashed = TrimmedMean, dotted = SecureFedGuard. SecureFedGuard degrades gracefully as α increases.
Figure 2. Clean accuracy on CIFAR-10 under sign-flip attacks ( λ = 10 ) versus corruption ratio α . Curves shown (no legend): solid = FedAvg, dashed = TrimmedMean, dotted = SecureFedGuard. SecureFedGuard degrades gracefully as α increases.
Electronics 15 01010 g002
Figure 3. ASR dynamics on CIFAR-10 backdoor ( α = 0.2 , λ b = 10 ). Curves shown (no legend): solid = FedAvg, dashed = TrimmedMean, dotted = SecureFedGuard. SecureFedGuard suppresses ASR early and maintains it near 0– 5 % .
Figure 3. ASR dynamics on CIFAR-10 backdoor ( α = 0.2 , λ b = 10 ). Curves shown (no legend): solid = FedAvg, dashed = TrimmedMean, dotted = SecureFedGuard. SecureFedGuard suppresses ASR early and maintains it near 0– 5 % .
Electronics 15 01010 g003
Figure 4. Clean-accuracy dynamics on CIFAR-10 backdoor ( α = 0.2 , λ b = 10 ). Curves shown (no legend): solid = FedAvg, dashed = TrimmedMean, dotted = SecureFedGuard. SecureFedGuard preserves clean accuracy while suppressing the backdoor.
Figure 4. Clean-accuracy dynamics on CIFAR-10 backdoor ( α = 0.2 , λ b = 10 ). Curves shown (no legend): solid = FedAvg, dashed = TrimmedMean, dotted = SecureFedGuard. SecureFedGuard preserves clean accuracy while suppressing the backdoor.
Electronics 15 01010 g004
Figure 5. Effect of the persistence-memory time scale (EMA β in Equation (19)) on CIFAR-10 backdoor ASR ( α = 0.2 , λ b = 10 ). Intermediate values balance responsiveness and stability, yielding the best ASR.
Figure 5. Effect of the persistence-memory time scale (EMA β in Equation (19)) on CIFAR-10 backdoor ASR ( α = 0.2 , λ b = 10 ). Intermediate values balance responsiveness and stability, yielding the best ASR.
Electronics 15 01010 g005
Table 1. Datasets and federated partitions used in our evaluation. For CIFAR, clients are synthetically formed with Dirichlet concentration δ to induce non-IID label skew; for LEAF tasks, we use the benchmark-provided user partitions.
Table 1. Datasets and federated partitions used in our evaluation. For CIFAR, clients are synthetically formed with Dirichlet concentration δ to induce non-IID label skew; for LEAF tasks, we use the benchmark-provided user partitions.
DatasetPartition/ClientsURL
CIFAR-10Dirichlet label skew ( δ = 0.3 ), N = 2000 , K = 100 , T = 2000 https://www.cs.toronto.edu/~kriz/cifar.html (accessed on 5 November 2025)
CIFAR-100Dirichlet label skew ( δ = 0.3 ), N = 2000 , K = 100 , T = 2000 https://www.cs.toronto.edu/~kriz/cifar.html (accessed on 5 November 2025)
FEMNIST (LEAF)LEAF user split, N = 2000 , K = 100 , T = 1500 https://leaf.cmu.edu/ (accessed on 5 November 2025)
Shakespeare (LEAF)LEAF user split, N = 2000 , K = 100 , T = 1500 https://leaf.cmu.edu/ (accessed on 5 November 2025)
Table 2. Model architectures and training hyperparameters. All methods share identical client sampling, optimizer, and local training settings; only the server-side aggregation/defense differs.
Table 2. Model architectures and training hyperparameters. All methods share identical client sampling, optimizer, and local training settings; only the server-side aggregation/defense differs.
TaskModelLocal TrainingLR Schedule
CIFAR-10/100ResNet-18SGD, momentum 0.9 , E = 2 epochs, B = 32 cosine η t : 0.1 0.001
FEMNIST2-layer CNNSGD, momentum 0.9 , E = 2 epochs, B = 32 cosine η t : 0.05 0.001
Shakespeare2-layer LSTMSGD, momentum 0.9 , E = 2 epochs, B = 16 cosine η t : 0.8 0.05
Table 3. CIFAR-10 under backdoor attack with α = 0.2 and model-replacement scaling ( λ b = 10 ). Trigger: 4 × 4 patch at bottom-right; target label y fixed across runs.
Table 3. CIFAR-10 under backdoor attack with α = 0.2 and model-replacement scaling ( λ b = 10 ). Trigger: 4 × 4 patch at bottom-right; target label y fixed across runs.
MethodClean Acc. (%)ASR (%)
FedAvg82.696.8
Median80.941.2
TrimmedMean81.428.7
Multi-Krum79.822.9
RFA81.019.4
SecureFedGuard82.13.7
Table 4. FEMNIST (LEAF) under backdoor attack with α = 0.2 and λ b = 8 . Trigger: 4 × 4 patch; target label y chosen uniformly at random and fixed per run.
Table 4. FEMNIST (LEAF) under backdoor attack with α = 0.2 and λ b = 8 . Trigger: 4 × 4 patch; target label y chosen uniformly at random and fixed per run.
MethodClean Acc. (%)ASR (%)
FedAvg86.998.2
Median85.537.6
TrimmedMean86.024.8
Multi-Krum84.218.5
RFA85.816.9
SecureFedGuard86.42.9
Table 5. Sign-flip Byzantine attack with α = 0.2 and λ = 10 . For CIFAR tasks, we report clean accuracy (%); for Shakespeare, we report test perplexity (PPL; lower is better).
Table 5. Sign-flip Byzantine attack with α = 0.2 and λ = 10 . For CIFAR tasks, we report clean accuracy (%); for Shakespeare, we report test perplexity (PPL; lower is better).
MethodCIFAR-10 (%)CIFAR-100 (%)Shakespeare (PPL)
FedAvg12.41.8215
Median73.541.292
TrimmedMean75.143.088
Multi-Krum71.639.8101
RFA74.242.190
SecureFedGuard77.044.684
Table 6. Gaussian Byzantine attack with α = 0.2 . Adversaries send u i t N ( 0 , σ 2 I ) scaled to match benign u i t ; metrics follow Table 5.
Table 6. Gaussian Byzantine attack with α = 0.2 . Adversaries send u i t N ( 0 , σ 2 I ) scaled to match benign u i t ; metrics follow Table 5.
MethodCIFAR-10 (%)CIFAR-100 (%)Shakespeare (PPL)
FedAvg38.716.2141
Median70.939.598
TrimmedMean72.441.094
Multi-Krum68.136.8109
RFA71.240.196
SecureFedGuard74.042.390
Table 7. Robustness under more extreme heterogeneity on CIFAR-10 backdoor ( α = 0.2 , λ b = 10 ). Entries are Clean Acc (%)/ASR (%).
Table 7. Robustness under more extreme heterogeneity on CIFAR-10 backdoor ( α = 0.2 , λ b = 10 ). Entries are Clean Acc (%)/ASR (%).
MethodDirichlet δ = 0.3 Dirichlet δ = 0.1 2-Class PathologicalFeature Shift
FedAvg82.6/96.880.9/97.374.5/98.178.2/97.0
TrimmedMean81.4/28.779.0/35.072.0/41.276.5/33.8
RFA81.0/19.478.5/25.070.8/30.675.8/23.7
SecureFedGuard82.1/3.780.3/6.174.0/8.478.0/6.7
Table 8. Comparison with recent backdoor defenses on CIFAR-10 backdoor (Dirichlet δ = 0.3 , α = 0.2 , λ b = 10 ). Entries are Clean Acc (%)/ASR (%).
Table 8. Comparison with recent backdoor defenses on CIFAR-10 backdoor (Dirichlet δ = 0.3 , α = 0.2 , λ b = 10 ). Entries are Clean Acc (%)/ASR (%).
MethodClean Acc/ASR
CrowdGuard [12]81.6/12.4
FDCR [13]81.2/15.8
AlignIns [15]81.9/6.9
SecureFedGuard82.1/3.7
Table 9. Ablation on CIFAR-10 backdoor ( α = 0.2 , λ b = 10 ). DVA-only uses trajectory screening (Equation (14)) without spectral forensics; Forensics-only uses residual clipping (Equation (21)) without DVA; Full combines both.
Table 9. Ablation on CIFAR-10 backdoor ( α = 0.2 , λ b = 10 ). DVA-only uses trajectory screening (Equation (14)) without spectral forensics; Forensics-only uses residual clipping (Equation (21)) without DVA; Full combines both.
VariantClean Acc. (%)ASR (%)
FedAvg82.696.8
DVA-only ( a i t enabled)81.714.2
Forensics-only (clipping enabled)81.99.6
Full SecureFedGuard82.13.7
Table 10. Ablation under sign-flip Byzantine attack ( α = 0.2 , λ = 10 ). Combining DVA and forensics yields the strongest robustness.
Table 10. Ablation under sign-flip Byzantine attack ( α = 0.2 , λ = 10 ). Combining DVA and forensics yields the strongest robustness.
VariantCIFAR-10 Acc. (%)CIFAR-100 Acc. (%)
FedAvg12.41.8
DVA-only ( a i t enabled)63.133.7
Forensics-only (clipping enabled)71.239.6
Full SecureFedGuard77.044.6
Table 11. Ablation under DVA-bypass sign-flip Byzantine attack ( α = 0.2 , λ = 10 ). Adversaries fabricate the trace sketch so that d i t 0 for an arbitrary poisoned update, eliminating the DVA signal.
Table 11. Ablation under DVA-bypass sign-flip Byzantine attack ( α = 0.2 , λ = 10 ). Adversaries fabricate the trace sketch so that d i t 0 for an arbitrary poisoned update, eliminating the DVA signal.
MethodCIFAR-10 (%)CIFAR-100 (%)
FedAvg12.41.8
DVA-only ( a i t enabled)13.02.1
Forensics-only (clipping enabled)69.839.1
Full SecureFedGuard74.643.2
Table 12. Sensitivity to backdoor scaling strength λ b on CIFAR-10 with α = 0.2 . Robust baselines degrade as λ b increases, while SecureFedGuard remains stable due to residual clipping and persistence-aware weighting.
Table 12. Sensitivity to backdoor scaling strength λ b on CIFAR-10 with α = 0.2 . Robust baselines degrade as λ b increases, while SecureFedGuard remains stable due to residual clipping and persistence-aware weighting.
Method λ b = 5 ASR (%) λ b = 10 ASR (%) λ b = 15 ASR (%)
TrimmedMean17.328.739.8
RFA12.619.427.5
SecureFedGuard3.93.74.8
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chen, T.; Li, Y.; Gong, S. SecureFedGuard: Authenticated and Backdoor-Resilient Federated Learning with Dual-View Gradient Forensics. Electronics 2026, 15, 1010. https://doi.org/10.3390/electronics15051010

AMA Style

Chen T, Li Y, Gong S. SecureFedGuard: Authenticated and Backdoor-Resilient Federated Learning with Dual-View Gradient Forensics. Electronics. 2026; 15(5):1010. https://doi.org/10.3390/electronics15051010

Chicago/Turabian Style

Chen, Tuli, Yantao Li, and Shu Gong. 2026. "SecureFedGuard: Authenticated and Backdoor-Resilient Federated Learning with Dual-View Gradient Forensics" Electronics 15, no. 5: 1010. https://doi.org/10.3390/electronics15051010

APA Style

Chen, T., Li, Y., & Gong, S. (2026). SecureFedGuard: Authenticated and Backdoor-Resilient Federated Learning with Dual-View Gradient Forensics. Electronics, 15(5), 1010. https://doi.org/10.3390/electronics15051010

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop