4.1. Design Goals and End-to-End Protocol
SecureFedGuard is built for the practical FL regime where the server must learn from untrusted updates while respecting the privacy boundary implied by local data isolation and, optionally, secure aggregation. The methodology is guided by three design goals. (G1) Integrity without data access: the server should reject or down-weight obviously non-realizable updates without inspecting any client data . (G2) Backdoor resilience under non-IID: defenses must not confuse benign heterogeneity with attacks, and must suppress targeted triggers while maintaining high clean accuracy. (G3) Deployment compatibility: the defense should remain compatible with secure aggregation and incur modest overhead.
To meet these goals, SecureFedGuard uses two orthogonal security signals, each addressing a different failure mode of robust aggregation. First, trajectory consistency checks whether a reported update is consistent with a plausible local optimization trace, which is effective against integrity-evasive attacks that craft arbitrary vectors. Second, geometric persistence tracks whether a client repeatedly deviates from the dominant benign update geometry across rounds, which is effective against optimization-based poisoning and stealthy backdoors that can mimic local SGD dynamics in a single round.
Protocol overview. At the beginning of round
t, the server broadcasts the current model
and a public seed that determines a linear sketch map
(Equation (
9)) shared by all participants. Each selected client
performs local training starting from
and produces the model delta
. In addition to
, the client computes and transmits two compact fingerprints: an update sketch
and a cumulative gradient-trace sketch
. These sketches are low-dimensional (
) and are designed to be inexpensive, while still enabling meaningful consistency checks at the server.
On the server, SecureFedGuard proceeds in three stages (
Section 4.2,
Section 4.3 and
Section 4.4). First, it computes a DVA gate
from the discrepancy between
and
, producing a soft integrity score per client. Second, it performs cross-round spectral forensics: using the current set of updates, it estimates a robust center
and a low-rank benign subspace projector
, measures the residual energy
, and updates a persistence memory
via an EMA. These quantities determine a residual shrink factor
and a sanitized update
that preserves within-subspace learning while attenuating anomalous residual components. Third, SecureFedGuard aggregates sanitized updates with a
security-aware robust rule: it combines DVA and persistence into a scalar weight
and applies an adaptive, coordinate-wise trimmed mean whose trimming level depends on an estimated corruption indicator
, yielding the global update
and
.
Secure-aggregation (SA) mode. In cross-device deployments, we consider a standard secure aggregation protocol in which the server learns only an aggregate of masked updates. SecureFedGuard operates in two phases each round. First, each selected client sends plaintext sketches
(Equation (
10)); the server computes DVA gates and sketch-space forensics scores and then returns a per-client scalar multiplier
(and optionally an allowlist) over an authenticated channel. Second, only allowlisted clients participate in secure aggregation and contribute the masked, locally scaled update
, so the server observes only
and updates with a weighted mean. Dropout is handled by the underlying secure aggregation protocol: if a client drops after sending sketches, its contribution is absent from the final aggregate and the server normalizes using the surviving set. Algorithm 1 provides the concrete message flow and threat assumptions.
Leakage from sketches. The server observes per-client m-dimensional randomized linear projections of both the final update and the gradient trace. These sketches can leak coarse information about the update direction and norm but are substantially lower-dimensional than ; we treat them as defense metadata rather than a privacy mechanism. If stronger privacy guarantees are required, sketches can be noise-perturbed or securely aggregated as well, which is complementary and beyond the scope of this paper.
Deployment notes. In secure aggregation, the server does not observe per-client , so coordinate-wise trimming and residual-only clipping in update space cannot be applied on plaintext updates. In SA mode, we therefore (i) compute DVA and persistence weights from plaintext sketches, (ii) optionally remove clients with very small weights before aggregation, and (iii) apply a per-client scalar shrink (derived from sketch residual energy and persistence) that clients apply locally to their update before masking, yielding a weighted-mean secure-aggregation update. If encrypted robust aggregation (e.g., trimmed mean under MPC/HE) is required, it is complementary to SecureFedGuard and not assumed here.
In the remainder of this section, we reorganize the methodology into three technical subsections: (i) DVA and its scoring/tuning, (ii) cross-round spectral forensics with residual clipping and persistence memory, and (iii) the final security-aware robust aggregation and protocol summary.
| Algorithm 1 SecureFedGuard with secure aggregation (SA mode) |
- 1:
Server selects clients and broadcasts and secure-aggregation parameters. - 2:
for each client (in parallel) do - 3:
Client runs local training from to obtain and per-step gradients . - 4:
Client computes plaintext sketches and and sends to the server. - 5:
(Optional) Client also sends its secure-aggregation setup message (e.g., ephemeral key material) as required by the SA protocol. - 6:
end for - 7:
Server computes and from (Equations ( 13)–( 15)). - 8:
Server runs the spectral/persistence pipeline in sketch space using to update and produces a scalar multiplier for each client (e.g., ). - 9:
Server broadcasts (possibly per-client) the allowlist and the corresponding scalars . - 10:
for each surviving client that completes secure aggregation do - 11:
Client locally scales its update (and applies standard norm clipping if desired), then participates in secure aggregation to send a masked . - 12:
end for - 13:
Secure aggregation returns the aggregate over the surviving set (dropouts handled by the SA protocol). - 14:
Server updates .
|
4.2. Dual-View Update Authentication via Sketch-Consistent Local Trajectories
Dual-view update authentication (DVA) targets integrity-evasive threats where a malicious client transmits an arbitrary vector
that does not correspond to any plausible local training trajectory under Equation (
2). Such attacks can be surprisingly effective because many robust aggregators treat the received update as a black box; if the attacker keeps the update within typical magnitudes or mimics coordinate-wise statistics, purely distributional filters may fail. DVA introduces an orthogonal signal: consistency between the reported update and a compact summary of the local gradient path that generated it.
Two complementary views. During local training at round
t, client
i starts from
and performs
E SGD steps. It produces the final update
and computes two sketches
where
is the public linear sketch (Equation (
9)). The first view
summarizes the final update; the second view
summarizes the cumulative gradient trace. The key point is that both are linear images in the same sketch space, so the server can compare them without reconstructing high-dimensional gradients.
Consistency model and discrepancy.
Lemma 1 (Trajectory-consistency stability)
. Let denote the first-order step direction used in Equation (2) (e.g., the stochastic gradient for SGD, or the momentum velocity for SGD with momentum). If the within-round step size is constant , then the realized update satisfies exactly, and by linearity of If the client uses a within-round learning-rate schedule
, then
and the mismatch obeys
Thus, under typical FL settings where
is constant within a round and
E is small, honest clients exhibit small
up to numerical/compression noise; we set
and
via warm-up calibration to accommodate these benign deviations. DVA therefore defines the normalized discrepancy
where
prevents instability when
is small. The numerator measures violation of the sketch-consistency relation; the denominator makes the score comparable across clients with different update scales.
Soft gating and robustness to benign heterogeneity. Instead of hard-rejecting clients (which can harm performance under non-IID data), DVA converts
into a continuous gate
where
controls tolerance. This design ensures that benign but heterogeneous clients are not discarded simply because their local updates differ in direction or magnitude; as long as their updates are self-consistent with their own gradient traces, they maintain high weights. Conversely, integrity-evasive attacks that directly craft
(e.g., sign-flip without corresponding gradient trace, arbitrary scaling that breaks the relation, or random vectors) tend to produce large discrepancies and are strongly down-weighted.
What DVA does and does not guarantee. Because the server has no access to client data
, it cannot recompute the gradient trace and DVA should be interpreted as an internal-consistency test between two client-reported views. In plaintext-update mode (when
is available), the server can recompute
to prevent spoofing of the update sketch; however, the trace sketch
remains a client report. DVA therefore reliably penalizes attacks that modify
without producing a matching trace (e.g., in-transit tampering, sign-flip applied after local training, or arbitrary random vectors), but a fully adaptive attacker can fabricate a trace sketch to satisfy
for an arbitrary malicious update. We explicitly evaluate this DVA-bypass attacker (
Section 5.2) and show that the spectral/persistence layer and robust aggregation remain effective even when DVA provides no signal.
Practical tuning and stability. We use two complementary uses of DVA in SecureFedGuard. First,
directly contributes to the security weight and influences the robust center and covariance estimation in later stages. Second, DVA provides a round-level corruption indicator
which controls the trimming strength
. In practice,
and
can be set using a short warm-up window (initial rounds assumed mostly benign) by matching a target false-positive rate on the empirical distribution of
. This preserves stability: if all clients are benign,
stays near zero, yielding minimal trimming and near-FedAvg behavior; if suspicious behavior increases, trimming strengthens automatically.
DVA does not require access to client data and never transmits raw gradients. Each selected client sends two m-dimensional sketches, and the server computes all DVA scores in . Clients compute and accumulate online during local training with negligible overhead.
4.4. Security-Aware Robust Aggregation and Protocol Summary
We now define how SecureFedGuard combines the integrity signal (DVA) and the backdoor signal (cross-round persistence) into a single aggregation rule.
Persistence-to-weight mapping. We convert the EMA memory
(Equation (
19)) into a soft persistence gate by normalizing with the current residual scale
and applying an exponential map:
where
controls how aggressively persistent anomalies are down-weighted and
is a small constant for stability.
Security weight. The final per-client security weight is the product of the DVA gate and the persistence gate:
Adaptive trimming level. We set the coordinate-wise trimming level using the DVA-based corruption indicator (Equation (
15)):
Weighted coordinate-wise trimmed mean on sanitized updates. Let
be the indices that remain after removing the top and bottom
values among
(as in Equation (
6)). We then aggregate the remaining coordinates using the security weights:
where
avoids division by zero. Finally, the server updates
where
.
Secure aggregation mode. When individual updates are hidden, Equation (
25) is replaced by a client-weighted mean computed via secure aggregation using the scalar multipliers
derived from sketches (Algorithm 1).
Hyperparameter selection. (i) Sketch dimension
m trades accuracy for bandwidth: for sign random projections, larger
m yields tighter norm/inner-product preservation; in practice
is a robust range and we use
by default. (ii) Subspace rank
r controls the capacity of the estimated benign manifold; a simple, reproducible rule is to choose the smallest
r whose cumulative explained variance in a warm-up window exceeds a threshold (e.g.,
), and we use
as a default. (iii) The EMA parameter
sets an effective memory horizon of roughly
rounds; we use
and also report a
sweep in
Section 5.4. (iv) DVA parameters
can be set from the warm-up empirical distribution of
to match a target false-positive rate.