Privacy-Preserving Byzantine-Tolerant Federated Learning Scheme in Vehicular Networks

Liu, Shaohua; Hou, Jiahui; Shen, Gang

doi:10.3390/electronics14153005

Open AccessArticle

Privacy-Preserving Byzantine-Tolerant Federated Learning Scheme in Vehicular Networks

by

Shaohua Liu

¹,

Jiahui Hou

² and

Gang Shen

^2,*

¹

Department of Management Engineering and Equipment Economics, Naval University of Engineering, Wuhan 430030, China

²

School of Computer Science, Hubei University of Technology, Wuhan 430068, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(15), 3005; https://doi.org/10.3390/electronics14153005

Submission received: 18 June 2025 / Revised: 13 July 2025 / Accepted: 24 July 2025 / Published: 28 July 2025

(This article belongs to the Special Issue Cryptography in Internet of Things)

Download

Browse Figures

Versions Notes

Abstract

With the rapid development of vehicular network technology, data sharing and collaborative training among vehicles have become key to enhancing the efficiency of intelligent transportation systems. However, the heterogeneity of data and potential Byzantine attacks cause the model to update in different directions during the iterative process, causing the boundary between benign and malicious gradients to shift continuously. To address these issues, this paper proposes a privacy-preserving Byzantine-tolerant federated learning scheme. Specifically, we design a gradient detection method based on median absolute deviation (MAD), which calculates MAD in each round to set a gradient anomaly detection threshold, thereby achieving precise identification and dynamic filtering of malicious gradients. Additionally, to protect vehicle privacy, we obfuscate uploaded parameters to prevent leakage during transmission. Finally, during the aggregation phase, malicious gradients are eliminated, and only benign gradients are selected to participate in the global model update, which improves the model accuracy. Experimental results on three datasets demonstrate that the proposed scheme effectively mitigates the impact of non-independent and identically distributed (non-IID) heterogeneity and Byzantine behaviors while maintaining low computational cost.

Keywords:

vehicular networks; federated learning; Byzantine attack; non-independent and identically distributed; median absolute deviation

1. Introduction

In recent years, the rapid development of vehicular networks has injected new vitality into intelligent transportation systems [1]. Leveraging efficient communication and collaborative training among vehicles, vehicular networks not only enable real-time environmental perception but also provide robust technical support for critical applications such as autonomous driving, intelligent route planning, and dynamic traffic flow optimization [2,3,4]. In the federated learning training process, clients (e.g., vehicles) perform local model training using their datasets and then upload the locally computed gradients to the server for global model updates. This process is conducted entirely independently by the clients without server involvement, ensuring that local data remain stored locally and avoiding the risk of data leakage [4,5]. Due to its distributed nature, federated learning has become an ideal choice for collaborative training among vehicles [6,7,8,9]. In the existing schemes, federated learning has been applied in many fields, such as the Internet of Vehicles [10], healthcare [11], and the Internet of Things [12]. Zhang et al. [13] combines homomorphic encryption and hash key exchange to design a collusion-resistant data aggregation scheme, but complex encryption may increase system burden. Scheme [14] targets unreliable users using threshold Paillier encryption, yet also introduces significant computational overhead. Therefore, a method with low computational cost that maintains model accuracy is needed to protect data privacy.

However, the non-independent and identically distributed (non-IID) nature of data in vehicular network environments, along with potential Byzantine attacks, poses severe challenges to traditional federated learning frameworks [15,16]. In complex vehicular network scenarios, data collected by different vehicles often exhibit significant non-IID characteristics, which severely impair the global model’s convergence and generalization capabilities [8,17]. More critically, in open vehicular network environments, malicious vehicles may launch Byzantine attacks by uploading carefully crafted fake gradients, to disrupt the model training process [18]. Both data heterogeneity and Byzantine attacks cause divergent model updates during iterations, causing the boundary between benign and malicious gradients to shift continuously [19]. These issues make it difficult for traditional detection mechanisms based on static thresholds or fixed rules to accurately identify malicious nodes, ultimately resulting in significant degradation of global model performance.

In order to deal with the above problems, many works related to resisting Byzantine attacks have been proposed. For example, scheme [20] proposes the auto-weighted geometric median (AutoGM) aggregation rule, which calculates the aggregated value through an alternating optimization strategy. However, when the Non-IID hyperparameter is set too small, it may misidentify normal nodes as outliers, leading to reduced model accuracy. Scheme [21] implements decentralized defense using blockchain and a committee mechanism but is constrained by the scale of the validation dataset and communication overhead. Scheme [19] introduces the SEAR framework, employing a sampling method to detect Byzantine faults. However, improper selection of parameter dimensions in the sampling detection may result in missed malicious gradients or the erroneous removal of legitimate updates. Therefore, effectively improving the detection accuracy of Byzantine nodes while reducing false positive and false negative rates remains a critical challenge in federated learning.

To address these challenges, this paper proposes a privacy-preserving Byzantine-tolerant federated learning scheme. By adaptively calculating MAD values to set detection thresholds, our proposted scheme effectively filters out malicious gradients. The main contributions of this paper are as follows:

To address the continuous boundary shifting between benign and malicious gradients, we propose a gradient anomaly detection method based on MAD. This method calculates MAD in each round to set a gradient anomaly detection threshold, thereby achieving precise identification and dynamic filtering of malicious gradients. During the gradient aggregation and model update phase, the identified malicious gradients are excluded, and only benign gradients are selected to participate in the global model update, thereby mitigating the impact of Byzantine attacks.
To ensure data security during transmission, we implement parameter obfuscation prior to uploading. This method effectively protects data privacy during gradient transmission without compromising model accuracy.
We conduct simulation experiments on three different datasets. The results demonstrate that the MAD-based gradient detection method can effectively filter out malicious gradients. Compared to existing approaches, it introduces no additional communication overhead and exhibits lower computational costs.

The remaining part of the paper is organized as follows. In Section 2, we present the details of the proposed scheme, followed by the security and privacy analysis in Section 3. Section 4 conducts the experimental evaluation. Finally, we conclude this paper in Section 5.

2. The Proposed Scheme

Our proposed scheme primarily addresses Byzantine attacks in vehicular networks under non-IID scenarios while employing obfuscation factors to protect vehicle privacy. The system model involves three entities: trusted third party (TTP), traffic cloud server (TCS), and vehicles. In this model, TTP is responsible for generating obfuscation factors for vehicles, while TCS serves as the aggregator. In this section, we provide a detailed description of the proposed scheme, as shown in Figure 1. For convenience, we list the main notations for this paper in Table 1.

2.1. System Initialization

At the beginning of training, TTP performs initialization configuration for the entire system. Given security parameters

k_{1}

and

k_{2}

, TTP first generates a set of obfuscation factors

(a_{1}, a_{2}, \dots, a_{i}, \dots, a_{N})

, where each

a_{i}

corresponds to a vehicle client

V_{i}

. Simultaneously, TTP generates large prime numbers

ε_{1}

and

ε_{2}

with security length

k_{1}

and a large prime number p with security length

k_{2}

. Subsequently, TTP distributes these parameters to the vehicles and TCS before exiting the system.

Each vehicle

V_{i}

receives a set of parameters

< a_{i}, ε_{1}, p >

, where the obfuscation factor

a_{i}

is unique to

V_{i}

, while

ε_{1}

and p are shared public parameters for all vehicles. TCS only receives parameter

ε_{2}

. All parameters are transmitted through secure channels. Before each training round begins, TCS updates the global model

w^{k}

and broadcasts it to all vehicles, where k denotes the current training round.

2.2. Gradient Deviations Calculation

In this phase,

V_{i}

obtains the guidance gradient through secure aggregation by TCS and calculates the gradient deviation based on the guidance gradient.

Calculating the Guidance Gradient: Vehicle $V_{i}$ trains its local model based on local data and obtains the local gradient $g_{i}^{k}$ . Since directly uploading gradients may leak private information, we use obfuscation factors to perturb the gradient, as shown in Equation (1). We define the local gradients obfuscated by vehicles using obfuscation factors as obfuscated gradient.

${\tilde{g}}_{i}^{k} = (ε_{1} g_{i}^{k} + a_{i}) \mod p$

(1)

Next, $V_{i}$ uploads the obfuscated gradient ${\tilde{g}}_{i}^{k}$ to TCS. TCS aggregates the obfuscated gradient uploaded by all vehicles to compute the global gradient direction reference value, which incorporates obfuscation factors. In this paper, we define this value as the guidance gradient ${\tilde{g}}^{k}$ . As shown in Equation (2), the guidance gradient ${\tilde{g}}^{k}$ is calculated and then sent back to $V_{i}$ . We define the aggregated gradient after removing obfuscation factors as the “plaintext” gradient. $V_{i}$ locally recovers the “plaintext” ${\bar{g}}^{k}$ of the guidance gradient, as shown in Equation (3). To ensure Equation (3) holds, we provide constraint conditions for $ε_{1}$ . Due to the fact that gradient parameters are generally decimals, we appropriately amplify the gradient to mitigate the impact of confusion factors.

${\tilde{g}}^{k} = \frac{1}{N} \sum_{i \in N} {\tilde{g}}_{i}^{k} \mod p$

(2)

${\bar{g}}^{k} = \frac{10^{θ} {\tilde{g}}^{k} - 10^{θ} {\tilde{g}}^{k} \mod ε_{1}}{10^{θ} ε_{1}}, ε_{1} > \sum_{i = 1}^{N} a_{i}$

(3)
Uploading Gradient Deviation: After $V_{i}$ obtaining the “plaintext” guidance gradient ${\bar{g}}^{k}$ , $V_{i}$ calculates the gradient deviation $D_{i}^{k}$ , as shown in Equation (4).

$D_{i}^{k} = \sum_{j = 1}^{l e n (g)} |g_{i j}^{k} - {\bar{g}}_{j}^{k}|$

(4)

where $l e n (g)$ represents the number of parameters in the gradient, where $g_{i j}^{k} \in g_{i}^{k}, {\bar{g}}_{j}^{k} \in {\bar{g}}^{k}$ . $V_{i}$ then obfuscates the gradient deviation as in Equation (5) and sends the obfuscated gradient deviation ${\tilde{D}}_{i}^{k}$ to TCS.

{\tilde{D}}_{i}^{k} = (ε_{2} D_{i}^{k} + a_{i}) \mod p

(5)

2.3. Identifying Byzantine Nodes

At this phase, we utilize MAD to set thresholds for filtering malicious gradients, enhancing the system’s resilience against Byzantine nodes, as it has been proven to be more robust than standard deviation [22], as shown in Algorithm 1.

Calculating MAD: TCS receives the obfuscated gradient deviations uploaded by participating vehicles in this training round. Using parameter $ε_{2}$ , TCS recovers all gradient deviations, as shown in Equation (6). To ensure Equation (6) holds, we also provide constraints for $ε_{2}$ . To compute MAD, TCS first obtains the median $D_{m e d}^{k}$ of all vehicle gradient deviations, then calculates the absolute deviations between each vehicle’s gradient deviation and this median, and, finally, takes the median of these absolute deviations to obtain $D_{M A D}^{k}$ , as shown in Equation (7).

$D_{i}^{k} = \frac{10^{θ} {\tilde{D}}_{i}^{k} - 10^{θ} {\tilde{D}}_{i}^{k} \mod ε_{2}}{10^{θ} ε_{2}}, ε_{2} > \sum_{i = 1}^{N} a_{i}$

(6)

$\{\begin{matrix} D_{m e d}^{k} = m e d i a n (D_{i}^{k}, \forall i \in N) \\ D_{M A D}^{k} = m e d i a n (|D_{i}^{k} - D_{m e d}^{k}|, \forall i \in N) \end{matrix}$

(7)
Marking Nodes: To detect Byzantine nodes, we set upper and lower bounds for the gradient deviation threshold. The upper threshold is $δ_{1} = D_{m e d}^{k} + c \times D_{M A D}^{k}$ , and the lower threshold is $δ_{1} = D_{m e d}^{k} - c \times D_{M A D}^{k}$ . There is a fixed conversion relationship between MAD and standard deviation $σ$ , where $σ \approx 1.4826 \times M A D$ . When $c = 1$ , the threshold covers approximately 85% of normally distributed data, helping to filter outliers and mitigate the impact of extreme values. Thus, we set $c = 1$ . A label is introduced to mark Byzantine nodes, as shown in Equation (8). Gradient deviation values outside the threshold range are considered outliers, and the corresponding node is labeled 0; otherwise, it is labeled 1. Nodes labeled 0 are identified as Byzantine nodes, and their uploaded gradients are treated as malicious gradients.

f_{i} = \{\begin{matrix} 1, δ_{1} < D_{i}^{k} < δ_{2} \\ 0, D_{i}^{k} < δ_{1} o r D_{i}^{k} > δ_{2} \end{matrix}

(8)

Algorithm 1 Identify Byzantine Nodes

Input: Gradient deviations ${D_{i}^{k}}_{i = 1}^{N}$
Output: Byzantine nodes

1:: Initialize $N u m \leftarrow \emptyset$
2:: Compute $D_{med}^{k}$ and $D_{MAD}^{k}$
3:: $δ_{2} \leftarrow D_{med}^{k} + c \times D_{MAD}^{k}$
4:: $δ_{1} \leftarrow D_{med}^{k} - c \times D_{MAD}^{k}$
5:: for $i = 1$ to N do
6:: if $δ_{1} < D_{i}^{k} < δ_{2}$ then
7:: $N u m$ .append(i)
8:: end if
9:: end for
10:: return $N ∖ N u m$

2.4. Gradient Aggregation and Model Update

In Section 2.3, we mark benign nodes and Byzantine nodes separately, with the number of benign nodes being

N u m = \sum_{i = 1}^{N} f_{i}

. Next, we filter out malicious gradients and retain only benign gradients for aggregation, as shown in Equation (9). The aggregated gradient

{\tilde{g}}_{g l o b a l}^{k}

obtained in Equation (9) still contains obfuscation factors. Using the same method as in Section 2.2, we remove the obfuscation factors to obtain the “plaintext” gradient

{\bar{g}}_{g l o b a l}^{k}

, as shown in Equation (10). Finally, TCS updates the global model by executing Equation (11).

{\tilde{g}}_{g l o b a l}^{k} = \frac{1}{N u m} \sum_{i \in N u m} {\tilde{g}}_{i}^{k} \mod p

(9)

{\bar{g}}_{g l o b a l}^{k} = \frac{10^{θ} {\tilde{g}}_{g l o b a l}^{k} - 10^{θ} {\tilde{g}}_{g l o b a l}^{k} \mod ε_{1}}{10^{θ} ε_{1}}

(10)

w^{k + 1} = w^{k} - η \times {\bar{g}}_{g l o b a l}^{k}

(11)

3. Security and Privacy Analysis

Theorem 1.

Gradient information is secure, the original gradient information cannot be inferred.

Proof of Theorem 1

V_{i}

protects its local gradient using obfuscation as Equation

{\tilde{g}}_{i}^{k} = ε_{1} g_{i}^{k} + a_{i}

. TCS receives all obfuscated gradients to compute the guidance gradient “ciphertext”

{\tilde{g}}^{k}

. The guidance gradient “plaintext” is computed locally by the vehicle as follows:

\begin{matrix} {\bar{g}}^{k} = \frac{1 0^{θ} {\tilde{g}}^{k} - 1 0^{θ} {\tilde{g}}^{k} \mod ε_{1}}{1 0^{θ} ε_{1}} = \frac{1 0^{θ} (\frac{1}{N} \sum_{i \in N} {\tilde{g}}^{k}) - 1 0^{θ} (\frac{1}{N} \sum_{i \in N} {\tilde{g}}^{k}) \mod ε_{1}}{1 0^{θ} ε_{1}} \\ = \frac{1 0^{θ} (\frac{ε_{1}}{N} \sum_{i \in N} g_{i}^{k} + \frac{1}{N} \sum_{i \in N} a_{i}) - 1 0^{θ} (\frac{ε_{1}}{N} \sum_{i \in N} g_{i}^{k} + \frac{1}{N} \sum_{i \in N} a_{i}) \mod ε_{1}}{1 0^{θ} ε_{1}} \\ = \frac{1}{N} \sum_{i \in N} g_{i}^{k} \end{matrix}

(12)

V_{i}

then calculates the gradient deviation

D_{i}^{k} = \sum_{j = 1}^{l e n (g)} |g_{i j}^{k} - {\bar{g}}_{j}^{k}|

based on the guidance gradient “plaintext” and uploads the obfuscated gradient deviation to TCS. Under our security assumptions, adversary

A

may attack TCS to obtain collected parameters

{\tilde{g}}_{i}^{k}

,

{\tilde{g}}^{k}

and

D_{i}^{k}

, attempting to infer

V_{i}

’s private information. To recover

g_{i}^{k}

, adversary

A

must solve the following equation:

g_{i}^{k} \equiv ({\tilde{g}}_{i}^{k} - a_{i}) \cdot ε_{1}^{- 1} (\mod p)

(13)

Since

ε_{1}

and p are public, adversary

A

can compute

ε_{1}^{- 1}

independently. However,

a_{i}

is a private random obfuscation factor distributed by TTP to vehicle

V_{i}

, which adversary

A

cannot obtain. Even if adversary

A

collects multiple obfuscated gradient

{\tilde{g}}_{i}^{k}

, each

g_{i}^{k}

corresponds to a unique

a_{i}

, making it impossible to eliminate

a_{i}

by solving simultaneous equations. Similarly, since the local gradient

g_{i}^{k}

and guidance gradient “plaintext”

\bar{g}

remain unknown, even if adversary

A

obtains

V_{i}

’s gradient deviation

D_{i}^{k}

, the local gradient cannot be inferred. Thus, the proposed scheme effectively protects the security of

V_{i}

’s local gradient. □

Theorem 2.

Parameters transmitted over communication channels are privacy-preserving.

Proof of Theorem 2

Both the local gradient and gradient deviation from

V_{i}

are obfuscated locally before being uploaded to TCS. Due to the randomness of the obfuscation factors, no external attacker or third party (including TCS) can accurately infer the original gradient information of

V_{i}

from the obfuscated data. Furthermore, the obfuscation factor

a_{i}

is generated and distributed exclusively by the trusted TTP and remains undisclosed, preventing adversary

A

from inferring the true transmitted data through statistical methods. Therefore, the transmitted parameters are protected and satisfy all privacy requirements. □

Theorem 3.

The MAD-based threshold detection method can effectively identify Byzantine nodes.

Proof of Theorem 3

By using MAD instead of the mean for detection and assuming benign nodes consistently form the majority (>50%), the median gradient deviation must originate from benign nodes. While benign nodes’ gradient deviations follow a normal distribution, Byzantine nodes’ deviations appear as outliers. Compared to the mean, the median remains unaffected by extreme values, making MAD more sensitive to outliers and consequently more effective at identifying Byzantine nodes. □

4. Experiments

In this section, we evaluate the performance of the proposed scheme through both theoretical analysis and experimental evaluation under a general FL setup [23]. In our experimental evaluation, the considered Byzantine attack type is Gaussian attack [24], where each element in the local gradients uploaded by Byzantine nodes follows a Gaussian distribution

N (μ, σ^{2}) = N (0, 4)

.

4.1. Experimental Setup and Data Sets

Our experiments are conducted in a Python 3.10 on laptop equipped with 12th Gen Intel(R) Core(TM) i9-12900H. We employ a CNN model for federated learning training, consisting of two convolutional layers and two fully connected layers. The data partitioning follows a Non-IID approach based on the Dirichlet distribution [17], with the hyperparameter

α

set to 0.5 to control the degree of data heterogeneity. The setting of

α = 0.5

generates a data distribution that is neither entirely random nor heavily skewed, effectively mimicking the moderately skewed distributions typically encountered in real-world vehicular networks. We take the MNIST dataset partitioning as an example, as shown in Figure 2. Additional hyperparameter settings are detailed in Table 2.

For dataset selection, we use three different real-world datasets: MNIST [25], Fashion-MNIST [25], and CIFAR-10 [25]. The MNIST dataset is a simple handwritten digit dataset containing 60,000 training images and 10,000 test images, each sized

28 \times 28

pixels in grayscale. The Fashion-MNIST dataset is an apparel classification dataset consisting of 60,000 training images and 10,000 test images, each being a

28 \times 28

grayscale image covering 10 categories of clothing items. The CIFAR-10 dataset includes 50,000 training images and 10,000 test images, divided into 10 distinct categories, with each image being a

3 \times 32 \times 32

natural image. For comparison, we implement the FedAvg algorithm [26] and the Byzantine attack detection scheme DisBezant [27] as baseline methods.

4.2. Communication Overhead

Our proposed scheme’s communication overhead involves interactions between vehicles and the TCS. During each training round,

V_{i}

uploads obfuscated gradient

{\tilde{g}}_{i}^{k}

and gradient deviation

{\tilde{D}}_{i}^{k}

to TCS, resulting in a communication overhead of

O (2 l e n (g))

. To eliminate the obfuscation factors from the aggregated gradient,

V_{i}

also needs to send aggregated gradient to TCS, incurring an additional communication overhead of

O (l e n (g))

. TCS receives all uploaded obfuscated gradients from vehicles and then computes and broadcasts the guidance gradient

{\tilde{g}}^{k}

to all vehicles, with a communication overhead of

O (N l e n (g))

. TCS then aggregates benign gradients and updates the global model, resulting in a communication overhead of

O (N l e n (g))

. Therefore, the total communication overhead per training round between vehicles and TCS is

O ((2 N + 3) l e n (g))

.

In DisBezant [27], ships locally compute two obfuscated gradients and upload them to the server, resulting in a communication overhead of

O (2 l e n (g))

. Sending aggregated gradients to the server incurs an additional communication overhead of

O (l e n (g))

. The server sends aggregated gradient and similarity verification requests based on the received obfuscated parameters from ships, with a communication overhead of

O (N l e n (g))

, and then broadcasts the aggregated obfuscated gradients to all ships, resulting in a communication overhead of

O (N l e n (g))

. Thus, the total communication overhead per training round between ships and the server is

O ((2 N + 3) l e n (g))

. Although the proposed scheme and scheme DisBezant [27] have the same communication overhead, the proposed scheme demonstrates significant advantages in the subsequent computational cost analysis.

4.3. Computational Cost

In this section, we analyze the computational cost of the vehicles and TCS. On the vehicle side, to protect gradient privacy,

V_{i}

first obfuscates gradient

g_{i}^{k}

using obfuscation factors; to identify Byzantine nodes,

V_{i}

also needs to calculate gradient deviation

{\tilde{D}}_{i}^{k}

. The computational cost for calculating local gradient and gradient deviation is

O (2 l e n (g))

. On TCS side, TCS first receives obfuscated gradients uploaded by all vehicles and then computes the guidance gradient

{\tilde{g}}^{k}

, with a computational cost of

O (N l e n (g))

. TCS also needs to calculate MAD for Byzantine node detection, incurring a computational cost of

O (N l o g N)

. Finally, during the gradient aggregation and model update phase, TCS aggregates gradients to update the global model, resulting in a computational cost of

O (N l e n (g) + l e n (g))

. Therefore, in the proposed scheme, the total computational cost per training round is

O (3 l e n (g) + 2 N l e n (g) + N l o g N)

.

In DisBezant [27], ships need to compute obfuscated gradient parameters and gradient similarity, resulting in a computational cost of

O (3 l e n (g))

. The computational cost for ships reaches

O (3 l e n (g))

. On the server side, the computational cost for calculating the aggregated gradient is

O (N l e n (g))

; the computational cost for verifying gradient similarity and updating the contribution value for each ship is

O (3 N l e n (g) + l e n (g))

; and finally, aggregating ship gradients incurs a computational cost of

O (N l e n (g))

. Thus, in DisBezant [27], the total computational cost per training round is

O (4 l e n (g) + 5 N l e n (g))

.

Figure 3 compares the training time between our proposed scheme and DisBezant [27] across client and server components. Scheme DisBezant [27] incurs significant CPU overhead due to its sequential gradient obfuscation and similarity computation loops on vessel nodes, while server operations benefit from GPU-accelerated tensor processing that reduces latency. In our proposted scheme, TCS handles computationally intensive tasks including gradient aggregation, guidance gradient computation, and MAD analysis operations that inherently require more processing time due to their large-scale tensor computations. However, vehicle nodes only perform gradient obfuscation, contributing to our scheme’s overall runtime efficiency advantage. As shown in Figure 4, when scaling from 20 to 90 participating vehicles, the per-vehicle computation time stabilizes while TCS processing time increases linearly. This scalability pattern occurs because TCS must expend additional computation for MAD-based malicious gradient filtering as more vehicles submit their gradients.

4.4. Accuracy

Under the client setting of 20 nodes, we evaluate the accuracy across three datasets with 30% malicious nodes participating in training. Figure 5 shows the accuracy comparison after 50 iterations between the FedAvg algorithm, scheme DisBezant [27], and our proposed scheme on MNIST, Fashion-MNIST, and CIFAR-10. From Figure 5a, it can be seen that the initial accuracy of the proposed scheme is close to that of scheme DisBezant [27], with both being lower than the accuracy of the FedAvg algorithm without attacks. However, around the 20th iteration, the accuracy of our proposed scheme gradually converges with no-attack FedAvg and surpasses the accuracy of scheme DisBezant [27]. Figure 5b shows that in the first 30 iterations, the proposed scheme has the lowest accuracy, but after 30 iterations, its accuracy gradually exceeds the other three baselines. Figure 5c demonstrates that while FedAvg suffers significant fluctuations due to data heterogeneity, both our scheme and scheme DisBezant [27] maintain stable accuracy improvements. From Figure 5, it can be seen that the FedAvg algorithm lacks resistance to Byzantine attacks and cannot meet the model accuracy requirements.

As shown in Figure 6, we plotted the relationship between model training convergence and the number of iterations. From Figure 6a, it can be seen that three curves on the MNIST dataset are nearly overlapping. From Figure 6b,c, it can be observed that on the Fashion-MNIST and CIFAR-10 datasets, both our proposed scheme and scheme DisBezant [27] converge more slowly than the traditional FedAvg algorithm. This is because the FedAvg algorithm does not account for malicious nodes, whereas our proposed scheme and the scheme DisBezant [27] need to identify Byzantine nodes. Additionally, the process of filtering malicious gradients or adjusting node trustworthiness removes some nodes’ gradient updates, which affects the convergence of model training to some extent.

Finally, we conducted experiments with malicious node ratios ranging from 0% to 30%. As shown in Figure 7a–c, as the proportion of malicious nodes increases, the final model accuracy decreases slightly, but the global model accuracy is not significantly affected. This result verifies the resilience of our proposed scheme against malicious nodes, as it maintains stable model performance under varying numbers of malicious nodes.

5. Conclusions

In federated learning systems, data heterogeneity and potential Byzantine attacks continuously blur the boundary between benign and malicious gradients. To address security and privacy concerns in vehicular networks, this paper proposes a privacy-preserving Byzantine-tolerant federated learning scheme that utilizes gradient deviation to distinguish between benign and malicious gradients. Specifically, we replace standard deviation with MAD to dynamically set gradient detection thresholds, enabling efficient filtering of malicious gradients without introducing additional computational cost. Furthermore, we employ obfuscation factors to protect gradient privacy. Experimental results demonstrate that the proposed scheme maintains model accuracy and training convergence while achieving low computational cost and communication overhead.

However, the proposed scheme still has some limitations. First, the proposed scheme heavily relies on TTP for system initialization and obfuscation factors distribution. In practical deployment, this centralized design may introduce single-point-of-failure risks, and the assumption of a fully trusted TTP may be unrealistic in open environments. Second, as the number of participating vehicles increases, the computational complexity of TCS increases significantly, posing challenges for real-time processing in resource-constrained edge computing scenarios, especially for large-scale deployments. Finally, while obfuscation factors are employed to protect gradient privacy, this approach depends on specific constraint conditions that may not hold consistently in dynamic networks.

In the future, we will focus on addressing these limitations by designing decentralized trust mechanisms to eliminate reliance on a single trusted third party. Concurrently, we will develop lightweight dynamic Byzantine detection algorithms to improve system scalability for large-scale networks with dynamic node participation.

Author Contributions

Conceptualization and supervision by G.S.; investigation, data collection, formal analysis, and writing—original draft preparation by J.H.; writing—review and editing by S.L. and G.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Guiding Program of Scientific Research Plan of Hubei Province under Grant B2023033, and the Natural Science Foundation of Hubei Province under Grant 2023AFB951.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Chen, T.; Yan, J.; Sun, Y.; Zhou, S.; Gündüz, D.; Niu, Z. Mobility accelerates learning: Convergence analysis on hierarchical federated learning in vehicular networks. IEEE Trans. Veh. Technol. 2025, 74, 1657–1673. [Google Scholar] [CrossRef]
Wang, N.; Yang, W.; Wang, X.; Wu, L.; Guan, Z.; Du, X.; Guizani, M. A blockchain based privacy-preserving federated learning scheme for Internet of Vehicles. Digit. Commun. Netw. 2024, 10, 126–134. [Google Scholar] [CrossRef]
Wang, D.; Yi, Y.; Yan, S.; Wan, N.; Zhao, J. A node trust evaluation method of vehicle-road-cloud collaborative system based on federated learning. Ad Hoc Netw. 2023, 138, 103013. [Google Scholar] [CrossRef]
Feng, X.; Liu, H.; Yang, H.; Xie, Q.; Wang, L. Batch-aggregate: Efficient aggregation for private federated learning in VANETs. IEEE Trans. Dependable Secur. Comput. 2024, 21, 4939–4952. [Google Scholar] [CrossRef]
Tang, Y.; Ni, L.; Li, J.; Zhang, J.; Liang, Y. Federated learning based on dynamic hierarchical game incentives in Industrial Internet of Things. Adv. Eng. Inform. 2025, 65, 103214. [Google Scholar] [CrossRef]
Wang, Y.; Zhai, D.; Xia, Y. RFVIR: A robust federated algorithm defending against Byzantine attacks. Inf. Fusion 2024, 105, 102251. [Google Scholar] [CrossRef]
Blanchard, P.; El Mhamdi, E.M.; Guerraoui, R.; Stainer, J. Machine learning with adversaries: Byzantine tolerant gradient descent. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS ’17, Long Beach, CA, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 118–128. [Google Scholar]
Colosimo, F.; De Rango, F. Dynamic gradient filtering in federated learning with Byzantine failure robustness. Future Gener. Comput. Syst. 2024, 160, 784–797. [Google Scholar] [CrossRef]
Gupta, N.; Liu, S.; Vaidya, N.H. Byzantine fault-tolerant distributed machine learning using stochastic gradient descent (SGD) and norm-based comparative gradient elimination (CGE). arXiv 2020, arXiv:2008.04699v2. [Google Scholar]
Yange, C.; Baocang, W.; Hang, J.; Pu, D.; Yuan, P.; Zhiyong, H. PEPFL: A framework for a practical and efficient privacy-preserving federated learning. Digit. Commun. Netw. 2024, 10, 355–368. [Google Scholar]
Zhang, L.; Fang, G.; Tan, Z. FedCCW: A privacy-preserving Byzantine-robust federated learning with local differential privacy for healthcare. Clust. Comput. 2025, 28, 182. [Google Scholar] [CrossRef]
Arazzi, M.; Nicolazzo, S.; Nocera, A. A fully privacy-preserving solution for anomaly detection in IoT using federated learning and homomorphic encryption. Inf. Syst. Front. 2025, 27, 367–390. [Google Scholar] [CrossRef]
Zhang, M.; Chen, S.; Shen, J.; Susilo, W. Privacy EAFL: Privacy-enhanced aggregation for federated learning in mobile crowdsensing. IEEE Trans. Inf. Forensics Secur. 2023, 18, 5804–5816. [Google Scholar] [CrossRef]
Li, Y.; Li, H.; Xu, G.; Huang, X.; Lu, R. Efficient privacy-preserving federated learning with unreliable users. IEEE Internet Things J. 2022, 9, 11590–11603. [Google Scholar] [CrossRef]
Leng, J.; Li, R.; Xie, J.; Zhou, X.; Li, X.; Liu, Q.; Chen, X.; Shen, W.; Wang, L. Federated learning-empowered smart manufacturing and product lifecycle management: A review. Adv. Eng. Inform. 2025, 65, 103179. [Google Scholar] [CrossRef]
Sun, B.; Song, X.; Tu, Y.; Liu, M. FedAgent: Federated learning on non-IID data via reinforcement learning and knowledge distillation. Expert Syst. Appl. 2025, 285, 127973. [Google Scholar] [CrossRef]
Chen, X.; Tian, Y.; Wang, S.; Yang, K.; Zhao, W.; Xiong, J. DBFL: Dynamic Byzantine-robust privacy preserving federated learning in heterogeneous data scenario. Inf. Sci. 2025, 700, 121849. [Google Scholar] [CrossRef]
Li, X.; Li, Y.; Wan, H.; Wang, C. Enhancing Byzantine robustness of federated learning via tripartite adaptive authentication. J. Big Data 2025, 12, 121. [Google Scholar] [CrossRef]
Zhao, L.; Jiang, J.; Feng, B.; Wang, Q.; Shen, C.; Li, Q. SEAR: Secure and efficient aggregation for byzantine-robust federated learning. IEEE Trans. Dependable Secur. Comput. 2022, 19, 3329–3342. [Google Scholar] [CrossRef]
Li, S.; Ngai, E.; Voigt, T. Byzantine-robust aggregation in federated learning empowered industrial IoT. IEEE Trans. Ind. Inform. 2023, 19, 1165–1175. [Google Scholar] [CrossRef]
Xu, G.; Lei, L.; Mao, Y.; Li, Z.; Chen, X.; Zhang, K. CBRFL: A framework for committee-based Byzantine-resilient federated learning. J. Netw. Comput. Appl. 2025, 238, 104165. [Google Scholar] [CrossRef]
Leys, C.; Ley, C. Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median. J. Exp. Soc. Psychol. 2013, 49, 764–766. [Google Scholar] [CrossRef]
Shenoy, D.; Bhat, R.; Prakasha, K. Exploring privacy mechanisms and metrics in federated learning. Artif. Intell. Rev. 2025, 58, 223. [Google Scholar] [CrossRef]
Gouissem, A.; Hassanein, S.; Abualsaud, K.; Yaacoub, E.; Mabrok, M.; Abdallah, M.; Khattab, T.; Guizani, M. Low complexity Byzantine-resilient federated learning. IEEE Trans. Inf. Forensics Secur. 2025, 20, 2051–2066. [Google Scholar] [CrossRef]
Zhao, P.; Cao, Z.; Jiang, J.; Gao, F. Practical private aggregation in federated learning against inference attack. IEEE Internet Things J. 2023, 10, 318–329. [Google Scholar] [CrossRef]
McMahan, H.B.; Moore, E.; Ramage, D.; Hampson, S.; y Arcas, B.A. Communicationefficient learning of deep networks from decentralized data. In Proceedings of the 20 th International Conference on Artificial Intelligence and Statistics, AISTATS, Fort Lauderdale, FL, USA, 20–22 April 2017. [Google Scholar]
Ma, X.; Jiang, Q.; Shajafar, M.; Alazab, M.; Kumar, S.; Kumari, S. DisBezant: Secure and robust federated learning against Byzantine attack in IoT-enabled MTS. IEEE Trans. Intell. Transp. Syst. 2023, 24, 2492–2502. [Google Scholar] [CrossRef]

Figure 1. The core process of the proposed scheme.

Figure 2. Non-IID distribution among client training sets.

Figure 3. Running time of each entity.

Figure 4. The running time of each entity as the number of vehicles increases. (a) shows the running time per vehicle, while (b) shows the running time of TCS.

Figure 5. Accuracy with 30% malicious node participation. (a) MNIST. (b) Fashion-MNIST. (c) CIFAR-10.

Figure 6. Model training convergence evaluation. (a) MNIST. (b) Fashion-MNIST. (c) CIFAR-10.

Figure 7. Accuracy rate with 10% to 30% malicious node participation. (a) MNIST. (b) Fashion-MNIST. (c) CIFAR-10.

Table 1. Description of notations.

Name	Expression
$g_{i}^{k}$	Local gradient of $V_{i}$ in the k-th round
${\tilde{g}}^{k}$	Guidance gradient
$D_{i}^{k}$	Gradient deviation of $V_{i}$ in the k-th round
$D_{M A D}^{k}$	Median absolute deviation
$f_{i}$	$V_{i}$ ’s label
$w_{k}$	Global model in the k-th round
${\tilde{g}}_{g l o b a l}^{k}$	Aggregated gradient

Table 2. Experiments configuration.

Hyperparameters
Learning rate	0.001
Rounds	50
Epochs	1
$α$	0.5
Batchsize	64 (MNIST, Fahion-MNIST)/32(CIFAR-10)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, S.; Hou, J.; Shen, G. Privacy-Preserving Byzantine-Tolerant Federated Learning Scheme in Vehicular Networks. Electronics 2025, 14, 3005. https://doi.org/10.3390/electronics14153005

AMA Style

Liu S, Hou J, Shen G. Privacy-Preserving Byzantine-Tolerant Federated Learning Scheme in Vehicular Networks. Electronics. 2025; 14(15):3005. https://doi.org/10.3390/electronics14153005

Chicago/Turabian Style

Liu, Shaohua, Jiahui Hou, and Gang Shen. 2025. "Privacy-Preserving Byzantine-Tolerant Federated Learning Scheme in Vehicular Networks" Electronics 14, no. 15: 3005. https://doi.org/10.3390/electronics14153005

APA Style

Liu, S., Hou, J., & Shen, G. (2025). Privacy-Preserving Byzantine-Tolerant Federated Learning Scheme in Vehicular Networks. Electronics, 14(15), 3005. https://doi.org/10.3390/electronics14153005

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Privacy-Preserving Byzantine-Tolerant Federated Learning Scheme in Vehicular Networks

Abstract

1. Introduction

2. The Proposed Scheme

2.1. System Initialization

2.2. Gradient Deviations Calculation

2.3. Identifying Byzantine Nodes

2.4. Gradient Aggregation and Model Update

3. Security and Privacy Analysis

4. Experiments

4.1. Experimental Setup and Data Sets

4.2. Communication Overhead

4.3. Computational Cost

4.4. Accuracy

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI