Federated Learning for Secure In-Vehicle Communication

Ghamri, Maroua; Boumerdassi, Selma; Belmeguenai, Aissa; Yellas, Nour-El-Houda

doi:10.3390/telecom6030048

Open AccessArticle

Federated Learning for Secure In-Vehicle Communication

¹

Department of Electronics, Faculty of Science and Technology, University of Jijel, BP 98 Ouled Aissa, Jijel 18000, Algeria

²

Laboratory of Electronics, Department of Electronics, Faculty of Science and Technology, University of Skikda, Skikda 21000, Algeria

³

Department of Computer Science, School of Engineering, CNAM, 75003 Paris, France

⁴

Department of Electronics, Faculty of Science and Technology, University of Skikda, Skikda 21000, Algeria

⁵

Department of Mobile Multimedia Networks and Services, SAMOVAR Laboratory, Télécom SudParis, Institut Polytechnique de Paris, 91120 Palaiseau, France

^*

Authors to whom correspondence should be addressed.

Telecom 2025, 6(3), 48; https://doi.org/10.3390/telecom6030048

Submission received: 9 May 2025 / Revised: 6 June 2025 / Accepted: 26 June 2025 / Published: 2 July 2025

Download

Browse Figures

Versions Notes

Abstract

The Controller Area Network (CAN) protocol is one of the important communication standards in autonomous vehicles, enabling real-time information sharing across in-vehicle (IV) components to realize smooth coordination and dependability in vital activities. Without encryption and authentication, CAN reveals several vulnerabilities related to message attacks within the IV Network (IVN). Traditional centralized Intrusion Detection Systems (IDS) where all the historical data is grouped on one node result in privacy risks and scalability issues, making them unsuitable for real-time intrusion detection. To address these challenges, we propose a Deep Federated Learning (FL) architecture for intrusion detection in IVN. We propose a Bidirectional Long Short Term Memory (BiLSTM) architecture to capture temporal dependencies in the CAN bus and ensure enhanced feature extraction and multi-class classification. By evaluating our framework on three real-world datasets, we show how our proposal outperforms a baseline LSTM model from the state of the art.

Keywords:

CAN; federated learning; deep learning; centralized; decentralized; intrusion detection

1. Introduction

Connected and Autonomous Vehicles (CAVs) are rapidly becoming a reality, enabled by advances in computing, electronics, sensors, and communication. Enriched hardware and advanced sensors significantly improve the situational awareness of CAVs, while network, security, and performance improvements enable them to better interact with infrastructure and other road users [1]. CAVs require large volumes of real-time data collected from various sensors [2]. These sensors include Light Detection and Ranging (LiDAR), radar, cameras, and GPS [3] that continuously collect data on the current conditions around the vehicle, obstacles on the road, traffic flow, and pedestrians who move around, which is crucial for correct decision-making, good navigation, and safety for all in the autonomous driving systems.

To develop autonomous driving, numerous electronic control units (ECUs) are required to control sensors and communication systems. CAN bus has been a backbone of IVNs for over three decades, enabling reliable broadcast communication among ECUs [4] and between sensors and ECUs. Unlike traditional networks such as USB or Ethernet, CAN uses a different protocol to send large data blocks between nodes, with a maximum signaling rate of 1 Mbps [5]. The CAN network consists of multiple subnetworks [6] interconnected through a CAN Gateway. The CAN gateway filters and relays the CAN frames between the CAN bus and the 0n-Board Unit (OBU), which connects via a CAN-to-USB adapter [7]. The CAN gateway sends frames of CAN (comprising sensor data) for vehicle speed, brakes status on/off, emergency brake lights status on/off, and hazard detection status [7] in time intervals (i.e., every 50 ms). On the other hand, the OBU acts as an intermediate between IVNs and Vehicle to Everything (V2X). Figure 1 describes the communication process between IVNs and V2X.

As the CAN messages are not encrypted, they significantly compromise the data confidentiality leading to various cyberattacks. In an IVN, the attacks can infiltrate both the CAN bus and the OBU, which poses a severe security risk. Attackers may intercept CAN messages, manipulate data exchanged between ECU and OBU, or inject malicious commands to disrupt vehicle operations. Furthermore, attackers may compromise the OBU and ECU to retrieve sensitive information such as private data that allows illegitimate access to communications within the CAN network [8].

Deep learning models have been increasingly employed to address these security vulnerabilities. These models detect complex patterns and anomalies in network traffic, making them suitable detecting intrusions in CAN-based networks [9]. The authors in [10] describe an LSTM-based IVN-IDS to identify and mitigate attacks on the CAN bus. A custom dataset is created by collecting normal CAN traffic and injecting simulated attacks. Training and testing of this model achieved a high detection accuracy (

\tilde{9}

9.995%). A light BiLSTM-based anomaly detection with an attention mechanism proposed for efficient protection of IVS in CAN messages is proposed in [11], leveraging correlations in CAN ID sequences to identify replay attacks, DoS attacks, and fuzzing attacks. With the increasing connectivity of modern vehicles, ensuring the security of IVN communication has become critical, as connected vehicles are now exposed to various attacks. To this end, we propose a deep learning-based approach for multiclass intrusion detection for in-vehicular CAN messages.

While the centralized deep learning LSTM and BiLSTM methods demonstrate excellent performance, deployment has critical practical constraints. Specifically, they require aggregation of all the training data within a central server, which is extremely problematic in terms of privacy, security, and data administration, especially for safety-critical use cases like V2X communication. In addition, in real-world contexts, central data aggregation is often impossible due to geographic dispersal, low connectivity, and constrained bandwidth.

In contrast to centralized deep learning approaches that require sharing raw data to a central server where significant privacy and latency risks are present. The proposed approach takes advantage of federated learning, making use of the V2X edge computing environment to improve privacy and real-time processing. FL enables decentralized training in multiple vehicles or edge devices without exposing raw CAN data. This approach provides the confidentiality of sensitive driving information and supports heterogeneous data sources across different vehicle types. FL’s compatibility with real-time low-latency edge computing makes it most suitable for safety-critical automotive applications. Furthermore, our model performs multiclass classification, identifying not only the presence of an attack but also the specific type of intrusion, such as Flood, Fuzzing, and Malfunction attacks. By learning from historical time-series data, our model is trained using CAN bus traffic patterns and learns to detect anomalies in different classes. To improve real-time detection and maintain data privacy, we integrate federated learning paradigm into a V2X edge computing framework. This setup enables edge entities such as OBUs, RSUs and edge servers, and the cloud to collaboratively train models without centralizing raw data and reducing the threats of data exposure. The edge computing layer offers low-latency intrusion detection through processing close to vehicles. On the other hand, federated learning allows vehicles to share model updates without transmitting sensitive CAN bus data to a central server. By suggesting this new deep learning-based architecture and combining it with FL, this paper offers a scalable, real-time, and privacy preserving solution to defend vehicular networks against cyber-attacks in an intelligent transportation system. Our contributions can be summarized as follows:

We propose adapted-BiLSTM (or, a-BiLSTM), an adapted BiLSTM architecture to identify intrusions in in-vehicular network CAN messages.
We present a federated learning-based intrusion detection system for in-vehicular network to tackle the issues related to privacy and facilitate V2X communication in a distributed manner.
We evaluate our proposal using three real-world datasets to demonstrate its ability to accurately identify attack types and adapt to each dataset characteristics. The evaluation results show how the a-BiLSTM model outperforms the baseline LSTM architecture in a federated learning system, demonstrating superior accuracy compared to other alternatives.

The remainder of the paper is structured as follows. Section 2 consists of the necessary background and preliminaries to understand the proposed framework. Section 3 describes our proposed FL-IVN-IDS framework. Section 4 and Section 5 present the experimental evaluation and the analysis of the results. Finally, Section 6 concludes the paper with a summary of the contributions and perspective.

2. Backgrounds and Preliminaries

In this section, we provide the background on CAN messages, types of cyberattacks on the vehicle network, the role of V2X communication in future transportation, and how FL can be applied to improve security through decentralized IDS with guaranteed privacy of data.

2.1. CAN Protocol

The real-time state of a vehicle is accurately described through CAN messages, which are widely used in the automotive industry due to their low production cost and robustness to electrical noise [12]. The CAN is an ISO-certified international standard for serial communication [13]. The standard frame format in Figure 2 is used to transmit CAN messages. In the standard CAN frame format, a frame is uniquely identified by an 11-bit CAN-ID field (Identifier) [14]. The CAN protocol guarantees reliable performance even under difficult circumstances, is economical, and is resilient in fault handling [15]. It is a prominent standard in automotive and industrial applications due to its robustness, broad acceptance in the semiconductor industry, and adaptability for high-speed, short-message transmission [13]. The frame format in Figure 2 is outlined as follows.

Frame Start (SOF): use of 1 bit to present a new frame in the CAN bus [16].

Data Length Code (DLC): refers to the use of four bits to specify the number of data bytes [17].

Data field: are the data transmitted between 0 and 8 bytes (up to 64 bits) [9].

Cyclic Redundancy Check (CRC): uses 16 bits for error detection [10].

Acknowledgments (ACK): take advantage of 2 bits for signaling receipt [18].

End of Frame (EOF): marks the end of the transmission by setting 7 bits [16].

This structure ensures efficient and robust data exchange in automotive and other embedded systems.

2.2. Attacks Types

Attacks on CAN can be physical or remote, compromise communication channels, and cause loss of life and operational integrity [19]. The automotive industry must comprehend these risks to develop secure systems.

2.2.1. Denial-of-Service (DoS) Attack

A DoS attack or flood attack as described in [20] on the CAN bus attempts to overwhelm the network with a high rate of messages, consuming bandwidth and preventing legitimate traffic. This kind of interference can result in system instability, malfunction, or temporary disablement of some vehicle components.

2.2.2. Spoofing Attack

Spoofing or Malfunction attack involves imitating authentic ECUs or users in the IoV network. This allows a malicious actor to inject fake commands, which can lead to congestion of traffic or accidents [21].

2.2.3. Masquerade Attack

Masquerade attack is sophisticated attack that requires strategic planning. The attacker disables communication of a compromised ECU with a specified ID as the first step. The attacker then uses a heavily compromised ECU to send imitated messages pretending to be the original ID and timing, impersonating the targeted ECU [22].

2.2.4. Reconnaissance Attack

Reconnaissance attack involves an attacker gathering information from the Internet of Vehicles (IoV) network passively. This information is used mainly to plan and detect more targeted or malicious attacks in the future [21].

2.2.5. Fuzzy Attack

A fuzzing attack targets IVNs by injecting random or altered CAN messages without requiring detailed knowledge of the system [23].

2.2.6. Replay Attack

Replay attack in the context of a CAN, particularly in remote keyless entry systems, occurs when an attacker captures a valid unlock message from the original key fob of the car. Since the system will accept any valid message without verifying its origin, the attacker then replays the captured same message at a later time to unlock the vehicle without authorization [24].

2.3. Vehicle-to-Everything Communication

V2X (Vehicle-to-Everything) communication allows vehicles to exchange real-time data with surrounding infrastructure, extending their perception beyond onboard sensors. This capability enhances CAV services, contributing to improved road safety and traffic efficiency [7]. V2X includes C-V2X and DSRC, where C-V2X uses cellular networks for direct communication, while DSRC operates on IEEE 802.11P protocols. Both are essential for intelligent transportation systems. Edge computing in V2X minimizes data transfer and improves reaction times. The ETSI has established standards for edge computing to enhance data processing and communication efficiency. Edge computing should also support real-time data aggregation and distribution via RSUs. The edge serves as an interface between vehicles and the cloud, with edge servers providing local computing and storage close to vehicles [25]. A practical Vehicular Edge Computing (VEC) deployment scenario is introduced in [26], featuring a three-tier communication network that seamlessly connects the cloud, base stations (BS), roadside units (RSUs), and vehicles. Vehicle-to-Vehicle (V2V) communication relies on DSRC, LTE, and WiFi to facilitate direct data exchange between vehicles, ensuring low-latency interactions. Vehicle-to-Roadside (V2R) communication utilizes DSRC, LTE, and WiFi, enabling efficient connectivity between vehicles and RSUs for improved data dissemination and traffic management. RSU-to-Cloud (R2C) communication employs Ethernet or optical fiber, ensuring high-speed, reliable data transmission between RSUs and cloud infrastructure for large-scale data processing and decision-making. This hierarchical communication framework enhances real-time data processing, traffic coordination, and autonomous vehicle decision-making, making it a cornerstone for intelligent transportation systems.

2.4. Machine Learning for Intrusion Detection in Vehicular Networks

Machine learning techniques are proposed to analyze CAN bus traffic and detect unauthorized access [27]. In this part, we focus on centralized machine learning that involves collecting, aggregating, and processing data in a central server or cloud, training a model, and performing computations for model training and evaluation [11].

Machine learning techniques are proposed to analyze CAN bus traffic and detect unauthorized access [11]. Classical machine learning methods include Support Vector Machine (SVM), K-Nearest Neighbor (KNN), One-Class Support Vector Machine (OCSVM), and Isolation Forest are proposed, while deep learning techniques are increasingly employed [28]. The deep learning models developed are more in demand for IVN-IDS. The authors in [28] propose a Convolutional Neural Network (CNN)-based IVN-IDS to detect attacks within the CAN protocol. They created their dataset using three different automobile models, and they showed that their CNN-based classifier successfully detects CAN bus assaults with encouraging accuracy and efficiency of more than 99%. To address the lack of security features in the CAN bus, the researchers suggest a transformer-based attention network (TAN) [16]. TAN uses a self-attention technique to categorize attacks. This model outperforms baseline techniques when given consecutive CAN IDs as input, and it uses transfer learning to adjust to smaller datasets from various car models. The outcomes demonstrate how well TAN can identify intrusions in automobile networks with a very high accuracy of 100%. The work in [10] outlines an LSTM-based IVN-IDS for identifying and mitigating attacks on the CAN bus. A custom dataset was created by collecting normal CAN traffic and injecting simulated attacks. The training and testing of this model achieved a detection accuracy of 99.995%. A lightweight BiLSTM-based anomaly detection model with an attention mechanism efficiently secures the CAN in vehicles lacking privacy safeguards. It detects replay attacks (92.79% accuracy), DoS attacks (92.85% accuracy), and fuzzing attacks (96.15% detection rate) by leveraging correlations in CAN ID sequences [11].

Building upon these findings, we propose a BiLSTM model tailored to centralized deep learning scenarios, demonstrating enhanced effectiveness in detecting CAN-based attacks and reinforcing the robustness of IVN-IDS.

2.5. Federated Learning for Intrusion Detection in Vehicular Networks

Google introduced federated learning, a decentralized machine learning approach that minimizes data leakage and enhances privacy [27]. The training process is based on two important steps, a set of FL clients that train a ML model using local data and a FL server that retrives their local weights, aggregate these weights and sends the updated value to the clients for an additional training round. Federated learning allows the creation of models using datasets distributed across many devices without requiring centralized data aggregation to reduce data leakage. This approach promotes robust collaborative model training and ensures data privacy [29]. FL operates under two primary architectures: Vertical FL (VFL) and Horizontal FL (HFL). VFL combines locally trained sub-models from clients with vertically split data to build a global model [30]. In contrast, HFL involves multiple clients collaboratively training a model through a central server while keeping the training data on devices, thereby protecting privacy [30]. HFL faces challenges like non-uniform data distribution among clients [31].

Some authors propose FL-based IDS for V2X communication to enhance security and threat detection in distributed environments. The authors [32] suggest an FL-based IDS for 5G and Beyond V2X networks to detect zero-day attacks. The method employs the Convolutional Neural Network (CNN) as the local model to identify intrusions. The authors in [33] propose FL-based IVN-IDS based on edge computing for C-V2X networks using a feedforward neural network (FNN) as a local model. The approach detects multiple types of attack according to its CICToRC-2021 dataset [34]. The FNN model can effectively classify intrusions with more than 96%, it has the drawbacks of a high memory requirement and computation time and is therefore less efficient to deploy on large scales. In addition, the lack of time perception in FNNs hinders the ability of the models to learn sequential patterns of attacks that are common in IVN traffic. In [35] an FL-based IDS using a convolutional LSTM (ConvLSTM) model is proposed to detect intrusions in IVNs. The work is edge computing-oriented, where the model is trained on CAV and the federated models are aggregated by edge servers.

Unlike previous FL-based IVN-IDS studies [32,33,34,35] that assume a single dataset divided among clients, our approach leverages three real-world CAN bus datasets (Hyundai, Chevrolet, and Kia) and assigns each dataset to a separate client. In each federated round, each client is presented with a distinct partition of its own dataset, guaranteeing strict data locality and privacy. This setting more closely resembles real-world deployment in networked vehicle environments, where centralized data collection becomes unrealistic. Through decentralized, incremental learning over data inherently, our solution finds competitive performance with privacy ensured in a realistic federated environment.

3. Federated Learning-Based Intrusion Detection Framework

In the following, we describe our framework for Federated Learning-based In-Vehicular Network Intrusion Detection System (FL-IVN-IDS).

3.1. Problem Statement

In-vehicle networks use sensors to collect CAVs data from the outside environment. The collected data is then broadcasted among the in-vehicle network components such as ECUs using the CAN bus. The CAN gateway transfers the CAN frames between CAN bus and OBU without any encryption or authentication procedures. Therefore, this lack of protection opens the door for many attack vectors that originate either through compromised V2X interfaces or directly on unsecured ECUs within the local in-vehicle network.

To link the internal and external networks, the OBU acts as a bridge and is directly connected to the CAN gateway. Therefore, attacks that successfully exploit vulnerabilities in the OBU, e.g., those arising from V2X messages or external data injection, can lead to the injection of malicious CAN frames into the internal bus via the gateway. This direct connection enables remote or close attackers to affect safety-critical functionality in the vehicle. An attack can lead to dangerous scenarios, including brake failure, runaway vehicle, or full vehicle loss of control, and put passengers and other road users at risk. One of the most dangerous threats is the Flooding Attack, where an attacker introduces a massive number of malicious CAN messages at high rates to jam ordinary communication. This attack exploits the priority-based arbitration mechanism of the CAN protocol, withholding important messages such as braking commands, speed control, and hazard detection signals from reaching their destination ECUs in a timely manner. This makes legitimate messages, such as emergency braking signals, delayed or lost due to continuous arbitration loss to the attack frames. Important communication between ECUs is blocked, rendering important vehicle operations futile. Continuous message flooding can cause certain ECUs to crash or reset, leading to unstable vehicle performance. To defend against the consequences of such attacks, we adopt a deep learning-based approach that leverages federated learning to detect attacks in an in-vehicular network. Attack detection includes, initially, each CAV constantly receiving CAN frames to construct real-time network traffic patterns.

The system identifies unusual patterns such as unanticipated bursts in message rates, abnormally short inter-arrival times, and high bus utilization, typically in case of flooding attacks. Each vehicle uses an OBU locally to train a DL model based on its historically labeled CAN traffic data to learn normal and abnormal messages. After training the model locally, the parameters of the model are transmitted via the CAN gateway to an external federated learning server, located at the edge network, for a global update. In this way, numerous vehicles (OBUs) collaboratively improve the IDS model by sharing updates to all models with the centralized server without sacrificing data privacy. The trained IDS model is then rolled out to each vehicle, which continuously monitors incoming CAN traffic for anomalies. By leveraging FL with edge computing, the solution delivers real-time, privacy-preserving, and adaptive intrusion detection to protect existing connected vehicles from cyberattacks without centralizing key CAN information.

3.2. Proposed Architecture

We consider a vehicular network where a set of vehicles operates within a specific geographical area (e.g., a city) and is connected to the same edge server. The set of vehicles, along with the edge server, constitutes a federated learning environment where the edge server is the FL server, and the vehicles are the federated learning clients and are responsible for locally training the machine learning model defined and initialized by the FL server. Each vehicle continuously collects CAN bus data and processes it using its onboard computing resources (i.e., OBU). The vehicle’s local computing entity preprocesses the data and uses it to locally train a model that has been initialized by the edge server. Once local training is completed, each vehicle sends updated model weights to the edge server via RSUs using WiFi or LTE, reducing direct cloud dependencies. The model consists of multiple layers, where the final layer is responsible for attack classification. To ensure privacy, vehicles do not transmit the last classification layer when sending updates to the edge server. The edge server acts as an aggregator; it receives the model updates from the vehicles, performs an aggregation, and sends the updated weights back to the vehicles. Figure 3 depicts the reference architecture of our framework.

3.3. Adapted-BiLSTM Architecture (a-BiLSTM)

BiLSTM is a variant of LSTM, which is a type of recurrent neural network that effectively learns long-term dependencies and addresses gradient issues, significantly enhancing classification accuracy [36]. Our a-BiLSTM architecture includes two BiLSTM layers followed by a dense layer, as shown in Figure 4. The first BiLSTM layer captures long-range temporal dependencies with 64 hidden units for each direction (forward and backward) and a return sequence parameter set to True, ensuring it outputs the entire sequence for each time step as input to the next layer. The second BiLSTM layer also has 64 hidden units per direction but uses a return sequence parameter set to False, outputting only the final state to summarize sequence information for subsequent layers. To prevent overfitting, we introduce regularization by applying a 50% dropout rate [37], randomly setting activations to zero during training. A fully connected layer with 128 units and ReLU activation follows, enabling the model to learn complex representations from the BiLSTM output. The output layer, consisting of the number of class units with a softmax activation function, predicts class probabilities. Although stacked BiLSTM architectures are commonly used for classification in the literature as in [38,39], we designate this model as “adapted” due to its tailored design towards vehicular CAN bus data as well as deployment in real-time federated learning deployment. The model balances expressiveness and computational efficiency, making it suitable for edge deployment in OBUs (Onboard Units) with limited resources. Moreover, its structure was refined through empirical testing across three different vehicle datasets to ensure robust multiclass intrusion detection under various data distributions typical in federated learning. For this purpose, we have tempered explained the improvements in terms of context-specific optimization and real-world deployment issues.

Table 1 provides the notations used in the paper for consistency and clarity.

4. Experimental Setup

In this section, we provide a complete overview of our experimental setup consisting of the hyper-parameters, the description of the datasets and the test setup. We describe how the preprocessing of the dataset is achieved and split into federated clients to mimic realistic distribution scenarios. In addition, we define the metrics we used to assess model performance. Finally, we present the experimental environment, reporting the hardware and software specifications adopted for training and testing.

4.1. Simulation Parameters

As the optimization algorithm, we use the Adam optimizer [40] because of its capability to adaptively adjust the learning rate, leading to quicker convergence and improved management of sparse gradients. We run the training for 1 and 5 epochs. For multi-class classification, the categorical cross-entropy [41] loss function is employed, which provides an effective way to assess the accuracy of probabilistic outcomes. A batch size of 64 samples is selected to balance computational efficiency and the ability to generalize. The learning rate is fixed to 0.001, a widely accepted value known to promote steady convergence in DL and FL applications.

Table 2 summarizes the simulation parameters. Fraction Fit specifies the proportion of clients selected to train the global model for a round of FL; if equal to 1, all three datasets are chosen to engage in each round. Fraction Evaluation refers to the proportion of clients selected for model evaluation and is equal to 0.5, which means that two of the three datasets are chosen for testing in each round.

We use the aggregation algorithm FedAvg [42], the global model update at round t is computed as a weighted average of the local models

W_{t}^{i}

(denoted as

W_{t}^{(i)}

for client i) based on the number of samples

n_{i}

(denoted as

n_{i}

for client i). The formula (1) [43] is given by:

\frac{\sum_{i = 1}^{N} n_{i} W_{t}^{i}}{\sum_{i = 1}^{N} n_{i}}

(1)

4.2. Dataset

The In-Vehicle Network Intrusion Detection Dataset (INV_IDC) [44] was established for the 2019 Information Security Research and Development Data Challenge, capturing CAN traffic information from three different vehicle models: KIA Soul, Chevrolet Spark, and Hyundai Sonata. It comprises four categories of CAN message features: CAN ID, DLC, DATA Payload, and Timestamps. It contains both normal CAN messages and three types of attacks. These attack simulations replicate genuine threats to vehicle cybersecurity. The dataset simulates three types of attack each distinguished by distinct patterns in the features of CAN bus traffic. Through examination of these features, IDS can proficiently recognize anomalies in standard traffic behaviors and differentiate between harmless and harmful actions within IVN. The following attacks include the attack strategies that we evaluate and detect in this dataset analysis and are presented in Figure 5, Figure 6 and Figure 7.

Flooding Attack: In a flooding attack, multiple ECU nodes simultaneously transmit CAN messages to a receiver ECU. Since message priority is determined by CAN ID values (lower values have higher priority), this attack overwhelms the CAN bus, leading to delayed or dropped messages. As a result, some critical signals fail to reach their target ECUs, disrupting vehicle functions.

Fuzzy Attack: A fuzzy attack injects randomly generated CAN messages into the bus, affecting both ID and Data fields. These messages may include valid CAN IDs extracted from the vehicle or fabricated ones. Since the injected messages do not have meaningful data, they are either processed incorrectly by ECUs or discarded, leading to unpredictable system behavior.

Malfunction Attack: This attack targets a specific CAN ID, manipulating its data field while injecting randomly selected CAN IDs. By modifying the 8-byte data values (e.g., replacing them with 00 or random values), the vehicle reacts abnormally. The manipulated messages reach the targeted ECU but contain incorrect information, causing the system to execute faulty operations or ignore valid commands.

This analysis provides insights into the nature of these attacks and their effect on the data, which is critical for developing effective intrusion detection strategies. We chose this dataset to simulate a real-world scenario where each vehicle represents an individual client, and its data remains private to that client. Specifically, we assume that each client operates independently with its own local data. To achieve this, we assigned each vehicle in the dataset to a separate client; Hyundai: Client 1, Chevrolet: Client 2, and Kia: Client 3. This setup allows us to closely model FL core principle of decentralized training on real-world data while preserving data privacy for each vehicle.

4.3. Data Augmentation

In IID settings, local models converge toward the global optimum, leading to stable aggregation. However, in non-IID scenarios, local model gradients diverge, causing drift and reducing global model accuracy [45]. This results in biased updates, slower convergence, and inefficient communication. The Hellinger distance is a metric used to quantify the similarity between two probability distributions [46]. It can assess the degree of non-IID data among clients. A higher Hellinger distance indicates a greater disparity between local datasets, which can adversely affect the performance of the FedAvg algorithm [46].

The global Hellinger distance formula [40] is defined in Equation (2).

H (G, S) = \frac{1}{\sqrt{2}} \sum_{i = 1}^{C} {(\sqrt{G_{i}} - \sqrt{S_{i}})}^{2}

(2)

where G represents the local label distribution, and S is the reference balanced distribution.

To compare the similarity of class distribution across the dataset, we apply the Hellinger distance measure between the three sub-datasets (i.e., Hyundai, Chevrolet and Kia).

The distance is equal to 0.0671 between Hyundai (Dataset 1) and Chevrolet (Dataset 2) datasets. This distance is equal to 0.0494 for Hyundai (Dataset 1) and Kia (Dataset 3), and to 0.1009 for Chevrolet (Dataset 2) and Kia (Dataset 3). Note that these values indicate the degree of divergence between class distributions of the datasets. The higher the Hellinger distance, the more different the label distributions are, meaning the datasets are more non-IID. Chevrolet and Kia have the greatest difference among the pairs (0.1009), indicating a more varied data distribution than the other dataset pairs. In contrast, Hyundai and Kia have the smallest Hellinger distance (0.0494), indicating that their class distributions are more similar. This decomposition facilitates the measurement of the quantity of non-IID data, which can significantly impact FL model performance, particularly in algorithms like FedAvg, where differences in local distributions can lead to gradient divergence and the slowdown in convergence.

4.4. Data Preprocessing

The INV-IDC dataset was collected and labeled but requires cleaning and normalization. To clean the data, missing values are replaced with 256. Additionally, the hexadecimal values of the IDs and data payload are converted into decimal values ranging from 0 to 256. The min–max normalization technique is employed to maintain the relationships within the original data [47]. This simple and effective method scales the data within a specified range, typically between 0 and 1 [47].

SMOTE (Synthetic Minority Oversampling Technique) is an efficient approach for tackling unbalanced datasets within machine learning by creating artificial samples for the less represented classes [48]. The imbalanced-learn library in Python offers a user-friendly implementation of SMOTE, enabling users to achieve balance in datasets by oversampling the minority class. This strategy enhances model performance by establishing a fairer class distribution, avoiding the simple replication of current samples, and hence maintaining the dataset’s variety [48]. This process ensures that the dataset is properly normalized and prepared for subsequent modeling tasks.

4.5. Data Splitting

In our federated learning setup, we utilize three distinct datasets in each training round. These datasets correspond to different vehicle manufacturers: Dataset 1 (Hyundai), Dataset 2 (Chevrolet), and Dataset 3 (Kia). The Hyundai dataset size is 41 MB, and the Chevrolet is 27.9 MB and the Kia is 17.8 MB, which indicates variations in data quantities for different vehicles. We employ a total of 40 federated learning rounds. To ensure balanced participation of all datasets throughout training, we divide each dataset into 40 equal subsets. In each round, we use

1 / 40

th of each dataset, ensuring a fair and consistent distribution of data across rounds.

For each round, the extracted data subset is split into training and testing sets. We follow an 80–20% split, where 80% of the data is used for model training and 20% is reserved for evaluation. Initially, the datasets exhibit significant class imbalances, as shown in Table 3. To address this, we apply the SMOTE to balance class distributions before training. Table 4 presents the new class distributions after applying SMOTE.

Since each dataset is split into 40 subsets, and we apply an 80–20% training-testing split in each round, the number of samples used per round is shown in Table 5.

By implementing this data partitioning, we ensure fair training conditions across federated learning rounds while mitigating the impact of class imbalance through oversampling.

4.6. Evaluation Metrics

The accuracy, recall (Rec), precision (Pre), and F1-score (F1) metrics are used to evaluate the performance of our models based on True Positive (TP), False Positives (FP), True Negatives (TN), True Negatives (TN) of the confusion matrix [49]. The metrics are presented in Equations (3)–(6) [49,50,51]. TP are instances of a specific attack type that were correctly predicted; FP is instances of a particular attack type that were incorrectly predicted; TN is instances of a specific attack type that were correctly identified; and FN is instances of a particular attack type that were missed [50].

Accuracy = (\frac{T P + T N}{T P + T N + F P + F N}) \times 100

(3)

Precision (Pre) = (\frac{T P}{T P + F P}) \times 100

(4)

Recall (Rec) = (\frac{T P}{T P + F N}) \times 100

(5)

F1_score (F1) = 2 \times (\frac{P r e \times R e c}{P r e + R e c}) \times 100

(6)

4.7. Environment Evaluation

We adapt the Flower framework from [52] to integrate our a-BiLSTM model. Flower is an open-source federated learning library that uses TensorFlow 2.x for deep learning. The setup was done in Python 3.10 with the Integrated Development Environment (IDE) being PyCharm 2023.3.3 on Windows 10 (64-bit). Hardware used was an Intel i5-6 processor CPU @ 2.4 GHz, 16 GB RAM, and no dedicated GPU. The Flower framework enabled efficient client-server communication through the FedAvg algorithm to enable local FL simulations.

5. Results Analysis

To assess the effectiveness of our proposal, we compare the following approaches:

FL-IVN-IDS-1E: our proposed approach described in Section 3 using one epoch of training;
FL-IVN-IDS-5E: our proposed approach described in Section 3 using 5 epochs of training;
LSTM-1E: federated learning based approach that uses the baseline LSTM model from [10], using 1 epoch of training;
LSTM-5E: federated learning based approach that uses the baseline LSTM model from [10] using 5 epochs of training.

5.1. Federated Learning-Based Intrusion Detection Performance

This section provides a detailed evaluation of the decentralized learning process, including validation results after each training round as well as the validation results after all rounds.

5.1.1. Per-Round Performance Evaluation

In the following, we represent the history of accuracy and loss achieved by the two aforementioned approaches as a function of the number of training epochs. During the evaluation phase, the global model is assessed after each training round by aggregating the evaluation results from all participants. The goal is to track the improvement of the model accuracy over successive rounds.

Figure 8 illustrates the accuracy history of FL-IVN-IDS-1E, FL-IVN-IDS-5E, FL-based LSTM-1E and LSTM-5E as a function of the number of rounds. We can notice that:

Our FL-IVN-IDS always learns faster and performs better than FL-based LSTM in both cases with one and five epochs and gives higher accuracies. We can notice that when only one epoch is applied using both approaches, the global accuracy of FL-IVN-IDS-1E and LSTM-1E are the same and equal to 98.2% but when the number of rounds is lower than 10, we can notice that FL-IVN-IDS-1E model has good convergence than FL-based LSTM-1E model. LSTM-1E model performs well in learning patterns in some rounds but does not generalize in others due to sensitivity to training data. It could be due to high variance and poor convergence.
We can observe that when five epochs are applied using both approaches, the accuracy is higher than that used by using one epoch and achieved higher accuracy of 100% with very lower number of rounds. This can be explained by the fact that the training for more epochs allows the model to learn useful patterns rather than just noise in the dataset. However, the accuracy of our FL-IVN-IDS-5E model always increases with the number of rounds. On the other hand, when the number of rounds is between 10 and 21, the tain in LSTM-5E is not stable due to poor convergence.

Figure 9 presents the loss history of FL-IVN-IDS-1E, FL-IVN-IDS-5E, FL-based LSTM-1E, and LSTM-5E across federated learning rounds. We can notice that:

Among all models, FL-IVN-IDS-5E achieves the lowest loss, demonstrating superior convergence and stability over multiple rounds. When only one epoch is applied using both approaches, the global loss of FL-IVN-IDS-1E and FL-based LSTM-1E are the same and equal to 0.05% but when the number of rounds is lower than 10, we can notice that our FL-IVN-IDS-1E model has good convergence and decrease faster than FL-based LSTM-1E model.
when five epochs are applied using both approaches, the loss is lower than that used by using one epoch and achieved a global loss equal to 10⁻⁵ of our model FL-IVN-IDS-5E lower than the loss of LSTM-5E that equal to 0.0085%. This can be explained by the fact that the training for more epochs allows the model allows the model to run more optimization iterations, adjusting its parameters and more and more reducing the loss.
Overall, having a loss value that is better when using 5 epochs than when using 1 epoch indicates that the model was undertrained earlier and required more iterations before it could reach a better region of the loss landscape. This means that the model is still in the learning stage, being optimized by more training without overfitting. The loss of our FL-IVN-IDS-5E model always decreased with the number of rounds. On the other hand, the tain in LSTM-5E is not stable due to poor convergence when the number of rounds is between 10 and 21.

5.1.2. Final Global Model Evaluation

We present the accuracy and loss trends achieved by the two previously described methods across different numbers of training epochs. The evaluation is carried out after all federated rounds have been completed, where the final global model based on the last aggregated weights is used to test the local data set of each client. This approach ensures that the final performance metrics reflect the model’s generalization ability across all participants.

Figure 10 illustrates the accuracy history of FL-IVN-IDS-1E, FL-IVN-IDS-5E, FL-based LSTM-1E and LSTM-5E as a function of the number of rounds. We can notice that our proposed FL-IVN-IDS consistently demonstrates faster convergence and superior performance compared to the FL-based LSTM model, in both the one epoch and five epochs settings. However, FL-IVN-IDS-1E exhibits more stable and efficient convergence, whereas the LSTM-1E model shows inconsistent generalization. With five training epochs, both models achieve higher accuracy compared to the one epoch configuration, with our FL-IVN-IDS-5E model reaching 100% accuracy in fewer rounds.

Figure 11 shows the loss history of FL-IVN-IDS-1E, FL-IVN-IDS-5E, FL-based LSTM-1E and LSTM-5E in federated learning rounds. We can notice that among all the models evaluated, FL-IVN-IDS-5E consistently achieves the lowest loss, indicating superior convergence and training stability in multiple rounds. When trained with only one epoch, FL-IVN-IDS-1E initially records a global loss of 1.206%; however, FL-IVN-IDS-1E demonstrates less loss reduction and better convergence than FL-based LSTM-1E which the last obtained 5.43%. With five training epochs, both models show improved performance, but FL-IVN-IDS-5E significantly outperforms LSTM-5E, achieving a minimal global loss of 0.095%, compared to 0.0085% for LSTM-5E. This improvement is attributed to the additional optimization iterations available in multi epoch training, which allow the model to better adjust its parameters and minimize the loss. In general, the consistently decreasing loss in FL-IVN-IDS-5E suggests effective learning without overfitting, while LSTM-5E exhibits unstable training behavior and poor convergence.

Table 6, Table 7, Table 8 and Table 9 present the classifying performance of all the models in FL, i.e., Precision, Recall, F1-Score, and the test sample number (Support).

The comparison between FL-based LSTM and FL-IVN-IDS across one and five training epochs highlights the consistent superiority of the FL-IVN-IDS model. After a single epoch, FL-IVN-IDS-1E achieves an accuracy of 99.73% with perfect F1-scores across all classes, outperforming FL-based LSTM-1E, which records 98.15% accuracy and slightly lower F1-scores for some classes (e.g., 0.96).
As previously explained, with five epochs of training, both models show improved performance, but FL-IVN-IDS-5E reaches 100% accuracy, compared to 99.96% for LSTM-5E. The output indicates that increased epochs of training yield better performance, but FL-IVN-IDS beats FL-based LSTM across all scores both when epochs are increased and when not.

Figure 12 and Figure 13 show the Confusion Matrix (CM) and Receiver Operating Characteristic (ROC) curve of our FL-IVN-IDS-5E model, clearly illustrating its exceptional classification performance on all four classes: Normal, Flooding, Fuzzy, and Malfunction. By evaluating the final global model obtained after all FL rounds we observe excellent results across all metrics, indicating that our approach successfully aggregated high-quality local models into a highly accurate global model.

The confusion matrix (CM) reflects the relationship between True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN) of FL-IVN-IDS-5E and LSTM-5E models. The CM in Figure 12 exhibits only diagonal entries, signifying that all instances were correctly classified with zero misclassifications.
Meanwhile, the ROC curves in Figure 13 reach 100% for every class, reflecting a True Positive Rate (TPR) of 1.0 and a False Positive Rate (FPR) of 0.0. This perfect separation between positive and negative cases confirms that the model achieves optimal detection accuracy.
The ideal diagonal pattern in the confusion matrix, combined with flawless ROC performance, reinforces the robustness and reliability of the FL-IVN-IDS-5E model in accurately identifying both attacks and normal in-vehicle traffic.

Figure 14 and Figure 15 show the CM and ROC curves of our LSTM-5E model.

Figure 14 represents the CM of LSTM-5E model containing only diagonal elements. All instances are correctly classified without any misclassifications.
This proves that for each class the model correctly identifies all instances TP and rejects all the other classes correctly TN instead in the Normal class, there is 2, which is a FN, leading to the near-perfect classification performance.
Figure 15 illustrates the ROC curve, where each class achieves a 100% score. This result signifies that the TPR is 1.0 and the FPR is 0, i.e., the model LSTM-5E classifies classes with perfect separation and no error.
A TPR of 100% guarantees that the model classifies all true occurrences of each class and an FPR of 0% signifies that there are no false alarms.

5.2. Centralized Learning-Based Intrusion Detection Performance

In the centralized evaluation, we merge the three datasets by aligning the rows. The model is trained and tested independently on each dataset corresponding to the three clients (Hyundai, Chevrolet, and Kia). For each dataset, 80% is used for training and 20% for testing. This allows us to evaluate the performance of the model in a non-federated setting on each vehicle-specific dataset separately. During training, the model learns how to classify CAN messages using the Adam optimizer. The loss function employed is categorical crossentropy, given the multi-class nature of the classification task. The evaluation phase measures the model’s accuracy on the test set after each epoch to assess its generalization performance.

Figure 16 depicts the accuracy history using the traditional centralized learning as a function of the number of epochs, using a-BiLSTM and LSTM [10]. The three sub-datasets (i.e., Hyundai, Chevrolet and Kia) are used to train the model on the same node. Note that early stopping is applied to determine the optimal number of epochs and prevent overfitting. We can note that:

Our a-BiLSTM model always performs better than the LSTM [10] model for all numbers of epochs and for the three testing sub-datasets.
When testing using the Sonata dataset, our a-BiLSTM learns faster than the LSTM model [10] during the first 10 epochs and achieves higher accuracy. We can also observe that the LSTM model performs well in learning patterns in some epochs but does not generalize in others (i.e., epoch 3) due to sensitivity to the training data. This can be explained by the fact that the model memorizes noise or spurious correlations rather than generalizable features where the model fits perfectly to some batches but generalizes poorly to others. In the Spark and Soul datasets, both models perform well, but a-BiLSTM converges more quickly.

Figure 17 illustrates the loss history when training using the traditional centralized approach across epochs. Early stopping is applied to determine the optimal number of epochs and prevent overfitting. We can notice that:

In the Sonata dataset, a-BiLSTM achieves a lower loss compared to LSTM, indicating better convergence. In the Spark and Soul datasets, both models perform well, but a-BiLSTM reaches a lower loss value more quickly.
LSTM approach has a peak during epoch 3. As already explaied, this can be due to the fact that the model does not memorize well the generalizable features where it fits perfectly to some batches but generalizes poorly to others.

5.3. Network Traffic Overhead

Figure 18 illustrates the amount of network traffic sent and received by the FL server during the training. We can notice that:

Network traffic increases with the number of epochs for both models. A possible explanation for that is more training per round leads to larger or more frequent updates from clients.
The communication patterns of both models remain generally comparable. However, the a-BiLSTM shows lower or similar data transfer volumes in 1 epoch and 5 epochs settings, which proves that our model has comparable communication demands despite its complex architecture, where the observed upper bounds on data transmission when using LSTM achieves values in the order of $10^{8}$ bytes.

5.4. Time Evaluation

Figure 19 illustrates the distribution of training time over 40 rounds for the two approaches under consideration using 1 and 5 epochs: FL-IVN-IDS-5E, LSTM-5E, FL-IVN-IDS-1E, and LSTM-1E.

Our FL-IVN-IDS model is computationally more effective by consuming significantly less training time per round compared to the LSTM-based model, hence being more deployable in near real-time applications.
In comparison with the 1E (1 epoch per round) configurations, FL-IVN-IDS-1E trains for 3326.08 seconds, whereas LSTM-1E takes 4687.35 seconds, an improvement of 29% in training efficiency. The lower computational expense in FL-IVN-IDS-1E further validates that it can do well even in resource-constrained systems where saving processing time and energy consumption is paramount.
For the 5E (5 epochs per round) configurations, our FL-IVN-IDS-5E model is trained in 10,202.43 seconds, as opposed to LSTM-5E in 17,309.56 seconds, reflecting a considerable 40.8% reduction in training time. This enhanced efficiency stems from the improved structure of FL-IVN-IDS that is able to effectively compromise between model complexity and training efficiency in a federated setup. The shortened training time per round indicates that FL-IVN-IDS-5E converges and updates faster than LSTM-5E, hence reducing computational expense and improving scalability.

The experimentally observed differences between 1E and 5E environments show that boosting the epochs per round significantly influences training time, with LSTM-5E being the most impacted by unduly long training times due to its higher computational demands. Our FL-IVN-IDS model, both in 1E and 5E variant, achieves a better balance between model performance and training efficiency and thus becomes a more scalable choice for federated learning deployments in intelligent vehicular networks. By reducing training time without compromising high performance, FL-IVN-IDS is a perfect choice for real-time and large-scale federated learning, as it facilitates faster model updates, less device energy consumption, and improved adaptability to real-world limitations.

5.5. Models Comparison

Table 10 summarizes comparative results of the four models regarding testing accuracy and training time. The results establish that the FL-IVN-IDS-1E and FL-IVN-IDS-5E outperform the LSTM models in accuracy and training efficiency. Interestingly, FL-IVN-IDS-5E achieves 100% accuracy with significantly lesser training time compared to the LSTM-5E model, which also performs very well (99.95%) but with higher computational time (20,139.08 s). Further, with 5 epochs of local epochs as opposed to 1 for both the LSTM-5E and FL-IVN-IDS-5E models, accuracy is consistently enhanced, but the FL-IVN-IDS-5E is more efficient and scalable. These results indicate the effectiveness of the proposed FL-IVN-IDS model, especially when applied with multiple local epochs.

6. Conclusions and Future Work

In this paper, we tackled a critical limitation of CAN protocol its lack of data encryption which makes it vulnerable to a wide range of cyberattacks. To mitigate this issue, we proposed FL-IVN-IDS, a federated intrusion detection system that can effectively classify attacks based on multiple datasets. Quantitatively, FL-IVN-IDS-5E achieved 100% testing accuracy, outperforming the LSTM-5E model which required more time to reach a slightly lower accuracy of 99.95%. Similarly, FL-IVN-IDS-1E reached 99.73% accuracy, compared to 98.15% accuracy by LSTM-1E. This demonstrates that FL-IVN-IDS not only improves classification performance but also reduces training time significantly up to 40% when compared with LSTM-5E.

Our model achieves 100% on all metrics of evaluation, significantly better than the baseline FL-based LSTM architecture in terms of accuracy, precision, recall, F1-score, and training time. The confusion matrix and ROC curves further substantiate the effectiveness of our approach. Furthermore, our solution generates low communication overhead, thus ensuring that it is highly scalable and deployable on smart CAV networks. Our FL-IVN-IDS model, in both 1E and 5E variants, improved a superior balance between model performance and training efficiency. By reducing training time without compromising accuracy, FL-IVN-IDS emerges as an ideal solution for large-scale and real-time federated learning applications. It enables faster model updates, lower device energy consumption, and greater flexibility towards real-world constraints. Thus, it is a very effective choice for improving cybersecurity in intelligent vehicular networks.

Future Works

Further works may consider enhancing privacy by adding strong encryption of FL-IVN-IDS weights to protect model parameters from attacks. Also, considering CAN-FD (CAN with flexible data rate) datasets to train the FL-IVN-IDS for higher data rates, improved efficiency, and better scalability for modern CAVs. future work can explore how pre-trained LLMs (e.g., transformer models) can be adapted and customized through federated learning on structured vehicle data (CAN messages). It involves modifying LLMs to handle time-series or sequence-based inputs through the use of embedding layers or tokenization methods suitable for CAN traffic patterns. Every device or edge vehicle can fine-tune the local model using its local traffic data and send only the updates to the model (not actual data) to the central aggregator. In this way, the LLM can be trained on local attack trends per vehicle while, over time, improving its generalization to detect more general types of attacks fleet-wide, with the prove of data privacy.

Author Contributions

Conceptualization, by M.G., N.-E.-H.Y. and S.B.; methodology, M.G., N.-E.-H.Y., A.B. and S.B.; software, M.G.; validation, N.-E.-H.Y., S.B. and A.B.; formal analysis, M.G., N.-E.-H.Y., S.B. and A.B.; investigation, M.G., N.-E.-H.Y., S.B. and A.B.; resources, M.G.; data curation, M.G.; writing—original draft preparation, M.G., N.-E.-H.Y., S.B. and A.B.; writing—review and editing, M.G., N.-E.-H.Y., S.B. and A.B.; visualization, N.-E.-H.Y. and M.G.; supervision, N.-E.-H.Y., S.B. and A.B.; project administration, M.G.; funding acquisition, M.G., N.-E.-H.Y., S.B. and A.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

No new data were generated as part of this study. The data used in this study is publicly available and can be accessed at https://ocslab.hksecurity.net/Datasets/datachallenge2019/car (accessed on 1 August 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

CAN	Controller Area Network
ML	Machine Learning
DL	Deep Learning
IVN	In-Vehicle Network
IDS	Intrusion Detection System
CAV	Connected and Autonomous vehicle
ECU	Electrical Control Units
OBU	On-Board Unit
RSU	Road Side Unit
V2X	Vehicle-to-Everything
ReLU	Rectified Linear Unit
FL	Federated Learning
FL-IVN-IDS	FL-based BiLSTM with 5 epochs
LSTM	Long Short-Term Memory
BiLSTM	Bidirectional Long Short-Term Memory
ROC	Receiver Operating Characteristic
CPU	Central Processing Unit
GPU	Graphics Processing Unit
CM	Confusion Matrix

References

Yusuf, S.A.; Khan, A.; Souissi, R. Vehicle-to-everything (V2X) in the autonomous vehicles domain—A technical review of communication, sensor, and AI technologies for road user safety. Transp. Res. Interdiscip. Perspect. 2024, 23, 100980. [Google Scholar]
Wang, B.; Li, W.; Khattak, Z.H. Anomaly Detection in Connected and Autonomous Vehicle Trajectories Using LSTM Autoencoder and Gaussian Mixture Model. Electronics 2024, 13, 1251. [Google Scholar] [CrossRef]
Ayala, R.; Mohd, T.K. Sensors in autonomous vehicles: A survey. J. Auton. Veh. Syst. 2021, 1, 031003. [Google Scholar] [CrossRef]
Palaniswamy, B.; Camtepe, S.; Foo, E.; Pieprzyk, J. An efficient authentication scheme for intra-vehicular controller area network. IEEE Trans. Inf. Forensics Secur. 2020, 15, 3107–3122. [Google Scholar] [CrossRef]
Oladimeji, D.; Rasheed, A.; Varol, C.; Baza, M.; Alshahrani, H.; Baz, A. CANAttack: Assessing Vulnerabilities within Controller Area Network. Sensors 2023, 23, 8223. [Google Scholar] [CrossRef]
Earth2 Digital. What is Vehicle CAN Bus ECU? Available online: https://www.earth2.digital/blog/what-is-vehicle-can-bus-ecu-evoque-adam-ali.html (accessed on 18 June 2024).
Sedar, R.; Vázquez-Gallego, F.; Casellas, R.; Vilalta, R.; Muñoz, R.; Silva, R.; Alonso-Zarate, J. Standards-compliant multi-protocol on-board unit for the evaluation of connected and automated mobility services in multi-vendor environments. Sensors 2021, 21, 2090. [Google Scholar] [CrossRef]
Yu, D.; Hsu, R.H.; Lee, J.; Lee, S. EC-SVC: Secure CAN bus in-vehicle communications with fine-grained access control based on edge computing. IEEE Trans. Inf. Forensics Secur. 2022, 17, 1388–1403. [Google Scholar] [CrossRef]
Lampe, B.; Meng, W. A survey of deep learning-based intrusion detection in automotive applications. Expert Syst. Appl. 2023, 221, 119771. [Google Scholar] [CrossRef]
Hossain, M.D.; Inoue, H.; Ochiai, H.; Fall, D.; Kadobayashi, Y. LSTM-based intrusion detection system for in-vehicle CAN bus communications. IEEE Access 2020, 8, 185489–185502. [Google Scholar] [CrossRef]
Kan, X.; Zhou, Z.; Yao, L.; Zuo, Y. Research on Anomaly Detection in Vehicular CAN Based on Bi-LSTM. J. Cyber Secur. Mobil. 2023, 12, 629–652. [Google Scholar] [CrossRef]
Buscemi, A.; Turcanu, I.; Castignani, G.; Panchenko, A.; Engel, T.; Shin, K.G. A survey on controller area network reverse engineering. IEEE Commun. Surv. Tutor. 2023, 25, 1445–1481. [Google Scholar] [CrossRef]
Nazakat, I.; Khurshid, K. Intrusion detection system for in-vehicular communication. In Proceedings of the 2019 15th International Conference on Emerging Technologies (ICET), Islamabad, Pakistan, 2–3 December 2019; pp. 1–6. [Google Scholar] [CrossRef]
Bhatia, R.; Kumar, V.; Serag, K.; Celik, Z.B.; Payer, M.; Xu, D. Evading Voltage-Based Intrusion Detection on Automotive CAN. In Proceedings of the NDSS, Virtual Conference, 21–25 February 2021. [Google Scholar]
Almehdhar, M.; Albaseer, A.; Khan, M.A.; Abdallah, M.; Menouar, H.; Al-Kuwari, S.; Al-Fuqaha, A. Deep learning in the fast lane: A survey on advanced intrusion detection systems for intelligent vehicle networks. IEEE Open J. Veh. Technol. 2024, 5, 869–906. [Google Scholar] [CrossRef]
Nguyen, T.P.; Nam, H.; Kim, D. Transformer-based attention network for in-vehicle intrusion detection. IEEE Access 2023, 11, 55389–55403. [Google Scholar] [CrossRef]
Ma, H.; Cao, J.; Mi, B.; Huang, D.; Liu, Y.; Li, S. A GRU-based lightweight system for CAN intrusion detection in real time. Secur. Commun. Netw. 2022, 2022, 5827056. [Google Scholar] [CrossRef]
Sun, H.; Chen, M.; Weng, J.; Liu, Z.; Geng, G. Anomaly detection for in-vehicle network using CNN-LSTM with attention mechanism. IEEE Trans. Veh. Technol. 2021, 70, 10880–10893. [Google Scholar] [CrossRef]
Al-Aql, N.; Al-Shammari, A. Hybrid RNN-LSTM networks for enhanced intrusion detection in vehicle CAN systems. J. Electr. Syst. 2024, 20, 3019–3031. [Google Scholar] [CrossRef]
Althunayyan, M.; Javed, A.; Rana, O. A Survey of Learning-Based Intrusion Detection Systems for In-Vehicle Network. arXiv 2025, arXiv:2505.11551. [Google Scholar] [CrossRef]
Al-Quayed, F.; Tariq, N.; Humayun, M.; Khan, F.A.; Khan, M.A.; Alnusairi, T.S. Securing the Road Ahead: A Survey on Internet of Vehicles Security Powered by a Conceptual Blockchain-Based Intrusion Detection System for Smart Cities. Trans. Emerg. Telecommun. Technol. 2025, 36, e70133. [Google Scholar] [CrossRef]
Wang, K.; Sun, Z.; Wang, B.; Fan, Q.; Li, M.; Zhang, H. ATHENA: An In-vehicle CAN Intrusion Detection Framework Based on Physical Characteristics of Vehicle Systems. arXiv 2025, arXiv:2503.17067. [Google Scholar] [CrossRef]
Liu, Y.; Xue, L.; Wang, S.; Luo, X.; Zhao, K.; Jing, P.; Ma, X.; Tang, Y.; Zhou, H. Vehicular Intrusion Detection System for Controller Area Network: A Comprehensive Survey and Evaluation. IEEE Trans. Intell. Transp. Syst. 2025, 26, 10979–11009. [Google Scholar] [CrossRef]
Gesteira-Miñarro, R.; López, G.; Palacios, R. Revisiting Wireless Cyberattacks on Vehicles. Sensors 2025, 25, 2605. [Google Scholar] [CrossRef] [PubMed]
Abdullah, M.F.A.; Yogarayan, S.; Razak, S.F.A.; Azman, A.; Amin, A.H.M.; Salleh, M. Edge computing for vehicle to everything: A short review. F1000Research 2023, 10, 1104. [Google Scholar] [CrossRef] [PubMed]
Liu, L.; Wu, B.; Shi, W. A comparison of communication mechanisms in vehicular edge computing. In 3rd USENIX Workshop on Hot Topics in Edge Computing (HotEdge 20); USENIX Association: Berkeley, CA, USA, 2020. [Google Scholar]
Sharmin, S.; Mansor, H.; Abdul Kadir, A.F.; Aziz, N.A. Benchmarking frameworks and comparative studies of Controller Area Network (CAN) intrusion detection systems: A review. J. Comput. Secur. 2024; preprint. [Google Scholar] [CrossRef]
Hossain, M.D.; Inoue, H.; Ochiai, H.; Fall, D.; Kadobayashi, Y. An effective in-vehicle CAN bus intrusion detection system using CNN deep learning approach. In Proceedings of the 2020 IEEE Global Communications Conference (GLOBECOM), Taipei, Taiwan, 7–11 December 2020. [Google Scholar] [CrossRef]
Yang, Q.; Liu, Y.; Chen, T.; Tong, Y. Federated machine learning: Concept and applications. ACM Trans. Intell. Syst. Technol. 2019, 10, 1–19. [Google Scholar] [CrossRef]
Wei, K.; Li, J.; Ma, C.; Ding, M.; Wei, S.; Wu, F.; Ranbaduge, T. Vertical federated learning: Challenges, methodologies and experiments. arXiv 2022, arXiv:2202.04309. [Google Scholar] [CrossRef]
Huang, W.; Li, T.; Wang, D.; Du, S.; Zhang, J.; Huang, T. Fairness and accuracy in horizontal federated learning. Inf. Sci. 2022, 589, 170–185. [Google Scholar] [CrossRef]
Boualouache, A.; Brik, B.; Rahal, R.; Ghamri-Doudane, Y.; Senouci, S.M. Federated Learning for Zero-Day Attack Detection in 5G and Beyond V2X Networks. arXiv 2024, arXiv:2407.03070. [Google Scholar] [CrossRef]
Selamnia, A.; Brik, B.; Senouci, S.M.; Boualouache, A.; Hossain, S. Edge Computing-enabled Intrusion Detection for C-V2X Networks using Federated Learning. In Proceedings of the GLOBECOM 2022–2022 IEEE Global Communications Conference, Rio de Janeiro, Brazil, 4–8 December 2022. [Google Scholar] [CrossRef]
Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization. In Proceedings of the 4th International Conference on Information Systems Security and Privacy (ICISSP), Madeira, Portugal, 22–24 January 2018. [Google Scholar]
Yang, J.; Hu, J.; Yu, T. Federated AI-enabled in-vehicle network intrusion detection for Internet of Vehicles. Electronics 2022, 11, 3658. [Google Scholar] [CrossRef]
Al-Smadi, B.S. DeBERTa-BiLSTM: A multi-label classification model of Arabic medical questions using pre-trained models and deep learning. Comput. Biol. Med. 2024, 170, 107921. [Google Scholar] [CrossRef]
Desta, A.K.; Ohira, S.; Arai, I. ID sequence analysis for intrusion detection in the CAN bus using long, short-term memory networks. In Proceedings of the IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops), Austin, TX, USA, 23–27 March 2020. [Google Scholar]
Khan, W.; Minallah, N.; Sher, M.; Khan, M.A.; Rehman, A.U.; Al-Ansari, T.; Bermak, A. Advancing crop classification in smallholder agriculture: A multifaceted approach combining frequency-domain image coregistration, transformer-based parcel segmentation, and Bi-LSTM for crop classification. PLoS ONE 2024, 19, e0299350. [Google Scholar] [CrossRef]
Natha, S.; Ahmed, F.; Siraj, M.; Lagari, M.; Altamimi, M.; Chandio, A.A. Deep BiLSTM Attention Model for Spatial and Temporal Anomaly Detection in Video Surveillance. Sensors 2025, 25, 251. [Google Scholar] [CrossRef]
Yadav, D.P.; Rathor, S. Bone fracture detection and classification using deep learning approach. In Proceedings of the 2020 International Conference on Power Electronics & IoT Applications in Renewable Energy and its Control (PARC), Mathura, India, 28–29 February 2020. [Google Scholar] [CrossRef]
Ho, Y.; Wookey, S. The real-world-weight cross-entropy loss function: Modeling the costs of mislabeling. IEEE Access 2019, 8, 4806–4813. [Google Scholar] [CrossRef]
Wu, H. Intrusion Detection Model for Wireless Sensor Networks Based on FedAvg and XGBoost Algorithm. Int. J. Distrib. Sens. Netw. 2024, 2024, 5536615. [Google Scholar] [CrossRef]
McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; y Arcas, B.A. Communication-efficient learning of deep networks from decentralized data. Artif. Intell. Stat. 2019, 54, 1273–1282. [Google Scholar]
In-Vehicle Network Intrusion Detection Dataset. Available online: https://ocslab.hksecurity.net/Datasets/datachallenge2019/car (accessed on 18 June 2024).
Lv, Y.; Ding, H.; Wu, H.; Zhao, Y.; Zhang, L. FedRDS: Federated learning on non-iid data via regularization and data sharing. Appl. Sci. 2023, 13, 12962. [Google Scholar] [CrossRef]
Tan, Q.; Wu, S.; Tao, Y. Privacy-enhanced federated learning for non-iid data. Mathematics 2023, 11, 4123. [Google Scholar] [CrossRef]
Zhang, J.; Zhang, X.; Liu, Z.; Fu, F.; Jiao, Y.; Xu, F. A Network Intrusion Detection Model Based on BiLSTM with Multi-Head Attention Mechanism. Electronics 2023, 12, 4170. [Google Scholar] [CrossRef]
Lemaitre, G.; Nogueira, F.; Aridas, C.K. Imbalanced-learn: A Python toolbox to tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 2017, 18, 559–563. [Google Scholar]
Wang, K.; Zhang, A.; Sun, H.; Wang, B. Analysis of recent deep-learning-based intrusion detection methods for in-vehicle network. IEEE Trans. Intell. Transp. Syst. 2022, 24, 1843–1854. [Google Scholar] [CrossRef]
Alalwany, E.; Mahgou, I. An Effective Ensemble Learning-Based Real-Time Intrusion Detection Scheme for an In-Vehicle Network. Electronics 2024, 13, 919. [Google Scholar] [CrossRef]
Ji, H.; Wang, Y.; Qin, H.; Wang, Y. Comparative performance evaluation of intrusion detection methods for in-vehicle networks. IEEE Access 2018, 6, 37523–37532. [Google Scholar] [CrossRef]
Flower Framework. Available online: https://flower.ai/docs/framework/tutorial-series-what-is-federated-learning.html (accessed on 1 August 2024).

Figure 1. CAN communication in autonomous vehicles.

Figure 2. CAN frame format.

Figure 3. Reference architecture of FL-IVN-IDS.

Figure 4. Adapted-BiLSTM Architecture for IVN-IDS.

Figure 5. Flood Attack Strategy.

Figure 6. Fuzzy Attack Strategy.

Figure 7. Malfunction Attack Strategy.

Figure 8. Accuracy Training History of Distributed Learning for Per-Round Evaluation.

Figure 9. Loss Training History of distributed learning for Per-Round Evaluation.

Figure 10. Accuracy Training History of Distributed Learning for Final Global Model Evaluation.

Figure 11. Loss Training History of distributed learning for Final Global Model Evaluation.

Figure 12. Confusion Matrix of FL-IVN-IDS-5E Architecture.

Figure 13. ROC of FL-IVN-IDS-5E Architecture.

Figure 14. Confusion Matrix of LSTM-5E Architecture.

Figure 15. ROC of LSTM-5E Architecture.

Figure 16. Accuracy Training History of Centralized Learning.

Figure 17. Loss Training History of Centralized Learning.

Figure 18. Traffic Network exchanges during the training period.

Figure 19. Distribution of the per-round training time using the four approaches.

Table 1. Summary of Notations.

Symbol	Description
N	Number of clients in federated learning
E	Number of local training epochs per round
B	Batch size for local training
$w_{t}$	Global model weights at round t
$w_{t}^{i}$	Local model weights of client i at round t
H	Hellinger distance
G	Local label distribution
S	Reference balanced distribution
C	Local dataset of client i
$T P$	instances of a specific attack type that were correctly predicted
$T N$	instances of a particular attack type that were incorrectly predicted
$F P$	instances of a specific attack type that were correctly identified
$F N$	instances of a particular attack type that were missed

Table 2. Complete Simulation Parameters and Settings.

Category	Parameter	Value/Description
General Setup	Number of Clients	3 clients (each using a separate dataset)
	Server Rounds	40
	Fraction of Clients per Round	1.0 (fit), 0.5 (evaluate)
	Local Epochs per Round	{1, 5}
	Batch Size	64
Model Architecture	Input Shape	(11, 1)
	Layer 1	Bidirectional LSTM with 64 units, return_sequences = True
	Layer 2	Bidirectional LSTM with 64 units
	Dropout Layer	Dropout rate = 0.5
	Dense Layer	Dense layer with 128 units, ReLU activation
	Output Layer	Dense layer with 4 units, Softmax activation
	Loss Function	Categorical Crossentropy
	Optimizer	Adam (learning rate = 0.001)
	Metrics	Accuracy

Table 3. Class Distribution Before Oversampling.

Dataset	Normal	Flooding	Fuzzy	Malfunction
Hyundai	236,607	17,093	9095	8202
Chevrolet	154,960	14,999	3043	3995
Kia	334,542	16,072	21,613	4770

Table 4. Class Distribution After Oversampling (SMOTE).

Dataset	Normal	Flooding	Fuzzy	Malfunction
Hyundai	189,297	189,297	189,297	189,297
Chevrolet	123,987	123,987	123,987	123,987
Kia	267,616	267,616	267,616	267,616

Table 5. Number of Training and Testing Samples Per Round After Oversampling.

Dataset	Training Samples (80%)	Testing Samples (20%)
Hyundai	15,144	3786
Chevrolet	9918	2480
Kia	21,410	5352

Table 6. Classification Report for FL-based LSTM-1E Model: Precision, Recall, F1-Score, and Support.

Class	Precision	Recall	F1-Score	Support
Normal	0.96	0.97	0.96	1203
Flooding	0.99	1.00	1.00	1148
Fuzzy	0.98	0.96	0.97	1177
Malfunction	0.99	1.00	1.00	1218
Accuracy (FL-based LSTM-1E)	0.9815

Table 7. Classification Report for FL-IVN-IDS-1E Model: Precision, Recall, F1-Score, and Support.

Class	Precision	Recall	F1-Score	Support
Normal	1.00	0.99	0.99	1203
Flooding	1.00	1.00	1.00	1148
Fuzzy	1.00	1.00	1.00	1177
Malfunction	1.00	1.00	1.00	1218
Accuracy (FL-IVN-IDS-1E)	0.9973

Table 8. Classification Report for LSTM-5E Model: Precision, Recall, F1-Score, and Support.

Class	Precision	Recall	F1-Score	Support
Normal	1.00	1.00	1.00	1203
Flooding	1.00	1.00	1.00	1148
Fuzzy	1.00	1.00	1.00	1177
Malfunction	1.00	1.00	1.00	1218
Accuracy (LSTM-5E)	0.9996

Table 9. Classification Report for FL-IVN-IDS-5E Model: Precision, Recall, F1-Score, and Support.

Class	Precision	Recall	F1-Score	Support
Normal	1.00	1.00	1.00	1203
Flooding	1.00	1.00	1.00	1148
Fuzzy	1.00	1.00	1.00	1177
Malfunction	1.00	1.00	1.00	1218
Accuracy (FL-IVN-IDS-5E)	1.0000

Table 10. Comparison of Model Performance in Testing Accuracy and Training Time- Global model.

Model	Accuracy (%)	Time (s)
LSTM-1E	98.15	5300.74
LSTM-5E	99.95	20,139.08
FL-IVN-IDS-1E	99.73	3822.06
FL-IVN-IDS-5E	100	12,001.79

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ghamri, M.; Boumerdassi, S.; Belmeguenai, A.; Yellas, N.-E.-H. Federated Learning for Secure In-Vehicle Communication. Telecom 2025, 6, 48. https://doi.org/10.3390/telecom6030048

AMA Style

Ghamri M, Boumerdassi S, Belmeguenai A, Yellas N-E-H. Federated Learning for Secure In-Vehicle Communication. Telecom. 2025; 6(3):48. https://doi.org/10.3390/telecom6030048

Chicago/Turabian Style

Ghamri, Maroua, Selma Boumerdassi, Aissa Belmeguenai, and Nour-El-Houda Yellas. 2025. "Federated Learning for Secure In-Vehicle Communication" Telecom 6, no. 3: 48. https://doi.org/10.3390/telecom6030048

APA Style

Ghamri, M., Boumerdassi, S., Belmeguenai, A., & Yellas, N.-E.-H. (2025). Federated Learning for Secure In-Vehicle Communication. Telecom, 6(3), 48. https://doi.org/10.3390/telecom6030048

Article Menu

Federated Learning for Secure In-Vehicle Communication

Abstract

1. Introduction

2. Backgrounds and Preliminaries

2.1. CAN Protocol

2.2. Attacks Types

2.2.1. Denial-of-Service (DoS) Attack

2.2.2. Spoofing Attack

2.2.3. Masquerade Attack

2.2.4. Reconnaissance Attack

2.2.5. Fuzzy Attack

2.2.6. Replay Attack

2.3. Vehicle-to-Everything Communication

2.4. Machine Learning for Intrusion Detection in Vehicular Networks

2.5. Federated Learning for Intrusion Detection in Vehicular Networks

3. Federated Learning-Based Intrusion Detection Framework

3.1. Problem Statement

3.2. Proposed Architecture

3.3. Adapted-BiLSTM Architecture (a-BiLSTM)

4. Experimental Setup

4.1. Simulation Parameters

4.2. Dataset

4.3. Data Augmentation

4.4. Data Preprocessing

4.5. Data Splitting

4.6. Evaluation Metrics

4.7. Environment Evaluation

5. Results Analysis

5.1. Federated Learning-Based Intrusion Detection Performance

5.1.1. Per-Round Performance Evaluation

5.1.2. Final Global Model Evaluation

5.2. Centralized Learning-Based Intrusion Detection Performance

5.3. Network Traffic Overhead

5.4. Time Evaluation

5.5. Models Comparison

6. Conclusions and Future Work

Future Works

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI