A Novel Adaptive Cluster-Based Federated Learning Framework for Anomaly Detection in VANETs

Ch, Ravikumar; Sudheer, P; Batra, Isha; Sembiring, Falentino

doi:10.3390/engproc2025107079

Open AccessProceeding Paper

A Novel Adaptive Cluster-Based Federated Learning Framework for Anomaly Detection in VANETs^†

¹

Department of CSE, Sreenidhi Institute of Science and Technology (SNIST), Sreenidhi University, Hyderabad 501504, India

²

Department of CSE (AI&ML), CVR College of Engineering, Hyderabad 500029, India

³

Department of CSE, Lovely Professional University, Phagwara 14441, India

⁴

Information Systems Study Program, Nusa Putra University, Sukabumi 43152, West Java, Indonesia

^*

Author to whom correspondence should be addressed.

^†

Presented at the 7th International Global Conference Series on ICT Integration in Technical Education & Smart Society, Aizuwakamatsu City, Japan, 20–26 January 2025.

Eng. Proc. 2025, 107(1), 79; https://doi.org/10.3390/engproc2025107079

Published: 10 September 2025

(This article belongs to the Proceedings of The 7th International Global Conference Series on ICT Integration in Technical Education & Smart Society)

Download

Browse Figures

Versions Notes

Abstract

Vehicular Ad Hoc Networks (VANETs) encounter significant hurdles in anomaly detection owing to their dynamic characteristics, scalability demands, and privacy issues. This research presents a new Adaptive Cluster-Based Federated Learning (ACFL) architecture to tackle these challenges. In contrast to conventional machine learning models, the ACFL framework dynamically organizes cars through the Context-Aware Cluster Manager (CACM), which adjusts clusters according to real-time variables like mobility, node density, and communication patterns. Each cluster utilizes Modified Temporal Neural Networks (MTNNs) for localized anomaly detection, employing time-series analysis to improve precision. Federated learning is enabled via the Hierarchical Aggregation Layer (HAL), which effectively consolidates updates across clusters, ensuring scalability and data confidentiality. The proposed framework was assessed in comparison to established machine learning models, including Support Vector Machines (SVM), Random Forest (RF), Logistic Regression (LR), K-Nearest Neighbor (KNN), and the K-Nearest Neighbors with Kernelized Feature Selection and Clustering(KNN-KFSC) approach, utilizing the VeReMi dataset. Findings demonstrate that ACFL surpasses existing models in identifying abnormalities, including Global Positioning System(GPS)spoofing and Denial of Service (DoS) assaults, exhibiting enhanced accuracy, adaptability, and scalability. This work emphasizes the capability of ACFL to tackle urgent security issues in VANET, facilitating the development of secure next-generation intelligent transportation systems.

Keywords:

anomaly detection; adaptive cluster-based federated learning; context-aware cluster manager; modified temporal neural networks; hierarchical aggregation layer; privacy-preserving models; VANET

1. Introduction

Vehicular ad hoc networks (VANETs), a vital component of modern intelligent transportation systems, are a category of mobile ad hoc networks (MANETs) that facilitate communication between vehicles and roadside infrastructure. The dynamic nature of VANETs, characterized by high node mobility and often changing network topologies, significantly impairs network performance and security. Effective and secure communication inside VANETs is crucial for improving traffic management and road safety, as well as facilitating autonomous driving solutions, given the increasing integration of wireless communication technologies in vehicles [1]. Due to the great mobility and decentralization of VANETs, they are vulnerable to many security threats, requiring dependable real-time anomaly detection and intrusion prevention systems [2].

In VANETs, intrusion detection systems (IDS) are crucial for detecting and preventing assaults; yet, traditional methods often struggle to adapt to the rapidly changing network landscape. The evolving topologies and increased number of attack routes in VANET systems hinder the adaptability of conventional IDS models, such as rule-based and statistical methods. Moreover, as the quantity of vehicles and data increases, these systems face scaling challenges [3]. Recent advancements in deep learning and machine learning have improved IDS capabilities, enabling more precise detection of abnormalities and hostile activities. The advancement of more efficient IDSs for VANETs is obstructed by challenges such as adversarial assaults, where malicious data is introduced to deceive IDS models, and the lack of comprehensive publicly available datasets.

This study proposes Adaptive Cluster-Based Federated Learning (ACFL), an innovative anomaly detection approach for VANETs, as a remedy for these challenges. ACFL transcends traditional machine learning techniques by implementing dynamic vehicle clustering that is informed by spatiotemporal behaviors and communication patterns. This technique facilitates more efficient and adaptable learning in real-time environments. The Context-Aware Cluster Manager (CACM), a notable enhancement of the ACFL architecture, facilitates real-time adaptation by dynamically adjusting clusters based on node density, mobility, and traffic patterns [4]. Each cluster employs MTNN for traffic forecasting and localized anomaly detection through time-series analysis.

The ACFL design enhances federated learning by integrating dynamic clustering with a Hierarchical Aggregation Layer (HAL), which consolidates updates from local clusters across many hierarchical levels. This process ensures scalability and data privacy by maintaining raw data in a decentralized manner while only disseminating model changes across the network [5]. Federated learning enhances system security and efficiency by diminishing the necessity for centralized data processing.

The aim of this paper is to develop a more effective anomaly detection system for VANETs that is scalable, customizable, and privacy-preserving. Traditional machine learning models, particularly SVM, RF, and LR, face challenges with the ever-changing nature of VANETs, attempt to maintain privacy, and struggle to offer intrusion detection services. This study shows that the KNN-KFSC method and the detection results outperformed the response time and accuracy of the proposed ACFL framework [6]. This assessment also highlights some important areas of improvement with the well-known VeReMi dataset that comes with an extensive collection of attack scenarios for vehicular communications [7]. This paper adds to the body of work on the security of VANETs by proposing a new ACFL framework that uses model aggregation with federated learning and dynamic clustering to enhance real-time anomaly detection. This framework uses CACM, MTNNs, and HAL for data privacy, adaptability of ever-changing topologies, and system performance, respectively. The ACFL architecture demonstrates the potential of meeting fundamental security requirements for intelligent transport systems through the evaluation using the VeReMi dataset.

2. Literature Review

VANETs and their associated security requirements motivate research on intrusion detection for such networks. VANETs allow vehicles to communicate with each other and with infrastructures, which facilitates autonomous driving, enhances traffic management, and increases road safety. Moreover, the high mobility of nodes and the constantly changing topologies make the development of robust IDS adaptable to these environments highly challenging. A great number of alternative solutions have emerged to address these problems, and machine learning is gaining popularity. This chapter discusses related works on intrusion detection in VANETs with a focus on the use of machine learning, hybrid systems, and modern neural networks aimed at improving system stability and increasing accuracy in detecting intrusions. To provide a better distinction between the classes of different attacks in VANETs, Hind Bangui et al. [6] proposed a hybrid data-driven framework that brought towards a coherent setup of different data models aimed at rogue nodes identification.

A distinct approach using a rule-based security filter to identify and eliminate abnormal nodes in VANETs was suggested by Alsarhan et al. [7]. This strategy analyzed a large real-time dataset using the Dempster-Shafer theory for filtering nodes and deriving linear attributes. To analyze the performance of the anomaly detection approach, they compared the results with other machine learning-based intrusion detection system methodologies. While the advanced techniques were useful, the analysis of the rule-based approach alongside more sophisticated machine learning techniques revealed some shortcomings, which were nonetheless beneficial for further research. This underscores the need to incorporate more novel approaches in the design of intrusion detection systems (IDSs), focusing on improving detection efficacy while addressing the dynamic challenge posed by threats, and rule-based and machine learning systems are converging.

S. Thorat et al. [8] implemented deep learning techniques along with changing the basic structure of IDS modules, which resulted in satisfactory improvements. Their work used a model of Deep Belief Network (DBN) and a dataset called CIC-IDS2017 to detect physical attacks on roadside units and intersections of vehicles. This approach sought to improve accuracy and dependability in the detection of some form of attacks using deep learning’s capability of capturing sophisticated patterns and relationships. The research indicated the possibility of utilizing Deep Belief Networks (DBNs) to enhance the performance of IDS on VANETs, especially for complex attack scenarios. Meanwhile, the issue of employing deep learning models in distributed and mobile VANET systems was also pointed out.

Abdulaziz Alshammari et al. [9] advanced an IDS module with enhanced classification features that utilized various categorization methods and multi-level distinctions. Their research stressed the importance of rigorous testing and validation in the development of IDS using a comprehensive testing methodology that involved multiple validation strategies to analyze the results collected. Zeng et al. [10] focused on studying the effect of Neural Networks (NNs) on the performance of VANET systems in one relevant work. Their study demonstrated the ability of neural networks to contribute to improving the efficacy of IDS by a detailed analysis of the model in components such as hidden layers and weight biases.

Erfan A. Shams et al., in the initial stages [11], developed a kernel-based SVM algorithm as a secondary approach to classify different types of IDS. The obstacles of having multiple vehicle nodes were a challenge to the promise offered by the SVM model. This issue raised further questions around the scalability of their model of IDS in large VANETs. Almi’ani et al. [12] implemented a novel non-linear approach of IDS involving self-organizing maps (SOM) for threat classification, focusing on networks. The clustering approach improves the detection of patterns of extraordinary behavior in VANETs and thus improves the accuracy of detection.

Laisen Nie et al. [13] made further improvements in the detection of anomalies in VANETs using Convolutional Neural Networks (CNN) by analyzing the spatial and temporal movements of vehicle nodes. Collecting spatial and temporal information helped in better training and classification, which in turn enhanced the identification of abnormalities. In more active settings like VANETs, these networks are very powerful tools for finding abnormal network behavior due to their intricate pattern learning and representation capabilities.

S. K. Tayyaba et al. [14] discussed the drawbacks of rule-based systems, especially in dealing with sophisticated, multi-layered attack techniques, and proposed more agile and adaptable frameworks for IDS. Zhou et al. [15] advanced the field by proposing an invariant-based distributed collaborative framework for intrusion detection in VANETs. The authors stressed the necessity for coordination among vehicles for prompt threat identification and threat mitigation. This structure’s distributed nature enabled better performance in detecting sophisticated, coordinated attacks while improving the response time for detection. Federated learning (FL), which is particularly applicable to VANETs, has surfaced as a suitable method for distributed learning that respects privacy.

Cao and Gong [16] have pinpointed the gaps in federated learning, especially the model poisoning attack, which greatly reduces the reliability of anomaly detection systems. Hossain et al. [17] studied model-poisoning stealthier attacks in differential privacy and discussed the need for stronger and effective defenses. Alshudukhi et al. [18] discussed the use of blockchain-enabled federated learning in vehicular ad hoc networks and pointed out its usefulness for data sharing in critical situations such as prolonged emergency treatment, where secure data exchange is needed. Simra et al. [19] introduced knowledge-based federated deep learning for IoT and showed its effectiveness in improving anomaly detection. Xia et al. [20] proposed the concepts and models for split federated learning in IoT and focused on scalable, secure implementations. Our results altogether emphasize the need for cluster-based, adaptive federated learning frameworks to tackle the unique problems posed by anomaly detection in vehicular ad hoc networks [21].

When analyzing the available literature related to the vehicular ad hoc networks, one cannot overlook IDS integrated into them. What becomes apparent is the lack of sophistication and increasingly advanced challenges [22,23,24] within the problem area. Both modern and older approaches do offer something beneficial; however, all of them tend to ignore the agile, continuously changing, and ever-dynamic nature of decentralization as well as the resources available in the environment that VANETs operate in 2.1 Literature gaps highlight the need for the development of the ACFL model.

Shortcomings of Classical Rule-Based Models

Examples include the rule-based IDSs developed by Patel and Sonker. Such systems do recognize some known threats like port scanning; however, they fail to cope with new attack patterns in environments like VANETS and do require a lot of manual updating work. Similarly, intrusion detection systems that employ a machine learning approach do achieve some level of success, but face serious problems regarding the ability to scale with the ever-increasing size of the network due to centralized processing that creates privacy concerns, as well. While deep learning models achieve impressive levels of precision, their expansive computational resource requirements make them inapplicable for use in distributed systems like VANETs where resources are constrained. Assessment is also limited because complete attack datasets are lacking and the static IDS models cannot cope with the changing topologies of VANETs. To overcome these challenges, the ACFL model combines MTNN for distributed deep learning, federated learning for maintaining data privacy, and a CACM that continuously remaps changing variable topologies. All of these have been confirmed using the VeReMi dataset to ensure performance in actual VANET environment;
Even with the continual improvements made to these systems, the issues of dynamic changes in network topologies, scaling, privacy safeguards, and real-time response remain unsatisfactorily solved;
As noted in prior research, the problems from the limitations of the classical rule-based systems, the difficulties in scaling machine learning algorithms, and the absence of sufficient datasets suggest the need for more sophisticated and flexible approaches. In this paper, we design a flexible and scalable VANET IDS based on ACFL, which employs dynamic clustering, decentralized learning, and hierarchical aggregation.

3. Techniques for Intrusion Detection in VANETs

In this part, we describe the methods used for anomaly detection in VANETs using the proposed ACFL architecture. As a result of its high mobility and its highly dynamic nature, innovative approaches are required to prevent and detect security issues within VANETs. The ACFL paradigm adds several new components that work together to solve problems of scalability, data privacy, real-time responsiveness, detection accuracy, and adaptability within VANET ecosystems. In this document, I will cover extensively the datasets utilized, the types of assaults handled, and the comparative study of the ACFL model against the traditional machine learning models K-Nearest Neighbours (KNNs), SVM, Random Forest (RF), and Logistic Regression (LR). Using these models furthers my research aimed at demonstrating the advantage of ACFL over static models, which is especially pronounced in dynamic environments like VANETs.

3.1. Datasets and Attack Scenarios

The primary dataset analyzed to test the effectiveness of the ACFL architecture is the VeReMi dataset, one of the VANET misbehavior detection systems benchmarking resources. Researchers using the VeReMi dataset, which attempts to mimic real VANET environments, can benchmark the effectiveness of misbehavior detection algorithms for the detection and identification of malicious activities on VANETs. The dataset comprises message logs from a simulation environment that contains messages that are classified according to a predefined ground truth model as either hostile or benign. This is performed so that the accuracy of detection of different kinds of misconduct by the models can be accurately gauged.

Position falsification attacks are one of the most common attacks in VANETs, where rogue vehicles inject false positions of their vehicles to interfere with the healthy operation of vehicular networks. The VeReMi dataset is concerned with this topic. The dataset includes five kinds of position falsification attacks.

The integration of dynamic clustering, temporal data, and federated learning helped ACFL overcome the limitations posed by traditional models in VANETs. The framework has many essential components that together improve the detection of anomalies while ensuring data privacy, preservation of privacy, and scalability.

The strengths of the ACFL architecture include the CACM, which organizes and remodels vehicle clusters automatically and immediately according to spatiotemporal behaviors, node density, communication patterns, and real-time data. Unlike traditional static clustering methods, CACM allows clusters to adapt to changes in vehicular mobility and traffic patterns, thus making certain that each cluster’s structure represents the current status of the network. This flexibility is needed in VANETs because positions, speeds, and intercommunication links of vehicles change rapidly.

CACM controls the merging of vehicles with features that enhance local anomaly detection accuracy. The system is capable of detecting anomalies in behavior within clusters thanks to the primary rudimentary non-static classification of vehicles. If devices within a cluster regularly report positions that conform to expected traffic flow patterns, any vehicle providing substantially divergent or inconsistent data can be identified as potentially malicious. Distributing the computing responsibility through several clusters enhances scalability, which is achieved by the dynamic clustering technique. Each cluster operates in a partially autonomous mode, which reduces centralized data processing while at the same time improving the system’s efficiency and ability to control large-scale VANET systems.

Once the vehicle clusters are formed, the ACFL architecture applies MTNN for local anomaly detection. Documented evidence shows that MTNNs are very proficient in the analysis of time-series data, thus making them very good at grasping the timeline structures associated with vehicle communications. Through time, sequences of positional data, velocity, and other communicative parameters are computed, and MTNNs are capable of finding anomalies that would suggest some form of attack.

MTNNs have an added advantage of sophisticated attack pattern detection, such as Eventual Stop Attacks and Random Offset Attacks. Although these attacks are not immediately perceivable through a single snapshot or by one data point analysis, MTNNs are able to detect deviations or abnormalities that evolve over a longer timescale due to the temporal relationships within a dataset. While executing its operations, each cluster has its own MTNN, which observes the movements of the cars within the clustered group. This type of anomaly detection makes it possible for systems to respond to potential threats much faster and without the need for central processor dependence, which often incurs a time lag. By distributing the responsibility of anomaly detection, the ACFL architecture ensures real-time efficiency throughout vast networks.

The integration of federated learning into the ACFL framework is a particular implementation that stands out. This enables the model of each vehicle cluster to be trained locally without needing to send the sensitive data to a central server. Clearly, data privacy is improved since vehicle details are kept within the cluster and not exposed to the network. Each cluster does not pass on the data but only the changes performed to the models, which are integrated at the global server to build a global model. The model change consolidation process is accomplished by HAL, which aggregates at several hierarchical levels. HAL’s approach to integrating model changes from several clusters increases the scalability and efficiency of federated learning systems by lowering the overhead associated with the model updates communication from multiple clusters. In the ACFL framework, integration with HAL enables the accommodation of thousands of cars without overusing the network resources or negatively impacting learning efficacy. The hierarchical aggregation approach ensures that the global model creation is useful locally at different levels, improving the accuracy of the anomaly detection system.

To measure effectiveness concerning the ACFL framework, we looked into a comparative analysis concerning, KNN, SVM, RF, and even LR. Each model has drawn significant interest with respect to anomaly detection, and their outputs are a measure of the value that can be realized using the ACFL framework.

3.2. K-Nearest Neighbors

An algorithm like KNN demonstrates an example of a passive learning scheme where class membership is given according to the dominant class of the closest neighbors. KNN is basic and performs sufficiently for a great number of problems, but it does face some issues in the context of VANETs. In VANETs, the extreme mobility of the nodes results in the data point relationships changing frequently, thereby making it very difficult for KNN to retain accurate classifications over time.

In addition, KNN has the complication of needing to retain vast amounts of information, which is a costly resource in large-scale networks. Local anomaly detection is performed at the more precise cluster level among the groups of similar vehicles, which mends those issues via dynamic clustering with the ACFL technique. Unlike standard KNN models, this decreases bulk data retention necessities and enables more rapid adjustment to varying network situations.

3.3. Support Vector Machine

The effectiveness of SVM lies in their ability to cope with high-dimensional data with intricate boundaries. The importance of SVM becomes especially pronounced in situations where the data shows clear borders between the classes, which enables many intrusion detection systems. However, SVMs face challenges in the context of VANETs because of their centralized data processing and their overfitting tendencies, which occur with an abundance of features and a limited sample size. The ACFL system mitigates these constraints, as it distributes learning across numerous clusters, reducing overfitting and enhancing system scalability. The ACFL system modifies the SVM model using federated learning, which requires no centralized data storage, permitting swift, on-the-fly changes as new information emerges.

3.4. Random Forest Classifier

The RF builds several decision trees and selects the class that is most voted on by the trees. Random Forest is popular for its robustness against overfitting and its ability to handle noisy data, thus making it a preferred algorithm for detecting intrusions. Increasing the number of decision trees, however, can increase the computational cost for Random Forest models, particularly in large-scale networks such as VANETs. The ACFL system employs dynamic clustering and federated learning to alleviate the training cost of large RF models. Permitting individual clusters to train a local RF model allows the system to sustain real-time performance while benefiting from integrated learning approaches.

3.5. Logistic Regression

LR is an example of a basic model that works well for binary classification problems. Just like other types of linear regression, LR is effective when the features of the dataset are simple and one-dimensional, meaning they move in a straight line with respect to the target variable. However, it does not perform well with the more intricate non-linear relationships that are characteristic of most relationships in VANET data. Nonetheless, their precision and low complexities make it a reliable model for detecting anomalies, albeit suffering from the limitations discussed earlier. The ACFL framework incorporates MTNN to enhance traditional LR models, which aids in capturing the intricate temporal dynamics associated with vehicular communication data. This development helps the system identify anomalies that more basic linear models, such as linear regression, are incapable of detecting.

3.6. The Case of Data Security in ACFL

Perhaps the most challenging issue of VANETs is safeguarding the confidentiality of vehicle information. A single central computer delineates a centralized system from the rest and requires vehicles to transmit basic data, which is extremely privacy invasive. In contrast, the privacy-preserving ACFL architecture employs clustered federated learning, a form of decentralized information processing that leaves sensitive data at the cluster level. Within the framework of VANETs, the potential threat to privacy is significantly mitigated by transmitting only model update information to the central server. In VANETs, one possible sensitive item that could be transmitted includes the vehicle’s location, speed, and even communication history. The HAL strengthens this control strategy by permitting the aggregation of model updates at different hierarchical levels, which decreases the volume of transmitted data over the network.

Leys and Delacre [21] shed light on ACFL and demonstrate its substantial superiority in anomaly detection for VANETs. The ACFL framework alleviates numerous issues that have traditionally hampered the effectiveness of intrusion detection systems in BASE-ST VANETs, such as vehicle mobility patterns with Dynamic Vehicle Clustering, anomaly detection using MTNN, and privacy concerns with data through Federated Learning. In their comparative analysis of KNN, SVM, RF, and LR, it was noted that the ACFL framework yielded better results than the traditional machine learning models with regard to accuracy, scalability, and reaction time. Using the VeReMi dataset also demonstrates the competence of the ACFL model in dealing with various attack scenarios in real-world implementations within VANET systems. The ACFL framework offers a balanced solution that combines enhancement of security and robustness in VANET systems.

4. ACFL Framework Implementation into VANETs

The ACFL architecture provides a novel solution for anomaly detection in VANETs. ACFL solution focuses on dynamic clustering as a derived feature from vehicle trajectory and real-time federated learning along with multi-task teaching to address the scalability, privacy, and anomaly detection problems [22,23]. The following sections describe the implementation steps, which include selecting the dataset, data preprocessing, model design, and training the model. The architecture workflow of ACFL is shown in Figure 1.

4.1. Dataset: VeReMi

The VeReMi dataset serves as the basis for the design and evaluation of the ACFL framework. This dataset was constructed using VEINS (Version 4.6) and LuST (model of urban mobility simulation), along with provided labeled ground truth data and unit message log files from the OBUs. It emulates numerous vehicle-to-vehicle communication scenarios, enabling adequate testing and benchmarking of misbehavior detection algorithms in a realistic VANET environment. The systematic approach adopted by VeReMi enhances the reliability of the detection algorithms while fostering the development of more sophisticated and accurate anomaly detection techniques.

In addition, the dataset includes a wide range of attack types, from GPS spoofing to Sybil attacks, replicating minor as well as major incursions into vehicular networks. Such richness ensures that the ACFL model is tested against a realistic range of security vulnerabilities, making it robust enough for practical applications in VANETs. As shown in Table 1, the VeReMi dataset was resampled to balance the distribution of instances for model evaluation.

Because of the versatility the dataset provides, it is perfect for evaluating the effectiveness of the ACFL framework in detecting assaults as well as testing the dynamic configuration of clustering and federated learning modules.

While the model training is being conducted, the raw data within the VeReMi collection is pre-processed to a standard that can easily be analyzed. For example, several columns, including “send time”, “sender”, “message ID”, position (pos), and speed (sad), are grouped together to form a single column, which is Textual content. The information extracted from the text is enough to be used as a model’s input.

After this, the Attack Type column, which classifies different attack types, is changed to a numerical format using label encoding. Label encoding assists in converting categorical variables to values that machine learning algorithms can work with using numbers.

Once the data encoding is completed, the dataset is divided into a training set of 80% and a testing set of 20%. This enables the model to diversify the amount of data it is trained on and validated with new data, providing a more balanced scenario to evaluate the model’s performance and adaptability.

After encoding, tokenization is the next step to create sequences ready for time series analysis. The KFSC tokenizer is applied to the text data to prepare the inputs for the MTNNs within the ACFL framework. A custom class named “CustomDataset” is designed to automate the relevant processes, including text and label acquisition, tokenization, and defining the length of the sequence for effective training of the model.

4.2. ACFL Model Implementation

The multiple innovations introduced in integrating the ACFL framework include dynamic clustering, privacy-preserving federated learning, and hierarchical aggregation to enhance the scalability, detection accuracy, and data privacy in VANETs.

The CACM is the core component of the ACFL framework, which constructs and alters clusters based on real-time traffic elements like density, mobility, and communication patterns. Unlike static clustering, CACM rearranges vehicles based on their trajectories and proximity, thus enabling the system to detect abnormalities within smaller and more localized clusters efficiently.

Vehicles moving nearby and at roughly the same speed, or using similar routes, are placed into distinct clusters. This strategy focuses on local data which improves anomaly detection at the expense of resource spending efficiency. The detection of anomalies is executed at the cluster level, which lessens the overall computational burden. The extensive scale of VANET makes reasonable adaptation necessary.

Modified Temporal Neural Networks (MTNNs) are methods for anomaly detection within each cluster. The ability of MTNNs to analyze time-related data makes them suitable for detecting offensive behaviors, including spoofing attacks through GPS or Sybil attacks, considering their positional and speed data.

The MTNNs learn locally in each cluster to enable the model to distinguish between short-term and long-term anomalies of normal vehicle behavior. This approach simplifies the problem of anomaly detection since no central analysis system is needed for each cluster’s operation.

As a model training privacy constraint, the ACFL framework employs federated learning to allow decentralized model training. In comparison to other machine learning models, there is a need to send raw data to a central server for training, and this poses privacy concerns. In contrast, raw data can be sent out with federated learning, where model training takes place at the device or cluster level.

Rather than sending raw data, the updates of the model (upgrades, changes in weights, etc.) considered relevant to the central server are sent. The central server then combines the updates to a global model and sends it back to the clusters. This process allows for shared learning without compromising the confidentiality of vehicle data.

The steps are as follows:

Global Model Training: Each cluster updates MTNN models from the lessons learned and collected from the local data;
Updates to Model: The updates to the trained model (parameters or gradients) are counted and sent to the hub;
Global Model Aggregation: The proposed central anomaly detection model is updated with information received in a single communication round;
Model Distribution: The proposed global model is sent back to the clusters for additional local training and refinement;
Iterative Process: This model iteration process continues until the clusters converge, allowing each cluster to maintain detail harboring privacy-preserving detection.

Using the ACFL framework, an attempt at addressing the problem of scalability in the context of large VANET regions is proposed with the inclusion of a Hal sixth aggregation layer. This HAL aggregates model update messages at some hierarchical level before communicating them with the center server. This reduces communication costs and improves network scalability.

In large-scale deployments, it is not feasible for every cluster to communicate directly with the central server due to network bottleneck problems. Using hierarchical aggregation, clusters in the same region can send the aggregated updates to an intermediate server, which can forward them to the center server. This eliminates the numerous updates that tend to congest the network and improves the scalability and efficiency of the system.

4.3. Evaluation and Validation of ACFL Framework

The validation of the ACFL framework was performed via the VeReMi dataset with a primary focus on the following:

Mean Accuracy: Overall accuracy in the detection of anomalies;
Mean Precision: True anomalies identification accuracy, without any triggers for false positives;
Mean Recall: The ability of the model to identify all pertinent anomalies;
Mean F1 Score: The balanced average of precision and recall.

The results demonstrated that the ACFL framework drastically improved detection accuracy and scale compared to conventional machine learning models. The federated learning approach maintained strong model performance while safeguarding the privacy of the vehicular data.

5. Results and Evaluation

In this work, we performed a detailed comparative study of the conventional machine learning models and the more recent ACFL framework for anomaly detection in VANETs. Considering the VeReMi dataset, we analyzed the performance of a number of models, such as K-Nearest Neighbors with K-Fuzzy Subspace Clustering (KNN-KFSC), SVM, RF, and LR. The dataset contained several attack scenarios, such as Constant Jamming Attacks, GPS spoofing, Selective Forwarding, Sybil Attacks, as well as Denial of Service (DoS) attacks. Each attack scenario was executed on the models with essential features like vehicle positions, speed, and communication history.

The provided Figure 2 illustrates the output from your Python (3.7) script containing:

Details pertaining to the successful loading of six datasets: ATTACK1, ATTACK2, ATTACK4, ATTACK8, ATTACK16, and Modified ATTACK16;
Evaluation per dataset of the four models (ACFL, KNN-KFSC, SVM, and Random Forest, along with Logistic Regression);
In each dataset, each model was evaluated for the confirmed successful execution across all datasets;
The screenshot shows the overall processing time as 68.49 s.

Figure 2. Steps for loading and evaluation of models.

These models were assessed on four fundamental metrics, which include the following: Mean Accuracy, Mean Precision, Mean Recall, and Mean F1 Score. To aid in reproducibility, we used k-fold cross-validation with five splits, which provided additional insights regarding the dependability of each model across the different data subsets.

These outcomes are also presented in the next table with comparative bar graphs. The ACFL framework outperformed all other frameworks, having a mean accuracy of 99.5%,with the KNN-KFSC model following close behind at 99%. In contrast, some of the older models, like Random Forest, SVM, and Logistic Regression, had accuracy rates of 89%, 92%, and 93%, respectively.

As shown in Table 2, the ACFL framework clearly showed superiority, outperforming the legacy models across all metrics, setting benchmarks well above the defined thresholds for accuracy, precision, and recall, thus validating its sophistication amid intricate datasets. The results show that ACFL is preferable in situations where rapid anomaly detection with minimal false positives within intricate VANET domains is required due to its refined positive result filtration.

Key Findings

ACFL Performance: The ACFL framework obtained the best performance overall with a 99.5% Mean Accuracy, 99.2% Mean Precision, and 99.25% Mean F1 Score. This shows that it is effective in ensuring privacy and scalability while detecting diverse kinds of anomalies.
KNN-KFSC Model: The KNN-KFSC model did not perform poorly either, obtaining a 99% Mean Accuracy, demonstrating its scope on diverse attack scenarios. However, it was noted that the static clusters and dynamic clustering, along with hierarchical aggregation of intermediate results in the ACFL framework, outperformed in scalability and real-time adaptability.
Traditional Models: The Random Forest, SVM, and Logistic Regression models performed reasonably well but were outperformed by a considerable margin by the ACFL and KNN-FKSC models. Random Forest had the lowest accuracy with 89%, whilst SVM and Logistic Regression scored 92% accuracy.
Improved Scalability with ACFL: Responsiveness and scalability were improved for large-scale utilization in VANET systems with the incorporation of the HAL module into the ACFL framework.

The performance metrics are given in the following bar charts to facilitate comparison across various models and their permutations. The results have validated the advantages of the ACFL framework designed for anomaly detection in the controls of VANETs. Its real-time adaptability to vehicle movements, along with privacy-preserving federated learning, underscores its exceptional value in modern vehicular networks.

The evaluation demonstrated that the ACFL architecture surpassed the performance of other traditional machine learning models in terms of anomaly detection. Dynamic clustering coupled with MTNNs for local anomaly detection, as well as federated learning, allowed ACFL to enhance its accuracy, scalability, and data privacy. Such advancements position ACFL as a potential candidate for fortifying security and reliability in VANETs. Further research could enhance the performance of ACFL by optimizing clustering algorithms and diversifying the application to other types of ad hoc networks.

6. Conclusions

The findings from this study show the effectiveness of the ACFL architecture considering the anomaly detection capabilities in the context of VANETs. The ACFL architecture outperformed other traditional models like Random Forest, Support Vector Machine, and even Logistic Regression with more advanced dynamic clustering techniques via the CACM and Federated Learning. Also, the incorporation of a HAL increased system responsiveness and scalability, which made the system particularly more effective in real-time large-scale VANET deployments. The empirical evidence from the VeReMi dataset demonstrates the superiority of ACFL over existing models in detecting various GPS spoofing and DoS attacks. This first attempt to impose constraints of safety, scalability, and privacy on anomaly detection in VANETs offered robust results in accuracy, precision, recall, and F1 score that far exceeded traditional approaches. It is a significant finding in terms of the security of modern vehicle networks, while at the same time feeding the discourse on the application of federated learning in other distributed network systems.

Author Contributions

P.S. contributed to Methodology, Software, Investigation, and Visualization. R.C. contributed to Conceptualization, Methodology, Formal Analysis, Data Curation, Writing Original Draft Preparation, and Project Administration. I.B. contributed to Validation, Investigation, Data Curation, and Writing Review and Editing. F.S. contributed to Formal Analysis, Resources, Writing Review and Editing, Supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Chiti, F.; Fantacci, R.; Gu, Y.; Han, Z. Content sharing in Internet of Vehicles: Two matching-based user-association approaches. Veh. Commun. 2017, 8, 35–44. [Google Scholar] [CrossRef]
Ghaleb, F.A.; Maarof, M.A.; Zainal, A.; Al-Rimy, B.A.S.; Alsaeedi, A.; Boulila, W. Ensemble-Based Hybrid Context-Aware Misbehavior Detection Model for Vehicular Ad Hoc Network. Remote. Sens. 2019, 11, 2852. [Google Scholar] [CrossRef]
Al-Rimy, B.A.S.; Maarof, M.A.; Alazab, M.; Alsolami, F.; Shaid, S.Z.M.; Ghaleb, F.A.; Al-Hadhrami, T.; Ali, A.M. A Pseudo Feedback-Based Annotated TF-IDF Technique for Dynamic Crypto-Ransomware Pre-Encryption Boundary Delineation and Features Extraction. IEEE Access 2020, 8, 140586–140598. [Google Scholar] [CrossRef]
Zafar, F.; Khattak, H.A.; Aloqaily, M.; Hussain, R. Carpooling in Connected and Autonomous Vehicles: Current Solutions and Future Directions. ACM Comput. Surv. 2022, 54, 1–36. [Google Scholar] [CrossRef]
Gopi, R.; Rajesh, A. Securing video cloud storage by ERBAC mechanisms in 5g enabled vehicular networks. Clust. Comput. 2017, 20, 3489–3497. [Google Scholar] [CrossRef]
Bangui, H.; Ge, M.; Buhnova, B. A Hybrid Data-driven Model for Intrusion Detection in VANET. Procedia Comput. Sci. 2021, 184, 516–523. [Google Scholar] [CrossRef]
Alsarhan, A.; Al-Ghuwairi, A.-R.; Almalkawi, I.T.; Alauthman, M.; Al-Dubai, A. Machine Learning-Driven Optimization for Intrusion Detection in Smart Vehicular Networks. Wirel. Pers. Commun. 2020, 117, 3129–3152. [Google Scholar] [CrossRef]
Vitalkar, R.S.; Thorat, S.S.; Rojatkar, D.V. Intrusion detection for vehicular ad hoc network based on deep belief network. In Computer Networks and Inventive Communication Technologies; Smys, S., Bestak, R., Palanisamy, R., Kotuliak, I., Eds.; Volume 75 of Lecture Notes on Data Engineering and Communications Technologies; Springer: Singapore, 2022. [Google Scholar]
Alshammari, A.; Zohdy, M.A.; Debnath, D.; Corser, G. Classification Approach for Intrusion Detection in Vehicle Systems. Wirel. Eng. Technol. 2018, 9, 79–94. [Google Scholar] [CrossRef]
Zeng, Y.; Qiu, M.; Ming, Z.; Liu, M. Senior2Local: A machine learning based intrusion detection method for VANETs. In Smart Computing and Communication. SmartCom 2018; Qiu, M., Ed.; Volume 11344 of Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2018. [Google Scholar]
Shams, E.A.; Rizaner, A.; Ulusoy, A.H. Trust aware support vector machine intrusion detection and prevention system in vehicular ad hoc networks. Comput. Secur. 2018, 78, 245–254. [Google Scholar] [CrossRef]
Almi’Ani, M.; Abu Ghazleh, A.; Al-Rahayfeh, A.; Razaque, A. Intelligent intrusion detection system using clustered self organized map. In Proceedings of the 2018 Fifth International Conference on Software Defined Systems (SDS), Barcelona, Spain, 3–26 April 2018; pp. 138–144. [Google Scholar]
Nie, L.; Li, Y.; Kong, X. Spatio-Temporal Network Traffic Estimation and Anomaly Detection Based on Convolutional Neural Network in Vehicular Ad-Hoc Networks. IEEE Access 2018, 6, 40168–40176. [Google Scholar] [CrossRef]
Khan, S.; Khattak, H.A.; Almogren, A.; Shah, M.A.; Din, I.U.; Alkhalifa, I.; Guizani, M. 5G Vehicular Network Resource Management for Improving Radio Access Through Machine Learning. IEEE Access 2020, 8, 6792–6800. [Google Scholar] [CrossRef]
Zhou, M.; Han, L.; Lu, H.; Fu, C. Distributed collaborative intrusion detection system for vehicular Ad Hoc networks based on invariant. Comput. Netw. 2020, 172, 107174. [Google Scholar] [CrossRef]
Specker, A.; Florin, L.; Cormier, M.; Beyerer, J. Improving Multi-Target Multi-Camera Tracking by Track Refinement and Completion. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), New Orleans, LA, USA, 18–24 June 2022; pp. 3198–3208. [Google Scholar]
Hossain, T.; Islam, S.; Badsha, S.; Shen, H. DeSMP: Differential Privacy-exploited Stealthy Model Poisoning Attacks in Federated Learning. In Proceedings of the 2021 17th International Conference on Mobility, Sensing and Networking (MSN), Exeter, UK, 13–15 December 2021; pp. 167–174. [Google Scholar]
Alshudukhi, K.S.S.; Ashfaq, F.; Jhanjhi, N.Z.; Humayun, M. Blockchain-Enabled Federated Learning for Longitudinal Emergency Care. IEEE Access 2024, 12, 137284–137294. [Google Scholar] [CrossRef]
Simra, T.; Konatham, B.; Amsaad, F.; Ibrahem, M.I.; Jhanjhi, N.Z. Enhancing Anomaly Detection of IoT using Knowledge-Based and Federated Deep Learning. In Proceedings of the 2024 IEEE 3rd International Conference on Computing and Machine Intel-Ligence (ICMI), Mt Pleasant, MI, USA, 13–14 April2024; pp. 1–6. [Google Scholar]
Xia, G.; Chen, J.; Yu, C.; Ma, J. Poisoning Attacks in Federated Learning: A Survey. IEEE Access 2023, 11, 10708–10722. [Google Scholar] [CrossRef]
Leys, C.; Delacre, M.; Mora, Y.L.; Lakens, D.; Ley, C. How to Classify, Detect, and Manage Univariate and Multivariate Outliers, With Emphasis on Pre-Registration. Int. Rev. Soc. Psychol. 2019, 32, 5. [Google Scholar] [CrossRef]
Jayaprakash, A.; Evans, B.G.; Xiao, P.; Awoseyila, A.B.; Zhang, Y. New Radio Numerology and Waveform Evaluation for Satellite Integration into 5G Terrestrial Network. In Proceedings of the ICC 2020—2020 IEEE International Conference on Communications (ICC), Dublin, Ireland, 7–11 June 2020; pp. 1–7. [Google Scholar]
Chen, T.; Yan, J.; Sun, Y.; Zhou, S.; Gündüz, D.; Niu, Z. Mobility Accelerates Learning: Convergence Analysis on Hierarchical Federated Learning in Vehicular Networks. IEEE Trans. Veh. Technol. 2025, 74, 1657–1673. [Google Scholar] [CrossRef]
HaghighiFard, M.S.; Coleri, S. Hierarchical Federated Learning in Multi-hop Cluster-Based VANETs. IEEE Trans. Veh. Technol. 2025, 1–15. [Google Scholar] [CrossRef]

Figure 1. Workflow diagram for ACFL.

Table 1. Resample of VeReMi Dataset.

Attacks	Size
BENIGN	60,000
Attack type 1 (Constant Jamming Attack)	30,473
Attack type 2 (GPS Spoofing)	30,473
Attack type 4 (Selective Forwarding)	30,510
Attack type 8 (Sybil Attack)	29,460
Attack type 16 (Denial of Service Attack)	28,832

Table 2. Results with different machine learning models.

Model	Mean Accuracy (%)	Mean Precision (%)	Mean Recall (%)	Mean F1 Score (%)
ACFL	99.5	99.2	99.3	99.25
KNN-KFSC	99.0	98.0	99.0	98.5
Random Forest	89.0	87.0	88.0	87.5
Support Vector Machine	92.0	91.0	92.0	91.5
Logistic Regression	93.0	90.0	93.0	91.5

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ch, R.; Sudheer, P.; Batra, I.; Sembiring, F. A Novel Adaptive Cluster-Based Federated Learning Framework for Anomaly Detection in VANETs. Eng. Proc. 2025, 107, 79. https://doi.org/10.3390/engproc2025107079

AMA Style

Ch R, Sudheer P, Batra I, Sembiring F. A Novel Adaptive Cluster-Based Federated Learning Framework for Anomaly Detection in VANETs. Engineering Proceedings. 2025; 107(1):79. https://doi.org/10.3390/engproc2025107079

Chicago/Turabian Style

Ch, Ravikumar, P Sudheer, Isha Batra, and Falentino Sembiring. 2025. "A Novel Adaptive Cluster-Based Federated Learning Framework for Anomaly Detection in VANETs" Engineering Proceedings 107, no. 1: 79. https://doi.org/10.3390/engproc2025107079

APA Style

Ch, R., Sudheer, P., Batra, I., & Sembiring, F. (2025). A Novel Adaptive Cluster-Based Federated Learning Framework for Anomaly Detection in VANETs. Engineering Proceedings, 107(1), 79. https://doi.org/10.3390/engproc2025107079

Article Menu

A Novel Adaptive Cluster-Based Federated Learning Framework for Anomaly Detection in VANETs^†

Abstract

1. Introduction

2. Literature Review

Shortcomings of Classical Rule-Based Models

3. Techniques for Intrusion Detection in VANETs

3.1. Datasets and Attack Scenarios

3.2. K-Nearest Neighbors

3.3. Support Vector Machine

3.4. Random Forest Classifier

3.5. Logistic Regression

3.6. The Case of Data Security in ACFL

4. ACFL Framework Implementation into VANETs

4.1. Dataset: VeReMi

4.2. ACFL Model Implementation

4.3. Evaluation and Validation of ACFL Framework

5. Results and Evaluation

Key Findings

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

A Novel Adaptive Cluster-Based Federated Learning Framework for Anomaly Detection in VANETs †

Abstract

1. Introduction

2. Literature Review

Shortcomings of Classical Rule-Based Models

3. Techniques for Intrusion Detection in VANETs

3.1. Datasets and Attack Scenarios

3.2. K-Nearest Neighbors

3.3. Support Vector Machine

3.4. Random Forest Classifier

3.5. Logistic Regression

3.6. The Case of Data Security in ACFL

4. ACFL Framework Implementation into VANETs

4.1. Dataset: VeReMi

4.2. ACFL Model Implementation

4.3. Evaluation and Validation of ACFL Framework

5. Results and Evaluation

Key Findings

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

A Novel Adaptive Cluster-Based Federated Learning Framework for Anomaly Detection in VANETs^†