Connected Vehicles Security: A Lightweight Machine Learning Model to Detect VANET Attacks

Elsadig, Muawia A.; Altigani, Abdelrahman; Mohamed, Yasir; Mohamed, Abdul Hakim; Kannan, Akbar; Bashir, Mohamed; Adiel, Mousab A. E.

doi:10.3390/wevj16060324

Open AccessArticle

Connected Vehicles Security: A Lightweight Machine Learning Model to Detect VANET Attacks

by

Muawia A. Elsadig

^1,*

,

Abdelrahman Altigani

²

,

Yasir Mohamed

³

,

Abdul Hakim Mohamed

³

,

Akbar Kannan

³,

Mohamed Bashir

³

and

Mousab A. E. Adiel

⁴

¹

College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, Dammam 34212, Saudi Arabia

²

Computer Information Science, Higher Colleges of Technology, Al Ain 17561, United Arab Emirates

³

Department of Information Systems and Business Analytics, College of Business Administration, A Sharqiyah University, Ibra 400, Oman

⁴

Sudan Audit Chamber, Port Sudan 33311, Sudan

^*

Author to whom correspondence should be addressed.

World Electr. Veh. J. 2025, 16(6), 324; https://doi.org/10.3390/wevj16060324

Submission received: 27 March 2025 / Revised: 27 May 2025 / Accepted: 9 June 2025 / Published: 11 June 2025

(This article belongs to the Special Issue Internet of Vehicles and Autonomous Connected Vehicle: Privacy and Security)

Download

Browse Figures

Versions Notes

Abstract

Vehicular ad hoc networks (VANETs) aim to manage traffic, prevent accidents, and regulate various parts of traffic. However, owing to their nature, the security of VANETs remains a significant concern. This study provides insightful information regarding VANET vulnerabilities and attacks. It investigates a number of security models that have recently been introduced to counter VANET security attacks with a focus on machine learning detection methods. This confirms that several challenges remain unsolved. Accordingly, this study introduces a lightweight machine learning model with a gain information feature selection method to detect VANET attacks. A balanced version of the well-known and recent dataset CISDS2017 was developed by applying a random oversampling technique. The developed dataset was used to train, test, and evaluate the proposed model. In other words, two layers of enhancements were applied—using a suitable feature selection technique and fixing the dataset imbalance problem. The results show that the proposed model, which is based on the Random Forest (RF) classifier, achieved excellent performance in terms of classification accuracy, computational cost, and classification error. It achieved an accuracy rate of 99.8%, outperforming all benchmark classifiers, including AdaBoost, decision tree (DT), K-nearest neighbors (KNNs), and multi-layer perceptron (MLP). To the best of our knowledge, this model outperforms all the existing classification techniques. In terms of processing cost, it consumes the least processing time, requiring only 69%, 59%, 35%, and 1.4% of the AdaBoost, DT, KNN, and MLP processing times, respectively. It causes negligible classification errors.

Keywords:

VANETs; CICIDS2017; connected vehicles; cyber security; internet of things security; vehicular ad hoc networks; balanced dataset; imbalanced dataset; random oversampling; DoS; machine learning; deep learning; feature selection

1. Introduction

Vehicular ad hoc networks (VANETs) have emerged as a critical component of intelligent transportation networks [1,2,3]. Vehicular ad hoc networks (VANET) seek to ensure safe driving by enhancing traffic flow, thereby lowering the number of accidents. They provide a driver or vehicle with the necessary information to help avoid these accidents. However, any modification to this real-time information could result in a system failure that compromises driver safety. Keeping this information secure is the primary priority for security professionals and researchers because it guarantees the smooth operation of VANETs [4].

A VANET is a unique type of mobile ad hoc network (MANET) [5,6,7,8,9] with pre-established routes or roads. Roadside units (RSUs) and on-board units (OBUs) are two distinct authorities that are required for registration and management. OBUs are installed in vehicles and RSUs are widely used on road edges to carry out certain services. Every vehicle can travel freely on a road network and communicate with other vehicles, RSUs, and designated authorities [4].

Because VANETs are wireless communication systems that are difficult to secure, it is imperative to protect against misuse activities and carefully specify the security architecture [10]. Owing to the exposed nature of the wireless medium [11], VANETs are vulnerable to a wide range of attacks. These attacks affect the VANET’s functionality and cause challenging issues for drivers who operate legitimately. As a result, safeguarding a VANET from interception, alteration, and message deletion has proven to be a difficult task and is a top concern for both academia and industry. Messages used to guide drivers may be changed by malicious nodes. Furthermore, attackers can disseminate false information, resulting in massive accidents and damages [12]. Several years ago, many researchers investigated security attacks and sought relevant remedies. Others have attempted to codify standards and protocols or define the security infrastructure. However, there is still much to learn about trends in misbehavior detection and node trustworthiness [4].

Owing to their significant role in several applications, VANETs have attracted the interest of both the general public and the scientific community. These applications include lane merging and intersection alert signal functionality, value-added services (i.e., providing drivers access to the Internet to enhance their travel experience), toll payment services, traffic avoidance warning messages, navigation, road conditions, emergency vehicle alarm signals, and detour notifications [13].

VANET security has garnered considerable attention because of concerns regarding human safety when driving. The maintenance of availability is one of the most crucial security considerations. There will be significant harm if the VANET services are unavailable. Denial of service (DoS) assaults are therefore seen as one of the major threats that might harm VANETs. Chen et al. [14] indicated that traditional security vulnerabilities in connected vehicle systems include DoS attacks. The methods used to execute these attacks against automobiles have evolved rapidly as the number of intelligent vehicles and on-board devices linked to the network has increased. Among these risks, distributed denial of service (DDoS) attacks pose a severe risk to driving safety as they can directly disrupt the availability of connected vehicles, infrastructure, and network services by sending malicious data from a number of compromised devices. Alrehan et al. [15] indicated that DDoS attacks are the primary concern affecting VANET availability. Sufficient security measures should be offered to prevent this type of attack.

To improve road safety, traffic management, and overall vehicular connections, VANETs constitute crucial technological advancements at the intersection of communication and transportation systems. By exchanging vital information regarding emergency alerts, road hazards, and real-time traffic conditions, VANETs enable vehicles to connect easily with infrastructure elements (V2I) and with each other (V2V). The communicated sensitive information must be safeguarded against potential privacy and security breaches in the communication channel because both V2V and V2I communications use public channels. The use of authentication procedures by VANETs is essential for building trust between automobiles and infrastructure [16]. However, many aspects of authentication in a VANET require further enhancement and pose real challenges.

In general, security in this type of network is a true problem owing to the highly dynamic topology of VANETs. Owing to the dynamic nature of these networks, they are exposed to distributed malicious attacks. Therefore, VANETs require several security measures to ensure their consistency and integrity.

Before deploying an application reliant on VANETs, the security problem must be carefully considered as one of the main challenges. Try to imagine whether a safety message sent by a VANET system was altered, delayed, or deleted as a result of any kind of attack brought on by an intruder or attacker, and the grave outcomes, including accidents, fatalities, damage to infrastructure, etc.

In general, integrity, availability, confidentiality, non-repudiation, accountability, and authentication are security requirements that must be satisfied to preserve the security of VANETs. Even though standard wired networks and VANETs have similar security goals, security attacks in VANETs demand unique solutions with a low computing overhead, considering the high mobility and quick dynamic changes in VANET topology.

This study provides in-depth information on VANET vulnerabilities, attack classification, authentication, and countermeasures. In particular, it examines the most advanced machine learning (ML) techniques that have recently been presented to identify VANET attacks. Artificial intelligence (AI) is widely used in various application domains because of its ability to supplement conventional data-driven approaches [17,18]. Accordingly, the authors introduced a lightweight ML approach based on a Random Forest (RF) classifier to identify VANET attacks. The contributions of this study are summarized as follows:

It offers thorough information on VANET systems, vulnerabilities, and attack classification.
It demonstrates the drawbacks of common authentication approaches. Authentication accuracy is still a key issue in VANETs, which is a great challenge.
It investigates the recent ML and deep learning (DL) techniques that have been proposed to counter VANET attacks with focus on their advantages and drawbacks.
It improves the CICIDS2017 dataset using the gain information feature selection technique. This significantly enhances the prediction performance, as only 61 out of 79 features were considered.
It develops a balanced version of the CICIDS2017 dataset using a random oversampling technique. An imbalanced dataset leads to bias in classification accuracy and erroneous prediction, which affects classification performance. The developed dataset, BCICIDS2017-GI, exhibited significant improvements in classification accuracy and performance.
It presents a lightweight ML model capable of precisely identifying VANET attacks with an outstanding accuracy rate and manageable overhead. It outperformed the other relevant classification methods. The proposed model is based on an RF classification model with a gain information feature selection technique.
It compared the suggested model with several reputable classification models, which exhibited better performance. In addition, the proposed detection model was compared with recent classification approaches and surpassed all of them in terms of prediction accuracy, as indicated in the Related Work section.
It offers a comprehensive analysis of the experimental findings.

In this section, we present a short introduction to VANET systems and show that security concerns are crucial challenges that must be addressed to maintain successful VANET applications. In addition, the contributions of this study are also presented in this section. The rest of the paper is structured as follows. Section 2 discusses the VANET attack classifications considering different aspects. A thorough review and comprehensive discussion of the relevant work is presented in Section 3. It consists of two parts. The first part demonstrates some common authentication approaches with a focus on their drawbacks and weaknesses. Authentication is an important challenge faced by VANETs. The second part provides a significant investigation of recent methods that use ML and DL approaches to identify VANET attacks. Their achievements and limitations are discussed and highlighted. A comparison with the proposed model is also presented in this section. Section 4 provides a detailed description of the proposed detection model. It provides information regarding the model’s construction, dataset, and feature selection, and the creation of a balanced dataset, tools, and evaluation metrics. The results and discussion are presented in Section 5, and the paper is concluded in Section 6.

2. VANET Attack Classification

Modern transportation systems rely heavily on VANETs to facilitate communication between cars and infrastructure components. Nonetheless, VANETs are vulnerable to a range of security threats owing to their open and dynamic architectures. Many studies have presented different classifications for VANET attacks. Recently, Nandy et al. [19] classified VANET attacks into five categories: service-based, sensing-based, forgery-based, identity-based, and message-based. Table 1 presents the aforementioned classification of VANET attacks.

In the same context, Table 2 [10,20] shows the classification of some common VANET attacks based on their effects. It covers the effects of these attacks on authenticity, availability, confidentiality, integrity, privacy, and non-repudiation. Moreover, a layer-based classification of VANET attacks is presented in Table 3 [21]. This clearly indicates that most VANET attacks involve threatening the VANET layer.

3. Related Work

A VANET’s structure consists of three main components, OBUs, RSUs, and the trusted authority (TA) [22,23]: (i) Vehicles are equipped with OBUs that facilitate communication between them using the dedicated short-range communication (DSRC) protocol. Message exchange and key requests are ongoing procedures, as long as cars are driven on the road. (ii) RSUs are wireless devices positioned all over the road to gather and process messages and perform intelligent traffic control operations. (iii) The TA is responsible for assigning system parameters and managing all network entities. Before being permitted to join the VANET, the RSUs and vehicles should be registered at the TA. The TA possesses the greatest storage and communication capacity. Although the RSUs and TA use protected channels for communication, the vehicles and RSUs use an open wireless medium to transmit messages. Consequently, wireless environments are subject to a variety of attacks that affect VANET functionality and security. These attacks are increasing in terms of their ability to monitor, track, and modify the network traffic. Accordingly, multiple authentication protocols have been introduced to fortify VANET security; however, the majority of current authentication solutions that address security flaws result in significant computational and storage demands on OBUs [24].

3.1. Authentication

This section discusses some studies on authentication and highlights their common drawbacks. One of the most important and challenging aspects of VANET security is authentication [25]. Maintaining a sufficient authentication system while simultaneously protecting privacy at the same time calls for careful consideration, and continues to pose a serious challenge.

Abdelfatah [24] pointed out that the lack of non-repudiation functionality in symmetric key-based authentication protocols has resulted in several security breaches in VANETs. Furthermore, elliptic curve cryptography (ECC) is used in public key-based authentication techniques, which complicates protocol implementation. To overcome these issues, Abdelfatah presented an authentication protocol that uses Chebyshev chaotic maps. As a result, compared to other network models, their protocol presents a model with the least hardware complexity and surpasses others in terms of both performance and security. However, it should be highlighted that the main disadvantage of this approach is how time-consuming it is [26].

Naskar et al. [16] pointed out that authentication techniques are essential in VANETs for building mutual confidence between interacting entities. To ensure the reciprocal verifiability of transmitted communications, a centralized trust authority (CA) is often used in traditional authentication systems. However, in situations where there is a very high vehicle density inside the network, an excessive volume of authentication requests may cause a communication bottleneck in the CA, resulting in a single point of failure. Wei et al. [27] reported that authenticated key agreement (AKA) methods currently in use have two major shortcomings. First, the overhead in communication and computation is too high to satisfy the demands of applications that are sensitive to delays (delay-sensitive applications). Second, the concept of a multi-trusted authority has not been considered.

Wei et al. [28] indicated that traditional conditional privacy-preserving authentication (CPPA) techniques have two drawbacks. First, an extremely short transmission delay is required for traffic emergency messages, yet the overhead in communication or storage is not sufficiently low enough. Second, although side-channel attack techniques and the wide use of the system secret key (SSK) increase the likelihood of SSK breaking, traditional CPPA systems do not consider updating the SSK. Wei et al. proposed a new CPPA scheme to overcome these issues. To address the first issue, they proposed a CPPA signature scheme based on elliptic curve cryptography. This scheme can protect emergency traffic messages at extremely low communication costs. To address the second issue, they designed an SSK updating algorithm based on a secure pseudorandom function and Shamir’s secret sharing algorithm. According to the authors, the security and privacy criteria of VANETs are satisfied by the proposed scheme. Compared to other methods, it offers a smaller transmission delay and consumes less storage space. However, Zhang and Zhang [29] indicated that the proposed CPPA scheme is forgeable, insecure, and unable to satisfy conditional privacy.

A technique is required for VANETs to authenticate messages, identify legitimate vehicles, and eliminate malicious vehicles. This technique can be offered via a public key infrastructure (PKI) that uses fixed public keys and certificates. However, fixed keys violate the privacy of drivers because they enable an eavesdropper to link a key to a certain vehicle and location [30].

The primary concept behind PKI systems is that each vehicle’s OBU must preload a significant number of pair keys and their corresponding anonymous certificates, which significantly increases the TA’s certification management workload. Furthermore, because of its limited storage capacity, a vehicle bears the responsibility of storage management. Additionally, during the authentication phase, the verifying vehicle must determine whether the certificate is legitimate, which increases the processing complexity of the VANET system [31].

According to Al-Shareeda [32], the open-medium nature of vehicle-to-infrastructure (V2I) and vehicle-to-vehicle (V2V) communications makes VANETs susceptible to security issues. Numerous studies have proposed security plans to address these issues. Unfortunately, many of these approaches have high computing overheads, particularly when using the batch verification method, which simultaneously checks several messages. Consequently, using ECC, a Lightweight Security Without Using Batch Verification Method (LSWBVM) technique is provided in [32]. The LSWBVM technique uses an XOR operation and a generic hash function. In an area with high traffic density, the proposed technique employs single verification rather than batch verification. According to the authors, LSWBVM accomplishes the security objectives of enabling mutual authentication between nodes and demonstrates that single verification is more effective than batch verification.

Li and Wang [33] proposed a secure authentication method that could withstand typical attacks. Digital signature technology is considered suitable for security authentication procedures. However, this can render the identity of the node visible. Accordingly, an anonymous signature authentication method based on group/alias has been developed to address this issue. However, there are still restrictions on authentication costs and certificate revocation [34].

Li et al. [35] proposed an authentication system with conditional privacy preservation and non-repudiation. They employed two authentication systems: ID-based Signature and ID-based Online/Offline Signature. They reported that their proposed system met all requirements and was adequate for urban vehicular communications. However, the proposed system employs public key cryptography, which may result in additional overhead. Therefore, it is necessary to verify the efficiency of the proposed system in large-scale networks [10].

In general, authentication security, which is the main concern for VANET security, remains a challenge. Moreover, creating a suitable authentication system is just one of the requirements. Another is that we must carefully consider protecting privacy when creating that system.

3.2. Machine Learning

VANETs are susceptible to a wide range of security threats and vulnerabilities [36] owing to the variety of communication schemas and the inherent characteristics of wireless communications. These networks are vulnerable to novel and targeted attacks that use features unique to vehicles and the standard flaws in wireless environments. For VANETs, the majority of the security solutions designed for traditional networks are inappropriate. Therefore, to support the characteristics of vehicle networks and offer strong security procedures, researchers are searching for suitable systems. Numerous security countermeasures have been proposed, including cryptographic algorithms, traceability techniques, anonymity, and key-management systems. Recently, many studies have demonstrated that the efficiency of intrusion detection systems in identifying VANET attacks can be improved by incorporating artificial intelligence (AI). IDSs are popular methods that look for signs of security breaches in traffic and raise alarms whenever a security abnormality is noticed. Furthermore, ML can be used to create anomaly-based detection systems that can recognize unknown and zero-day threats, learn from and train themselves by examining network behavior, and gradually improve detection accuracy. The application of ML algorithms for intrusion detection in VANETs is of special interest, given the volume of data that is transferred and the variety of attacks that can occur. In recent years, many published datasets have described actual VANET communication traces, making it possible to evaluate the effectiveness of ML algorithms [20]. ML algorithms are a major contributor to the detection of many security attacks and can effectively meet current real-world requirements in the security field [37,38] and many other fields such as health [39] and machine translation [40,41]. Nevertheless, any attempt to mimic normal traffic causes ML algorithms to either fail to detect attacks or to become less accurate. Furthermore, ML algorithms have several flaws that allow attackers to launch complex attacks. Thus, it is imperative that weaknesses in ML techniques be evaluated in the early stages of development [42]. The extracted features of a detection model will not work if an attacker can evade the statistical analysis of VANET attacks. As a result, it is important to emphasize that more contributions to the ML domain are required for further enhancement.

Anyanwu et al. [43] pointed out that securing VANETs against DDoS attacks is a challenge. The use of encryption and authentication intrusion detection techniques that incorporate broad or general mitigation measures is unreliable or problematic because these networks operate in real time. These intrusion detection techniques reduce the dimensions of the supposed data by concentrating only on the payload of the data. However, AI and ML techniques are less affected by encryption and preserve the vehicular space and have been adopted to identify and categorize different types of distributed malicious intent on VANET and their technologies.

Alsarhan et al. [44] proposed a hybrid IDS for preventing cybersecurity attacks on VANET. Three approaches were integrated. These approaches include event-specific trust, rule-based filtering systems, and prior knowledge. While Bayesian Learning (BL) is used to update the risk level of attacks using the database’s historical data and attack categorization model, the Dempster–Shafer theory is used to compute the risk of attacks by integrating various pieces of evidence. Comparative studies have demonstrated the efficacy of the proposed IDS through a series of tests. However, their proposed method is limited to identifying known attacks; therefore, the authors’ future work will incorporate DL techniques such that their proposed system can identify new attacks. In addition, the proposed IDS requires a high detection time for attacks [45].

Karthiga et al. [45] presented an IDS that can identify known and unknown attacks targeting VANETs. Their approach involves two methods, DL and ML. The authors emphasized that, whereas existing systems solely concentrate on known VANET threats, their proposed system does not. It focuses on both the known and unknown attacks. Convolutional neural networks (CNNs) are used in their suggested method to identify unknown attacks, while the Adaptive Neuro Fuzzy Inference method (ANFIS) is used to identify known attacks. When tested on the i-VANET and CIC-IDS 2017 datasets, the proposed system performed better than several of the existing methods.

The RF methodology has numerous benefits compared to other supervised classification models. It can handle unbalanced databases, enhance decision tree performance by randomizing the variable feature selection, and accelerate the prediction process. RF classification is an ensemble in which weak learners are combined to create a powerful learner. To prevent the overfitting of the training datasets, the RF algorithm creates a large number of classification trees, each of which is built using a bootstrap randomized resampling method. The method gathers each tree’s prediction results, sets up a voting system, and then determines the classification by taking a plurality vote among classifiers [46].

Setia et al. [47] investigated several ML approaches using a simulation-based attack dataset including DDoS attacks. The investigated classifiers were decision tree (DT), Logistic Regression (LR), RF, k-nearest neighbors (KNNs), Naive Bayes (NB), and Kernel Support Vector Machine (Kernel SVM). The results indicated that the models with the best accuracy, precision, recall, and F1 scores were the DT and RF models. NB, LR, and KNN come next, while Kernel SVM had the lowest scores. DT and RF achieved an accuracy rate of 99.59%. Therefore, the authors recommend that they be appropriate for simulation-based attack identification. However, the generated dataset requires more evaluation and the study focuses only on one type of VANET attack, which is DDoS.

Kawale et al. [48] proposed an ensemble ML attack detection approach for VANETs. The proposed ensemble approach comprises four classifiers: multi-layer perceptron (MLP), Support Vector Machine (SVM), AdaBoost, and Bayesian. The outputs of these classification approaches are assembled using a fuzzy-based ranking technique. The proposed approach achieved an accuracy rate of 94.97%. However, there is no clear description of how the dataset was generated; it is mentioned that the data was combined from different online sources. In addition, it is commonly known that ensemble techniques can enhance detection accuracy; however, they cause more computational cost. The authors do not provide information regarding the proposed model’s computational cost and processing time.

ALMahadin et al. [49] proposed a semi-supervised model to detect VANET activity anomalies. This is a GRU-based DL model called SEMI-GRU. The proposed model was evaluated using the NSL-KDD dataset and demonstrated better performance. It achieved an accuracy rate of 83.31%; therefore, it outperformed some existing techniques provided in their article. However, the accuracy obtained is moderate. DL models demand a lot of computing power and may not function in all circumstances. In addition, the NSL-KDD dataset does not include recent attacks.

Ercan et al. [50] proposed an ensemble ML classification approach, in which the outputs of the four classifiers were combined using the LR classifier. The four classifiers are DT, AdaBoost, RF, and KNN. The dataset employed was VeReMi. After applying feature selection procedures and resampling methods, the authors claimed an improvement in the classification performance; however, the overall accuracy of the proposed model was not given. However, the computational cost of the proposed approach is questionable and requires more investigation. It is known that ensemble classification approaches can greatly enhance prediction accuracy; however, they require more computational power.

Rashid et al. [51] proposed a real-time malicious node prediction approach using ML. Their results were evaluated using SUMO and OMNET++ with ML classification approaches that included SVM, RF, MLPC, LR, and Gradient Boosted Tree (GBT). The obtained results show that RF and GBT performed better than SVM and LR, by achieving accuracy rates of 98% and 97%, respectively. The authors claimed that the accuracy rate of the proposed system approaches 99%. However, their target attacks were only misbehavior and DDoS.

Marouane et al. [52] provided an empirical assessment of the use of DL and ML methods for misbehavior detection in VANETs. The RF classifier performed better than other methods. The classification methods evaluated in this study include long short-term memory (LSTM), artificial neural networks (ANNs), deep neural networks (DNNs), AdaBoost, RF, KNN, and DT. The study also focused on the trade-off between training and execution durations, showing that while ensemble and DL models yield more accurate outcomes, simpler techniques are quicker overall. With the exception of the ANN, which exhibits a recall performance lower than 90%, the author reported that all applied classifiers showed good detection performance, often better than 94%. RF surpassed all of them, with a classification accuracy of 99%. However, the authors indicated that CNN, SVM, and RNN were assessed in terms of their performance, but there is nothing showing that in their article.

Anyanwu et al. [43] indicated that securing SDN-based VANETs is essential and required the application of AI methods. To identify DDoS attacks in the automotive domain, they proposed an intrusion detection model. The proposed model uses a grid search cross-validation technique and the SVM classifier’s radial basis function (RBF) kernel. The OBUs of each vehicle can be equipped with the proposed model. These units collect vehicular data and perform intrusion detection to determine whether a message is authentic or contains a DDoS attack. The overall accuracy of the proposed model is 99.33%. However, the model focuses on one type of attack, instead of considering various types. The ability to detect several types of attacks can significantly increase the visibility of the proposed model in terms of the computational cost.

Alsarhan et al. [53] examined SVM optimization using three ML algorithms: a genetic algorithm (GA), ant colony optimization (ACO), and particle swarm optimization (PSO). They evaluated the optimization results with respect to classification accuracy. Their findings demonstrate that using a GA to optimize SVM parameters resulted in a detection rate of 99%. This indicates that the GA outperformed the PSO and ACO algorithms. These models were tested using the NSL-KDD dataset. However, training their model on a limited number of network scenarios raises questions about the model’s robustness [43]. In addition, the NSL-KDD dataset does not include recent attacks [54].

Kaur and Kakkar [12] suggested a hybrid optimization model for attack classification on a VANET based on a Deep Maxout Network (DMN) classifier. Attack categorization was performed using the DMN, which was trained using the presented optimization approach. With a recall and accuracy of 0.9462 and 0.9395, respectively, the proposed model performed better in the classification process. However, the cost requirement is high [48].

Karthiga et al. [45] proposed an IDS by integrating ML and DL approaches to detect malicious attacks on VANETs. They claimed that the proposed IDS can discover both known and unknown attacks. Known attacks can be identified using the ANFIS classification model, and a DL model is employed to predict unknown attacks. Four types of attacks are considered using the CIC-IDS 2017 and i-VANET datasets. These include DoS, botnet, portscan, and brute force attacks. The average detection accuracy for these attacks was 98.6% when tested on both the i-VANET and CIC-IDS 2017. However, the detection of unknown attacks needs more verification.

Vitalkar et al. [55] proposed an intrusion detection system for VANETs based on a deep belief network (DBN) classification model. For the training, testing, and evaluation, the CICIDS2017 dataset was employed. The results obtained showed that the proposed model achieved accuracy rates of 98% and 90% for binary classification and multi-classification, respectively. However, it is a complex detection algorithm [45]. In addition, the computational cost is questionable as the proposed classification model is based on a DL method that requires more computational power.

Anyanwu et al. [56] proposed an ensemble model detection approach. The proposed approach was trained and tested using the BurST-ADMA dataset. Because it is an updated dataset with potential characteristics for false message detection in the Internet of Vehicles (IoV), this dataset was chosen for the assessment of the proposed model. The findings indicate that optimizing ensemble models can increase accuracy and lower the expense of misclassification. The authors claimed that the proposed AdaBoost achieved better accuracy, reaching 98.92%, compared to the other ensemble classification approaches. Ensemble classification approaches are capable of providing better performance; however, they involve more computations that need extra computational power, which can affect the network performance. Low-cost solutions are more applicable.

Bangui et al. [57] proposed a hybrid ML model to enhance the performance of intrusion detection systems. Their model uses an RF classifier to identify known attacks and an unsupervised clustering algorithm based on coresets to detect unknown attacks. The model was tested on the CICIDS2017 dataset and showed good performance. The detection efficiency was significantly increased compared to some common ML approaches with a classification accuracy that reached 96.93%. However, it requires more training time, more resources, and more computational power [48].

The most popular in-car communication protocol is the controller area network (CAN). However, CANs are devoid of appropriate security features such as encryption and message authentication. Therefore, a CAN bus is open to various cyberattacks. These attacks pose a direct risk to the safety of drivers, passengers, and the surrounding areas of moving vehicles, in addition to being a threat to information security and privacy. To identify cyberattacks on a CAN bus, Rajapaksha et al. [58] presented an intrusion detection system (IDS). This system can be used in commercial vehicles, military vehicles, passenger cars, and other CAN-based applications including medical equipment, industrial automation, and aerospace. Their proposed model is an ensemble of a time-based model and a gated recurrent unit (GRU) classification model. The efficacy of the proposed model was evaluated on three distinct datasets encompassing 16 attacks, including both fabricated and masquerade attacks. For 13 of these attacks, the model received an F1 score greater than 99%. However, it has limited capacity to identify attacks on high-frequency aperiodic IDs [59].

Refat et al. [60] proposed a CAN intrusion detection system (IDS), in which seven graph-based properties are extracted for use as features to detect attacks by employing two classification methods, SVM and KNN. The performance of the proposed IDS was tested on three CAN bus attacks, including DoS, fuzzy, and spoofing attacks, using a real vehicular CAN bus dataset. The experimental results showed that the proposed IDS achieved a detection accuracy of 97.99% and 97.92% for KNN and SVM, respectively. However, the detection rate for the spoofing attack was low [59].

Sharma and Jaekel [61] proposed an ML classification approach to detect position falsification attacks in VANETs. In the proposed approach, the computational overhead was shifted from the OBU to the RSU. They implement the misbehavior detection approach in RSUs, and thus, RSUs can share this information with vehicles and other RSUs. The results obtained show that the proposed approach outperforms existing methods in terms of recall and precision. The proposed model was trained using the VeReMi dataset, which includes five types of attacks. The classifiers investigated were KNN, DT, RF, and NB. Both KNN and FR performed better than the other algorithms. KNN was the best, achieving 99.2% and 98.8% for precision and recall, respectively. However, the dataset employed in this study does not represent all possible VANET position falsification attacks. In addition, there is no information regarding the accuracy of the proposed approach [62].

Gad et al. [54] proposed an intrusion detection system for VANETs. Several classification methods were examined to select the most applicable method. The investigated classifiers included LR, NB, RF, DC, AdaBoost, KNN, SVM, and XGBoost. All were trained on the ToN-IoT dataset. To improve the classification performance, the authors applied a feature selection technique to consider the features with a high influence on the prediction accuracy and implemented the SMOTE method to solve the class imbalance problem. The results show that XGBoost outperformed the other classification methods by achieving better performance. It achieved accuracy rates of 97.9% and 98.2% for both multi-class classification and binary classification, respectively. However, XGBoost is an ensemble classifier and therefore its computational cost needs to be investigated.

Sonker and Gupta [63] compared different classification approaches including NB, KNN, stochastic gradient descent (SGD), DT, and RF. The classifiers were trained and tested using the VeReMi. dataset. Accordingly, the authors proposed an approach based on the RF classifier, which achieved the highest accuracy compared with the other classifiers. It reached 97.62%. The investigated attacks throughout their study were constant, constant offset, random, random offset, and eventual. However, the comparison does not include the computational cost of these classifiers. This is an important factor to be considered, especially when dealing with networks that involve high mobility and rapid changes in topology.

Adhikary et al. [64] proposed a hybrid detection model, DT and Neural Network (NN), to detect DDoS attacks in VANETs. DTand NN were combined to form the proposed hybrid model. Based on their experimental results, it was found that the performance of the proposed hybrid model increased significantly compared with that of each individual model. It achieved a classification accuracy of 96.40%. The authors used a self-generated dataset that was coined by a simulation VANET under two conditions, normal and DDoS attack conditions. Even though the accuracy of the proposed hybrid model is high and might be regarded as a near-reliable criterion for a VANET, their proposed model leads to an increase in computing complexity [43].

Shams et al. [65] proposed a classification method for detecting VANET attacks based on SVM. They indicated that the SVM is a powerful ML algorithm that is applicable for suitable classification with low resource consumption. Their model was trained and tested using a self-made dataset created using simulated VANET traffic. The proposed model can classify the absence and presence of DoS attacks. It was compared with the Linear Discriminant Analysis (LDA) and Quadratic Discriminant Analysis (QDA) methods and showed better performance. It achieved an average of 99% for recall and precision. However, their proposed model targets one type of attack. More verification of their model using a public and trusted dataset is required.

Vitalkar et al. [66], in their article, proposed a DL method to detect VANET attacks. The method was based on the DBN classifier, which was trained and tested using the CICIDS2017 dataset. The proposed method achieved an accuracy rate of 98.07%. However, DL models cause considerable computational overhead.

By leveraging deep learning approaches that were trained using IoT datasets, Thorat et al. [67] claimed that deep learning approaches are more capable of detecting known and unknown VANET attacks compared to traditional approaches. Their outcomes indicate that CNNs and LSTMs performed better in terms of discovering known attacks with detection rates of 96% and 95%, respectively, while autoencoders are capable of identifying unknown attacks with a detection rate that reached 85%. However, the authors stated that CNNs may struggle with long-term dependencies and LSTM is more complex to train. In addition, an autoencoder may struggle with complex patterns.

Based on this thorough review, the authors of this article proposed a lightweight ML model based on an RF classifier with a gain information feature selection method to detect VANET attacks. A balanced version of the well-known and recent dataset CISDS2017 was developed in this study by applying a random oversampling technique. The developed dataset was used to train and test the proposed model. Two layers of enhancement were employed to enhance the model’s performance—applying an adequate feature selection method to enhance the model’s performance and applying a random oversampling technique to solve the class imbalance problem. Eliminating this problem can greatly improve classification accuracy and avoid bias in prediction caused by the influence of majority classes. Accordingly, the proposed model showed significant performance in terms of classification accuracy, computational cost, and classification error. It achieved an accuracy rate of 99.8%, outperforming all benchmark classifiers—AdaBoost, DT, and MLP. In terms of processing cost, the proposed classifier consumed the least processing time, requiring only 69%, 59%, 35%, and 1.4% of the Ada-Boost, DT, KNN, and MLP processing times, respectively.

A summary of the above related work is given in Table 4, which provides key information on each method, including the author, achieved accuracy, dataset used, and limitations. In addition, the model proposed in this article is provided in the last row of the table. This clearly indicates that the proposed model outperforms all relevant and recent methods.

4. Materials and Methods

ML algorithms play a crucial role in guaranteeing the security and dependability of network systems and applications because of their capacity to handle massive amounts of data and detect patterns and abnormalities that may lead to security breaches.

This section provides details regarding the proposed model, which is intended to detect the most common and effective attacks that threaten VANETs and their applications. All classification models throughout this work were implemented using Orange ver. 3.34.0. that was run on a computer with the following specifications: DDR3 RAM 8 GB, Prointel (R) Core™ i7-3720Q M CPU at 2.60 GHz.

Five classification models were carefully selected based on their extensive use in the security field and their outstanding prediction performance, as proven in several studies. These classifiers include AdaBoost, RF, MLP, KNN, and DT. Table 5 shows the classifiers and their corresponding parameters. These classifiers were built and then trained and tested using a well-known, recent, trusted, and updated dataset called CICIDS2017 to select the most effective classifier based on the performance and computational cost. The dataset was then improved by selecting features that had the greatest influence on the prediction process. This was performed by computing the weight of each feature. Subsequently, the selected classifiers were tested on our enhanced dataset, CICIDS2017-GI, and they exhibited better performance when the computational overhead was considered. Another layer of enhancement was applied to the dataset. A random oversampling technique was applied to balance the dataset and eliminate the imbalance problem. In conclusion, these layers of enhancement significantly enhanced the prediction accuracy and processing time of the proposed model. Compared with state-of-the-art models, the proposed model outperformed all existing models. The remainder of this section provides details regarding the dataset, the feature selection method, the tools used for the experiments, the evaluation metrics, and a discussion of the results.

4.1. Dataset

The CICIDS2017 dataset has drawn academic attention since its launch to develop new models and algorithms. The five-day normal and attack traffic data from the Canadian Institute of Cybersecurity were spread across eight separate files in the dataset. It includes 3,119,345 instances and eighty-three features with 15 class labels, 14 attack labels plus one normal class [70].

In the CICIDS2017, data were captured across different periods. This dataset includes attacks that are not included in the UNSW-NB15, KDD Cup 1999, or NSL-KDD datasets. The CICIDS2017 attacks are categorized as DDoS, DoS, heartbleed, botnet, nfiltration, brute force SSH, brute force FTP, and web attacks. The features of this dataset set it apart from the others in terms of trustworthiness and realistic benchmarking. To ensure the reliability of the CICIDS2017 evaluation, the benchmarking used 11 criteria involving available protocols and complete traffic [71]. Sharafaldin et al. [72] revealed that a range of modern multi-stage attacks, including heartbleed and various DoS and DDoS attacks, were included in this dataset. Moreover, a range of modern protocols were included.

Panigrahi et al. [70] outlined some drawbacks of this dataset, including its scattered presence, a large volume of data, missing values, and high class imbalance. Accordingly, they developed an enhanced version of this dataset considering the aforementioned drawbacks. The enhanced version of the dataset developed by Panigrahi et al. [70] was used. This was received from the author after a request was made. The dataset consists of 91,830 instances with 79 features. Six types of attacks were included in this dataset, in addition to the normal instances. Table 6 lists the number of instances for each attack including the number of normal instances.

4.2. Feature Selection

Using a suitable feature selection technique is essential to reduce the computational overhead and increase prediction accuracy. Feature selection is a useful technique for lightweight detection models [73]. This procedure greatly improves the performance of classification models by reducing the number of features, considering those that have a strong effect, and ignoring the rest. Feature selection methods can be used for different aspects to enhance prediction—as an example, to eliminate redundant features, Fang et al. [74] proposed a genetic algorithm-based feature selection approach and demonstrated its effectiveness on a real dataset. Taheri et al. [75] proposed a feature selection method to reduce the number of features to enhance intrusion detection systems. More relevant work can be seen in [76,77]. Choosing a computationally efficient feature selection technique would be a wonderful first step towards improving the prediction performance and accuracy when working with VANETs that exhibit rapid changes in topology and fast mobility. The information gain feature selection technique was used in this study. It is a widely used method for choosing pertinent features since it is easy to calculate and comprehend. This method calculates how much the features and the class label depend on one another [78]. It quantifies the amount of information a feature offers to a class. This feature selection technique was implemented using Orange software to compute each feature weight. Accordingly, 61 of 79 features were chosen based on their weights. This technique was employed to identify the most informative features. Only features with considerable weights were selected, whereas the other features were ignored. In our scenario, all features with weights less than 0.1 were dropped. As an example, Table 7 lists the computed weights for the top six features that resulted in the highest weights. Table 8 lists the ignored features with their calculated weights; accordingly, 18 features were dropped.

4.3. Creation of a Balanced Dataset

Table 6 unambiguously demonstrates the imbalance in the CICIDS2017 dataset. Variations in the number of instances for each attack are clearly noted. This imbalance undoubtedly contributes to a bias in the classification accuracy and erroneous prediction. A useful solution to this problem is to apply an appropriate resampling method, which entails increasing the number of records in the minority class and decreasing the number of records in the majority class.

To overcome this drawback, a random oversampling approach is used to create a balanced version of the dataset. By either repeating or creating new instances, the random oversampling strategy increases the number of minority classes by producing more instances. Oversampling addresses the issue of learning from skewed, uneven data where one class is underrepresented in relation to other classes [79]. It performs better when it is compared to an undersampling technique. Mohammed et al. indicate that, for different classification models, oversampling showed better performance and received high scores for many evaluation metrics [80]. This is the reason behind the selection of the oversampling technique. To implement this technique, appropriate code using Python version 3.12.0 was implemented to balance the dataset. Consequently, a balanced dataset of 1,700,330 records was created with 26,185 instances for each attack.

4.4. Evaluation

The acceptance of any classification model requires validation to ensure reliable and realistic results [81]. In this context, the authors employed a cross-validation procedure with 10 folds for each model. The classification accuracy (AC) of each model was calculated using Equation (1).

A C = \frac{(T P + T N)}{(F P + T N + F N + T P)}

(1)

where true negative (TN) and true positive (TP) represent the number of accurately predicted negative and positive cases, respectively. False negatives (FNs) are the number of positive cases mistakenly classified as negative, whereas false positives (FPs) are the number of negative cases mistakenly projected as positive.

A confusion matrix was used to evaluate the performance and efficacy of the models. To quantify classification errors, false positives (FPs) and false negatives (FNs) were employed. Equations (2)–(4) were also utilized to calculate the F1 score (F1), precision (P), and recall (R) among other metrics that were used to assess the data. For further evaluation, receiver operating characteristic (ROC) curves were calculated for each classification model.

F 1 = 2 \times \frac{(P r e c i s i o n) \times (R e c a l l)}{(P r e c i s i o n + R e c a l l)}

(2)

P = \frac{(T P)}{(T P) + (F P)}

(3)

R = \frac{(T P)}{(T P) + (F N)}

(4)

5. Results and Discussion

The selected classification models, including RF, AdaBoost, DT, KNN, and MLP were built, trained, and tested using the revised version of the CICIDS2017 dataset. The computed accuracy, recall, precision, F1 score, and Matthews correlation coefficient (MCC) are listed in Table 9.

The performance metrics indicate that the RF classifier outperforms all the other benchmark classifiers. It achieved 99.5% for all performance metrics: accuracy, recall, precision, and F1 score. For visualization, all of the computed metrics are shown in Figure 1.

Another important factor supporting this finding is the computational cost of each model. To assess this, the processing time for all classifiers was computed, and this indicates that RF still outperforms the other classifiers by consuming the least processing time. Figure 2 shows the processing times for all classifiers while Table 10 lists the training time, testing time, and computation time.

It can clearly be noted that RF also performed significantly better than the other classifiers, followed by AdaBoost, DT, and KNN, while MLP lags behind by consuming too much processing time, which is normal because DL methods are computationally expensive.

The computational cost is a crucial factor when considering networks that require real-time actions. This confirms that the RF classifier is much better; it requires only 71%, 51%, 27%, and 1.3% of the Ada-Boost, DT, KNN, and MLP processing times, respectively.

After concluding that RF was the best classification model in terms of accuracy and processing cost, two layers of enhancements were applied to further improve both the classification accuracy and processing cost. These two layers include (i) applying an appropriate feature selection approach to only choose the features with a high level of influence on the classification results, which boosts both the prediction accuracy and processing cost, and (ii) applying a resampling method to solve the imbalance problem in the dataset. Having a balanced dataset can significantly enhance prediction performance and eliminate bias in the classification results.

In the first layer of enhancement, the information gain feature selection method described in the Method section was applied. Accordingly, 61 features with a score greater than 0.1 were selected to form the newly developed dataset, CICIDS2017 with gain information (CICIDS2017-GI).

All classifiers were trained on the aforementioned dataset, CICIDS2017-GI. Almost all classifiers achieved the same classification accuracies, which were achieved before applying the feature selection method. This indicates that this feature selection technique successively selects the effective features that have a significant influence on prediction accuracy and neglects other features that do not. In addition, removing features that have no weight in the classification can significantly reduce the processing cost, given that the number of features to be processed is reduced. Table 11 shows the accuracy and processing time. It can be noted that, for each classifier, while the accuracy remains the same, the training time, testing time, and execution time are decreased, which means there is improvement in terms of processing cost due to the application of the information gain feature selection method.

The second layer of enhancement is the development of a balanced dataset that can reduce bias in the prediction process and improve classification accuracy. As described in the Method section, a random oversampling method was implemented using Python software for the CICIDS2017-GI dataset, which was developed by applying the gain information feature selection method. For this study, the developed balanced dataset was called Balanced CICIDS2017 with gain information (BCIC2017-GI).

All classifiers were trained, tested, and evaluated using the BCICIDS2017-GI dataset which includes 1,700,330. A ten-fold cross-validation technique was employed for each model to validate the results. The computed evaluation metrics, which include accuracy, recall, precision, F1 score, the Matthews correlation coefficient (MCC), and the area under the ROC curve (AUC), are listed in Table 12.

The results show a significant improvement in the classification accuracy obtained by all classifiers; however, RF was dominant. It achieved an outstanding accuracy rate of 99.8%, followed by AdaBoost, and MLP. DT and KNN are at the bottom level with accuracies of 98.5% and 98.40%, respectively. Figure 3 shows the accuracy rates achieved by the benchmark classifiers.

When examining the accuracy for each attack among all these classifiers, Table 13 shows good performance for all classifiers after using the developed dataset, BCICIDS2017-GI. These results support the conclusion that RF is the best. It demonstrated the best performance with a high capacity to detect all attacks, with an outstanding classification accuracy rate.

It is noteworthy that AdaBoost comes next, and demonstrates excellent performance. However, RF still surpassed all of them when considering processing time and cost, as is shown in Figure 2. Figure 4 shows the accuracy achieved by all classifiers for each attack which clearly indicates that our proposed classifier (RF) is a dominant classifier.

For further evaluation, the ROC curves for DT, AdaBoost, MLP, KNN, and the suggested classifier (RF) are shown in Figure 5, Figure 6, Figure 7, Figure 8, and Figure 9, respectively. The ROC curves and performance metrics showed how well each classifier performed, indicating that the proposed classifier performed better than other classification techniques. On the other hand, the confidence interval (CI) for the proposed classifier is 0.034555021 with 99.83455502 and 99.76544498 as the upper bond and lower bond, respectively. Moreover, when we looked at the classification errors that were calculated using the confusion matrix, it was noted that RF caused negligible classification errors as shown in Table 14. This result further validates that our proposed classifier is an optimal classifier to be used for detecting VANET attacks.

The research findings confirm that the proposed model, which is based on RF and employs the information gain feature selection method and a balanced dataset, has achieved excellent performance in terms of classification accuracy, computational cost, and classification error. It has achieved an accuracy of 99.8%, outperforming all benchmark classifiers: AdaBoost, DT, KNN, and MLP. In addition, to the best of our knowledge, it outperforms all the existing relevant classification techniques, as indicated in the Related Work section.

The proposed classifier consumed the least processing time, requiring only 69%, 59%, 35%, and 1.4% of the AdaBoost, DT, KNN, and MLP processing times, respectively. The processing cost is an important factor for VANETs, which require fast and real-time actions to avoid damage and ensure safe human life. Any delays can cause considerable damage.

6. Conclusions

The rapid development of VANET systems will lead to the greater usage of intelligent transportation systems. This type of network has attracted attention in recent years because of its enormous influence on improving traffic management systems and road safety. Many studies have been conducted to improve many elements of VANETs, including their coverage, protocols, and other relevant factors. Although VANET security is a major concern, the architecture of these networks appears to render appropriate and effective protection a real challenge.

ML models play a crucial role in guaranteeing the security and dependability of network systems and applications because of their capacity to handle massive amounts of data and detect patterns and abnormalities that may lead to security breaches. Therefore, ML offers a potential security solution for VANET systems.

In this paper, VANET vulnerabilities, attacks, and countermeasures are thoroughly discussed, with a focus on ML models that have recently been introduced to identify VANET attacks. Their achievements and drawbacks are thoroughly addressed. Accordingly, this study presents an effective ML model for detecting VANET attacks. This model is based on the RF classification model and the information gain feature selection method. It was trained, tested, and evaluated on an enhanced version of the CICIDS2017 dataset. The information gain method was used to select the features with the greatest influence on the prediction, and a random oversampling technique was applied to balance the dataset. As a result, a balanced dataset with information gain was developed, the BCICIDS2017-GI. The proposed model achieved outstanding performance compared with the other classification models investigated in this study. The proposed model achieved an accuracy rate of 99.8%, with acceptable classification errors. In terms of computational cost, the proposed classifier surpasses all other classifiers by requiring the least processing time; hence, it is more appropriate for VANETs. Our future work will focus on evaluating the proposed model by using other datasets.

Author Contributions

Conceptualization, M.A.E.; Formal analysis, M.A.E., A.A., Y.M., A.H.M., A.K. and M.B.; Investigation, M.A.E., A.A., Y.M., A.H.M., A.K. and M.B.; Methodology, M.A.E., A.A., Y.M., A.H.M., A.K. and M.B.; Resources, M.A.E., A.A., Y.M., A.H.M., A.K. and M.B.; Software, M.A.E., A.A., Y.M., A.H.M., A.K. and M.B.; Validation, M.A.E., A.A., Y.M., A.H.M., A.K. and M.B.; Visualization, M.A.E., A.A., Y.M., A.H.M., A.K., M.B. and M.A.E.A.; Writing—original draft, M.A.E., A.A., Y.M., A.H.M., A.K., M.B. and M.A.E.A.; Writing—review & editing, M.A.E., A.A., Y.M., A.H.M., A.K., M.B. and M.A.E.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Available upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhang, H.; Li, M. Towards an intelligent and automatic irrigation system based on internet of things with authentication feature in VANET. J. Inf. Secur. Appl. 2025, 88, 103927. [Google Scholar] [CrossRef]
Manasrah, A.; Yaseen, Q.; Al-Aqrabi, H.; Liu, L. Identity-Based Authentication in VANETs: A Review. IEEE Trans. Intell. Transp. Syst. 2025, 26, 4260–4282. [Google Scholar] [CrossRef]
Mansouri, F.; Tarhouni, M.; Alaya, B.; Zidi, S. A distributed intrusion detection framework for vehicular Ad Hoc networks via federated learning and Blockchain. Ad Hoc Netw. 2025, 167, 103677. [Google Scholar] [CrossRef]
Hasrouny, H.; Samhat, A.E.; Bassil, C.; Laouiti, A. VANet security challenges and solutions: A survey. Veh. Commun. 2017, 7, 7–20. [Google Scholar] [CrossRef]
Shawky, M.A. Authentication Enhancement in Command and Control Networks: (A Study in Vehicular Ad-Hoc Networks). Ph.D. Thesis, University of Glasgow, UK, 2024. [Google Scholar]
Samaras, N.S. Using basic MANET routing algorithms for data dissemination in vehicular Ad Hoc Networks (VANETs). In Proceedings of the 24th Telecommunications Forum (TELFOR), Belgrade, Serbia, 22–26 November 2016; pp. 1–4. [Google Scholar]
Lee, M.; Atkison, T. VANET applications: Past, present, and future. Veh. Commun. 2021, 28, 100310. [Google Scholar] [CrossRef]
Elsadig, M.A.; Fadlalla, Y.A. Mobile Ad Hoc Network Routing Protocols: Performance Evaluation and Assessment. Int. J. Comput. Digit. Syst. 2018, 7, 59–66. [Google Scholar] [CrossRef] [PubMed]
Deng, J.; Deng, J.; Liu, P.; Wang, H.; Yan, J.; Pan, D.; Liu, J. A Survey on Vehicular Cloud Network Security. IEEE Access 2023, 11, 136741–136757. [Google Scholar] [CrossRef]
Elsadig, M.A.; Fadlalla, Y.A. VANETs Security Issues and Challenges: A Survey. Indian J. Sci. Technol. 2016, 9, 1–8. [Google Scholar] [CrossRef]
Elsadig, M.A.; Altigani, A.; Baraka, M.A.A. Security issues and challenges on wireless sensor networks. Int. J. Adv. Trends Comput. Sci. Eng. 2019, 8, 1551–1559. [Google Scholar] [CrossRef]
Kaur, G.; Kakkar, D. Hybrid optimization enabled trust-based secure routing with deep learning-based attack detection in VANET. Ad Hoc Netw. 2022, 136, 102961. [Google Scholar] [CrossRef]
Sun, J.; Fang, Y. Defense against misbehavior in anonymous vehicular ad hoc networks. Ad Hoc Netw. 2009, 7, 1515–1525. [Google Scholar] [CrossRef]
Chen, X.; Feng, W.; Chen, Y.; Ge, N.; He, Y. Access-Side DDoS Defense for Space-Air-Ground Integrated 6G V2X Networks. IEEE Open J. Commun. Soc. 2024, 5, 2847–2868. [Google Scholar] [CrossRef]
Alrehan, A.M.; Alhaidari, F.A. Machine Learning Techniques to Detect DDoS Attacks on VANET System: A Survey. In Proceedings of the 2nd International Conference on Computer Applications & Information Security (ICCAIS), Riyadh, Saudi Arabia, 1–3 May 2019; pp. 1–6. [Google Scholar]
Naskar, S.; Brunetta, C.; Hancke, G.; Zhang, T.; Gidlund, M. A Scheme for Distributed Vehicle Authentication and Revocation in Decentralized VANETs. IEEE Access 2024, 12, 68648–68667. [Google Scholar] [CrossRef]
Saoud, B.; Shayea, I.; Yahya, A.E.; Shamsan, Z.A.; Alhammadi, A.; Alawad, M.A.; Alkhrijah, Y. Artificial Intelligence, Internet of things and 6G methodologies in the context of Vehicular Ad-hoc Networks (VANETs): Survey. ICT Express 2024, 10, 959–980. [Google Scholar] [CrossRef]
Elsadig, M.A. ChatGPT and Cybersecurity: Risk Knocking the Door. J. Internet Serv. Inf. Secur. 2023, 14, 1–15. [Google Scholar] [CrossRef]
Nandy, T.; Noor, R.M.; Kolandaisamy, R.; Idris, M.Y.I.; Bhattacharyya, S. A review of security attacks and intrusion detection in the vehicular networks. J. King Saud Univ. Comput. Inf. Sci. 2024, 36, 101945. [Google Scholar] [CrossRef]
Ben Rabah, N.; Idoudi, H. A Machine Learning Framework for Intrusion Detection in VANET Communications. In Emerging Trends in Cybersecurity Applications; Daimi, K., Alsadoon, A., Peoples, C., El Madhoun, N., Eds.; Springer International Publishing: Cham, Switzerland, 2023; pp. 209–227. [Google Scholar]
Amari, H.; El Houda, Z.A.; Khoukhi, L.; Belguith, L.H. Trust Management in Vehicular Ad-Hoc Networks: Extensive Survey. IEEE Access 2023, 11, 47659–47680. [Google Scholar] [CrossRef]
Al-Shareeda, M.A.; Manickam, S. A Systematic Literature Review on Security of Vehicular Ad-Hoc Network (VANET) Based on VEINS Framework. IEEE Access 2023, 11, 46218–46228. [Google Scholar] [CrossRef]
Shawky, M.A.; Shah, S.T.; Abdrabou, M.; Usman, M.; Abbasi, Q.H.; Flynn, D.; Imran, M.A.; Ansari, S.; Taha, A. How secure are our roads? An in-depth review of authentication in vehicular communications. Veh. Commun. 2024, 47, 100784. [Google Scholar] [CrossRef]
Abdelfatah, R.I.; Abdal-Ghafour, N.M.; Nasr, M.E. Secure VANET Authentication Protocol (SVAP) Using Chebyshev Chaotic Maps for Emergency Conditions. IEEE Access 2022, 10, 1096–1115. [Google Scholar] [CrossRef]
Dong, S.; Su, H.; Xia, Y.; Zhu, F.; Hu, X.; Wang, B. A Comprehensive Survey on Authentication and Attack Detection Schemes That Threaten It in Vehicular Ad-Hoc Networks. IEEE Trans. Intell. Transp. Syst. 2023, 24, 13573–13602. [Google Scholar] [CrossRef]
Praba, M.S.B.; Ramesh, S.S.S. VANET Communication System with HENON Based Privacy Preserving Authentication. Int. J. Intell. Syst. Appl. Eng. 2023, 11, 340–349. [Google Scholar]
Wei, L.; Cui, J.; Zhong, H.; Bolodurina, I.; Liu, L. A Lightweight and Conditional Privacy-Preserving Authenticated Key Agreement Scheme With Multi-TA Model for Fog-Based VANETs. IEEE Trans. Dependable Secur. Comput. 2023, 20, 422–436. [Google Scholar] [CrossRef]
Wei, L.; Cui, J.; Xu, Y.; Cheng, J.; Zhong, H. Secure and Lightweight Conditional Privacy-Preserving Authentication for Securing Traffic Emergency Messages in VANETs. IEEE Trans. Inf. Forensics Secur. 2021, 16, 1681–1695. [Google Scholar] [CrossRef]
Zhang, J.; Zhang, Q. Comment on “Secure and Lightweight Conditional Privacy-Preserving Authentication for Securing Traffic Emergency Messages in VANETs”. IEEE Trans. Inf. Forensics Secur. 2023, 18, 1037–1038. [Google Scholar] [CrossRef]
Studer, A.; Shi, E.; Bai, F.; Perrig, A. TACKing Together Efficient Authentication, Revocation, and Privacy in VANETs. In Proceedings of the 6th Annual IEEE Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks, Rome, Italy, 22–26 June 2009; pp. 1–9. [Google Scholar]
Alshudukhi, J.S.; Mohammed, B.A.; Al-Mekhlafi, Z.G. Conditional Privacy-Preserving Authentication Scheme Without Using Point Multiplication Operations Based on Elliptic Curve Cryptography (ECC). IEEE Access 2020, 8, 222032–222040. [Google Scholar] [CrossRef]
Al-Shareeda, M.A.; Anbar, M.; Alazzawi, M.A.; Manickam, S.; Al-Hiti, A.S. LSWBVM: A Lightweight Security Without Using Batch Verification Method Scheme for a Vehicle Ad Hoc Network. IEEE Access 2020, 8, 170507–170518. [Google Scholar] [CrossRef]
Li, C.; Wang, Z. Location-based Security Authentication Mechanism for Ad hoc Network. In Proceedings of the National Conference on Information Technology and Computer Science, Lanzhou, China, 16–18 November 2012. [Google Scholar]
Chen, L.; Tang, H.; Wang, J. Analysis of VANET security based on routing protocol information. In Proceedings of the Fourth International Conference on Intelligent Control and Information Processing (ICICIP), Beijing, China, 9–11 June 2013; pp. 134–138. [Google Scholar]
Li, J.; Lu, H.; Guizani, M. ACPN: A Novel Authentication Framework with Conditional Privacy-Preservation and Non-Repudiation for VANETs. IEEE Trans. Parallel Distrib. Syst. 2015, 26, 938–948. [Google Scholar] [CrossRef]
Shu, J.; Zhou, L.; Zhang, W.; Du, X.; Guizani, M. Collaborative Intrusion Detection for VANETs: A Deep Learning-Based Distributed SDN Approach. IEEE Trans. Intell. Transp. Syst. 2021, 22, 4519–4530. [Google Scholar] [CrossRef]
Muawia, E. Network Covert Channels. In Steganography—The Art of Hiding Information; Joceli, M., Ed.; IntechOpen: Rijeka, Croatia, 2024; Chapter 8. [Google Scholar]
Elsadig, M.; Gafar, A. Packet length covert channel detection: An ensemble machine learning approach. J. Theor. Appl. Inf. Technol. 2022, 100, 38391–38405. [Google Scholar]
Elsadig, M.A.; Altigani, A.; Elshoush, H.T. Breast cancer detection using machine learning approaches: A comparative study. Int. J. Electr. Comput. Eng. (IJECE) 2023, 13, 736–745. [Google Scholar] [CrossRef]
Nagarhalli, T.P.; Vaze, V.; Rana, N.K. Impact of Machine Learning in Natural Language Processing: A Review. In Proceedings of the Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV), Tirunelveli, India, 4–6 February 2021; pp. 1529–1534. [Google Scholar]
Mohamed, Y.A.; Khanan, A.; Bashir, M.; Mohamed, A.H.H.M.; Adiel, M.A.E.; Elsadig, M.A. The Impact of Artificial Intelligence on Language Translation: A Review. IEEE Access 2024, 12, 25553–25579. [Google Scholar] [CrossRef]
Elsadig, M.A.; Gafar, A. Covert Channel Detection: Machine Learning Approaches. IEEE Access 2022, 10, 38391–38405. [Google Scholar] [CrossRef]
Anyanwu, G.O.; Nwakanma, C.I.; Lee, J.-M.; Kim, D.-S. Optimization of RBF-SVM Kernel Using Grid Search Algorithm for DDoS Attack Detection in SDN-Based VANET. IEEE Internet Things J. 2023, 10, 8477–8490. [Google Scholar] [CrossRef]
Alsarhan, A.; Al-Ghuwairi, A.-R.; Almalkawi, I.T.; Alauthman, M.; Al-Dubai, A. Machine Learning-Driven Optimization for Intrusion Detection in Smart Vehicular Networks. Wirel. Pers. Commun. 2021, 117, 3129–3152. [Google Scholar] [CrossRef]
Karthiga, B.; Durairaj, D.; Nawaz, N.; Venkatasamy, T.K.; Ramasamy, G.; Hariharasudan, A. Intelligent Intrusion Detection System for VANET Using Machine Learning and Deep Learning Approaches. Wirel. Commun. Mob. Comput. 2022, 2022, 5069104. [Google Scholar] [CrossRef]
Bangui, H.; Ge, M.; Buhnova, B. A Hybrid Data-driven Model for Intrusion Detection in VANET. Procedia Comput. Sci. 2021, 184, 516–523. [Google Scholar] [CrossRef]
Setia, H.; Chhabra, A.; Singh, S.K.; Kumar, S.; Sharma, S.; Arya, V.; Gupta, B.B.; Wu, J. Securing the road ahead: Machine learning-driven DDoS attack detection in VANET cloud environments. Cyber Secur. Appl. 2024, 2, 100037. [Google Scholar] [CrossRef]
Kawale, R.N.; Patil, R.V.; Mahajan, S.A. Optimal Attack or Malicious Activity Detection in VANET Using Ensemble Machine Learning Approach. Int. J. Intell. Syst. Appl. Eng. 2024, 12, 669–675. [Google Scholar]
Almahadin, G.; Aoudni, Y.; Shabaz, M.; Agrawal, A.V.; Yasmin, G.; Alomari, E.S.; Al-Khafaji, H.M.R.; Dansana, D.; Maaliw, R.R. VANET Network Traffic Anomaly Detection Using GRU-Based Deep Learning Model. IEEE Trans. Consum. Electron. 2024, 70, 4548–4555. [Google Scholar] [CrossRef]
Ercan, S.; Mendiboure, L.; Alouache, L.; Maaloul, S.; Sylla, T.; Aniss, H. An Enhanced Model for Machine Learning-Based DoS Detection in Vehicular Networks. In Proceedings of the IFIP Networking Conference (IFIP Networking), Barcelona, Spain, 12–15 June 2023; pp. 1–9. [Google Scholar]
Rashid, K.; Saeed, Y.; Ali, A.; Jamil, F.; Alkanhel, R.; Muthanna, A. An Adaptive Real-Time Malicious Node Detection Framework Using Machine Learning in Vehicular Ad-Hoc Networks (VANETs). Sensors 2023, 23, 2594. [Google Scholar] [CrossRef]
Marouane, H.; Dandoush, A.; Amour, L.; Erbad, A. Performance Evaluation of Machine Learning-Based Misbehavior Detection Systems in VANETs: A Comprehensive Study. In Proceedings of the International Symposium on Networks, Computers and Communications (ISNCC), Doha, Qatar, 23–26 October 2023; pp. 1–6. [Google Scholar]
Alsarhan, A.; Alauthman, M.; Alshdaifat, E.; Al-Ghuwairi, A.-R.; Al-Dubai, A. Machine Learning-driven optimization for SVM-based intrusion detection system in vehicular ad hoc networks. J. Ambient. Intell. Humaniz. Comput. 2023, 14, 6113–6122. [Google Scholar] [CrossRef]
Gad, A.R.; Nashat, A.A.; Barkat, T.M. Intrusion Detection System Using Machine Learning for Vehicular Ad Hoc Networks Based on ToN-IoT Dataset. IEEE Access 2021, 9, 142206–142217. [Google Scholar] [CrossRef]
Vitalkar, R.S.; Thorat, S.S.; Rojatkar, D.V. Intrusion Detection for Vehicular Ad Hoc Network Based on Deep Belief Network. In Computer Networks and Inventive Communication Technologies; Smys, S., Bestak, R., Palanisamy, R., Kotuliak, I., Eds.; Springer Nature: Singapore, 2022; pp. 853–865. [Google Scholar]
Anyanwu, G.O.; Nwakanma, C.I.; Kim, J.-H.; Lee, J.-M.; Kim, D.-S. Misbehavior Detection in Connected Vehicles using BurST-ADMA Dataset. In Proceedings of the 13th International Conference on Information and Communication Technology Convergence (ICTC), Jeju Island, Republic of Korea, 19–21 October 2022; pp. 874–878. [Google Scholar]
Bangui, H.; Ge, M.; Buhnova, B. A hybrid machine learning model for intrusion detection in VANET. Computing 2022, 104, 503–531. [Google Scholar] [CrossRef]
Rajapaksha, S.; Kalutarage, H.; Al-Kadri, M.O.; Madzudzo, G.; Petrovski, A.V. Keep the Moving Vehicle Secure: Context-Aware Intrusion Detection System for In-Vehicle CAN Bus Security. In Proceedings of the 14th International Conference on Cyber Conflict: Keep Moving! (CyCon), Tallinn, Estonia, 31 May–3 June 2022; pp. 309–330. [Google Scholar]
Rajapaksha, S.; Kalutarage, H.; Al-Kadri, M.O.; Petrovski, A.; Madzudzo, G.; Cheah, M. AI-Based Intrusion Detection Systems for In-Vehicle Networks: A Survey. ACM Comput. Surv. 2023, 55, 1–40. [Google Scholar] [CrossRef]
Refat, R.U.D.; Abu Elkhail, A.; Hafeez, A.; Malik, H. Detecting CAN Bus Intrusion by Applying Machine Learning Method to Graph Based Features. In Intelligent Systems and Applications; Arai, K., Ed.; Springer International Publishing: Cham, Switzerland, 2022; pp. 730–748. [Google Scholar]
Sharma, A.; Jaekel, A. Machine Learning Based Misbehaviour Detection in VANET Using Consecutive BSM Approach. IEEE Open J. Veh. Technol. 2022, 3, 1–14. [Google Scholar] [CrossRef]
Abdelkreem, E.; Hussein, S.; Tammam, A. Feature engineering impact on position falsification attacks detection in vehicular ad-hoc network. Int. J. Inf. Secur. 2024, 23, 1939–1961. [Google Scholar] [CrossRef]
Sonker, A.; Gupta, R.K. A new procedure for misbehavior detection in vehicular ad-hoc networks using machine learning. Int. J. Electr. Comput. Eng. (IJECE) 2021, 11, 2535–2547. [Google Scholar] [CrossRef]
Adhikary, K.; Bhushan, S.; Kumar, S.; Dutta, K. Decision Tree and Neural Network Based Hybrid Algorithm for Detecting and Preventing Ddos Attacks in VANETS. Int. J. Innov. Technol. Explor. Eng. 2020, 9, 669–675. [Google Scholar] [CrossRef]
Shams, E.A.; Ulusoy, A.H.; Rizaner, A. Performance Analysis and Comparison of Anomaly-based Intrusion Detection in Vehicular Ad Hoc Networks. Radioengineering 2020, 29, 664–671. [Google Scholar] [CrossRef]
Vitalkar, R.S.; Thorat, S.S.; Rojatkar, D.V. Intrusion detection system for vehicular ad-hoc network using deep learning. Int. Res. J. Eng. Technol. 2020, 7, 2294–2300. [Google Scholar] [CrossRef]
Thorat, S.; Rojatkar, D.; Deshmukh, P. Detection of Unknown Attacks in VANET using a Deep Learning Approach and IoT-based Data Set. Indian Soc. Tech. Educ. 2025, 48, 20–28. [Google Scholar]
Sambangi, S.; Gondi, L.; Aljawarneh, S.; Annaluri, S.R. SDN DDoS attack image dataset. IEEE Dataport 2021. [Online]. [Google Scholar] [CrossRef]
Sharafaldin, I.; Lashkari, A.H.; Hakak, S.; Ghorbani, A.A. Developing Realistic Distributed Denial of Service (DDoS) Attack Dataset and Taxonomy. In Proceedings of the 2019 International Carnahan Conference on Security Technology (ICCST), Chennai, India, 1–3 October 2019; pp. 1–8. [Google Scholar] [CrossRef]
Panigrahi, R.; Borah, S. A detailed analysis of CICIDS2017 dataset for designing Intrusion Detection Systems. Int. J. Eng. Technol. 2018, 7, 479–482. [Google Scholar]
Maseer, Z.K.; Yusof, R.; Bahaman, N.; Mostafa, S.A.; Foozy, C.F.M. Benchmarking of Machine Learning for Anomaly Based Intrusion Detection Systems in the CICIDS2017 Dataset. IEEE Access 2021, 9, 22351–22370. [Google Scholar] [CrossRef]
Sharafaldin, I.; Habibi Lashkari, A.; Ghorbani, A.A. A Detailed Analysis of the CICIDS2017 Data Set. In Information Systems Security and Privacy; Mori, P., Furnell, S., Camp, O., Eds.; Springer International Publishing: Cham, Switzerland, 2019; pp. 172–188. [Google Scholar]
Elsadig, M.A. Detection of Denial-of-Service Attack in Wireless Sensor Networks: A Lightweight Machine Learning Approach. IEEE Access 2023, 11, 83537–83552. [Google Scholar] [CrossRef]
Fang, Y.; Yao, Y.; Lin, X.; Wang, J.; Zhai, H. A feature selection based on genetic algorithm for intrusion detection of industrial control systems. Comput. Secur. 2023, 139, 103675. [Google Scholar] [CrossRef]
Taheri, R.; Ahmadzadeh, M.; Kharazmi, M.R. A new approach for feature selection in intrusion detection system. Sci. J. 2015, 36, 1344–1357. [Google Scholar]
Parsaei, M.R.; Taheri, R.; Javidan, R. Perusing the effect of discretization of data on accuracy of predicting naive bayes algorithm. J. Curr. Res. Sci. 2016, 1, 457. [Google Scholar]
Amoozegar, M.; Minaei-Bidgoli, B. Optimizing multi-objective PSO based feature selection method using a feature elitism mechanism. Expert Syst. Appl. 2018, 113, 499–514. [Google Scholar] [CrossRef]
Theng, D.; Bhoyar, K.K. Feature selection techniques for machine learning: A survey of more than two decades of research. Knowl. Inf. Syst. 2024, 66, 1575–1637. [Google Scholar] [CrossRef]
Shirvan, M.H.; Moattar, M.H.; Hosseinzadeh, M. Deep generative approaches for oversampling in imbalanced data classification problems: A comprehensive review and comparative analysis. Appl. Soft Comput. 2025, 170, 112677. [Google Scholar] [CrossRef]
Mohammed, R.; Rawashdeh, J.; Abdullah, M. Machine Learning with Oversampling and Undersampling Techniques: Overview Study and Experimental Results. In Proceedings of the 11th International Conference on Information and Communication Systems (ICICS), Irbid, Jordan, 7–9 April 2020; pp. 243–248. [Google Scholar]
Elsadig, M.A.; Gafar, A. An ensemble model to detect packet length covert channels. Int. J. Electr. Comput. Eng. 2023, 13, 5296–5304. [Google Scholar] [CrossRef]

Figure 1. The accuracy of the MLP, RF, AdaBoost, DT, and KNN classifiers.

Figure 2. The computer processing time for the AdaBoost, RF, MLP, DT, and KNN classifiers.

Figure 3. The classification accuracy rates achieved by all classifiers when using the developed balanced dataset with an information gain feature selection method (BCICIDS2017-GI).

Figure 4. The classification accuracy for all attacks obtained by the proposed classifier (RF), DT, AdaBoost, MLP, and KNN.

Figure 5. ROC for DT classification model.

Figure 6. ROC for AdaBoost classification model.

Figure 7. ROC for MLP classification model.

Figure 8. ROC for KNN classification model.

Figure 9. ROC for RF classification model.

Table 1. VANET Attack Classification.

Service-Based Attacks	Sensing-Based Attacks	Forgery-Based Attacks	Identity-Based Attacks	Message-Based Attacks
Brute force attack	Snooping	Masquerade attack	Movement tracking	Packet dropping
DoS attack	Traffic analysis attack	Greedy behavior attack	Identity revealing	Social attack
Reply attack	GPS spoofing	Unauthorized preemption attack	Hardware tapering	Message suppression
Timing attack	Eavesdropping	Impersonation attack	Malicious vehicle	Pollution attack
Black hole attack		Forging	Sybil attack	Message modification
Malware and spam		Man-in-the-middle attack	Repudiation attack	Bogus information
Intentional attack		Illusion attack		Broadcast tampering attack
Jamming		Wormhole attack
		Tunneling
		Sleep deprivation torture attack

Table 2. VANET Attack Classifications Based on their Effects.

Attack Name	Authentication	Confidentiality	Privacy	Availability	Integrity	Authenticity	Non-Repudiation
Impersonation		Yes	Yes			Yes [20]
DoS				Yes		Yes [20]
Masquerading	Yes
Wormhole/tunneling	Yes	Yes
Bogus information	Yes
Black hole				Yes
Social attack					Yes
Malware				Yes
Man in the middle		Yes	Yes		Yes
Monitoring attack			Yes			Yes
Spamming				Yes
Illusion attack					Yes	Yes
Timing attack					Yes
Sybil attack	Yes		Yes				Yes [20]
GPS spoofing	Yes
Gray hole				Yes [20]
Hidden vehicle		Yes [20]				Yes [20]
Spoofing						Yes [20]
Relay					Yes [20]	Yes [20]	Yes [20]
Position falsification					Yes [20]

Table 3. Layer-Based Classification of VANET Attacks.

Physical Layer	Datalink Layer	Network Layer	Transport Layer	Application Layer
Impersonation attack Jamming Free riding Replication Eavesdropping Man in the middle	Traffic analysis Illusion attack Greedy behavior	Tunneling Sybil attack Message tampering Black hole Jellyfish Gray hole Wormhole	DoS Masquerading Reply GPS Spoofing DDoS Spamming	Non-repudiation

Table 4. VANET Attack Detection Models.

Author	Accuracy	Dataset	Limitations\Drawbacks	Year
Thorat et al. [67]	96%	IoT datasets	The authors stated that CNNs may struggle with long-term dependencies and LSTM is more complex to train. In addition, an autoencoder may struggle with complex patterns	2025
Setia et al. [47]	99.59%	Self-generated dataset includes Normal and DDoS classes	The generated dataset requires more evaluation. The study focuses on only one type of VANET attack, which is DDoS.	2024
Kawale et al. [48]	94.97%	Several online resources	There is no clear description of how the dataset was generated; it is mentioned that the data was combined from different online sources. In addition, it is commonly known that ensemble techniques can enhance detection accuracy; however, they cause more computational cost. The authors do not provide information regarding the proposed model’s computational cost and processing time.	2024
ALMahadin et al. [49]	83.31%	NSL-KDD	The accuracy obtained is moderate. DL models demand a lot of computing power and may not function in all circumstances. In addition, the NSL-KDD dataset does not include recent attacks.	2024
Ercan et al. [50]	-	VeReMi	The computational cost of the proposed approach is questionable and requires more investigation. It is known that ensemble classification approaches can greatly enhance prediction accuracy; however, they require more computational power.	2023
Rashid et al. [51]	99%	Self-generated + Kaggle	Their target attacks are only Misbehavior and DDoS.	2023
Marouane et al. [52]	99%	Self-generated dataset using the SUMO, OMNET++, and VEINS simulators	The generated dataset needs more verification and evaluation to be trusted. The authors indicate that CNN, SVM, and RNN were assessed in terms of their performance, but there is nothing showing that in their article.	2023
Anyanwu et al. [43]	99.33%	SDN DDoS [68] and CICDDoS2019 [69]	The model focuses on one type of attack, instead of considering various types. The ability to detect several types of attacks can significantly increase the visibility of the proposed model in terms of the computational cost.	2023
Alsarhan et al. [53]	99%	NSL-KDD	Training their model on a limited number of network scenarios raises questions about the model’s robustness [43]. The NSL-KDD dataset does not include recent attacks [54].	2023
Kaur and Kakkar [12]	0.9395	Self-generated	The cost requirement is high [48]	2022
Karthiga et al. [45]	98.6%	i-VANET and CIC-IDS 2017.	The detection of unknown attacks needs more verification.	2022
Vitalkar et al. [55]	98% and 90%	CICIDS2017	A complex detection algorithm [45]. In addition, the computational cost is questionable as the proposed classification model is based on a DL method that requires more computational power.	2022
Anyanwu et al. [56]	98.92%	BurST-ADMA	Ensemble classification approaches are capable of providing better performance; however, they involve more computations that need extra computational power, which can affect network performance. Low-cost solutions are more applicable.	2022
Bangui et al. [57]	96.93%	CICIDS2017	It requires more training time, more resources, and more computational power [48].	2022
Rajapaksha et al. [58]	F1-Score of greater than 99%	Public real data (HCRL CH, HCRL SA, ROAD)	Limited capacity to identify attacks on high-frequency aperiodic IDs [59].	2022
Refat et al. [60]	97.99% and 97.92% for KNN and SVM	HCRL CH	The detection rate for the spoofing attack was low [59].	2022
Sharma and Jaekel [61]	99.2% and 98.8% for precision and recall.	VeReMi	The dataset employed in this study does not represent all possible VANET position falsification attacks. In addition, no information regarding the accuracy of the proposed approach is included [62].	2022
Gad et al. [54]	97.9% and 98.2%	ToN-IoT	XGBoost is an ensemble classifier and therefore its computational cost needs to be investigated.	2021
Sonker and Gupta [63]	97.62%	VeReMi	The comparison does not include the computational cost of these classifiers, which is an important factor to be considered especially when dealing with networks that involve high mobility and rapid changes in topology.	2021
Adhikary et al. [64]	96.40%	Self-generated	Even though the accuracy of the proposed hybrid model is high and might be regarded as a near-reliable criterion for a VANET, their proposed model leads to an increase in computing complexity [43].	2020
Shams et al. [65]	99% for recall and precision.	Self-generated	Their proposed model targets one type of attack. More verification of their model using a public and trusted dataset is required.	2020
Vitalkar et al. [66]	98.07%	CICIDS2017	DL models cause more computational overhead.	2020
TheProposed Model	99.8%	CICIDS2017	Our future work will focus on performing more evaluations using different datasets.	2025

Table 5. Classifiers Parameters.

Classifier	Parameters
KNN	Number of neighbors = 5
RF	Number of trees = 10, tree depth = 5
MLP	Neurons in hidden layers = 100, maximal number of iterations = 200
DT	Min. number of instances in leaves = 2, limit the maximal tree depth to 100
AdaBoost	Number of estimators = 50

Table 6. CISIDS2017 Dataset.

Attack Type	Number of Instances
Botnet ARES	1873
Brute Force	10,201
Dos/DDos	26,066
Infiltration	29
Normal	26,185
PortScan	25,409
Web Attack	2067
Total of instances	91,830

Table 7. Feature Weight.

Features	Weight (Score)
Destination Port	1.093
Bwd Packet Length Mean	1.073
Avg Bwd Segment Size	1.073
Total Length of Bwd Packets	1.065
Subflow Bwd Bytes	1.065
Bwd Packet Length Max	1.053

Table 8. The Removed Features.

Feature	Weight	Feature
FIN Flag Count	0.093	CWE Flag Count
URG Flag Count	0.085	Bwd PSH Flags
Fwd PSH Flags	0.082	Bwd URG Flags
SYN Flag Count	0.082	Fwd Avg Bytes Bulk
Idle Std	0.054	Fwd Avg Packets Bulk
Active Std	0.041	Fwd Avg Bulk Rate
RST Flag Count	0.000	Bwd Avg Bytes Bulk
ECE Flag Count	0.000	Bwd Avg Packets Bulk
Fwd URG Flags	0.000	Bwd Avg Bulk Rate

Table 9. Performance Metrics for All Classifiers when using the Revised Version of the CICIDS2017 Dataset.

	Accuracy	Recall	Precision	F1 Score	MCC
MLP	98.9	98.9	98.9	98.9	98.5
RF	99.5	99.5	99.5	99.5	99.3
AdaBoost	99.3	99.3	99.3	99.3	99.0
DT	97.9	97.9	97.9	97.8	97.2
KNN	96.9	96.9	96.9	96.8	95.8

Table 10. Execution Time for All Classifiers.

Classifier	Training Time (s)	Testing Time (s)	Computation Time (s)
KNN	11.374	138.747	150.121
MLP	3188.379	2.548	3190.927
RF	38.043	1.827	39.87
DT	78.155	0.053	78.208
ADA	55.246	1.406	56.652

Table 11. Accuracy and Processing Time for All Classifiers.

	Accuracy	Training Time (s)	Testing Time (s)	Execution Time (s)
MLP	98.9	2743.669	2.114	2745.783
RF	99.5	36.345	1.656	38.001
AdaBoost	99.3	53.566	1.091	54.657
DT	97.9	63.832	0.060	63.892
KNN	96.9	6.839	102.980	109.819

Table 12. Performance Metrics for All Classifiers when using the BCICIDS2017-GI dataset.

	Accuracy	Recall	Precision	F1 Score	MCC	AUC
MLP	99.1	99.1	99.1	99.1	98.9	99.9
RF	99.8	99.8	99.8	99.8	99.7	100
AdaBoost	99.7	99.7	99.7	99.7	99.6	99.8
DT	98.5	98.5	98.5	98.5	98.3	99.5
KNN	98.4	98.4	98.5	98.4	98.2	99.7

Table 13. Attack Classification Accuracy.

	Botnet ARES	Brute Force	Dos/DDos	Infiltration	PortScan	Web Attack	Normal
DT	99.5	99.9	99.6	100	99.8	99.6	98.7
RF	100	100	99.8	100	100	100	99.8
AdaBoost	100	100	99.7	100	100	100	99.7
MLP	100	99.7	99.6	100	99.8	99.7	99.3
KNN	99.9	99.9	99.0	100	99.7	99.9	98.6

Table 14. Classification errors of the proposed classifier, RF.

Attack	FN	FP
Botnet ARES	0.0000	0.0005
Brute Force	0.0000	0.0001
Dos/DDos	0.0027	0.0126
Infiltration	0.0000	0.0000
PortScan	0.0005	0.0008
Web Attack	0.0000	0.0002

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Published by MDPI on behalf of the World Electric Vehicle Association. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Elsadig, M.A.; Altigani, A.; Mohamed, Y.; Mohamed, A.H.; Kannan, A.; Bashir, M.; Adiel, M.A.E. Connected Vehicles Security: A Lightweight Machine Learning Model to Detect VANET Attacks. World Electr. Veh. J. 2025, 16, 324. https://doi.org/10.3390/wevj16060324

AMA Style

Elsadig MA, Altigani A, Mohamed Y, Mohamed AH, Kannan A, Bashir M, Adiel MAE. Connected Vehicles Security: A Lightweight Machine Learning Model to Detect VANET Attacks. World Electric Vehicle Journal. 2025; 16(6):324. https://doi.org/10.3390/wevj16060324

Chicago/Turabian Style

Elsadig, Muawia A., Abdelrahman Altigani, Yasir Mohamed, Abdul Hakim Mohamed, Akbar Kannan, Mohamed Bashir, and Mousab A. E. Adiel. 2025. "Connected Vehicles Security: A Lightweight Machine Learning Model to Detect VANET Attacks" World Electric Vehicle Journal 16, no. 6: 324. https://doi.org/10.3390/wevj16060324

APA Style

Elsadig, M. A., Altigani, A., Mohamed, Y., Mohamed, A. H., Kannan, A., Bashir, M., & Adiel, M. A. E. (2025). Connected Vehicles Security: A Lightweight Machine Learning Model to Detect VANET Attacks. World Electric Vehicle Journal, 16(6), 324. https://doi.org/10.3390/wevj16060324

Article Menu

Connected Vehicles Security: A Lightweight Machine Learning Model to Detect VANET Attacks

Abstract

1. Introduction

2. VANET Attack Classification

3. Related Work

3.1. Authentication

3.2. Machine Learning

4. Materials and Methods

4.1. Dataset

4.2. Feature Selection

4.3. Creation of a Balanced Dataset

4.4. Evaluation

5. Results and Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI