A Deep Learning Approach for Real-Time Intrusion Mitigation in Automotive Controller Area Networks

Kousar, Anila; Ahmed, Saeed; Khan, Zafar A.

doi:10.3390/wevj16090492

Open AccessArticle

A Deep Learning Approach for Real-Time Intrusion Mitigation in Automotive Controller Area Networks

by

Anila Kousar

¹,

Saeed Ahmed

¹

and

Zafar A. Khan

^1,2,*

¹

Department of Electrical Engineering, Mirpur University of Science and Technology (MUST), Mipur AJK-10250, Pakistan

²

School of Computing and Engineering, University of Huddersfield, Queensgate, Huddersfield HD1 3DH, UK

^*

Author to whom correspondence should be addressed.

World Electr. Veh. J. 2025, 16(9), 492; https://doi.org/10.3390/wevj16090492

Submission received: 5 July 2025 / Revised: 19 August 2025 / Accepted: 21 August 2025 / Published: 1 September 2025 / Corrected: 24 September 2025

(This article belongs to the Special Issue Vehicular Communications for Cooperative and Automated Mobility)

Download

Browse Figures

Versions Notes

Abstract

The digital revolution has profoundly influenced the automotive industry, shifting the paradigm from conventional vehicles to smart cars (SCs). The SCs rely on in-vehicle communication among electronic control units (ECUs) enabled by assorted protocols. The Controller Area Network (CAN) serves as the de facto standard for interconnecting these units, enabling critical functionalities. However, inherited non-delineation in SCs— transmits messages without explicit destination addressing—poses significant security risks, necessitating the evolution of an astute and resilient self-defense mechanism (SDM) to neutralize cyber threats. To this end, this study introduces a lightweight intrusion mitigation mechanism based on an adaptive momentum-based deep denoising autoencoder (AM-DDAE). Employing real-time CAN bus data from renowned smart vehicles, the proposed framework effectively reconstructs original data compromised by adversarial activities. Simulation results illustrate the efficacy of the AM-DDAE-based SDM, achieving a reconstruction error (RE) of less than 1% and an average execution time of 0.145532 s for data recovery. When validated on a new unseen attack, and on an Adversarial Machine Learning attack, the proposed model demonstrated equally strong performance with RE < 1%. Furthermore, the model’s decision-making capabilities were analysed using Explainable AI techinques such as SHAP and LIME. Additionally, the scheme offers applicable deployment flexibility: it can either be (a) embedded directly into individual ECU firmware or (b) implemented as a centralized hardware component interfacing between the CAN bus and ECUs, preloaded with the proposed mitigation algorithm.

Keywords:

controller area network; cyber-attacks; deep-denoising autoencoder; intrusions mitigation; smart cars

Graphical Abstract

1. Introduction

Digital advancement has remarkably revolutionized the automotive industry, transforming paradigm from traditional vehicles to futuristic smart cars (SCs). Along with conventional control mechanisms, these intelligent cars are equipped with sophisticated computation and automation that operate with physical processes. This integration of conventional and advanced features has brought several benefits to passengers and drivers, including meticulous security and safety features. However, the World Health Organization reported about 1.3 million deaths per year from road accidents [1]. The situation is likely to deteriorate further in the coming years as projected by Upstreams Security, a renowned cloud-based cybersecurity company. It has revealed a 60 % increase in cyber-attacks in the auto industry between 2023 and 2024 [2], anticipating a continuous growth in the impact and frequency of cyber-intrusions for the coming year [3].

Stemming from continuously evolving nature of cyber-attacks and threat actors, numerous incidents since 2010 have targeted SCs by exploiting inherent vulnerabilities, as illustrated in Figure 1. By exploiting vulnerabilities and attack surfaces such as telematics, diagnostics, infotainment, and other systems, the hackers succeeded in compromising critical functions including acceleration, braking, engine control. In addition, the adversaries managed to gain unauthorized access to unlock, start, and finally steal vehicles. The “Oz Car Parts” hack by the Stux Team in December 2023 [4], the “Spireon Vehicles” hack by security researchers in 2022 [5], the luxury car heist in London in 2022, and the keyless entry hack in Oakville in 2022 [6] are some well-known real-life cyber-security incidents in smart cars. These incidents necessitate the need to integrate a robust and efficient self-defense mechanism to neutralize and mitigate the impact of cyber-intrusions once detected and identified by the intrusion detection system embedded in the network monitoring unit in smart cars. However, majority of the existing studies [7,8,9,10,11,12,13,14,15] on cybersecurity of SCs have primarily focused on anomaly detection in controller area network, overlooking the mitigation mechanism to reconstruct the original data by removing noise to ensure the continuity of operations. There are only few existing studies [16,17,18,19,20,21] which address the mitigation in smart cars, but their proposed solutions such as isolating the affected node and or blocking the compromised data interrupt the system leading to the disruption in the continuity of operations. To tackle the issue of disruption, a real-time mitigation strategy which not only ensures continuity of operation but is also efficient is proposed in this study.

Smart cars are sophisticated cyber-physical systems that integrate sensors, control units, and intricate communication networks, enabling intelligent features such as obstacle identification, emergency braking, lane keeping, keyless entry, and many other. The realization of smart features is linked with the integration of advanced digital systems including electronic control units (ECUs), networking, and communication modules. ECUs are regarded as brain for various subsystems in SCs. The core functions of ECUs include safety management—traction and stability control, driver assistance—adaptive cruise control, lane keeping, collision avoidance, and control and coordination—steering, braking, engine performance. In addition, communication management, battery and energy management, infotainment and comfort, and security functions are also controlled by ECUs. Using the communication networks, ECUs communicate with each other by transmitting messages over the data bus by employing different network protocols such as controller area network, FlexRay, local interconnect network, and ethernet. Among all, Automotive controller area network (CAN) is a de-facto communication network, exhibiting inherited minimal latency and priority-based transmission nature. CAN connects various ECUs, actuators and sensors, ensuring real-time coordination, control and safety in SCs. However, these components are highly vulnerable to cyber-intrusions through external systems such as infotainment, telematics, human-machine interface, OBD-II port and Wi-Fi connections. Serving as a gateway, these elements streamline the anomaly injections for adversaries, allowing them to intrude the SCs by introducing various attacks, including malfunction, replay, false message injection, fuzzy, and sensor manipulation. A generic representation of automotive communication network along with key vulnerabilities that foster the design and execution of this work, is presented in Figure 2.

However, CAN bus features poor data authentication compared to other protocols, facing a serious challenge of data security and passenger safety. It underscores the necessity to develop a state-of-the-art intrusion mitigation model to reconstruct the input data in real time. Normally, the reconstruction models are deep neural networks-based, consisting of (i) encoder, (ii) latent-space representation, and (iii) decoder. Employing the non-linear transformation, the encoder enables the model to learn the underlying data structure to transform the input into a compact latent space, where the decoder reconstructs the original input by removing noise from the data. By employing various machine learning-based neural network approaches, researchers have conducted numerous studies to propose techniques for the mitigation of intrusions, discussed in subsequent section.

1.1. Related Works

Mitigation techniques usually include blocking anomalies by (a) discarding the compromised data altogether, (b) reconstructing original values by removing noise from the data, or (c) isolating the affected node from the system. However, removing the data or isolating the node disrupts flow of data traffic, and halts normal operation for a considerable time. However, real-time reconstruction of original values ensures the continuity of operations in smart cars. Leveraging machine learning techniques that involve model learning based on the input data pattern, the deep learning models learn compressed latent space representation by employing non-linear transformation on input data to remove noise and reconstruct the original data.

Researchers have investigated different approaches to present various machine learning and deep learning-based intrusion detection and mitigation mechanisms in smart cars [15,16,17,19,20,21,24,25]. Moradi et al. [17], proposed a two-tier defense mechanism based on sensor and fusion-decision to detect and mitigate cyber-intrusions that have compromised the data traffic for in-vehicular communication. Comprising of two processes—sensor validation and sensor estimation—the proposed mechanism employed a fusion-based approach for intrusion detection and then validated the detection accuracy employing Yager’s rule. After detection, the manipulated values were replaced using an LSTM-based deep regressor estimator. The fusion approach utilized (a) RReliefF (Regression RefiefF), mRMR (minimum redundancy maximum relevance), and PCC (Pearson Correlation Coefficient) for marking feature ranking, (b) convolutional autoencoder and (c) trade-off-based detector for detection. The study employed the AEGIS dataset for model training and testing to investigate the impact of false data injections, DoS, and replay attacks in smart cars, and concluded with an appreciable detection accuracy of 99.93 %. However, the study failed to provide clear performance results for the mitigation phase. Khanna et al. [24] proposed a threat mitigation model for vehicular ad hoc networks in smart cars. The model was a combination of k-means clustering to group data of a similar nature and a hybrid model of SVM and Feed-forward for accuracy assessment based on the firefly algorithm. The performance evaluation metrics include true detection rate, jitter, throughput, and packet delivery ratio. These performance evaluators are indicators for threat detection instead of mitigation which normally considers isolating the device, discarding anomalous data, or reconstructing original values by removing noise from the data.

Working on the security of information in cyberspace, Wang [19] introduced a hybrid model for anomaly detection and response generation in communication networks. The study employed CNN and RNN for spatial and temporal analysis utilizing zero trust architecture to ensure network integrity. The results revealed 95.4 % accuracy in anomaly detection including blocking suspicious account and isolation of compromised devices from the network as a mitigation step. Sontakke and Chopade [20] investigated the performance of deep learning-based intrusion detection and mitigation mechanisms proposed for vehicular ad hoc networks. The detection model combined improved particle swarm optimization technique with the deep neural networks for fine model tuning. The mitigation process utilized the BAIT approach to locate the intruder’s position in the network. The attack vectors employed in the study include false message injection—sending bogus fictitious values, and Sybil—creating multiple illegal message identifiers. The simulation results showed outstanding performance in anomaly detection and identifying the adversary’s position with 100 % true detection rate, and the mitigation measure involved isolation of the affecting node (intruder) from the network.

Hidalgo et al. in [21] advocated the efficient use of the SerIoT system in vehicular communication networks for anomaly detection and mitigation. The SerIoT system can monitor real-time network traffic, analyze the data for unusual behavior and irregular patterns, and can take necessary mitigation action in communication networks. Consisting of real and virtual components, the experimental setup included Renault Twizy 80 and Dynacar Environment implemented by using Matlab/Simulink. The study used graph neural network-based multi-layer perceptron technique for detection and mitigation of DoS anomaly in the network. The model proposed block—temporary data blockade, block-list—data blockade for an extended time, block-list MAC—blocking the affected MAC device and Deflect—redirecting the intruder to decoy system as mitigation strategies. The proposed model was evaluated in terms of response time both for detection and mitigation which is fairly low, however, detection accuracy score in correctly detecting the intrusions was missing in the study. Khanapuri et al. [25] presented a DL-based anomaly detection and mitigation controller for the security of smart vehicles platoon. By using CARLA and MATLAB platforms for the simulation setup and keeping the system decentralized, the study utilized local sensor information to identify the malicious actor in the platoon. Preparing the data using Gramian Angular Fields, Short Time Fourier Transform, Time Series to Gray Scale, and Markov Transition Fields, the detection process utilized CNN for identification of flaws. The proposed controller employed Routh Hurwitz Criterion to determine constraints on controller gains for attack mitigation. The study concluded with 96.3 % detection accuracy and increased distance between attacking and normal vehicles as a mitigation measure. To neutralize the impact of false data injection attacks in vehicle platoons, Ahmed et al. [15] introduced a state-space model and unknown input observers (UIOs)-based anomaly detection and mitigation model. The UIOs were used for state estimation, and were implemented with residual function to detect the presence of anomalies in the data. Afterward, the anomalous data was subtracted from the associated input fed to the platoon controller as a mitigation strategy.

Limitation to existing studies: Evident from the literature above and Table 1, the existing studies have primarily focused the first tier of the defense mechanism i.e., anomaly detection, while intrusion mitigation being a critical process to ensure car security and passenger safety, comparatively low attention is paid to present a mitigation strategy against cyber intrusions in SCs. Unfortunately, there are quite limited studies on the topic, addressing the intrusion mitigation in controller area network in smart cars—an area underexplored by the researchers. However, the documented studies in literature suggest to isolate the affected node or block and discard the data completely. Not only disengaging the system, these solutions also halt the continuity of operations resulting in loss of critical information, which is damaging for the commuters in smart cars. To overcome these issues, this study proposes a real-time anomaly mitigation technique based on data reconstruction strategy which operates without interrupting the system’s functionalities. However, to the best of our knowledge, the literature lacks an extensively investigated intrusion mitigation mechanism for smart cars that reconstructs data efficiently in real-time. Fulfilling the gap, this study proposes a novel lightweight AM-DDAE-based intrusion mitigation model to remove anomalous values and reconstruct the original values compromised by cyber-intrusions injected in the CAN-bus of smart cars. The proposed model was rigorously tested for multiple cyber-attacks by employing six different datasets, to validate its effectiveness over a range of intrusions. To check model adaptability to more sophisticated and emerging attack, it is also tested on new unseen attack. In addition, to build the trust of stakeholders on AM-DDAE, Explainable AI techniques are employed to analyse decision making process of the proposed model. Furthermore, leveraging the deep-denoising autoencoder scheme, the proposed method adopts to adjust the momentum dynamically during the model training at each epoch. This strengthens the model learning to capture the underlying pattern more efficiently. Tested for multiple known attack designs, car models, and new unseen attack, the proposed model has achieved < 1% error in original data reconstruction, and is also lightweight consuming 0.145532 s in model execution on average.

1.2. Challenges in Intrusion Mitigation

Real-time mitigation of intrusions in smart cars is crucial to ensure a secure and safe driving experience. Data transfer between different electronic control units through CAN-bus possesses varying nature which brings multiple challenges for the mitigation model to tackle with. The main challenges are:

Non-linear complex data handling: CAN-bus data is highly non-linear as the different nodes communicate with each other to perform tasks that differ in functional complexity. Deep-denoising autoencoder (DDAE) models have built-in capability to handle non-linear data comfortably.
Robust noise removal: DDAE models are designed to clean the noisy data.However, effective noise filtering is a challenging task to avoid model overfitting and underfitting that often result in loss of significant information, and failure in learning underlying data patterns to generalize well to new data.
Gradient handling: Gradient vanishing and divergence is a big challenge in non-linear data where vanishing can slow down model learning, and divergence can accelerate weight updates causing instability in the model.
Generalization: With the evolving nature of cyber-intrusions demonstrating new attack vector designs, it is imperative to have an adaptive model to denoise the new unseen data efficiently.

1.3. Motivation

The existing studies mentioned in Section 1.1, propose mitigation strategies either by discarding the data or isolating the affected node from the network causing prolonged interruption in a vehicular communication network. Disrupting the normal data flow, these approaches can cause car immobilization which can be the source of road obstructions, block lanes, rear-end collisions, stress, and panic among drivers leading to unpleasant driving experience. Contrarily, the reconstruction of normal data by removing attacks appears to be a more viable option to ensure safe and continuous SC operation.

Further, data retrieval using an AM-based deep denoising autoencoder presents promising performance while overcoming the associated challenges. Deep-learning based autoencoders are inherently capable of handling non-linear data and learning underlying data patterns efficiently. Adaptive momentum enables the model to achieve a balance between learning efficiency and stability avoiding underfitting and overfitting problems. Rather than using a fixed learning rate and momentum, the AM-DDAE model utilizes optimized parameter values based on the model performance during training, generalizing well for the unseen data.

1.4. Contributions

For a smart car, where cyber-attack detection is crucial, attack mitigation is an integral part of the defense mechanism. In this study, an efficient and robust AM-DDAE-based intrusion mitigation mechanism is presented to reconstruct the original data values manipulated by malicious actions of adversaries on the CAN-bus data through the OBD-II port of the SCs. The proposed AM-DDAE model is extensively investigated for multiple attack vectors, viz DoS, Replay, Spoofing, Fuzzing, Impersonation, and Malfunction, using CAN-bus data from different smart car models such as Hyundai YF Sonata, KIA Soul, Chevrolet Spark, and Genesis g80. The main contributions to the literature are given below:

Novel light-weight AM-DDAE model design: Development of a novel adaptive momentum-based deep denoising autoencoder for intrusions mitigation in smart cars, with considerably low computational cost. Additionally, Explainable AI techniques—SHAP and LIME—are incorporated to examine model’s decision-making capabilities.
Generalization: Investigation of the proposed mechanism by employing multiple cyber-attacks for different makes of smart cars ensuring the generalizability and applicability of the proposed mechanism over various attack designs and car models.
Efficiency and robustness: Analysis of the results revealed a comprehendible performance with 2.87 × 10⁻⁷ mean reconstruction error, less than 1% percentage error and 0.145532 sec average execution time which verifies the robustness and efficiency of the proposed mechanism. The model presented an equally strong performance when further evaluated on a new unseen attack design, and on Adversarial Machine Learning Attack. Additionally, on comparison with Generative Adversarial Networks, the proposed model demonstrated exceedingly high performance with 99 times higher accuracy in intrusions mitigation.

The rest of the paper is organized as follows. Section 2 gives an understanding of potential attack surfaces and cyber-intrusions in a smart car. The proposed AM-DDAE for intrusion mitigation is described in Section 3 followed by the findings of the proposed mechanism in Section 4. Finally, Section 5 concludes the study with the generalizability of the proposed mechanism over different types of smart cars and attacks.

2. The Smart Cars: Potential Attack Surfaces & Cyber-Intrusions

Smart cars are the cars with embedded intelligence that integrate tightly coupled physical devices—sensors and actuators, to the communication and computational network—software and processors. These components communicate with each other to share data and perform intelligent operations like anti-lock braking, engine control, and keyless entry. What makes smart cars smart is their key feature of achieving high speed in a matter of seconds. Fortwo can accelerate to 60 mph from 1 mph in Just 15 s, and is able to handle the high speed efficiently through an anti-lock braking system in emergencies. The infotainment system, powertrain, advanced driving assistance system, GPS, engine temperature and fuel injection control are the prominent intelligent features of smart cars. In addition, smart cars improve luxury driving by enabling connectivity to the internet through the occupant’s mobile phone or by using built-in hardware that provides 4G, and 5G connections. However, over-the-air communication make SCs highly vulnerable to cyber-threats. The various attack surfaces in SCs are diagnostic ports, infotainment systems, sensors, communication networks, over-the-updates, USB connections, and cloud-based services.

The attack surface is an entry point through which an adversary can gain unauthorized access to the system, either physically or through wireless links. By exploiting potential vulnerabilities in these surfaces, intruders can introduce multiple intrusions demonstrating varying attack designs. The various potential cyber threats in SCs are: malware injections, CAN-bus injections, fake OTA updates, Wifi jamming, key cloning, phishing apps, and many others. Figure 3 describes various potential attack surfaces and threats that disrupt the in-vehicular communication (IVN), reducing the overall car efficiency that may lead to serious threats to the life of the occupants. The current study focuses on the CAN-bus injections which include DoS, fuzzing, spoofing, replay, impersonation, flooding, and malfunction intrusions.

Denial-of-service (DoS) attack is the interruption in normal CAN-bus data flow with the injection of 0 × 000 values, fuzzing is the addition of random noise in the CAN data, replay is the transmission of the same message repeatedly, spoofing is the introduction of spoofed messages, impersonation is the data corruption by hacker impersonating a legitimate user, flooding is the network bottleneck by tempered data, and malfunction is the loss of data integrity by insertion of malicious values.

3. Proposed Methodology

Leveraging digitization, technological advancement has facilitated the driving experience by integrating smart features in the conventional transportation network. However, the smart feature inherits cyber vulnerabilities associated with communication in cyber space. Various mechanisms are proposed for intrusion detection to send alert signals to commuters about the presence of anomalies in the network, but there is still a gap in the literature to remove the intrusion and reconstruct the original data efficiently in smart cars. In this study, a novel lightweight AM-DDAE mechanism is proposed for the reconstruction of original data manipulated by cyber-intrusions in the CAN-bus network in smart cars. The AM-DDAE model utilizes real-time datasets for model training and testing collected from different smart car models.

3.1. Datasets

The behavior of a deep learning-based model is dependent on the data provided for model training to learn the latent space and regenerate the original values by removing noise from the actual data, so the selection of the dataset plays a major role in determining the efficiency of a model. This study utilizes real-time datasets collected from different car models such as the Hyundai YF Sonata, KIA Soul, Chevrolet Spark, and Genesis g80. The database was generated in the Hacking and Countermeasure Research Lab, South Korea by injecting intrusions in CAN-bus through OBD-II port by wireless connection to the data acquisition system. Differing in collection time, attack type, intrusion injection frequency, and car model, the six independent datasets are used for the training and testing of the proposed AM-DDAE mechanism. These are (1) CAN-Intrusion Dataset (OTIDS) generated using KIA Soul, 2017 [26]; (2) Car Hacking Dataset compiled from Hyundai YF Sonata, 2018 [27]; (3) In-vehicle Network Intrusion Detection Challenge Dataset produced with data collected from Chevrolet Spark, KIA Soul and Hyundai Sonata, 2018 [28]; (4) M-CAN Intrusion Dataset derived from Genesis g80, 2022 [29]; (5) B-CAN Intrusion Dataset generated using Genesis g80, 2022 [30] and (6) CAN-FD Intrusion Dataset obtained through Smart cars released in 2021, 2022 [31]. The various characteristics of the datasets including car model, number of attacked samples, normal samples, and attack types are given in Table 2. The attributes of CAN bus dataset are the timestamp, CAN identifier (ID), data length code (DLC), data value, and a flag representing it as normal data or anomalous data. Timestamp represents the time at which data was logged and recorded. CAN ID is an identifier for the CAN message in HEX as 024A. It is either 11-bit identifier—a standard frame or 29-bit identifier—an extended frame depending upon the format used. DLC gives an idea of the length of bytes in the CAN message transmitted to a particular electronic control unit. This value normally ranges from 0 to 8 bytes, however in case of CAN-FD data, its value could extend to 64 bytes. Data value represents the actual information contained in the CAN message to carry out a specific operation either to control the car speed, control the fuel injection rate, control the tyre pressure or any other function. The data value can be 64 bits at maximum for normal CAN messages, while it could increase to 512 bits for CAN-FD. Flag is the final feature of a CAN-bus message which differentiates the data as either attacked or normal. ‘T’ flag shows that the data is compromised by an anomaly while attack-free data is represented by ‘R’ flag. As the CAN-bus enables in-vehicular communication among different ECUs, so all these attributes and features carry substantial information about normal and abnormal traffic flow on CAN-bus, helping the mitigation model to learn the data pattern and mitigate the anomaly from the system once identified by the intrusion detection system. Further, these attributes enable understanding the communication semantics by identifying the sensitive CAN IDs, allowing targeted monitoring and mitigation of intrusions that are perils to safety. Generally, each ECU in a smart car is assigned a unique fixed CAN ID which helps in identifying the compromised ECU when it starts sending data at an unnecessarily high frequency reflecting injection of DoS, flooding or spoofing attack. Similarly, DLC value plays a critical role in knowing anomalous data if there is significant variation in its value as from 2 to 8 or any other. Lastly, actual data value is the most significant in the training of a mitigation model. As an abrupt change in sensor reading, steering angle and or acceleration indicates high chances of intrusion such as spoofing, replay or false message insertion attack. Therefore, the performance of a mitigation model is directly linked with and highly dependent on these features to learn complex patterns in the data, know the contextual relationships and temporal dependencies among messages in the controller area network traffic. Attack injection rate is an important parameter linked to the intensity, CAN bus is manipulated. The original data is logged by taking time interval between attack injections. The Attack injection rate (AIR) is number of packets per second (pps), calculated by dividing the total number of intrusions by time interval between two injections.AIR values for different attack designs are given in Table 2.

3.2. Data Preprocessing

To enhance the model’s stability and convergence, data preprocessing is an integral step in the training of deep-learning models for generating effective latent space representation in DDAE. Normally, preprocessing is performed using either normalization or standardization depending on the nature of the data. For Gaussian-based data distribution, standardization is employed by calculating the mean and variance of the data. Normalization is preferred when data is highly non-Gaussian and it is required to preserve the impact of intrusions in the original data. Given the non-Gaussian data with a significant number of attack samples, normalization is employed in this work to achieve consistency across the range of data. More precisely, the min-max approach is adopted for normalization to adjust the values between 0 and 1 which helps to avoid bias and rescale the complete data proportionally. Mathematically, the normalization process is given as in (1), while (2) explains the transformation to a new scale of 0 and 1.

Z = \frac{z - z_{m n}}{z_{m x} - z_{m n}}

(1)

Z = \frac{z - z_{m n}}{z_{m x} - z_{m n}} (Y_{m x} - Y_{m n}) + Y_{m n}

(2)

where Z is the normalized data, z is the original data,

z_{m n}

and

z_{m x}

represent the minimum and maximum data points, and

Y_{m n}

and

Y_{m x}

represent the new minimum and maximum data values on 0–1 scale.

3.3. Attack Design

Attack design depends on the target environment, its architecture, and system awareness by the threat actor. Assuming that the intruder is well-informed of the system and its vulnerabilities injects smartly designed intrusions such as DoS, replay, fuzzy, and spoofing attacks on the CAN-bus through OBD-II port. The attack design for Denial-of-Service is the injection of messages of the highest priority by replacing all non-zero values with zeros in a CAN-bus message, and for a replay attack, it is the transmission of the same data point repeatedly on the bus. In the case of fuzzy intrusion, the attack design is the addition of random data points in network traffic for an arbitrarily chosen CAN ID whereas data manipulation for a specific CAN ID is categorized as a spoofing attack. The attack designs for different attack types are described in (3).

ϵ = \begin{matrix} [0, 0, . . ., 0], f o r D o S i n t r u s i o n \\ [d_{x}, d_{x}, . . ., d_{x}], f o r r e p l a y i n t r u s i o n \\ a r b i t r a r y_C A N_I D (r a n d [d_{x}]), f o r f u z z i n g i n t r u s i o n \\ s p e c i f i c_C A N_I D [d_{x}], f o r s p o o f i n g i n t r u s i o n \end{matrix}

(3)

where

d_{x}

represents manipulated data points. The input Z changes to

\hat{Z}

once the attack vector is injected into the original data, as given in (4).

\hat{Z} = Z + ϵ

(4)

3.4. Proposed AM-Based Deep-Denoising Autoencoder

This study proposes a novel adaptive momentum-based deep denoising autoencoder (AM-DDAE) mechanism for intrusion mitigation by reconstructing original data from the noisy input compromised by cyber-intrusions. Data reconstruction is realized by the model’s ability to understand the data behavior and its trends, which requires data division in an appropriate ratio for model training and testing. In this study, the input data is split as 65% data is used for model training - further divided into 90% training and 10% validation data—and 35 % is used for testing.

The proposed mechanism is a two-step model including (a) intrusion injection and (b) AM-DDAE intrusion mitigation processes independent of intrusion detection system which is a separate component. Intrusion injection involves attack insertion in CAN-bus data by the intruder impersonating an authentic ECU to get illegitimate access to the controller area network through an OBD-II port in smart cars connected via a wireless connection to the data acquisition system. The compromised ECU could be an advanced driver assistance system, power train-, body-, engine-control module, or any other. The normal and attacked data is then fed to the AM-DDAE intrusion mitigation model which is a key component of a network monitoring unit in smart cars. Capturing the incoming traffic, the model performs encoding and decoding on the data based on the model training, updates the parameters including weights, biases, momentum, and learning rate, and finally reconstructs the original data by removing noise from the input data. The pseudocode stating the main points of the proposed mechanism is presented in Algorithm 1. The working of the proposed model is presented in the subsequent section.

Algorithm 1: AM-DDAE Mechanism for Intrusion Mitigation

Data Preparation input_data = SC_CAN-Bus_data()
norm_data = normalize(input_data)
[train,test] = input.split(input, train_ratio)
train_data = 1:upper_limit → 0.65 * norm_data
test_data = upper_limit+1:end → 0.35 * norm_data
Parameters Initialization
def features_number, hiddern_layers_size
def initial_momentum, initial_learning_rate
def batch_size, epochs_number
def weights, biases, activation_function, hiddern_layers_size
Training Loop
for i = 1 : epochs_number
learning_rate = learning_rate_scheduler(epoch)
latent_space = encoder(batch_input)
retrieved_output = decoder(latent_space)
loss = compute_loss(batch_input, retrieved_output)
grad = compute_gradients(loss, weights, activation_function)
% Weights and Biases Update
weights[layer] = momentum * weights[layer] + learning_rate *
grad.weights[layer]
biases[layer] = momentum * biases[layer] + learning_rate *
grad.biases[layer]
validation_loss = evaluate_model(input, weights, biases)
% Momentum Adjustment
if epoch > 1:
if validation_loss > prev_validation_loss
momentum = max(0.5, momentum*0.9)
else
momentum = min(0.99, momentum*1.1)
Model Testing & Evaluation
test_reconstruction = forward_pass(test_data, weights, biases)
error_calculation = mse(y_test, test_reconstruction)

3.4.1. Working of the Proposed AM-DDAE Mechanism

Working of the proposed model involves four steps (i) data preprocessing, (ii) training, (iii) validation and (iv) evaluation. It begins with the (i) normalization of the input data to scale different features within the range. As the large feature values dominating the learning process can negatively impact the model’s training and validation steps, affecting its overall stability. After normalization using min-max technique, the input data (X) is split into training (

x_{t r a i n}

), validation (

x_{v a l}

) and test data (

X_{t e s t}

). (ii) Training utilizes

x_{t r a i n}

data divided into fixed number of epochs, further splitting each epoch into 16 mini-batches (

x_{b a t c h}

). The model training is an iterative process performed on per-batch concept i.e., the complete training process is executed on one mini-batch and then repeated for all 16 mini-batches in the epoch. Using the

x_{b a t c h}

and initial values of weights and biases, the forward pass process performs data encoding and decoding. The data is encoded using three layers of encoder, each performing linear transformation to generate pre-activations followed by ReLU function which introduces non-linearity in the data generating activations. During the encoding, output of one layer serves as an input to the next layer transforming input data into compact latent space. This latent representation is fed to a decoder which reverses the encoding process and attempts to reconstruct the original data with minimal reconstruction error. Here after, loss function is computed which helps in determining magnitude and direction of gradient in the subsequent process.

Next, gradients of weights and biases are calculated performing backpropagation in the backward pass process by using loss function, initial weights and biases values and activations produced during the forward pass. At the end of the training process per-batch, these gradients values, initial velocity and learning rate are used to update the optimization parameters generating updated weights, biases, velocity and learning rate. These updated values are then used in the next iteration, and the process continues until the training by all mini batches i.e., one epoch. Upon completion of training by an epoch, the model performs (iii) validation. Validation is conducted by employing forward pass process on validation data, using weights and biases produced during the last mini-batch training. This process outputs reconstructed value,

x_{v a l h a t}

. Following the hierarchy, validation loss is the computed which is used to adjust momentum needed for enhance learning and improved convergence. If validation loss for the current epoch is greater than the last epoch, momentum is decreased but not below 0.5 and increased up to 0.99 if loss decreases. The limit 0.5 – 0.99 is set to avoid diminishing impact by the previous updates, and divergence by overshoot of the minima. The final step in the process is (iv) evaluation. Once validation for the epochs is executed, the model performs final evaluation by employing forward pass on the test data using weights and biases obtained during the last epoch. This process generates retrieved output as

y_{t e s t}

used to calculate reconstruction error which builds an understanding of the model’s performance. The higher value of error shows inaccuracy in the model, whereas a lower error value demonstrates robustness and reliability of the developed model. Finally, the working of the proposed model ends with the measurement of reconstruction error. The data flow and interaction between different components of the proposed model are presented in Figure 4. The data preprocessing and training process are represented in Figure 4a, whereas Figure 4b shows validation and the final evaluation processes. The architecture of the proposed mechanism is presented in the following subsection.

3.4.2. Architecture of the Proposed AM-DDAE Mechanism

The architecture of the proposed mechanism based on the deep-denoising autoencoder technique consists of multiple processes which include forward pass, backward pass, update of the optimization parameters and performance metrics. These processes are explained below in detail.

Forward Pass: A function responsible for the computational flow of data among the encoder, latent-space, and decoder of the proposed AM-DDAE mechanism. It is an iterative process to reduce the loss function to its minimum value to provide an efficient reconstruction of the original input. While inputting the data to the encoder, (5) and (7) serve to transform the input into the latent space by applying a non-linear activation function which is then fed to the decoder component. During decoding, (8) and (9) are used to generate the reconstructed values utilizing the final values of weights and biases.

Encoder: Normally, encoding is the process of converting the given data into a new data format. The proposed AM-DDAE, is the transformation of input data with added noise into compressed representation employing a non-linear transformation. Initially, a linear transformation is employed as given in (5) generating pre-activation output, x for each layer, i of the input data.

x = \hat{Z} \cdot W_{i}^{T} + b_{i}

(5)

where W and b are weights and biases.

Describing the strength of links between layers, weights measure the impact of a neuron from the previous layer on the neuron of the next layer. Weight matrix,

W \in R^{m \times n}

is generally represented as in (6):

W = [\begin{matrix} w_{1, 1} & w_{1, 2} & \dots & w_{1, n} \\ w_{2, 1} & w_{2, 2} & \dots & w_{2, n} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ w_{m, 1} & w_{i, 2} & \dots & w_{m, n} \end{matrix}],

(6)

where m and n are the number of neurons in the current and previous layers respectively.

Biases are the offset values included in the weighted sum of input, allowing the model to learn displacements in data distribution.

Next, non-linearity is introduced in x using the Rectified Linear Unit (ReLU) function to create a compact representation of

\hat{Z}

. ReLU is a non-linear activation function, enabling the model to learn intricate patterns in the data and represent them in latent space to effectively remove noise from the input. The normal functioning of ReLU is as given in (7).

f (x) = max (0, x)

(7)

For values x > 0, the output is x itself allowing smooth flow for gradient computation in back-propagation during the backward pass process, whereas, for x ≤ 0, the output is ‘0’, to avoid diminishing gradients.

The choice of ReLU in this work is linked with its high computational efficiency as its processing involves a simple comparison of incoming data x with ’0’ rendering low computational cost and making it suitable for large datasets. During each subsequent layer, the data size reduces progressively extracting the most relevant features for intrusion mitigation by bypassing the less significant features such as noise. This process ends with the development of latent space presenting a noise-tolerant compact representation of the input data.

Latent-space representation: It is a final and highly compact presentation of the given data generated by the encoder. It reduces data size and computational cost while enabling effective storage. The most prominent feature of latent-space representation is that it serves as a base for the process of the decoder to retrieve the attack-free original input.

Decoder: Decoding is the regeneration of encoded data back to the original format. In the proposed AM-DDAE, working as the reversal of the encoder and expanding the latent space back to the original data size, the decoder retrieves the initial input by removing noise and eliminating the impact of intrusions on the input data. The reconstruction process for each decoder layer j is given in (8) and (9). Equation (8) produces the intermediate output, x, for each internal layer, whereas (9) presents the original data reconstruction,

Z^{'}

, by the final decoder layer.

x = latent representation \cdot W_{j}^{T} + b_{j}

(8)

Z^{'} = latent representation \cdot W_{d e c o d e r}^{T} + b_{d e c o d e r}

(9)

where W and b are weights and biases. Note that the order of the weight matrix is reversed

W \in R^{n \times m}

, because starting from the output layer, the decoder functions in reverse order compared to the encoder.

Architecture of the proposed model is portrayed in Figure 5 encoder, decoder and dense layer or the latent space for the proposed methodology along with the number of neurons at each layer. The methodology employs adaptive momentum strategy which is dynamically adjusted according to the validation loss along with Forward pass and Backward pass processes. The data flows from encoder to latent space and then to decoder during Forward pass, and the error calculated by loss function flows from decoder to latent space to encoder during backward pass to update weights for the next epoch. It is an iterative process until training is completed for all epochs. The adaptive momentum value is incorporated to Backward process after the 3rd hidden layer propagating across all layers towards encoder.

Loss Function: The proposed model aims to find the optimal parameters

\{W, b\}

during training, that could help to reduce the reconstruction error by calculating the loss function. Normally, neural networks utilize mean-squared error (mse) as a loss function to compute the loss during training. Mean-squared error is a measure of the squared difference between reconstructed and original values, calculated using (10). For larger differences, the model is reiterated to improve the learning by focusing on more meaningful features that effectively reduce the mse value generating a well-generalized latent space.

mse = \frac{1}{N} \sum_{i = 1}^{N} {(z_{i} - z_{i}^{'})}^{2}

(10)

where z and

z^{'}

are original and reconstructed values respectively, and N represents the total number of observations.

This study aims at reducing the reconstruction error between original and reconstructed values, that allows focused optimization by minimizing reconstruction loss which is directly aligned with the goal of Denoising—removing noise from the data.

Backward Pass: A function processing backward to compute the gradients for weights and biases using the error value (

δ

) as given in (11). These gradients help the model identify the amount of change required to update the parameters using gradient descent to obtain the minimum value for the loss function.

δ = - (Z - Z^{'})

(11)

Negative sign indicates that the gradient is descending in the direction of low reconstruction error.

The basic steps involved in the backward pass process are (i) calculation of error value using (11), (ii) calculation of weights- and biases-gradients using latent-space activations and error signal for encoder, (iii) calculation of weights- and biases-gradients using activations from previous layer and error signal for decoder, and (iv) application of derivative of activation function to tune error signal in every layer. The weights-

\nabla W

and biases-

\nabla b

gradients are calculated using (12) and (13) which propagate backward from the output layer to the input layer.

\nabla W

calculation utilizes delta and activations from previous layers, while

\nabla b

is a measure of the mean of deltas from all previous layers.

\nabla W_{i} = \frac{δ^{T} \cdot {activation}_{i - 1}}{N}

(12)

\nabla b_{i} = \frac{1}{N} \sum_{j = 1}^{N} δ_{j}

(13)

Adaptive Momentum: In this study, an adaptive momentum (AM) strategy is employed to automatically adjust weights and biases in relevance to validation loss for model optimization. AM is adjusted for each iteration to avoid instability in the proposed AM-DDAE model. During training, if the validation loss increases, momentum drops to a minimum value of 0.5, while it increases to a maximum of 0.99 in the case of a decrease in validation loss. The selection of 0.5 as lower limit is connected to the fact that values below 0.5 give less weight to the past gradients. On the other hand, the values greater than 0.99 may destabilize the learning process and overshoot the optimal point. The momentum updates gradient descent by adding a small value in the current update from the previous update, ensuring to overcome local minima. It helps in reducing the convergence rate by enabling the optimizer to move quickly through flat regions, minimizing oscillations by averaging updates, and addressing over-fitting issues by ensuring stability in the model. Velocity vector—combined effect of previous updates on the current update—is used to update the parameters by considering AM coefficient, C, into account. The equations governing the update process are given in (14) and (15). Equation (14) calculates velocity for each parameter which is then updated using (15) by computing the updated velocities.

v_{t} = C v_{t - 1} + L \nabla P

(14)

P_{t} = P_{t - 1} - v_{t}

(15)

where,

v_{t}

—updated velocity, C—momentum coefficient,

v_{t - 1}

—previous velocity, L—learning rate,

\nabla P

—gradient of updating parameter,

P_{t}

—current updated parameter and

P_{t - 1}

—previous updated parameter.

Learning Rate: Controlling the step size to update the parameters, the learning rate determines the value to scale the gradients for smooth, stable, and fast model convergence. Normally, the learning rate is static—a fixed value during complete training, or dynamically adjusted based on the predefined learning rate

L_{0}

, the number of epochs E, and the decay factor

λ

as given in (16).

L = \frac{L_{0}}{1 + λ \cdot E}

(16)

In this work, the learning rate is updated by adjusting the decaying factor based on the number of epochs. Starting with an initial rate of 0.01 for the 1–10 epochs, the rate is decayed by 0.933, subsequently adjusted to 0.005 for 11–20 epochs, and then to 0.001 for higher epochs. The selection of decaying factor—0.933—for the proposed AM-DDAE depends on its efficiency in reconstructing the original data.

In addition, training of the proposed model without utilizing an early stopping technique is helpful that allows model to train for maximum number of epochs, enabling robust learning of the complex and subtle features. Furthermore, it observes degradation trends and long-term training behavior.

Summarizing the working of the proposed AM-DDAE model, training of encoder and decoder is an iterative process carried out jointly by reducing the mean-squared error between given input and retrieved output. Encoding, latent-space representation, and decoding are performed during forward pass. Backward pass computes gradients in relevance to weights and biases by performing backpropagation. Optimization is achieved by adjusting the weights and biases to ensure accelerated convergence and stability through adaptive momentum and learning rate schedulers. Lastly, model efficiency is determined by measuring reconstruction error between input data and reconstructed output which is explained in the next subsection.

3.4.3. Error Metrics

The proposed AM-DDAE model is aimed at regenerating the original input by removing intrusions injected into the normal data. The most efficient way to determine the reconstruction accuracy is by analyzing the deviation in reconstructed value from the original value, i.e., performing a simple comparison between original input data and retrieved de-noised output. This comparison is possible by calculating absolute error which is considered as the maximum error possible by any model. It is a measure of the difference between input and output values, calculated as given in (17).

A b s o l u t e E r r o r = |z_{p, q} - z_{p, q}^{'}|

(17)

where p represents data points and q represents feature number for the z input and

z^{'}

denoised output.

Error ratio, another performance indicator, determines the relative reconstruction error reflecting the model’s ability to learn and understand the input data and how accurately the input data is *represented as denoised output. It is calculated using (18).

{E r r o r R a t i o}_{p, q} = \frac{| z_{p, q} - z_{p, q}^{'} |}{| z_{p, q} |}

(18)

Standard deviation (std) plays a critical role in evaluating the model’s performance while underlining its consistency level. A high std value indicates large variations in reconstruction errors across the samples which shows high inconsistency in the model’s performance. Contrarily, the low std value is a measure of model’s consistency and competency to retrieve and reconstruct the original data points with high accuracy. Std is measured by using (19).

σ = \sqrt{\frac{1}{m - 1} \sum_{j = 1}^{m} {(μ_{j} - \bar{μ})}^{2}}

(19)

where,

μ_{j} = \frac{1}{n} \sum_{i = 1}^{n} |z - z^{'}|

and

\bar{μ} = \frac{1}{m} \sum_{j = 1}^{m} μ_{j}

μ_{j}

is an array of the mean of absolute errors per-sample and

\bar{μ}

is the mean of all

μ_{j}

.

4. Results and Discussions

Generally, to analyze the reconstruction performance of a model in reconstructing original data from attacked or compromised data, the best evaluation metric is the absolute error, which calculates the difference between the original and reconstructed data. Trained on six independent datasets, the proposed model is tested and evaluated by calculating absolute error, error ratio, and standard deviation. Training cost and validation costs for determining how well the model is learning and its generalizability on unseen data are also investigated in this study. The different simulation parameters used for optimal model training to avoid over-fitting and under-fitting problems are given in Table 3.

4.1. Error Metrics

Reconstruction error (RE) is a gauge to measure the model’s efficiency and accuracy in data retrieval. A high RE indicates significant deviation in the output from the input reflecting poor performance by the model. The low error shows a higher correlation between the input and output, revealing an excellent model learning with low variation in the reconstructed data. Error ratio (ER) is a relative error which provides scale-independent measure of the difference between actual and predicted value. A small ER demonstrates the model’s ability to output the data proportionally close to the input, acting as an insightful indicator in mitigation models. Standard deviation (std) is another performance indicator for the mitigation model, which answers the question that how much the values deviate from the mean. A low std value shows tight clustering around mean, reflecting high consistency and reliability of the model. The mathematical relation for these metrics are given in (17)–(19).

The proposed model is extensively trained, validated and tested by employing all the six independent datasets, each comprising of 500,000 samples per attack design. However, looking at the paper length, the results for the best 5 and the worst 5 reconstructed data points along with corresponding reconstruction error and error ratio are presented here. Table 4, Table 5, Table 6, Table 7, Table 8 and Table 9 present the simulation results for different datasets and attack designs.

Table 4 enumerates the original data value, reconstructed value, reconstruction error and error ratio for three different intrusions i.e., DoS, fuzzy and impersonation from Dataset 1. The results are presented for the best and worst cases scenarios, also providing mean reconstruction error and standard deviation for each case. For DoS intrusion, the best-reconstructed error is 6.5297 × 10⁻⁸ with 8.6722 × 10⁻⁸ error ratio. A substantially low error ratio is reported for fuzzy and impersonation intrusions, the corresponding values are 4.999 × 10⁻⁹ and 9.0795 × 10⁻⁷ for the best cases, and are 1.7792 × 10⁻¹ and 2.0983 × 10⁻² for the worst cases. In particular, the model showed improved performance in the mitigation of intrusions against fuzzy attacks compared to DoS and impersonation anomalies. The probable reason of the superior results could be that fuzzy intrusion—injecting random values—compromises data largely and generates highly anomalous data, which is easier for the AM-DDAE model to detect and reconstruct. Furthermore, the fuzzy samples are probably (i) more balanced and distinct compared to other anomalous data, which enables enhanced resilience. Overall, the model showed exceptional performance in data reconstruction and generated extremely low mean reconstruction error and standard deviation—on the order of 10⁻⁴—reflecting the high accuracy and robustness of the proposed AM-DDAE model.

More insight into the model performance can be gained from Table 5 which depicts results for DoS, fuzzy, spoofing (gear) and spoofing (RPM) intrusions introduced in the Car Hacking dataset for CAN-bus. Reconstruction error which is the performance defining parameter for AM-DDAE model is extremely low for all the four cases, even in the case of worst data reconstruction. RE value ranges between 0.018601 and 0.015228 for the worst case in mitigating DoS attack. This value drops to 1.44 × 10⁻⁷ for the best data reconstruction, revealing the exceptional efficiency by AM-DDAE model to learn and reproduce normal data. Similarly, for fuzzy, spoofing (gear) and spoofing (RPM) anomalies, the model manifested high accuracy with 3.9921 × 10⁻⁶, 1.6033 × 10⁻⁴ and 1.0828 × 10⁻⁴ average RE, respectively. These low values demonstrate the model’s effectiveness across multiple attack designs. Similarly, Table 6 illustrates the reconstruction results for In-vehicle intrusion dataset. RE and ER score for all the intrusions are exceedingly low, ranging between 10⁻¹² to 10⁻². To combat flooding attack, the model achieved 2.871 × 10⁻⁷ average RE with 1.2391 × 10⁻⁵ standard deviation. While tackling the fuzzy and malfunction intrusions, RE ranges from 10⁻¹¹ to 10⁻², while it ranges between 10⁻⁸ and 10⁻² for replay attack. It is evident that RE is comparatively high for replay attack as compared to other anomalies, this is due to the fact that replay attack sends same normal data repeatedly producing slight variations in normal data pattern, complicating the data differentiation and model learning. However, the proposed model succeeded in efficiently reproducing the original data compromised by replay attack with RE as low as 10⁻².

Simulation results to mitigate the impact of DoS and fuzzing anomalies on the M-CAN data are presented in Table 7. RE and ER scores confirms the consistency of the AM-DDAE model to reconstruct the original data with considerably high accuracy. For the optimal case in DoS, RE are strikingly small ranging between 1.43 × 10⁻⁶ and 7.15 × 10⁻⁶ and ER ranges from 2.02 × 10⁻⁴ to 4.57 × 10⁻⁵. Nevertheless. the results are also appreciable for the worst case, reporting 1.163 × 10⁻² maximum RE and 1.1816 × 10⁻² maximum ER. The model showed an improved performance for fuzzing intrusion exhibiting 3.33 × 10⁻⁹ maximum RE and 8.51 × 10⁻⁸ maximum ER. Similarly, Table 8 shows simulation results for B-CAN data compromised by DoS and fuzzing intrusions. The model portrayed comprehensible outcomes such as reconstructing 0.2 original value as 0.1999999 with 1.5 × 10⁻¹¹ RE and 7.3 × 10⁻¹¹ ER as the best case for DoS. While for worst case, 0.2509803 was reconstructed as 0.2346066 featuring 1.6374 × 10⁻² RE and 6.5239 × 10⁻² ER. Nevertheless, these trivially low values are a demonstration of the model’s strength and robustness. A similar pattern is observed for fuzzing intrusion where maximum RE is 3.76 × 10⁻⁹ with 1.47 × 10⁻⁸ ER in the optimal data regeneration, and 1.026 × 10⁻² RE with 3.7375 × 10⁻¹ ER in the case of poor data regeneration.

Lastly, Table 9 displays the results for CAN-FD data compromised by flooding, fuzzing, and malfunction attacks. Aligned with the model’s performance for the previous cases, the AM-DDAE proved its effectiveness to reproduce the CAN-FD data compromised by flooding, fuzzing and malfunction attacks. It is observed that RE for optimal data in all the three cases is remarkably low falling in the order of 10⁻¹¹, 10⁻¹⁰ and 10⁻⁹, and for the instances of poorly reconstructed data, the RE is reduced to the order of 10⁻².

Summarizing the above, these extremely low RE, ER and standard deviation scores indicate consistency, robustness and reliability of the proposed AM-DDAE model over a diverse range of intrusions.

Besides absolute RE, percentage RE is also calculated which makes performance results independent of data scale, making them easily interpretable. Table 10 demonstrates the percentage RE for multiple attack designs. It is noticeable that percentage RE is exceptionally low, less than 1% for all the used cases, ranging between 1.3271 × 10⁻⁵ and 9.0666 × 10⁻⁵. It signifies remarkable delivery by the proposed AM-DDAE model to reconstruct data accurately and efficiently.

4.2. Training and Validation Cost

Training Cost (TC) is a measure of model’s learning explaining how well the model learns the underlying data pattern to remove noise and reconstruct the original values, while Validation Cost (VC) explains the degree of model’s generalizability describing its validation for unseen data. TC and VC are calculated using mean-squared error using training data for TC and validation data for VC. The choice of mse is linked to its built-in feature, in its effectiveness in gradient-based model optimization and data reconstruction.

Simulation curves of VC and TC for all the attack designs are given in Figure 6, Figure 7, Figure 8, Figure 9, Figure 10 and Figure 11. Figure 6 visualizes TC and VC curves for multiple anomalies in Car-Intrusion dataset (OTIDS). At the start, the TC and VC values are comparatively high. However, as the learning proceeds, the curves start converging towards zero value reflecting stability gain by the proposed model. An anlogous pattern is observed for the Car Hacking and In-vehicle Intrusion Detection datasets in Figure 7 and Figure 8, where staring values are slightly less than 0.1 for all the used cases. Furthermore, a similar pattern is evident for M-CAN, B-CAN and CAN-FD datasets in Figure 9, Figure 10 and Figure 11. However, for fuzzy intrusion, VC is slightly lower than TC at the start in all the cases. The possible reason could be the statistical variation between training and validation data. Despite this difference, the proposed model learns underlying pattern effectively, attains stabilization and converges following several epochs. As a whole, the proposed AM-DDAE model converges towards zero after few epochs are elapsed, showing comprehensible performance to attain steady convergence with no signs of overfitting.

4.3. Computational Time

Deep-denoising autoencoder models are generally considered heavyweight with high computational cost in data processing. However, employing the shallow architecture for deep-denoising autoencoders, this study proposed a lightweight AM-DDAE for intrusion mitigation and data reconstruction. The simulation results presented extra-ordinarily low inference time to retrieve normal data from anomalous data compromised by various attacks. Table 11 shows the computational time for different attack scenarios. It is discernible that the execution time in all the cases is less than a second, averaging to 0.145532 s. The possible reason for this low computational time is linked with the fact that training is performed once, and there after new incoming data is directly evaluated, and recovered if any compromised data is found. Besides execution time, Amortized time (AT) is another significant parameter to evaluate computational efficiency of a model. It measures the average computational cost per instance or task when the total cost is distributed across many instances or tasks. For smart cars, the model can be invoked repeatedly depending on varying environmental factors and driving behavior. Thus, amortized time is calculated by dividing the total computational cost by the number of estimated usage. Equation (20) is used for the calculation of AT. The corresponding results are presented in Table 11 for various estimated usages. It is discernible that the execution time reduces considerably as the frequency of usage increases reflecting the practical viability effectiveness of the proposed model in an extensive operating environment.

C o s t_{A M} = \frac{T_{t o t}}{N}

(20)

where

T_{t o t}

is the total execution time and N is the number of usages.

The computational platform for the development of the proposed AM-DDAE model utilized an Intel(R) Core(TM) i7-1165G7 @ 2.80 GHz CPU and 1.30 GHz GPU computational machine with 12.0 GB RAM. The proposed model was implemented completely in a custom Matlab (R2023a) code by utilizing its native control structures and matrix operations, with no use of built-in Neural Network Frameworks or Deep Learning Toolbox. All the deep learning components including forward pass, backward pass, parameter initialization and optimizer updates were performed manually. However, Statistics and Machine Learning toolbox was used for data preprocessing and numerical computation. In addition, utilizing the same machine as mentioned, Jupyter Notebook 6.4.8 was used to implement XAI techniques using Python 3.

4.4. Adaptability of the Proposed AM-DDAE Model to Unseen Attack

Analysis of the model’s adaptability to new, unseen is crucial, ensuring its long-term viability. To fulfil the purpose, the proposed AM-DDAE model is tested with new unseen data – UNSW-NB 15 dataset generated by security researchers in Cyber Range Lab of Australian Center for Cyber Security. Interestingly, the model has successfully reconstructed original data compromised by a new attack i.e., ‘Exploits’. The corresponding simulation results are highlighted in Table 12.

The proposed model yielded a trivially small percentage RE i.e., 0.3744 (less than 1%) for the unseen Exploits intrusion, which reflects a high accuracy and efficiency exhibited by the AM-DDAE model in reconstructing original data affected by new unseen attack. In addition, the proposed model is incredibly lightweight in mitigating anomalies for the new unseen attacks. While mitigating unseen Exploits attack, the model consumed 0.058369 s to denoise and regenerate the clean data.

To support efficacy of the proposed model, a comparison is conducted between seen and unseen data, as highlighted in Table 13 shows performance comparison of the proposed AM-DDAE model between seen data and unseen data. Seen data means same dataset is used for training, validation and test. Unseen means model is trained on one type of dataset and tested on another unseen dataset. The reconstruction error and execution time are compared between the two categories, as these are significantly low for unseen data same as observed in previous cases. This reveals that the proposed model behaves and performs equally well for known seen data and new unseen data, enhancing its generalizability and adaptability to new data.

4.5. Decision-Making Process of the Proposed Model Through Explainable AI

Explainable AI (XAI) are advanced techniques aiming to open the black box of AI models. They make the decision-making process of AI models more transparent and interpretable by highlighting key features and factors and identifying potential biases significant in determining the outcome of a model. Among various, Shapley Additive Explanations (SHAP) and Locally Interpretable Mode-Agnostic Explanations (LIME) are the two prominent techniques used to explain the model’s strategic evaluation. SHAP is a model-agnostic method but its Shapely values have model-specific priorities making it a model-dependent method offering both global and local explanation elucidating the role of features for all scenarios and for a specific scenario. SHAP works by assigning contribution value (CV) to each feature towards defining the final accuracy score. A high CV indicates high impact, whereas a feature with low CV highlights minimal contribution by the particular feature. LIME is a pure model-agnostic technique offering only local explanation for a specific instance. LIME assigns probability to the subject matter, coefficient value i.e., weight and normally assign colors to highlight role of each feature to differentiate between two possible outcomes of a specific instance. To build the trust of stakeholders in the proposed model, this study implements both SHAP and LIME based XAI techniques to interpret the model’s latent space and reconstruction process. Latent space is a compressed form of input data fed to the encoder, reconstruction process includes decoder which takes compressed latent data to reconstruct the original input. If the reconstructed values are close to the input, reconstruction error is low and vice versa. The simulation results for SHAP and LIME are presented in Figure 12 and Figure 13.

Figure 12 shows SHAP decision plot between model output (reconstruction error) and actual feature values. The values in parenthesis are original feature values before scaling. Starting from the left (the initial base value), the plot continuously proceeds to the right towards red, an indicator used to highlight the positively increasing contribution in generating final output. Not only this, the reconstruction error is consistently low indicating that the model is effectively capturing the underlying data pattern in its latent representation and accurately regenerating it through the decoder. This reflects the model’s strong ability to compress the data into meaningful lower-dimensional features while preserving critical information. The decoder successfully utilizes these latent features to reproduce the original input with minimal loss, demonstrating that the encoder, latent space and decoder have achieved reliable mapping that reflects good reconstruction quality.

Figure 13 presents LIME summary plot. As the input data is normalized between 0 and 1, the probability score of outcome is 0 for anomalous data and 1 for normal data. The blue bars are indicator of high negative contribution towards data reconstruction with increased RE, while orange bars are indication of features that positively contributed towards anomaly mitigation with low RE. For instance, Feature_1 and Feature_2 with values 0.51 and 0.75 reflect their high relevancy in data regeneration and intrusion mitigation, suggesting a slight deviation in their value will impact the outcome significantly. Conversely, Feature_7 and Feature_8 with 0.01 and 0.04 values are inclined towards zero, demonstrating minimal relevance in reconstruction process. Features with intermediate values though represent low impact but still influence the reconstruction process depending upon the model’s weight.

The simulation results for SHAP and LIME provide an insight into model’s latent decision-making process and reconstruction process. Majority of the bars are Orange in LIME decision plot, and for SHAP majority features are inclined towards red, an indicator that most influential features are working together to reconstruct data positively. Experts in the field can interpret which feature values initiate and trigger intrusion mitigation that ensure model alignment with anomaly patterns in the automotive controller area network. Furthermore, these explanations indicate how RE values of specific features corresponding to CV and the observed deviations in the feature values impact mitigation process. This transparency not only validates the proposed model’s reasoning, but also helps stakeholders to interpret and trust the mitigation model.

Important to note that, the CAN dataset used in this study contains 08 x Features per message which correspond to 08 x Data bytes. The HCRL does not provide the exact physical parameters represented by each feature, probably due to proprietary reasons. However, it is perceived that these features could be wheel speed, brake status, throttle position, tyre pressure, internal temperature, engine speed, or any other ECU signal. Nevertheless, these features are any ECU signal, and the proposed model has learnt the latest space effectively generating output with <1% error, despite the missing information of physical parameters.

4.6. Results and Analysis of Adversarial Machine Learning Attack

To support the effectiveness of the proposed model, a new adversarial machine learning attack is employed along with several other attack designs. The adversarial attack is perturbed by adding Gaussian noise scaled by feature range to CAN bus data. Maintaining the physical plausibility, the attack is reproducible and scalable to even larger datasets. The simulation results are presented in Table 14, manifesting that the model performed appreciably well in data reconstruction with 4.2402 × 10⁻⁵ mean RE, and 0.084789 s execution time against an emerging attack, reflecting its high applicability and adaptability for a range of sophisticated and new attacks.

Figure 14 shows TC and VC curves for the adversarial machine learning attack. The model showed comparatively high values around 0,09 at the start, where after a sharp decline is observed as the learning progresses, highlighting the apprehensible generalization by the model, and finally converges to zero after few epochs, depicting attainment of steady state.

4.7. Comparison with Generative Adversarial Networks

A comprehensive comparison with Generative Adversarial Networks (GANs) has been conducted to further demonstrate the effectiveness of the proposed method. Both AM-DDAE and GANs models are extensively evaluated on six HCRL-generated datasets, which follow widely accepted data collection and attack simulation standards, effectively serving as a standardized benchmark for performance evaluation. Performance comparison results are highlighted in Table 15 and Table 16. It is evident from Table 15 that the mean RE by the AM-DDAE is significantly low compared to GANs where its value is approximately 99 times higher for all the used cases. The probable reason to this incredibly high difference could be the instability associated with generator and discriminator due to min-max game in GANs. Furthermore, the reliance on random latent vectors to produce outcome could be another possible reason of increased RE in GANs. Table 16 reflects execution time comparison between AM-DDAE and GANs. Similar to previous scenario, GANs consumed substantially high time, approximately 97 times, to generate output. The extended processing time is due to the fact of use of two networks (generator and discriminator) in GANs. Furthermore, adversarial evaluation – generating random samples from latent space and then feeding through discriminator – in GANs increases computational cost. In summary, the comparative analysis confirms the superior performance and robustness of the proposed approach.

In addition, Table 17, Table 18, Table 19, Table 20, Table 21 and Table 22 present simulation results of the GANs evaluated by employing the same six datasets used to test the proposed model. It is evident that despite GANs performed outstanding in all the cases with appreciable efficiency to reconstructed data. However, the reconstruction error and error ratio are low compared to the proposed model for every attack design, reflecting dominant performance of proposed AM-DDAE model over GANs.

4.8. Comparison with Existing Studies

This study proposes a novel mechanism for intrusions mitigation in controller area network in smart cars. The existing literature proposes various mitigation techniques, including isolation of compromised node from the system, blockade of data temporarily or for extended time, deflection of intruder to the decoy system, and data reconstruction. Excluding data reconstruction, all other techniques interrupt the normal functioning of smart cars over a duration, which could be damaging for commuters and other vehicles on road. To avoid interruption, the proposed mechanism based on deep-denoising autoencoder reconstructs data on real-time ensuring continuity of operations. In data reconstruction methods, percentage reconstruction error and execution time are key performance indicators for comparative analysis.

Performance comparison of various studies including details of methods and mitigation strategy is presented in Table 23. Investigating intrusion detection and mitigation for smart intersection system in in-vehicular communication, Hidalgo et al. [21] used graph neural network base multilayer perceptron technique. The study reported 0.0466 s execution time to apply mitigation strategy which includes data blockade and diversion of intruder to the decoy system. Applying the BAIT approach along with deep neural network for anomlay detection and mitigation in vehicular adhoc network, Sontakke and Chopade [20] adopted node isolation as a mitigation strategy. Working on a vehicle platoon, Khanapuri et al. [25] proposed CNN and Routh Hurwitz Criterion for selection of controller gains-based intrusion detection and mitigation technique. Like node isolation from the network, the study employed increased vehicle spacing as a mitigation strategy. In another study, Shirazi et al. [32] proposed LSTM based mitigation mechanism for controller area network in smart cars. Adopting data reconstruction as the mitigation strategy, the study reported less then 6% error between the reconsttructed value and acutal value. The proposed AM-DDAE-based intrusions mitigation mechanism used real-time data reconstruction as a mitigation technique. Comparatively, this study showed outstanding performance with less than 1% reconstruction error and execution time ranging between 0.0877 and 0.2807 s for all the used cases employed in the model development.

4.9. Impacts of the Proposed Mechanism on End-Users

Restricted with the technical aspect of a vehicle safety, this study provides technical evaluation metrics of the proposed model for mitigating the impact of intrusions on SCs, which are quantified by measuring reconstruction error, error ratio, validation and training cost, and execution time. These metrics are direct measure of vehicle performance and safety, however, they do not directly quantify end-user impact, and related to driver trust and safety indirectly. A small RE secures system from instability and unexpected interruptions, lower ER minimizes false alarms, minimized VC and TC provide stable performance, enhancing long-term reliability, and low execution time ensure faster real-time response. Some major impacts on end-user by the proposed model are highlighted in Figure 15.

4.10. Integration of Proposed Model with Existing Security Frameworks

To ensure the compatibility and scalability of the proposed model with the industry standards, the proposed AM-DDAE can be embedded into the existing automotive security frameworks, such as AUTOSAR, ISO.SAE 21434 to enhance security and safety of SCs and commuters. The proposed model can be integrated with intrusion detection and mitigation layer within the AUTOSAR’s adaptive framework to function as a Security Service Component without compromising Runtime Environment specifications. Considering ISO/SAE 21434, the model can be added during Cybersecurity concept phase, in accordance with Threat Analysis and Risk Assessment findings. The mitigation logs can support cybersecurity validation and incident readiness. This modular integration of the proposed model with AUTOSAR platform linked with ISO/SAE 21434 processes is portrayed in Figure 16. This approach highlights the model compatibility, its deployment across automotive architectures, and its scalability to future standards and vehicle platform.

4.11. Advantages of the Proposed Mechanism

The proposed scheme offers several key benefits that ensure model’s effectiveness and practicality in intrusion mitigation. The proposed model is based on deep denoising autoencoder that compresses input into the latent space by capturing the effective features, thus reduces noise from the data leading to robust feature learning. It is also lightweight. Once trained, any new data are directly subjected to testing and evaluation.

Furthermore, it enables real-time response by quickly mitigating the intrusions as they are injected in the system, minimizing damage. It also reduces down-time by reconstructing compromised data in real-time, ensuring continuity of functions.

Lastly, it offers reliability and adaptability in tackling emerging attacks, while considering with the continuously evolving nature of intrusions.

4.12. Limitations and Future Scope

The proposed approach offers a novel approach for intrusions mitigation; however, it has certain limitations which opens new directions for the future work.

At first, there is limited flexibility in momentum optimization i.e., momentum adaptability is confined between 0.5–0.99 which may not be ideal for all types of data. Use of meta-learning momentum for optimal selection of momentum coefficient or a dynamic momentum based on gradient behavior during training will improve model learning.

Secondly, training proceeds for a fixed number of epochs ignoring the possible issue of overfitting during the process that may lead to potential loss in efficiency, increased computational cost and limited generalization. Incorporation of early stopping regularization technique will enhance model performance and generalization.

Furthermore, the reconstructive objective is based on a single loss metric without considering multi-objective learning. Integration of regularization loss will optimize feature learning.

Lastly, the model performs epoch-wise validation over entire validation data, overlooking detailed learning patterns. Using mini batches for validation will provide faster feedback allowing optimized dynamic adjustments of parameters, enhancing overall model performance.

5. Conclusions

The integration of digital advancement into the conventional transport network has shifted the paradigm from traditional cars to futuristic smart cars. These intelligent cars are equipped with sophisticated electronics, computing, and control to offer luxury and intelligent features like keyless entry, anti-lock braking, fuel injection control, advanced driver assistance system, and others. The in-vehicular communication is established through different networks primarily using the CAN-bus network for data transfer among different ECUs. However, because of the delineating nature of data encryption and authentication, CAN-bus data is highly prone to cyber intrusions. To secure promised CAN-bus data, this study proposed the adaptive momentum-based deep-denoising autoencoder for intrusion mitigation and attack-free data reconstruction. The proposed model was tested using real-time CAN-bus data from different car models. The model achieved remarkably high performance with less than 1 % reconstruction error when tested on both benchmark datasets with known attacks, and on new unseen attack. Comparative analysis highlighted the proposed model’s superiority over GANs in intrusion mitigation. The outstanding model performance is the result of the use of an adaptive momentum strategy which enabled the model to adjust and update the weights in each succeeding layer following the change in validation cost for model training. The state-of-the-art model’s performance verified by utilizing multiple attack designs introduced on real-time data from different smart cars, signifies its generalization and applicability to new unseen attacks for any new car model.

Author Contributions

Conceptualization, Z.A.K. and S.A.; methodology, S.A., Z.A.K. and A.K.; software, A.K.; validation, S.A. and A.K.; formal analysis, Z.A.K., S.A. and A.K.; investigation, A.K.; resources, Z.A.K. and S.A.; data curation, A.K.; writing—original draft preparation, A.K., S.A. and Z.A.K.; writing—review and editing, A.K., S.A. and Z.A.K.; visualization, A.K.; supervision, Z.A.K. and S.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are openly available in Hacking and CounterMeasure Research Lab, South Korea at https://ocslab.hksecurity.net/Datasets.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AM	Adaptive Momentum	AI	Artificial Intelligence
AP	Access Point	API	Application Programming Interface
CAN	Controller Area Network	CNN	Convolutional Neural Network
DDAE	Deep Denoising Autoencoder	DL	Deep Learning
DoS	Denial of Service	ECUs	Electronic Control Units
ETI	Event-triggered Interval	FDL	Federal Deep Learning
FL	Federated Learning	GAN	Generative Adversarial Network
GPS	Global Positioning System	HID	Human Interface Display
ID	Identifier	IDMS	Intrusion Detection and Mitigation System
IDS	Intrusion Detection System	IVN	In-Vehicular Network

JTAG	Joint Test Action Group	LSTM	Long-short Term Memory
MAC	Media Access Control	ML	Machine Learning
OBD	On-Board Diagnostics	OTA	Over-the-Air
RE	Reconstruction Error	ReLU	Rectified Linear Unit
RNN	Recurrent Neural Network	SCs	Smart Cars
SDM	Self-defence Mechanism	SerIoT	Secure and Safe Internet of Things
SSM	State-Space Model	SVM	Support Vector Machine
UIOs	Unknown Input Observers	USB	Universal Serial Bus

References

Alsaade, F.W.; Al-Adhaileh, M.H. Cyber attack detection for self-driving vehicle networks using deep autoencoder algorithms. Sensors 2023, 23, 4086. [Google Scholar] [CrossRef] [PubMed]
Upstream. Upstream’s 2025 Global Automotive Cybersecurity Report Executive Summary. 2025. Available online: https://upstream.auto/ty-2025-gacr-executive-summary/ (accessed on 19 August 2025).
Upstream. 2025 Predictions: The Future of Automotive Cybersecurity. 2025. Available online: https://upstream.auto/ty-2025-predictions/ (accessed on 19 August 2025).
SOCRadar. Major Cyber Attacks Targeting the Automotive Industry. 2024. Available online: https://socradar.io/major-cyber-attacks-targeting-the-automotive-industry/ (accessed on 10 November 2024).
Ionut Arghire. 16 Car Makers and Their Vehicles Hacked via Telematics, APIs, Infrastructure. 2023. Available online: https://www.securityweek.com/16-car-makers-and-their-vehicles-hacked-telematics-apis-infrastructure/ (accessed on 6 June 2024).
SLNT. Under the Hood: The Modern Reality of Car Hacking. 2024. Available online: https://slnt.com/blogs/insights/under-the-hood-the-modern-reality-of-car-hacking (accessed on 10 November 2024).
Zhou, X.; Wu, Y.; Lin, J.; Xu, Y.; Woo, S. A Stacked Machine Learning-Based Intrusion Detection System for Internal and External Networks in Smart Connected Vehicles. Symmetry 2025, 17, 874. [Google Scholar] [CrossRef]
Tanksale, V. Intrusion detection system for controller area network. Cybersecurity 2024, 7, 4. [Google Scholar] [CrossRef]
Alfardus, A.; Rawat, D.B. Machine Learning-Based Anomaly Detection for Securing In-Vehicle Networks. Electronics 2024, 13, 1962. [Google Scholar] [CrossRef]
Bari, B.S.; Yelamarthi, K.; Ghafoor, S. Intrusion detection in vehicle controller area network (can) bus using machine learning: A comparative performance study. Sensors 2023, 23, 3610. [Google Scholar] [CrossRef] [PubMed]
Shahriar, M.H.; Xiao, Y.; Moriano, P.; Lou, W.; Hou, Y.T. ANShield: Deep Learning-Based Intrusion Detection Framework for Controller Area Networks at the Signal-Level. IEEE Internet Things J. 2023, 10, 22111–22127. [Google Scholar] [CrossRef]
Cheng, P.; Xu, K.; Li, S.; Han, M. TCAN-IDS: Intrusion detection system for internet of vehicle using temporal convolutional attention network. Symmetry 2022, 14, 310. [Google Scholar] [CrossRef]
Moulahi, T.; Zidi, S.; Alabdulatif, A.; Atiquzzaman, M. Comparative performance evaluation of intrusion detection based on machine learning in in-vehicle controller area network bus. IEEE Access 2021, 9, 99595–99605. [Google Scholar] [CrossRef]
Kavousi-Fard, A.; Dabbaghjamanesh, M.; Jin, T.; Su, W.; Roustaei, M. An evolutionary deep learning-based anomaly detection model for securing vehicles. IEEE Trans. Intell. Transp. Syst. 2020, 22, 4478–4486. [Google Scholar] [CrossRef]
Ahmed, N. Detection, Identification, and Mitigation of False Data Injection Attacks and Faults in Vehicle Platooning. Ph.D. Thesis, Lakehead University, Thunder Bay, ON, Canada, 2023. Available online: https://knowledgecommons.lakeheadu.ca/handle/2453/5254 (accessed on 12 December 2024).
Hassan, S.M.; Mohamad, M.M.; Muchtar, F.B. Advanced intrusion detection in MANETs: A survey of machine learning and optimization techniques for mitigating black/gray hole attacks. IEEE Access 2024, 12, 150046–150090. [Google Scholar] [CrossRef]
Moradi, M.; Kordestani, M.; Jalali, M.; Rezamand, M.; Mousavi, M.; Chaibakhsh, A.; Saif, M. Sensor and Decision Fusion-based Intrusion Detection and Mitigation Approach for Connected Autonomous Vehicles. IEEE Sens. J. 2024, 24, 20908–20919. [Google Scholar] [CrossRef]
Samani, M.A.; Farrokhi, M. Adverse to Normal Image Reconstruction Using Inverse of StarGAN for Autonomous Vehicle Control. IEEE Access 2025, 13, 77305–77316. [Google Scholar] [CrossRef]
Wang, K. Leveraging Deep Learning for Enhanced Information Security: A Comprehensive Approach to Threat Detection and Mitigation. Int. J. Adv. Comput. Sci. Appl. 2024, 15, 964. [Google Scholar] [CrossRef]
Sontakke, P.V.; Chopade, N.B. Optimized Deep Neural Model-Based Intrusion Detection and Mitigation System for Vehicular Ad-Hoc Network. Cybern. Syst. 2023, 54, 985–1013. [Google Scholar] [CrossRef]
Hidalgo, C.; Vaca, M.; Nowak, M.P.; Frölich, P.; Reed, M.; Al-Naday, M.; Mpatziakas, A.; Protogerou, A.; Drosou, A.; Tzovaras, D. Detection, control and mitigation system for secure vehicular communication. Veh. Commun. 2022, 34, 100425. [Google Scholar] [CrossRef]
Eric Schädlich. The Most Influential Automotive Hacks. 2024. Available online: https://dissec.to/general/the-most-influential-automotive-hacks/ (accessed on 4 February 2025).
Majumdar, A.R.C. 42 Luxury Cars Stolen over Four Weeks in Oakville. 2021. Available online: https://www.oakvillenews.org/local-news/42-luxury-cars-stolen-over-four-weeks-oakville-ontario-8486515 (accessed on 4 February 2025).
Khanna, H.; Kumar, M.; Bhardwaj, V. An Integrated Security VANET Algorithm for Threat Mitigation and Performance Improvement Using Machine Learning. SN Comput. Sci. 2024, 5, 1089. [Google Scholar] [CrossRef]
Khanapuri, E.; Chintalapati, T.; Sharma, R.; Gerdes, R. Learning based longitudinal vehicle platooning threat detection, identification and mitigation. IEEE Trans. Intell. Veh. 2021, 8, 290–300. [Google Scholar] [CrossRef]
HCRL. CAN Dataset for Intrusion Detection (OTIDS). Available online: https://ocslab.hksecurity.net/Dataset/CAN-intrusion-dataset (accessed on 1 March 2024).
HCRL. Car-Hacking Dataset. Available online: https://ocslab.hksecurity.net/Datasets/car-hacking-dataset (accessed on 1 March 2024).
HCRL. In-Vehicle Network Intrusion Detection Challenge. Available online: https://ocslab.hksecurity.net/Datasets/datachallenge2019/car (accessed on 1 March 2024).
HCRL. M-CAN Intrusion Dataset. Available online: https://ocslab.hksecurity.net/Datasets/m-can-intrusion-dataset (accessed on 1 March 2024).
HCRL. B-CAN Intrusion Dataset. Available online: https://ocslab.hksecurity.net/Datasets/b-can-intrusion-dataset (accessed on 1 March 2024).
HCRL. CAN-FD Intrusion Dataset. Available online: https://ocslab.hksecurity.net/Datasets/can-fd-intrusion-dataset (accessed on 1 March 2024).
Shirazi, H.; Pickard, W.; Ray, I.; Wang, H. Towards resiliency of heavy vehicles through compromised sensor data reconstruction. In Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy, Baltimore, MD, USA, 24–27 April 2022; pp. 276–287. [Google Scholar]

Figure 1. A comprehensive overview of various cyber-intrusions exploiting multiple attack surfaces and vulnerabilities in smart cars over the past decade. Adapted from [22,23].

Figure 2. Representation of automotive controller area network with key vulnerabilities.

Figure 3. An overview of various potential attack surfaces and attack types in smart cars.

Figure 4. Flowchart of the proposed mechanism. (a) Data Preprocessing and Model Training. (b) Validation and Final Evaluation.

Figure 5. Architecture of the proposed methodology, highlighting the integration of adaptive momentum.

Figure 6. Validation and training costs for (a) DoS (b) Fuzzy and (c) Impersonation intrusions in Dataset 1.

Figure 7. Validation and training costs for (a) DoS (b) Fuzzy (c) Spoofing (Gear) and (d) Spoofing (RPM) intrusions in Dataset 2.

Figure 8. Validation and training costs for (a) Flooding (b) Fuzzy, (c) Malfunction and (d) Replay intrusion in Dataset 3.

Figure 9. Validation and training costs for (a) DoS and (b) Fuzzing intrusions in Dataset 4.

Figure 10. Validation and training costs for (a) DoS and (b) Fuzzing intrusions in Dataset 5.

Figure 11. Validation and training costs for (a) Flooding (b) Fuzzing and (c) Malfunction intrusions in Dataset 6.

Figure 12. SHAP decision plot to visualize the decision-making process of the proposed model.

Figure 13. LIME local explanation plot of the proposed model.

Figure 14. Training and Validation Costs of Adversarial Machine Learning Attack.

Figure 15. End-User Impacts of the proposed model.

Figure 16. Modular Integration of the proposed model with AUTOSAR adaptive platform linked to ISO/SAE 21434 processes.

Table 1. A comprehensive summary of existing studies conducted to develop the self-defense mechanism against cyber-intrusions in smart cyber-physical systems.

Reference	Purpose	Strengths	Weaknesses	Year	Category
[7]	Stacked Machine Learning (ML) based intrusion detection system (IDS) for automotive networks	99.99 % Detection Accuracy	Undefined car, Heavyweight	2025	Detection
[8]	ML based IDS for automotive controller area network	0.9968 Specificity, 0.9948 Sensitivity	No description about accuracy measure and computational time	2024
[9]	DL based IDS for automotive networks	95 % Detection Accuracy	Limited to single attack type; no detail about computational time	2024
[10]	ML based IDS for automotive controller area network	99.9% Detection accuracy	Heavyweight	2023
[11]	DL based IDS for automotive controller area network	0.952 Area under the curve	No description about computational time, undefined car	2023
[12]	ML based IDS for automotive controller area network	0.9998 F1-score	Heavyweight	2022
[13]	ML based IDS for automotive controller area network	98.5269 % Detection accuracy	Heavyweight	2021
[14]	GAN based IDS for automotive controller area network	96.84 % Hit rate	Undefined car, limited to single attack type; no detail about computational time	2020
[18]	starGAN based image reconstruction for autonomous vehcile control	22.21 PSNR, 0.92 SSIM	Heavyweight, limited object detection	2025	Mitigation
[19]	DL based intrusion mitigation for information security in cyberspace	95.4 % Detection accuracy; Compromised data blockade, Affected node isolation	No real-time data reconstruction	2024
[20]	DL based IDMS for vehicular ad hoc networks	100 % True detection rate; Affected node isolation	No real-time data reconstruction	2023
[21]	SerIoT system for vehicular communication networks	Compromised data blockade, Deflection to decoy system	No real-time data reconstruction	2022
[25]	DL based IDMS for vehicle platoons	96.3 % Detection accuracy, Gap Widening between vehicles for mitigation	No real-time data reconstruction	2021

Acronyms are defined in “Abbreviations” section.

Table 2. A comprehensive summary of six different datasets employed in this work. (Labels: 1 for attack, 0 for normal).

Dataset	Attack Type	Attack Injection Rate (pps)	Volume (Number of Samples)		Targets	Sample Rate (Samples/s)
Dataset	Attack Type	Attack Injection Rate (pps)	Attack	Normal	Targets	Sample Rate (Samples/s)
1	DoS	-	2,244,041	2,369,868	CAN bus traffic	-
	Fuzzy	-
	Impersonation	-
2	DoS	3333.3	2,331,517	15,226,830	ECUs, Gear, RPM Gauge
	Fuzzy	2000.0				2563.28
	Spoofing (Gear)	1000.0				-
	Spoofing (RPM)	1000.0				1922.46
3	Flooding	-	1,253,508	8,114,265	ECUs, random and specific CAN IDs	-
	Fuzzy	-
	Malfunction	-
	Replay	-
4	DoS	4000.0	500,000	2,452,620	Multimedia communication devices	13,667.88
4	Fuzzing	10,000.0	500,000	2,452,620	Multimedia communication devices	13,667.88
5	DoS	4000.0	500,000	7,530,786	Low-speed communication devices	5576.93
5	Fuzzing	10,000.0	500,000	7,530,786	Low-speed communication devices	5576.93
6	Flooding	10,000.0	1,630,473	5,490,129	CAN bus	1977.945
	Fuzzing	5000.0
	Malfunction	1000.0

Table 3. Simulation parameters for the proposed AM-DDAE mechanism.

Parameters	Values
Total samples per dataset	500,000
Training Samples per dataset	325,000
Test Samples per dataset	175,000
Hidden Layers	3
Total Neurons in Hidden Layers	104
Latent Space size	8
Loss Function	mse
Activation Function Input/Output	relu
Initial Momentum	0.89
Initial Learning Rate	0.01
Number of Epochs	30

Table 4. A representation of the retrieved data for the best and worst cases from Dataset 1, including various attack designs.

DoS Intrusion
Optimal Reconstructed Data
Original Value	0.7529411	0.7607843	0.7647058	0.7490196	0.7568627
Reconstructed Value	0.7529412	0.7607844	0.7647061	0.7490192	0.7568622
Reconstruction Error	6.53 × 10⁻⁸	8.48 × 10⁻⁸	2.69 × 10⁻⁷	3.86 × 10⁻⁷	4.64 × 10⁻⁷
Error Ratio	8.67 × 10⁻⁸	1.12 × 10⁻⁷	3.52 × 10⁻⁷	5.15 × 10⁻⁷	6.14 × 10⁻⁷
Poor Reconstructed Data
Original Value	0.2705882	0.3098039	0.2784313	0.2745098	0.8470588
Reconstructed Value	0.2652617	0.3044519	0.2730492	0.2690989	0.8386531
Reconstruction Error	5.32 × 10⁻³	5.35 × 10⁻³	5.38 × 10⁻³	5.41 × 10⁻³	8.41 × 10⁻³
Error Ratio	0.019685	0.017275	0.01933	0.019711	0.0099234
Reconstruction Error ( $μ$ ± $σ$ ) = (1.3695± 1.7307) × 10⁻⁴
Fuzzy Intrusion
Optimal Reconstructed Data
Original Value	0.8470588	0.8470588	0.6784313	0.9686274	0.5647058
Reconstructed Value	0.8470588	0.8078431	0.6784314	0.9686275	0.5647059
Reconstruction Error	4.23 × 10⁻⁹	1.34 × 10⁻⁸	2.54 × 10⁻⁸	3.20 × 10⁻⁸	3.43 × 10⁻⁸
Error Ratio	4.99 × 10⁻⁹	1.65 × 10⁻⁸	3.74 × 10⁻⁸	3.31 × 10⁻⁸	6.08 × 10⁻⁸
Poor Reconstructed Data
Original Value	0.0274509	0.9294117	0.0274509	0.9843137	0.0549019
Reconstructed Value	0.0338043	0.9228175	0.0342877	0.9752079	0.0646703
Reconstruction Error	6.35 × 10⁻³	6.59 × 10⁻³	6.83 × 10⁻³	9.10 × 10⁻³	9.76 × 10⁻³
Error Ratio	0.23144	0.0070951	0.24905	0.0092509	0.17792
Reconstruction Error ( $μ$ ± $σ$ ) = (1.0999 ± 0.6405) × 10⁻⁴
Impersonation Intrusion
Optimal Reconstructed Data
Original Value	0.0509803	0.0431372	0.8549019	0.0627450	0.0078431
Reconstructed Value	0.0509803	0.0431373	0.8549018	0.0627452	0.0078429
Reconstructed Error	4.63 × 10⁻⁸	6.80 × 10⁻⁸	1.38 × 10⁻⁷	1.48 × 10⁻⁷	1.82 × 10⁻⁷
Error Ratio	9.08 × 10⁻⁷	1.57 × 10⁻⁶	1.62 × 10⁻⁷	2.36 × 10⁻⁶	2.32 × 10⁻⁵
Poor Reconstructed Data
Original Value	0.9882352	0.9843137	0.9294117	0.9960784	0.9921568
Reconstructed Value	0.9837410	0.9897851	0.9352317	0.9785670	0.9713379
Reconstruction Error	4.49 × 10⁻³	5.47 × 10⁻³	5.82 × 10⁻³	1.75 × 10⁻²	2.08 × 10⁻²
Error Ratio	0.0045478	0.0055587	0.006262	0.01758	0.020983
Reconstruction Error ( $μ$ ± $σ$ ) = (1.4127 ± 1.9858) × 10⁻⁴

Table 5. A representation of the retrieved data for the best and worst cases from Dataset 2, including various attack designs.

DoS Intrusion
Optimal Reconstructed Data
Original Value	0.0901960	0.9686274	0.0705882	0.9843137	0.9686274
Reconstructed Value	0.0901961	0.9686273	0.0705885	0.9843134	0.9686271
Reconstruction Error	1.0 × 10⁻⁷	1.40 × 10⁻⁷	2.65 × 10⁻⁷	2.65 × 10⁻⁷	2.82 × 10⁻⁷
Error Ratio	1.11 × 10⁻⁶	1.44 × 10⁻⁷	3.75 × 10⁻⁶	2.69 × 10⁻⁷	2.91 × 10⁻⁶
Poor Reconstructed Data
Original Value	0.0627450	0.0862745	0.0941176	0.1058823	0.0823529
Reconstructed Value	0.0779733	0.1016976	0.1106752	0.1228766	0.1009537
Reconstruction Error	0.015228	0.015423	0.016558	0.016994	0.018601
Error Ratio	0.2427	0.17877	0.17592	0.1605	0.22587
Reconstruction Error ( $μ$ ± $σ$ ) = (2.5156 ± 5.1545) × 10⁻⁴
Fuzzy Intrusion
Optimal Reconstructed Data
Original Value	0.0941176	0.1333333	0.5882352	0.9803921	0.9490196
Reconstructed Value	0.0941176	0.1333333	0.5882352	0.9803921	0.9490196
Reconstruction Error	3.94 × 10⁻¹¹	5.34 × 10⁻¹¹	1.07 × 10⁻¹⁰	1.40 × 10⁻¹⁰	1.88 × 10⁻¹⁰
Error Ratio	4.18 × 10⁻¹⁰	4.0 × 10⁻¹⁰	1.83 × 10⁻¹⁰	1.43 × 10⁻¹⁰	1.98 × 10⁻¹⁰
Poor Reconstructed Data
Original Value	0.1529411	0.0392156	0.0274509	0.0745098	0.0470588
Reconstructed Value	0.1556089	0.0433465	0.0327977	0.0829002	0.0592805
Reconstruction Error	0.0026678	0.0041309	0.0053468	0.0083904	0.012222
Error Ratio	0.017443	0.10534	0.19478	0.11261	0.25971
Reconstruction Error ( $μ$ ± $σ$ ) = 3.9921 × 10⁻⁶ ± 4.0126 × 10⁻⁵
Spoofing (Gear) Intrusion
Optimal Reconstructed Data
Original Value	0.0627450	0.0352941	0.1254901	0.0549019	0.0274509
Reconstructed Value	0.0627450	0.0352941	0.1254902	0.0549021	0.0274511
Reconstruction Error	1.34 × 10⁻¹⁰	2.48 × 10⁻⁸	7.49 × 10⁻⁸	1.50 × 10⁻⁷	1.67 × 10⁻⁷
Error Ratio	2.14 × 10⁻⁹	7.03 × 10⁻⁷	5.97 × 10⁻⁷	2.74 × 10⁻⁶	6.11 × 10⁻⁶
Poor Reconstructed Data
Original Value	0.9921568	0.9960784	0.8980392	0.9647058	0.9686274
Reconstructed Value	0.9977037	1.0017604	0.9037346	0.9707119	0.9600952
Reconstruction Error	0.0055469	0.005682	0.0056954	0.0060061	0.0085322
Error Ratio	0.0055907	0.0057044	0.006342	0.0062258	0.0088085
Reconstruction Error ( $μ$ ± $σ$ ) = (1.6033 ± 1.4613) × 10⁻⁴
Spoofing (RPM) Intrusion
Optimal Reconstructed Data
Original Value	0.0431372	0.0352941	0.1254901	0.0980392	0.0745098
Reconstructed Value	0.0431373	0.0352941	0.1254902	0.0980392	0.0745097
Reconstruction Error	4.84 × 10⁻⁸	5.20 × 10⁻⁸	5.45 × 10⁻⁸	6.95 × 10⁻⁸	7.67 × 10⁻⁸
Error Ratio	1.12 × 10⁻⁶	1.47 × 10⁻⁶	4.34 × 10⁻⁷	7.09 × 10⁻⁷	1.03 × 10⁻⁶
Poor Reconstructed Data
Original Value	0.0705882	0.9490196	0.9843137	0.9333333	1.0
Reconstructed Value	0.0663771	0.9446970	0.9793825	0.9333333	0.9946695
Reconstruction Error	0.0042111	0.0043225	0.0049312	0.0051985	0.0053304
Error Ratio	0.059657	0.0045547	0.0050098	0.0055698	0.0053304
Reconstruction Error ( $μ$ ± $σ$ ) = (1.0828 ± 1.3339) × 10⁻⁴

Table 6. A representation of the retrieved data for the best and worst cases from Dataset 3, including various attack designs.

Flooding Intrusion
Optimal Reconstructed Data
Original Value	0.9529411	0.9568627	0.9294117	0.8470588	0.9411764
Reconstructed Value	0.9529411	0.9568627	0.9294117	0.8470588	0.9411764
Reconstruction Error	1.6 × 10⁻¹²	1.9 × 10⁻¹²	2.0 × 10⁻¹²	6.2 × 10⁻¹²	8.7 × 10⁻¹²
Error Ratio	1.7 × 10⁻¹²	2.0 × 10⁻¹²	2.1 × 10⁻¹²	7.3 × 10⁻¹²	9.2 × 10⁻¹²
Poor Reconstructed Data
Original Value	0.9686274	0.0196078	0.0039215	0.0235294	0.9960784
Reconstructed Value	0.9686236	0.0196120	0.0039340	0.0235914	0.9944053
Reconstruction Error	3.78 × 10⁻⁶	4.17 × 10⁻⁶	1.25 × 10⁻⁵	6.20 × 10⁻⁵	1.67 × 10⁻³
Error Ratio	3.91 × 10⁻⁶	0.00021285	0.0031925	0.0026376	0.0016797
Reconstruction Error ( $μ$ ± $σ$ ) = 2.871 × 10⁻⁷ ± 1.2391 × 10⁻⁵
Fuzzy Intrusion
Optimal Reconstructed Data
Original Value	0.6078431	0.8470588	0.6901960	0.6274509	0.8666666
Reconstructed Value	0.6078431	0.8470588	0.6901960	0.6274509	0.8666666
Reconstruction Error	1.0 × 10⁻¹¹	2.2 × 10⁻¹¹	3.4 × 10⁻¹¹	6.0 × 10⁻¹¹	6.4 × 10⁻¹¹
Error Ratio	1.7 × 10⁻¹¹	2.6 × 10⁻¹¹	4.9 × 10⁻¹¹	9.6 × 10⁻¹¹	7.3 × 10⁻¹¹
Poor Reconstructed Data
Original Value	0.0235294	0.1254901	0.0784313	0.0117647	0.0235294
Reconstructed Value	0.0179177	0.1313583	0.0864377	0.0211943	0.0363473
Reconstruction Error	0.012818	0.0094297	0.0080064	0.0058681	0.0056116
Error Ratio	0.54476	0.80152	0.10208	0.046762	0.23849
Reconstruction Error ( $μ$ ± $σ$ ) = 1.1844 × 10⁻⁶ ± 9.7876 × 10⁻⁵
Malfunction Intrusion
Optimal Reconstructed Data
Original Value	0.9215686	0.8705882	0.3294117	0.7647058	0.3764705
Reconstructed Value	0.9215686	0.8705882	0.3294117	0.7647058	0.3764705
Reconstruction Error	1.9 × 10⁻¹¹	1.1 × 10⁻¹⁰	1.9 × 10⁻¹¹	2.1 × 10⁻¹⁰	2.2 × 10⁻¹⁰
Error Ratio	2.1 × 10⁻¹¹	1.3 × 10⁻¹⁰	5.9 × 10⁻¹¹	2.8 × 10⁻¹⁰	5.8 × 10⁻¹¹
Poor Reconstructed Data
Original Value	0.9921568	0.1333333	0.8470588	0.0078431	0.0705882
Reconstructed Value	0.9911699	0.1343709	0.8456669	0.0110356	0.0952084
Reconstruction Error	0.00098688	0.0010376	0.0013919	0.0031925	0.02462
Error Ratio	0.00099468	0.0077824	0.0016432	0.40705	0.34879
Reconstruction Error ( $μ$ ± $σ$ ) = 8.9644 × 10⁻⁷ ± 1.1129 × 10⁻⁴
Replay Intrusion
Optimal Reconstructed Data
Original Value	0.0313725	0.0627450	0.0196078	0.2705882	0.0352941
Reconstructed Value	0.0313724	0.0627452	0.0196077	0.2705880	0.0352939
Reconstruction Error	8.47 × 10⁻⁸	1.08 × 10⁻⁷	1.12 × 10⁻⁷	1.63 × 10⁻⁷	1.78 × 10⁻⁷
Error Ratio	2.70 × 10⁻⁶	1.73 × 10⁻⁶	5.75 × 10⁻⁷	6.04 × 10⁻⁷	5.07 × 10⁻⁶
Poor Reconstructed Data
Original Value	0.9921568	0.0431372	0.9960784	0.0745098	0.0156862
Reconstructed Value	0.9980645	0.0491685	1.0028459	0.0940016	0.0395505
Reconstruction Error	5.90 × 10⁻³	6.03 × 10⁻³	6.76 × 10⁻³	1.94 × 10⁻²	2.38 × 10⁻²
Error Ratio	0.0059544	0.13982	0.0067941	0.2616	1.5213
Reconstruction Error ( $μ$ ± $σ$ ) = (2.8144 ± 2.9713) × 10⁻⁴

Table 7. A representation of the retrieved data for the best and worst cases from Dataset 4, including various attack designs.

DoS Intrusion
Optimal Reconstructed Data
Original Value	0.0313725	0.0039215	0.0274509	0.3294117	0.0352941
Reconstructed Value	0.0313739	0.0039248	0.0274466	0.3294051	0.0353012
Reconstruction Error	1.43 × 10⁻⁶	3.32 × 10⁻⁶	4.28 × 10⁻⁶	6.63 × 10⁻⁶	7.15 × 10⁻⁶
Error Ratio	4.57 × 10⁻⁵	8.47 × 10⁻⁴	1.55 × 10⁻⁴	2.01 × 10⁻⁵	2.02 × 10⁻⁴
Poor Reconstructed Data
Original Value	0.0117647	0.0039215	0.9921568	0.0235294	0.9843137
Reconstructed Value	0.0141285	0.0011375	0.9954768	0.0298712	0.9959440
Reconstruction Error	0.0023639	0.002784	0.00332	0.0063418	0.01163
Error Ratio	0.20093	0.70993	0.0033462	0.26953	0.011816
Reconstruction Error ( $μ$ ± $σ$ ) = (1.2052 ± 2.9518) × 10⁻⁴
Fuzzing Intrusion
Optimal Reconstructed Data
Original Value	0.0392156	0.0196078	0.3647058	0.0980392	0.4274509
Reconstructed Value	0.0392156	0.0196078	0.3647058	0.0980392	0.4274510
Reconstruction Error	3.33 × 10⁻⁹	3.71 × 10⁻⁹	8.45 × 10⁻⁹	1.32 × 10⁻⁸	3.09 × 10⁻⁸
Error Ratio	8.51 × 10⁻⁸	1.89 × 10⁻⁷	2.31 × 10⁻⁸	1.35 × 10⁻⁷	7.23 × 10⁻⁸
Poor Reconstructed Data
Original Value	0.1058823	0.0705882	0.0862745	0.0509803	0.0431372
Reconstructed Value	0.0995527	0.0771858	0.0959314	0.0608457	0.0280398
Reconstruction Error	0.0063297	0.0065976	0.0096569	0.0098653	0.015097
Error Ratio	0.05978	0.093466	0.11193	0.19351	0.34999
Reconstruction Error ( $μ$ ± $σ$ ) = 2.6575 × 10⁻⁵ ± 1.1231 × 10⁻⁴

Table 8. A representation of the retrieved data for the best and worst cases from Dataset 5, including various attack designs.

DoS Intrusion
Optimal Reconstructed Data
Original Value	0.2	0.2078431	0.2666666	0.7215686	0.6352941
Reconstructed Value	0.1999999	0.2078431	0.2666666	0.7215686	0.6352941
Reconstruction Error	1.5 × 10⁻¹¹	1.2 × 10⁻¹⁰	6.3 × 10⁻¹⁰	1.13 × 10⁻⁹	1.55 × 10⁻⁹
Error Ratio	7.3 × 10⁻¹¹	5.6 × 10⁻¹⁰	2.36 × 10⁻⁹	1.56 × 10⁻⁹	2.44 × 10⁻⁹
Poor Reconstructed Data
Original Value	0.1137254	0.01568627	0.0509803	0.4980392	0.2509803
Reconstructed Value	0.1161385	0.0183532	0.0562650	0.4920709	0.2346066
Reconstruction Error	0.002413	0.002667	0.0052846	0.0059682	0.016374
Error Ratio	0.021218	0.17002	0.10366	0.011983	0.065239
Reconstruction Error ( $μ$ ± $σ$ ) = 9.9053 × 10⁻⁶ ± 4.4062 × 10⁻⁵
Fuzzing Intrusion
Optimal Reconstructed Data
Original Value	0.0156862	0.4941176	0.0431372	0.0392156	0.2549019
Reconstructed Value	0.0156862	0.4941176	0.0431372	0.0392156	0.2549019
Reconstruction Error	7.8 × 10⁻¹⁰	1.50 × 10⁻⁹	2.56 × 10⁻⁹	3.19 × 10⁻⁹	3.76 × 10⁻⁹
Error Ratio	5.0 × 10⁻⁸	3.04 × 10⁻⁹	5.93 × 10⁻⁸	8.15 × 10⁻⁸	1.47 × 10⁻⁸
Poor Reconstructed Data
Original Value	0.0039215	0.0117647	0.0039215	0.01568627	0.0274509
Reconstructed Value	0.0086751	0.0170816	0.0100046	0.0221609	0.0171910
Reconstruction Error	0.0047536	0.0053169	0.0060831	0.0064746	0.01026
Error Ratio	1.2122	0.45194	1.5512	0.41276	0.37375
Reconstruction Error ( $μ$ ± $σ$ ) = (1.5741 ± 5.9155) × 10⁻⁵

Table 9. A representation of the retrieved data for the best and worst cases from Dataset 6, including various attack designs.

Flooding Intrusion
Optimal Reconstructed Data
Original Value	0.9372549	0.3960784	0.40	0.8313725	0.6117647
Reconstructed Value	0.9372549	0.3960784	0.3999999	0.8313725	0.6117647
Reconstruction Error	4.66 × 10⁻¹¹	3.50 × 10⁻¹⁰	3.75 × 10⁻¹⁰	9.26 × 10⁻¹⁰	2.66 × 10⁻⁹
Error Ratio	4.97 × 10⁻¹¹	8.83 × 10⁻¹⁰	9.39 × 10⁻¹⁰	1.11 × 10⁻⁹	4.35 × 10⁻⁹
Poor Reconstructed Data
Original Value	0.1411764	0.1411764	0.0666666	0.3725490	0.8509803
Reconstructed Value	0.0271798	0.1466601	0.0731802	0.3830386	0.8329448
Reconstruction Error	0.0036504	0.0054836	0.0065136	0.01049	0.018036
Error Ratio	0.15514	0.038842	0.097704	0.028156	0.021194
Reconstruction Error ( $μ$ ± $σ$ ) = (4.9234 ± 7.7279) × 10⁻⁵
Fuzzing Intrusion
Optimal Reconstructed Data
Original Value	0.2039215	0.2274509	0.5960784	0.3411764	0.1019607
Reconstructed Value	0.2039215	0.2274509	0.5960784	0.3411764	0.1019607
Reconstruction Error	5.43 × 10⁻⁹	6.24 × 10⁻⁹	6.52 × 10⁻⁹	7.03 × 10⁻⁹	8.65 × 10⁻⁹
ER	2.66 × 10⁻⁸	2.74 × 10⁻⁸	1.09 × 10⁻⁸	2.06 × 10⁻⁸	8.48 × 10⁻⁸
Poor Reconstructed Data
Original Value	0.0078431	0.0980392	0.0470588	0.0313725	0.9490196
Reconstructed Value	0.0155906	0.1067975	0.0652327	0.05395768	0.9110684
Reconstruction Error	0.0077475	0.0087583	0.018174	0.022585	0.037951
Error Ratio	0.98781	0.089335	0.3862	0.7199	0.03999
Reconstruction Error ( $μ$ ± $σ$ ) = (5.4577 ± 1.7709) × 10⁻⁴
Malfunction Intrusion
Optimal Reconstructed Data
Original Value	0.3372549	0.5921568	0.2705882	0.1176470	0.7450980
Reconstructed Value	0.3372549	0.5921568	0.2705882	0.1176470	0.7450980
Reconstruction Error	3.23 × 10⁻¹⁰	4.2 × 10⁻¹⁰	5.9 × 10⁻¹⁰	1.12 × 10⁻⁹	1.24 × 10⁻⁹
Error Ratio	9.6 × 10⁻¹⁰	7.1 × 10⁻¹⁰	2.18 × 10⁻⁹	9.53 × 10⁻⁹	1.66 × 10⁻⁹
Poor Reconstructed Data
Original Value	0.0235294	0.0666666	0.0039215	0.0078431	0.8196078
Reconstructed Value	0.0261796	0.0696458	0.0077227	0.0144744	0.7989199
Reconstruction Error	0.0026502	0.0029792	0.0038012	0.0066313	0.020688
Error Ratio	0.11263	0.044688	0.96929	0.84549	0.025241
Reconstruction Error ( $μ$ ± $σ$ ) = (8.7937 ± 4.3163) × 10⁻⁵

Table 10. Percentage reconstruction error in data reconstruction by the proposed model.

Dataset	Intrusions	Percentage RE (%)
1	Dos	7.5182 × 10⁻⁵
	Fuzzy	7.1871 × 10⁻⁵
	Impersonation	4.3145 × 10⁻⁵
2	Dos	4.267 × 10⁻⁵
	Fuzzy	9.0666 × 10⁻⁵
	Spoofing (Gear)	3.9214 × 10⁻⁵
	Spoofing (RPM)	2.2653 × 10⁻⁵
3	Flooding	1.3271 × 10⁻⁵
	Fuzzy	4.6938 × 10⁻⁵
	Malfunction	9.0317 × 10⁻⁵
	Replay	1.6254 × 10⁻⁵
4	Dos	2.1384 × 10⁻⁵
4	Fuzzing	2.0139 × 10⁻⁵
5	Dos	2.0518 × 10⁻⁵
5	Fuzzing	3.7491 × 10⁻⁵
6	Flooding	1.4678 × 10⁻⁵
	Fuzzing	1.7889 × 10⁻⁵
	Malfunction	7.5370 × 10⁻⁵

Table 11. Execution and amortized time by the proposed model for different attack designs under study.

Dataset	Intrusion	Execution Time (s)	Estimated Usage	Amortized Time (s)
1	Dos	0.128403	50	0.002568
			100	0.001284
			200	0.000642
	Fuzzy	0.118190	50	0.002364
			100	0.001182
			200	0.000591
	Impersonation	0.136582	50	0.002732
			100	0.001366
			200	0.000683
2	Dos	0.119429	50	0.002389
			100	0.001194
			200	0.000597
	Fuzzy	0.140763	50	0.002815
			100	0.001408
			200	0.000704
	Spoofing (Gear)	0.085770	50	0.001715
			100	0.000858
			200	0.000429
	Spoofing (RPM)	0.115697	50	0.002314
			100	0.001157
			200	0.000578
3	Flooding	0.106050	50	0.002121
			100	0.001061
			200	0.000530
	Fuzzy	0.118164	50	0.002363
			100	0.001182
			200	0.000591
	Malfunction	0.151347	50	0.003027
			100	0.001513
			200	0.000757
	Replay	0.117029	50	0.002341
			100	0.001170
			200	0.000585
4	Dos	0.207910	50	0.004158
			100	0.002079
			200	0.001040
	Fuzzing	0.132068	50	0.002641
			100	0.001321
			200	0.000660
5	Dos	0.094069	50	0.001881
			100	0.000941
			200	0.000470
	Fuzzing	0.148530	50	0.002971
			100	0.001485
			200	0.000743
6	Flooding	0.145511	50	0.002910
			100	0.001455
			200	0.000728
	Fuzzing	0.273372	50	0.005467
			100	0.002734
			200	0.001367
	Malfunction	0.280700	50	0.005614
			100	0.002807
			200	0.001404

Table 12. Representation of original data reconstruction by AM-DDAE tested on new unseen data and attack type.

Unseen-Exploits-Intrusion
Best Reconstructed Data
Original Value	0.058823529	0.019607843	0.039215686	0.3333348715754	0.352941176
Reconstructed Value	0.058823592572301	0.019608034694900	0.039215225788863	0.333334871575410	0.352949442591866
Reconstruction Error	6.3572 × 10⁻⁸	1.9169 × 10⁻⁷	4.6021 × 10⁻⁷	1.5386 × 10⁻⁶	8.266 × 10⁻⁶
Error Ratio	1.0807 × 10⁻⁶	9.7764 × 10⁻⁶	1.1735 × 10⁻⁵	4.6157 × 10⁻⁶	2.3422 × 10⁻⁵
Worst Reconstructed Data
Original Value	0.176470588	0.294117647	0.274509804	0.078431373	0.901960784
Reconstructed Value	0.164413727257618	0.281943574129103	0.262118984920315	0.092223280780675	0.883810436014474
Reconstruction Error	0.012057	0.012174	0.012391	0.013792	0.01815
Error Ratio	0.068322	0.041392	0.045138	0.17585	0.020123
Reconstruction Error ( $μ$ ± $σ$ ) = 7.9501 × 10⁻⁴ ± 4.8 × 10⁻²

Table 13. Performance comparison between known Seen and new Unseen Data.

Parameters	Seen Data	Unseen Data
Mean Reconstruction Error	<1%	<1%
Execution Time (s)	0.145532	0.058369

Table 14. A representation of the retrieved data for the best and worst cases for Adversarial Machine Learning Attack.

Adversarial Machine Learning Atta
Best Reconstructed Data
Original Value	0.051755242375452	0.010063551905973	0.013296405755303	0.010571772791250	0.005831583183491
Reconstructed Value	0.051755242377456	0.010063551430136	0.013296406299568	0.013296406299568	0.005831583896953
Reconstruction Error	2.0032 × 10⁻¹²	4.7584 × 10⁻¹⁰	5.4427 × 10⁻¹⁰	6.2462 × 10⁻¹⁰	7.1346 × 10⁻¹⁰
Error Ratio	3.8705 × 10⁻¹¹	4.7283 × 10⁻⁸	4.0933 × 10⁻⁸	5.9083 × 10⁻⁸	1.2234 × 10⁻⁷
Worst Reconstructed Data
Original Value	0.904889486417633	0.921277743274121	0.951833868463914	0.048913922043000	0.900990803938693
Reconstructed Value	0.903207826749840	0.919451094624141	0.949769494945811	0.051268595843600	0.897338065778924
Reconstruction Error	0.0016817	0.0018266	0.0020644	0.0023547	0.0036527
Error Ratio	0.0018584	0.0019827	0.0021688	0.048139	0.0040541
Reconstruction Error ( $μ$ ± $σ$ ) = (4.2401 ± 7.1352) × 10⁻⁵

Table 15. Performance comparison of the proposed model with GANs (Mean RE).

Dataset	Attack Type	Mean	RE	Difference
Dataset	Attack Type	AM-DDAE (Proposed)	GANs	Difference
1	DoS	0.00013695	0.1024	0.102263
	Fuzzy	0.00010999	0.1812	0.18109
	Impersonation	0.00014127	0.1664	0.166259
2	DoS	0.00025156	0.1473	0.147048
	Fuzzy	0.000039921	0.2023	0.20226
	Spoofing (Gear)	0.00016033	0.1867	0.18654
	Spoofing (RPM)	0.00010828	0.1664	0.166292
3	Flooding	0.00002871	0.1820	0.181971
	Fuzzy	0.000011844	0.1792	0.179188
	Malfunction	0.00089644	0.1498	0.148904
	Replay	0.00028144	0.1862	0.185919
4	DoS	0.00012052	0.1197	0.119579
4	Fuzzing	0.00026575	0.1330	0.132734
5	DoS	0.000099053	0.1649	0.164801
5	Fuzzing	0.000015741	0.1485	0.148484
6	Flooding	0.000049234	0.1672	0.167151
	Fuzzing	0.00054577	0.1656	0.165054
	Malfunction	0.000087937	0.1894	0.189312

Table 16. Performance comparison of the proposed model with GANs (Execution T).

Dataset	Attack Type	Execution	Time (s)	Difference
Dataset	Attack Type	AM-DDAE (Proposed)	GANs	Difference
1	DoS	0.128403	16.844	16.7156
	Fuzzy	0.118190	13.786	13.66781
	Impersonation	0.136582	11.081	10.94442
2	DoS	0.119429	11.191	11.07157
	Fuzzy	0.140763	11.012	10.87124
	Spoofing (Gear)	0.085770	11.208	11.12223
	Spoofing (RPM)	0.115697	14.445	14.3293
3	Flooding	0.106050	10.743	10.63695
	Fuzzy	0.118164	10.913	10.79484
	Malfunction	0.151347	10.985	10.83365
	Replay	0.117029	12.500	12.38297
4	DoS	0.207910	10.985	10.77709
4	Fuzzing	0.132068	12.382	12.24993
5	DoS	0.094069	10.717	10.62293
5	Fuzzing	0.148530	10.816	10.66747
6	Flooding	0.145511	10.690	10.54449
	Fuzzing	0.273372	11.038	10.76463
	Malfunction	0.280700	10.877	10.5963

Table 17. Representation of original data reconstruction by GANs for Dataset 1.

DoS Intrusion
Best Reconstructed Data
Original Value	0.003921569	0.019607843	0.011811024	0.015686275	0.007843137
Reconstructed Value	0.0039357	0.0196396	0.0117115	0.0155016	0.0074881
Reconstruction Error	0.000014093	0.000031763	0.000099567	0.00018467	0.00035506
Error Ratio	0.0035936	0.0016199	0.0084	0.011773	0.04527
Worst Reconstructed Data
Original Value	0.984313725	0.976470588	0.976377953	0.980392157	0.996078431
Reconstructed Value	0.0118349	0.0037626	0.0025530	0.0039900	0.0050469
Reconstruction Error	0.97248	0.97271	0.97382	0.9764	0.99103
Error Ratio	0.98798	0.99615	0.99739	0.99593	0.99493
Reconstruction Error ( $μ$ ± $σ$ ) = 0.1024 ± 0.1356
Fuzzy Intrusion
Best Reconstructed Data
Original Value	0.019607843	0.615686275	0.298039216	0.31372549	0.635294118
Reconstructed Value	0.0196316	0.6157715	0.2977831	0.3134525	0.6355679
Reconstruction Error	0.000023804	0.000085197	0.00025608	0.00027301	0.00027379
Error Ratio	0.001214	0.00013838	0.00085921	0.00087021	0.00043096
Worst Reconstructed Data
Original Value	0.815686275	0.964705882	0.878431373	0.803921569	0.894117647
Reconstructed Value	0.1180484	0.2283716	0.1354989	0.0301402	0.0755604
Reconstruction Error	0.69764	0.73633	0.74293	0.77378	0.81856
Error Ratio	0.85528	0.76327	0.84575	0.96251	0.91549
Reconstruction Error ( $μ$ ± $σ$ ) = 0.1812 ± 0.0665
Impersonation Intrusion
Best Reconstructed Data
Original Value	0.049107143	0.062745098	0.040178571	0.070588235	0.074509804
Reconstructed Value	0.0490225	0.0628323	0.0403882	0.0708314	0.0747580
Reconstruction Error	0.000084636	0.000087153	0.00020959	0.00024314	0.00024823
Error Ratio	0.0017235	0.001389	0.0052165	0.0034445	0.0033316
Worst Reconstructed Data
Original Value	0.928571429	0.976470588	0.925490196	0.915178571	0.988235294
Reconstructed Value	0.0540381	0.0994046	0.0355721	0.0195887	0.0857598
Reconstruction Error	0.87453	0.87707	0.88992	0.89559	0.90248
Error Ratio	0.94181	0.8982	0.96156	0.9786	0.91322
Reconstruction Error ( $μ$ ± $σ$ ) = 0.1664 ± 0.0695

Table 18. Representation of original data reconstruction by GANs for Dataset 2.

DoS Intrusion
Best Reconstructed Data
Original Value	0.007843137	0.04705882	0.004784689	0.019607843	0.003968254
Reconstructed Value	0.0075861	0.0465696	0.0037233	0.0183325	0.0024771
Reconstruction Error	0.00025706	0.00048924	0.0010614	0.0012754	0.0014912
Error Ratio	0.032775	0.010396	0.22182	0.065043	0.37578
Worst Reconstructed Data
Original Value	0.968253968	0.984313725	0.979057592	0.988235294	0.996078431
Reconstructed Value	0.0089633	0.0064030	0.000980555	0.0061639	0.0012052
Reconstruction Error	0.95929	0.97791	0.97808	0.98207	0.99487
Error Ratio	0.99074	0.99349	0.999	0.99376	0.99879
Reconstruction Error ( $μ \pm σ$ ) = 0.1473 $\pm$ 0.1349
Fuzzy Intrusion
Best Reconstructed Data
Original Value	0.117647059	0.11372549	0.035294118	0.066666667	0.074509804
Reconstructed Value	0.1176597	0.1137034	0.0352124	0.0665476	0.0746459
Reconstruction Error	1.2592 × 10⁻⁵	2.2083 × 10⁻⁵	8.1761 × 10⁻⁵	1.191 × 10⁻⁴	1.3612 × 10⁻⁴
Error Ratio	0.00010703	0.00019418	0.0023166	0.0017865	0.0018268
Worst Reconstructed Data
Original Value	0.980392157	0.949019608	0.929411765	0.945098039	0.976470588
Reconstructed Value	0.1312901	0.0984343	0.0726554	0.0552933	0.0850940
Reconstruction Error	0.8491	0.85059	0.85676	0.8898	0.89138
Error Ratio	0.86608	0.89628	0.92183	0.94149	0.91286
Reconstruction Error ( $μ \pm σ$ ) = 0.2023 $\pm$ 0.0922
Spoofing (Gear) Intrusion
Best Reconstructed Data
Original Value	0.004784689	0.019607843	0.178010471	0.004784689	0.062745098
Reconstructed Value	0.0046932	0.0197156	0.1781214	0.0045282	0.0623568
Reconstruction Error	9.1474 × 10⁻⁵	1.0777 × 10⁻⁴	1.1096 × 10⁻⁴	2.5652 × 10⁻⁴	3.8834 × 10⁻⁴
Error Ratio	0.019118	0.0054963	0.00062334	0.053613	0.0061891
Worst Reconstructed Data
Original Value	0.91372549	0.929411765	0.964705882	0.976470588	0.945098039
Reconstructed Value	0.0092011	0.0049451	0.0322466	0.0401472	0.0066587
Reconstruction Error	0.90452	0.92447	0.93246	0.93632	0.93844
Error Ratio	0.98993	0.99468	0.96657	0.95889	0.99295
Reconstruction Error ( $μ \pm σ$ ) = 0.1867 $\pm$ 0.0849
Spoofing (RPM) Intrusion
Best Reconstructed Data
Original Value	0.141176471	0.882352941	0.125490196	0.031372549	0.141176471
Reconstructed Value	0.1412281	0.8824174	0.1253147	0.0311787	0.1414160
Reconstruction Error	0.000051594	0.00006444	0.00017554	0.00019388	0.00023951
Error Ratio	0.00036546	0.000073032	0.0013989	0.00618	0.0016965
Worst Reconstructed Data
Original Value	0.980392157	0.956862745	0.984313725	0.988235294	0.996078431
Reconstructed Value	0.1335045	0.1035433	0.1278326	0.0753057	0.0731756
Reconstruction Error	0.84689	0.85332	0.85648	0.91293	0.9229
Error Ratio	0.86383	0.89179	0.87013	0.9238	0.92654
Reconstruction Error ( $μ \pm σ$ ) = 0.1664 $\pm$ 0.0739

Table 19. Representation of original data reconstruction by GANs for Dataset 3.

Flooding Intrusion
Best Reconstructed Data
Original Value	0.035294118	0.090196078	0.11372549	0.031372549	0.058823529
Reconstructed Value	0.0352911	0.0903348	0.1132890	0.0318339	0.0593143
Reconstruction Error	2.979 × 10⁻⁶	0.00013872	0.00043649	0.00046135	0.00049074
Error Ratio	8.4404 × 10⁻⁵	0.001538	0.0038381	0.014706	0.0083427
Worst Reconstructed Data
Original Value	0.976470588	0.94488189	0.925490196	0.878431373	0.933333333
Reconstructed Value	0.1155326	0.0826848	0.0591295	0.0064016	0.0128098
Reconstruction Error	0.86094	0.8622	0.86636	0.87203	0.92052
Error Ratio	0.88168	0.91249	0.93611	0.99271	0.98628
Reconstruction Error ( $μ$ ± $σ$ ) = 0.1820 ± 0.0704
Fuzzy Intrusion
Best Reconstructed Data
Original Value	0.11372549	0.235294118	0.011764706	0.11372549	0.074509804
Reconstructed Value	0.1138461	0.2354272	0.0119260	0.1135059	0.0747396
Reconstruction Error	0.00012063	0.00013307	0.00016134	0.00021959	0.00022975
Error Ratio	0.0010607	0.00056554	0.013714	0.0019309	0.0030835
Worst Reconstructed Data
Original Value	0.925490196	0.921568627	0.937254902	0.97254902	0.956862745
Reconstructed Value	0.0305022	0.0160121	0.0173264	0.0422058	0.0169412
Reconstruction Error	0.89499	0.90556	0.91993	0.93034	0.93992
Error Ratio	0.96704	0.98263	0.98151	0.9566	0.9823
Reconstruction Error ( $μ$ ± $σ$ ) = 0.1792 ± 0.0819
Malfunction Intrusion
Best Reconstructed Data
Original Value	0.003921569	0.011764706	0.047058824	0.023529412	0.007843137
Reconstructed Value	0.0039237	0.0117493	0.0469612	0.0233983	0.0076827
Reconstruction Error	2.1172 × 10⁻⁶	1.545 × 10⁻⁵	9.7598 × 10⁻⁵	1.3112 × 10⁻⁴	1.6041 × 10⁻⁴
Error Ratio	0.00053989	0.0013132	0.002074	0.0055727	0.020453
Worst Reconstructed Data
Original Value	0.980392157	0.984313725	0.988235294	0.992156863	0.996078431
Reconstructed Value	0.0053127	0.0014639	0.00021569	0.000684863	0.000068583
Reconstruction Error	0.97508	0.98285	0.98802	0.99147	0.99601
Error Ratio	0.99458	0.99851	0.99978	0.99931	0.99993
Reconstruction Error ( $μ$ ± $σ$ ) = 0.1498 ± 0.1219
Replay Intrusion
Best Reconstructed Data
Original Value	0.078431373	0.066666667	0.145098039	0.10980392	0.188235294
Reconstructed Value	0.0784392	0.0666569	0.1451117	0.1098243	0.1882974
Reconstruction Error	7.8329 × 10⁻⁶	9.7348 × 10⁻⁶	1.3626 × 10⁻⁵	2.0415 × 10⁻⁵	6.2127 × 10⁻⁵
Error Ratio	9.9869 × 10⁻⁵	1.4602 × 10⁻⁴	9.391 × 10⁻⁵	1.8592 × 10⁻⁴	3.3005 × 10⁻⁴
Worst Reconstructed Data
Original Value	0.9372549	0.97254902	0.956862745	0.968253968	0.988235294
Reconstructed Value	0.0148576	0.0498330	0.0227760	0.0302223	0.0374357
Reconstruction Error	0.9224	0.92272	0.93409	0.93803	0.9508
Error Ratio	0.98415	0.94876	0.9762	0.96879	0.96212
Reconstruction Error ( $μ$ ± $σ$ ) = 0.1862 ± 0.0866

Table 20. Representation of original data reconstruction by GANs for Dataset 4.

DoS Intrusion
Best Reconstructed Data
Original Value	0.003921569	0.007843137	0.023529	0.015686275	0.011764706
Reconstructed Value	0.0038629	0.0073847	0.023081	0.0148526	0.0133527
Reconstruction Error	0.000058637	0.00045848	0.00044805	0.00083371	0.001588
Error Ratio	0.014952	0.058456	0.019042	0.053149	0.13498
Worst Reconstructed Data
Original Value	0.93333333	0.941176471	0.97254902	0.980392157	0.992156863
Reconstructed Value	0.0038465	0.000056349585	0.0053684	0.0025628	0.00071916
Reconstruction Error	0.92949	0.94112	0.95106	0.97783	0.99144
Error Ratio	0.99588	0.99994	0.9779	0.99739	0.99928
Reconstruction Error ( $μ$ ± $σ$ ) = 0.1197 ± 0.2124
Fuzzing Intrusion
Best Reconstructed Data
Original Value	0.015686275	0.066666667	0.011764706	0.078431373	0.039215686
Reconstructed Value	0.0157229	0.0665290	0.0119625	0.0786558	0.0396165
Reconstruction Error	0.000036624	0.00013769	0.0001978	0.00022444	0.00040083
Error Ratio	0.0023348	0.0020654	0.016813	0.0028617	0.010221
Worst Reconstructed Data
Original Value	0.984313725	0.925490196	0.980392157	0.941176471	0.949019608
Reconstructed Value	0.0874406	0.0202050	0.0626758	0.0117404	0.0030840
Reconstruction Error	0.89687	0.90529	0.91772	0.92944	0.94594
Error Ratio	0.91117	0.97817	0.93607	0.98753	0.99675
Reconstruction Error ( $μ$ ± $σ$ ) = 0.1330 ± 0.1266

Table 21. Representation of original data reconstruction by GANs for Dataset 5.

DoS Intrusion
Best Reconstructed Data
Original Value	0.062745098	0.015686275	0.125490196	0.247058824	0.02745098
Reconstructed Value	0.0630626	0.0160570	0.1250065	0.2475464	0.0269227
Reconstruction Error	0.0003175	0.00037074	0.00048373	0.00048758	0.00052826
Error Ratio	0.0050601	0.023635	0.0038547	0.0019735	0.019244
Worst Reconstructed Data
Original Value	0.988235294	0.941176471	0.949019608	0.97254902	0.976470588
Reconstructed Value	0.0518627	0.0024055	0.0085331	0.0178499	0.0201059
Reconstruction Error	0.93637	0.93877	0.94049	0.9547	0.95636
Error Ratio	0.94752	0.99744	0.99101	0.98165	0.97941
Reconstruction Error ( $μ$ ± $σ$ ) = 0.1649 ± 0.1176
Fuzzing Intrusion
Best Reconstructed Data
Original Value	0.019607843	0.003921569	0.007843137	0.035294118	0.02745098
Reconstructed Value	0.0196479	0.0038376	0.0075930	0.0355900	0.0271260
Reconstruction Error	4.0042 × 10⁻⁵	8.4005 × 10⁻⁵	2.5012 × 10⁻⁴	2.9588 × 10⁻⁴	3.2499 × 10⁻⁴
Error Ratio	0.0020421	0.021421	0.03189	0.0083833	0.011839
Worst Reconstructed Data
Original Value	0.976470588	0.952941176	0.949019608	0.941176471	0.996078431
Reconstructed Value	0.0588685	0.0346383	0.0145934	0.0025264	0.0112039
Reconstruction Error	0.92848	0.92939	0.93443	0.93865	0.98487
Error Ratio	0.95085	0.97529	0.98462	0.99732	0.98875
Reconstruction Error ( $μ$ ± $σ$ ) = 0.1485 ± 0.1205

Table 22. Representation of original data reconstruction by GANs for Dataset 6.

Flooding Intrusion
Best Reconstructed Data
Original Value	0.010309278	0.203921569	0.254901961	0.51372549	0.062992126
Reconstructed Value	0.0103211	0.2039485	0.2549708	0.5139307	0.0627497
Reconstruction Error	1.1863 × 10⁻⁵	2.6959 × 10⁻⁵	6.8798 × 10⁻⁵	2.0519 × 10⁻⁴	2.4246 × 10⁻⁴
Error Ratio	1.1507 × 10⁻³	1.322 × 10⁻⁴	2.699 × 10⁻⁴	3.9941 × 10⁻⁴	3.849 × 10⁻³
Worst Reconstructed Data
Original Value	0.925490196	0.996062992	0.968627451	0.968627451	0.994845361
Reconstructed Value	0.1642768	0.1514729	0.1238152	0.1041814	0.0170993
Reconstruction Error	0.76121	0.84459	0.84481	0.86445	0.97775
Error Ratio	0.8225	0.84793	0.87217	0.89244	0.98281
Reconstruction Error ( $μ$ ± $σ$ ) = 0.1672 ± 0.0667
Fuzzing Intrusion
Best Reconstructed Data
Original Value	0.050980392	0.48627451	0.007843137	0.003921569	0.482352941
Reconstructed Value	0.0510123	0.4863441	0.0079223	0.0038279	0.4824857
Reconstruction Error	3.1953 × 10⁻⁵	6.9589 × 10⁻⁵	7.9206 × 10⁻⁵	9.3645 × 10⁻⁵	1.328 × 10⁻⁴
Error Ratio	0.00062676	0.00014311	0.010099	0.023879	0.00027532
Worst Reconstructed Data
Original Value	0.945098039	0.984313725	0.992156863	0.988235294	0.996078431
Reconstructed Value	0.0688975	0.0748087	0.0565799	0.0374701	0.0152394
Reconstruction Error	0.8762	0.90951	0.93558	0.95077	0.98084
Error Ratio	0.9271	0.924	0.94297	0.96208	0.9847
Reconstruction Error ( $μ$ ± $σ$ ) = 0.1656 ± 0.0715
Malfunction Intrusion
Best Reconstructed Data
Original Value	0.42745098	0.321568627	0.262745098	0.010152284	0.403921569
Reconstructed Value	0.4274471	0.3215783	0.2628300	0.0100435	0.4038029
Reconstruction Error	3.8994 × 10⁻⁶	9.667 × 10⁻⁶	8.4951 × 10⁻⁵	1.0877 × 10⁻⁴	1.1867 × 10⁻⁴
Error Ratio	9.1224 × 10⁻⁶	3.0062 × 10⁻⁵	3.2332 × 10⁻⁴	1.0713 × 10⁻²	2.9379 × 10⁻⁴
Worst Reconstructed Data
Original Value	0.952941176	0.980392157	0.984313725	0.960784314	0.97254902
Reconstructed Value	0.0479626	0.0742492	0.0692697	0.0453411	0.0385420
Reconstruction Error	0.90498	0.90614	0.91504	0.91544	0.93401
Error Ratio	0.94967	0.92427	0.92963	0.95281	0.96037
Reconstruction Error ( $μ$ ± $σ$ ) = (8.7937 ± 4.3163) × 10⁻⁵

Table 23. Performance comparison of the proposed mechanism with the existing studies.

Reference	System	Method	Mitigation Strategy	Performance Metrics
Reference	System	Method	Mitigation Strategy	Percentage Error	Execution Time (s)
Hidalgo et al. [21]	IVN—smart intersection	GNN-MLP	Block, deflect intruder to decoy system	-	0.0466
Sontakke & Chopade [20]	Vehicular ad-hoc network	DNN-BAIT approach	Node isolation	-	-
Khanapuri et al. [25]	Vehicle platoon	CNN—Routh-Hurwitz Criterion	Gap widening between vehicles	-	-
Shirazi et al. [32]	CAN-bus	LSTM	Data Reconstruction	<6	-
This study	CAN-bus	AM-DDAE	Data Reconstruction	<1	0.08577–0.2807

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Published by MDPI on behalf of the World Electric Vehicle Association. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kousar, A.; Ahmed, S.; Khan, Z.A. A Deep Learning Approach for Real-Time Intrusion Mitigation in Automotive Controller Area Networks. World Electr. Veh. J. 2025, 16, 492. https://doi.org/10.3390/wevj16090492

AMA Style

Kousar A, Ahmed S, Khan ZA. A Deep Learning Approach for Real-Time Intrusion Mitigation in Automotive Controller Area Networks. World Electric Vehicle Journal. 2025; 16(9):492. https://doi.org/10.3390/wevj16090492

Chicago/Turabian Style

Kousar, Anila, Saeed Ahmed, and Zafar A. Khan. 2025. "A Deep Learning Approach for Real-Time Intrusion Mitigation in Automotive Controller Area Networks" World Electric Vehicle Journal 16, no. 9: 492. https://doi.org/10.3390/wevj16090492

APA Style

Kousar, A., Ahmed, S., & Khan, Z. A. (2025). A Deep Learning Approach for Real-Time Intrusion Mitigation in Automotive Controller Area Networks. World Electric Vehicle Journal, 16(9), 492. https://doi.org/10.3390/wevj16090492

Article Menu

A Deep Learning Approach for Real-Time Intrusion Mitigation in Automotive Controller Area Networks

Abstract

1. Introduction

1.1. Related Works

1.2. Challenges in Intrusion Mitigation

1.3. Motivation

1.4. Contributions

2. The Smart Cars: Potential Attack Surfaces & Cyber-Intrusions

3. Proposed Methodology

3.1. Datasets

3.2. Data Preprocessing

3.3. Attack Design

3.4. Proposed AM-Based Deep-Denoising Autoencoder

3.4.1. Working of the Proposed AM-DDAE Mechanism

3.4.2. Architecture of the Proposed AM-DDAE Mechanism

3.4.3. Error Metrics

4. Results and Discussions

4.1. Error Metrics

4.2. Training and Validation Cost

4.3. Computational Time

4.4. Adaptability of the Proposed AM-DDAE Model to Unseen Attack

4.5. Decision-Making Process of the Proposed Model Through Explainable AI

4.6. Results and Analysis of Adversarial Machine Learning Attack

4.7. Comparison with Generative Adversarial Networks

4.8. Comparison with Existing Studies

4.9. Impacts of the Proposed Mechanism on End-Users

4.10. Integration of Proposed Model with Existing Security Frameworks

4.11. Advantages of the Proposed Mechanism

4.12. Limitations and Future Scope

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI