A Hybrid CNN–LSTM–Attention Framework for Intrusion Detection in Smart Mobility Networks

Ekpo, Otuekong; Casola, Valentina; De Benedictis, Alessandra; Asuquo, Philip; Agbor, Bright

doi:10.3390/fi18040210

Open AccessArticle

A Hybrid CNN–LSTM–Attention Framework for Intrusion Detection in Smart Mobility Networks

by

Otuekong Ekpo

^1,*

,

Valentina Casola

²

,

Alessandra De Benedictis

²

,

Philip Asuquo

³

and

Bright Agbor

⁴

¹

IMT School for Advanced Studies, 55100 Lucca, Italy

²

Department of Electrical and Information Technology Engineering, University of Naples Federico II, 80138 Naples, Italy

³

TETFUND Center of Excellence in Computational Intelligence Research, University of Uyo, Uyo 520101, Nigeria

⁴

Department of Computer Engineering, University of Uyo, Uyo 520101, Nigeria

^*

Author to whom correspondence should be addressed.

Future Internet 2026, 18(4), 210; https://doi.org/10.3390/fi18040210

Submission received: 28 February 2026 / Revised: 30 March 2026 / Accepted: 8 April 2026 / Published: 15 April 2026

(This article belongs to the Special Issue Cybersecurity in the Era of Smart Cities)

Download

Browse Figures

Versions Notes

Abstract

Smart cities are increasingly dependent on interconnected transportation systems; however, this connectivity exposes smart mobility networks to significant cybersecurity risks. Traditional Intrusion Detection Systems are ill-equipped for this environment, as they are designed for isolated systems or fixed network boundaries. Thus, they struggle to secure the complex and heterogeneous smart mobility networks, where various protocols and resource-constrained edge devices require more adaptive solutions. To address this limitation, we propose a novel hybrid deep learning framework that combines convolutional neural networks for spatial feature extraction, long short-term memory networks for temporal pattern recognition, and an attention mechanism for adaptive feature weighting, together forming a context-aware Intrusion Detection System. Our approach is evaluated across six benchmark datasets spanning vehicular networks, IoT ecosystems, cloud computing, and 5G environments—VeReMi Extension, CICIoV2024, Edge-IIoTset, UNSW-NB15, Car Hacking, and 5G-NIDD—a deliberately diverse selection that represents the heterogeneous nature of real-world smart mobility networks. Empirical evaluation using three different random seeds reveals the proposed framework achieves detection accuracy exceeding 98% on each dataset, a mean F1 score of 98.94%, and an inference latency of just 4.96 ms per sample. Our results show that the proposed model achieves consistently high detection performance across six heterogeneous benchmark datasets, making it a potentially robust candidate for real-time intrusion detection in smart mobility systems.

Keywords:

cybersecurity; smart city; smart mobility; deep learning; attention mechanism; autonomous vehicles; connected vehicles; VANETs

1. Introduction

Smart cities are rapidly reshaping urban life, driven by growing real-world deployments and expanding research interest in intelligent urban development [1]. At their core, smart cities integrate social capital with both traditional and modern Information and Communication Technology (ICT) infrastructure to foster sustainable economic growth and improve quality of life [2]. This transformation has been accelerated by the proliferation of emerging technologies—Artificial Intelligence (AI), Big Data, and the Internet of Things (IoT)—which have collectively redefined how cities manage governance, healthcare, energy, transportation, safety, infrastructure, and education [3].

Central to this transformation is smart mobility, a critical component of a smart city. Smart mobility leverages IoT technologies, wireless networks, and real-time communication to optimize urban transportation, enabling improved route planning, reduced congestion and emissions, shorter travel times, and enhanced overall efficiency [4]. At its core, smart mobility includes Intelligent Transportation Systems (ITSs) which leverage communication networks, Big Data and sensing devices into transportation infrastructure to enable data-driven decision-making. It also extends to Connected and Automated Vehicles (CAVs), smart road infrastructure, cloud-based data services, and demand-driven platforms such as Mobility-as-a-Service (MaaS) [5]; together forming an ecosystem characterized by heterogeneity, spatio-temporality, high interconnectivity, and real-time data dependency [6].

The rapid adoption of smart mobility is further underscored by projections that 85% of the world’s population will reside in urban areas by 2050 [7], thereby placing unprecedented demand on city infrastructure and services. Smart mobility directly addresses this challenge by integrating private vehicles, ride-sharing, public transit, electric vehicles, and on-demand services into unified digital ecosystems that optimize the movement of people and goods. Beyond reducing congestion and improving traffic flow, smart mobility supports broader sustainability objectives, most notably the United Nations Sustainable Development Goals, through measurable reductions in carbon emissions and more efficient utilization of urban resources [8,9].

In view of the above, existing research has predominantly focused on behavioural aspects of sustainable transportation and the enhancement of mobility solutions [2], while cybersecurity for smart mobility remains an evolving and insufficiently addressed concern. Karopoulos et al. [10] highlight that breaches in cyberspace resulting from cyber-physical attacks can lead to harmful consequences in the physical domain, many internal vehicle networks (IVNs), particularly the Controller Area Network (CAN) bus, were originally designed with minimal security considerations because earlier IVNs operated in isolated environments with little external connectivity [11]. The rapid integration of these legacy systems into broader smart mobility ecosystems has therefore introduced unaddressed attack surfaces.

Existing studies on Intrusion Detection Systems (IDSs) for smart mobility have largely focused on isolated components rather than the ecosystem in general. At the network level, studies have proposed intrusion detection for vehicular ad hoc networks [12], edge-based detection for transportation IoT [13], and IoV-specific frameworks, including federated learning approaches such as CVAR-FL [14] and deep-learning-based models [15]. At the in-vehicle level, contributions include GAN-based intrusion detection for CAN-FD buses [16], voltage signal analysis for CAN bus attacks [17], threat detection for autonomous vehicles [18], supervised learning for CAV security [19], federated learning for ITS misbehavior detection [20], and cloud–vehicle collaborative detection for IoV [21].

While these contributions represent meaningful progress, they share a common limitation: models are developed and evaluated within a single component, typically validated on one or two datasets. As a result, they struggle to generalize across the extreme heterogeneity, diverse communication protocols, and spatio-temporal characteristics of real-world smart mobility networks. This growing complexity and expanding threat landscape necessitate the development of an advanced, intelligent intrusion detection framework capable of operating across the heterogeneous and dynamically interconnected components of smart mobility networks [22].

To address this gap, we propose a hybrid deep learning framework that integrates Convolutional Neural Networks (CNNs), long short-term memory (LSTM) networks, and attention mechanisms for intrusion detection in heterogeneous smart mobility networks. The primary novelty of this research lies in its synergistic integration of a CNN, an LSTM, and an attention mechanism to create a comprehensive and resilient intrusion detection framework tailored for heterogeneous smart mobility networks. This hybrid architecture uniquely addresses the limitations of prior component-specific models by employing CNNs to extract spatial features from high-dimensional data sources, including Roadside Unit (RSU) sensor and vehicle telemetry, while leveraging LSTM networks to model temporal dependencies in time-series data such as traffic streams and network communications, capturing evolving intrusion patterns over time. The attention mechanism further enhances this architecture by dynamically weighting the most discriminative features across diverse data streams, suppressing irrelevant signals and enabling precise, context-aware intrusion detection in smart mobility networks.

The proposed framework targets intrusion detection across three core layers of smart mobility infrastructure. The first is the vehicle layer, encompassing in-vehicle networks such as Controller Area Network (CAN) buses, Electronic Control Units (ECUs), On-Board Units (OBUs), and Connected and Automated Vehicles (CAVs). The second is the infrastructure layer, covering Roadside Units (RSUs) supporting vehicle-to-infrastructure and vehicle-to-network (V2I/V2N) communications, 5G/6G networks, and sensor-based smart parking systems. The third is the digital services layer, comprising cloud platforms, IoT-based traffic and parking sensors, Mobility-as-a-Service (MaaS) platforms, Vehicular Ad Hoc Networks (VANETs), and the broader Internet of Vehicles (IoV) ecosystem. By operating across all three layers, the framework overcomes the limitations of component-specific Intrusion Detection Systems, offering consistent cross-dataset performance, scalability, improved detection performance, and enhanced resilience against new and evolving threats.

This study makes the following contributions:

1.: We propose a novel hybrid deep learning framework that integrates CNNs, LSTM networks, and an attention mechanism for spatial feature extraction, temporal dependency modeling, and adaptive feature weighting respectively, achieving enhanced intrusion detection accuracy across heterogeneous smart mobility data streams.
2.: We take into account the heterogeneous nature of the smart mobility network by considering threats occurring in diverse components such as in-vehicle Controller Area Network (CAN) buses, vehicular ad hoc networks, Internet of Vehicles (IoV) and 5G-connected vehicles, IoT smart parking systems, and cloud-based mobility services.
3.: We conduct a rigorous and comprehensive evaluation of the proposed model on multiple datasets spanning key smart mobility domains: VeReMi Extension (vehicular misbehavior), Car Hacking (in-vehicle network attacks), 5G-NIDD (5G network intrusions), Edge-IIoTset (IoT and edge threats), UNSW-NB15 (cloud-based attacks), and CICIoV2024 (Internet of Vehicles anomalies).

The subsequent sections of this paper are structured as follows: Section 2 includes the literature review and highlights the gaps in the literature. Section 3 outlines the methodology adopted in this study. Section 4 is for the presentation and discussion of the findings. Section 5, rounding off, includes the concluding remarks and recommendations for future studies.

2. Related Work

The integration of connected and autonomous vehicles (CAVs), intelligent transportation systems (ITSs), and digital platforms in smart mobility has raised cybersecurity concerns, resulting in a growing interest in their Intrusion Detection System (IDS) research.

For vehicular components, particularly the Controller Area Network (CAN) bus, several intrusion detection approaches have been proposed. Wang et al. [16] developed a GAN-based IDS for CAN-FD bus nodes, achieving a 99.93% detection rate and a 0.15 ms response time, surpassing baseline methods by 1.2%. Khan et al. [23] proposed an LSTM-FCN model enhanced with squeeze-and-excite layers and attention mechanisms for automotive theft detection, reporting accuracies of 99.36% and 96.36% on the HCRL and test datasets, respectively. Wei et al. [24] introduced an attention-based autoencoder model for binary CAN message processing, achieving AUC scores of 0.923 and 0.915 on benchmark datasets, while Kang et al. [25] combined time-interval likelihood and signal-based analysis in the CANival framework, reporting true positive rates of up to 0.960. At the physical layer, Levy et al. [17] employed voltage-based spoofing detection, achieving 100% accuracy in intrusion localization, and Yin et al. [26] proposed an LSTM autoencoder for voltage-based attack filtering, reducing attack success rates to 0.18% with 99.4% ECU identification accuracy. Further contributions include a control-system-level analysis for J1939 buses via Simulink-CANoe simulations [27] and the IDS-DEC framework [28], which combines LSTM-CNN autoencoders with entropy-based clustering to achieve 99% accuracy on Car Hacking datasets.

Beyond in-vehicle security, existing literature also highlights cybersecurity challenges across broader smart mobility infrastructure, including electric vehicle charging stations (EVCS), roadside units (RSUs), and ITS components. For EVCS, Almadhor et al. [29] proposed a transfer-learning-based IDS combining deep neural networks and LSTM-RNN, achieving 93% accuracy on the CICEVSE2024 dataset for cyber-physical attack detection, while risk analyses by Hamdare et al. [30] and Skarga-Bandurova et al. [31] further highlight protocol-centric vulnerabilities inherent to charging infrastructure. Within ITS, Usha et al. [32] employed adaptive neuro-fuzzy inference systems (ANFISs) for DDoS detection, achieving 94.3% accuracy across the UNSW-NB15 and CICDDoS2019 datasets, outperforming SVM and CNN baselines in dynamic vehicular settings. Weerasinghe et al. [33] explored threshold cryptography for securing V2X communications in 5G-enabled ITSs, and Chowdhury et al. [34] applied Kalman filters and Dempster–Shafer theory for anomaly detection in urban traffic simulations. At the infrastructure level, Li et al. [35] proposed a hybrid RSU deployment strategy leveraging parked vehicles as temporary units to enhance network coverage, though their focus remains on physical deployment rather than intrusion detection. Channamallu et al. [36] further identified persistent cybersecurity gaps in IoT-based smart parking systems, particularly within wireless sensor networks and VANETs.

Cybersecurity and privacy challenges also extend to data and digital platforms within smart mobility ecosystems, including cloud-based systems and Mobility-as-a-Service (MaaS) platforms. These platforms function as systems of systems, integrating backend infrastructures, third-party providers, endpoints, and data producers to deliver comprehensive mobility solutions [37], making them particularly attractive and consequential targets for cyberattacks. Several studies have reviewed MaaS-specific cybersecurity risks, including insider threats and AI-driven attacks, proposing countermeasures such as overlay networking and blockchain-based ticketing [38,39,40,41]. For cloud-based intrusion detection, Attou et al. [42] proposed a random forest model achieving 98.3–99.9% accuracy on Bot-IoT and NSL-KDD datasets, while hybrid approaches including a Bi-SC-CBALSTM model [43] and a variational autoencoder Wasserstein GAN [44] have further advanced detection performance in digital platform environments.

Recent studies have begun exploring large-scale AI models for intrusion detection across IoT, vehicular, and cloud environments. For example, transformer-based approaches such as SecurityBERT proposed by Ferrag et al. [45] achieved 98.2% accuracy across fourteen attack types, outperforming earlier hybrid models, including GAN–Transformer and CNN–LSTM architectures. Building on this trend, generative AI frameworks leveraging large language models (LLMs) have been applied to detect zero-day attacks in electric vehicle ecosystems, achieving detection accuracies of around 98% with lower false positive rates compared to traditional IDS methods [46]. Similarly, LLM-based neuro-symbolic agents have shown promising results for cloud anomaly detection, achieving F1-scores above 92% on benchmark datasets [47]. However, despite these advances, challenges remain regarding computational cost, scalability, and explainability, particularly when fine-tuning LLMs for specific network environments [48]. Isgandarov et al. [49] further contributed an Isolation-Forest-based approach for interpretable anomaly detection in shared mobility systems, whereas Zhang et al. [50] proposed a federated learning framework that achieved a 92.10% F1-score while preserving data privacy.

Security in the Internet of Vehicles (IoV) and vehicular communication networks, particularly VANETs and V2X communications, including V2V and V2I, has been widely studied [51]. At the network level, Kong et al. [52] proposed a reinforcement quantile spatial CNN (RQSCNN) for V2V cybersecurity in 6G networks, achieving high throughput and low latency, while Karim et al. [53] introduced a blockchain-based framework using elliptic curve cryptography for secure 5G IoV data exchange, validated via the Scyther tool. For V2X threat mitigation, Sedar et al. [54] explored attack vectors and proposed AI-driven countermeasures for adversarial threats and lightweight models including UltraADV and a VAE-based approach; [55,56] achieved up to a 99% F1-score on VeReMi-Extension datasets. Several ML-based IDS studies have similarly targeted replay and DDoS attacks in VANETs [57,58,59,60,61], consistently achieving over 99% accuracy on VeReMi datasets. More recently, Fu et al. [62] introduced IoV-BERT-IDS, a hybrid LLM-based model capable of detecting both in-vehicle and extra-vehicular threats, demonstrating generalization across CICIDS and Car-Hacking datasets.

To contextualize the contributions of the proposed attention-enhanced CNN–LSTM framework, Table 1 provides a chronological comparison of of cutting-edge studies on securing smart mobility components from 2023 to 2025, outlining their primary areas of focus and emphasizing the specific limitations that our system seeks to overcome.

As summarized in Table 1, a critical gap persists across the existing literature: proposed IDS solutions remain narrowly scoped to isolated components—vehicular networks, infrastructure, digital platforms, or IoV—without accounting for the interdependent nature of smart mobility ecosystems. Specifically, they fail to capture the interdependencies among physical infrastructure (RSUs, smart roads, EV charging stations, and 5G backbones), IoT components (sensors, telematics, and drones), data platforms (MaaS and cloud/edge computing), and vehicular communication networks (V2X and C-V2X), fundamentally limiting their ability to detect and respond to cross-domain threats.

To address this, we propose a novel hybrid framework that integrates CNNs, LSTM networks, and an attention mechanism to enable real-time, scalable intrusion detection across the smart mobility stack, encompassing in-vehicle networks, mobility infrastructure, digital platforms, and IoV and thereby enhancing security in smart city ecosystems.

3. Research Methodology

This section describes the workflow of the proposed intrusion detection framework for smart mobility networks, as depicted in Figure 1. More specifically, it is structured into four sequential stages, (1) dataset selection, (2) data preprocessing, (3) model learning, and (4) performance evaluation, each designed to address the heterogeneous and dynamic nature of smart mobility traffic while ensuring reproducibility and rigor in the experimental process.

3.1. Dataset Insights

The experimental evaluation of the proposed intrusion detection framework relies on a deliberate collection of datasets representative of the typical smart mobility networks. This provides a cross-infrastructure cybersecurity perspective, capturing attacks and anomalies from diverse environments such as in-vehicle networks, the Internet of Vehicles (IoV), vehicular ad hoc networks (VANETs), 5G-enabled vehicular communications, ground IoT sensors for intelligent parking, and cloud-based mobility services and platforms. These datasets reflect the interconnected nature of smart mobility networks, where vehicles, roadside units (RSUs), IoT devices, and cloud-based services interoperate. A detailed description of the selected datasets is presented below, and their respective statistical distributions are outlined in Table 2.

3.1.1. VeReMi Extension Dataset

Introduced by Kamel et al. [63], this dataset focuses on V2X misbehavior detection in cooperative ITS (C-ITS). It primarily captures communication data from RSUs, VANETs, and V2X infrastructures. Attack scenarios include Data Replay, Denial of Service (DoS), Eventual Stop, Disruptive Attacks, Random DoS, and Traffic Congestion Sybil. The dataset has been extensively employed for misbehavior detection [64], anomaly detection [65], and intrusion detection [66].

3.1.2. Car Hacking Dataset

Published by Song et al. [67], this dataset captures traffic from Controller Area Networks (CAN), the de facto standard for in-vehicle communications. It has been widely used to evaluate IDS for CAVs, ECUs, and CAN Bus attacks [68,69]. The dataset contains a combination of normal traffic and hostile attack events such as Fuzzy, DoS, and Spoofing (gear/RPM manipulation).

3.1.3. 5G-NIDD Dataset

Developed by Samarakoon et al. [70], this dataset was generated on a 5G testbed at the University of Oulu, Finland. It contains realistic network traffic from 5G environments, covering HTTP Flood, UDP Flood, SYN Flood, ICMP Flood, and Slow-Rate DoS, as well as port scan attacks (TCP connect scan, UDP scan, and SYN scan). It has been widely adopted for DDoS detection in 5G environments [71,72] and for intrusion detection in next-generation mobile networks [73].

3.1.4. Edge-IIoTset Dataset

Proposed by Ferrag et al. [74], this dataset targets threats in vehicular IoT and edge-assisted applications, such as smart parking systems and IoT-enabled sensors. It provides a diverse set of cyberattacks categorized into five classes: malware, DoS/DDoS, injection, MITM, and information gathering. Its breadth makes it highly relevant for evaluating IoT- and edge-related threats in smart mobility infrastructures.

3.1.5. UNSW-NB15 Dataset

To evaluate the framework’s resilience against attacks targeting cloud-based infrastructures and MaaS platforms, the UNSW-NB15 dataset [75] was incorporated. Generated using the IXIA PerfectStorm platform at the Australian Centre for Cyber Security (ACCS), the dataset combines real-world benign traffic with synthesized attack scenarios spanning generic attacks, fuzzers, backdoors, exploits, DoS, reconnaissance, worms, and shellcode. Its broad coverage of diverse and realistic attack vectors across cloud and enterprise environments has established UNSW-NB15 as a widely adopted benchmark for intrusion detection research.

3.1.6. CICIoV2024 Dataset

Neto et al. [76] introduced a dataset to address the absence of critical features in existing datasets for IoV. The dataset was generated by capturing network traffic from multiple real ECUs within an IoV environment, with the primary aim of fostering the design of sophisticated cybersecurity mechanisms for IoV. It includes five distinct attack scenarios, categorized as spoofing and DoS, executed through CAN protocol against the intact internal architecture of a 2019 Ford automobile. This dataset has since been adopted in subsequent intrusion detection studies, including those by Merzouk et al. [77] and Mahdi et al. [78].

3.2. Dataset Preprocessing

Data preprocessing is a crucial stage in the proposed framework (see Figure 1) to ensure that heterogeneous datasets were standardized into a model-ready format. For consistent model training across diverse datasets capturing different attack scenarios in smart mobility environments, all malicious traffic data are merged into a single ATTACK label, simplifying the detection task to a binary classification problem (Benign vs. Attack). Preprocessing involves converting IP-related features to integer values using the Python IP address library, removing constant features, and applying label encoding for categorical variables with many unique values. It also includes specifying one-hot encoding for those with fewer categories and addressing missing values by inputing zeros for IPs, mean values for numerical features, and the placeholder unknown for categorical attributes. Dataset-specific refinements were also introduced: for instance, in the UNSW-NB15 dataset, numeric attributes stored as strings were converted to integers, while in the 5G-NIDD dataset, features with more than 60% missing values were discarded to reduce sparsity. Finally, all numerical features across datasets were normalized using the Standard Scaler (Equation (1)) to ensure each feature attains a zero mean and unit variance, thereby eliminating scale disparities among features and also facilitates stable gradient convergence and prevents features with larger numeric ranges from dominating the learning process.

x^{'} = \frac{x - μ}{σ}

(1)

where x represents the original feature value,

μ

denotes the mean of the feature, and

σ

is the corresponding standard deviation.

3.3. Proposed Model for Intrusion Detection

A hybrid CNN–LSTM–Attention architecture (illustrated in Figure 1) forms the core of the proposed framework, which is designed to improve intrusion detection performance in smart mobility environments through binary classification. The design integrates convolutional layers for local feature extraction, recurrent layers for long-term temporal modeling, and an attention mechanism to enhance the interpretability and relevance of the learned representations. Algorithmically, the design of the proposed model is represented in Algorithm 1.

3.4. Problem Formulation

The proposed framework classifies preprocessed input sequences as benign or malicious, leveraging a CNN–LSTM–Attention architecture for intrusion detection across smart mobility networks. This section formally defines the problem and presents the mathematical formulation of each model component, with Algorithm 1 providing a detailed procedural representation of the proposed model.

3.5. Problem Definition

Given a labeled dataset of network traffic from different smart mobility components,

D = {(x_{i}, y_{i})}_{i = 1}^{N}

, where

x_{i} \in R^{T \times F}

represents each input sample. Here, T denotes the sequence length (

T = 1

in this study, corresponding to a single observation), and F represents the number of features. The target labels are

y_{i} \in {0, 1}

, where

y_{i} = 0

represents benign traffic and

y_{i} = 1

represents malicious traffic. The objective is to learn a mapping function

f_{θ} : R^{T \times F} \to {0, 1}

, parameterized by

θ

, that minimizes classification error on unseen data, enabling robust intrusion detection across smart mobility components presented below.

3.6. CNN Layer for Feature Extraction

The network begins with an input layer that accommodates both one-dimensional and multi-dimensional input sequences. The first stage of feature extraction employs two 1D convolutional layers with filter sizes of 128 and 16, respectively. Both layers are configured with a kernel size of three, utilize ReLU as the activation function, and are subsequently processed with batch normalization and max-pooling. This convolutional block captures local patterns and reduces the dimensionality while retaining essential information.

Algorithm 1 CNN–LSTM with Attention Pseudocode for Binary Classification

Input: Training $X_{t r a i n}, y_{t r a i n}$ , Validation $X_{v a l}, y_{v a l}$ , Test $X_{t e s t}, y_{t e s t}$
Set $c n n_f i l t e r s = [128, 16]$ , $c n n_k e r n e l_s i z e = 7$ , $l e a r n i n g_r a t e = 0.001$
Set hyperparameters: $l s t m_u n i t s = [256, 96, 48]$ , $d r o p o u t_r a t e = 0.2$ , $d e n s e_u n i t s = [32, 48, 1]$
Initialize input layer with $i n p u t_s h a p e = X_{t r a i n} . s h a p e [1 :]$
Normalize the dataset between $(0, 1)$
for each training epoch do
for each batch in training data do
Apply Conv1D: $C_{1} (t) = R e L U (W_{c 1} * x (t) + b_{c 1})$ where $W_{c 1}$ = CNN Weight Matrix
Apply BatchNorm and MaxPooling
Apply Conv1D: $C_{2} (t) = R e L U (W_{c 2} * C_{1} (t) + b_{c 2})$
Apply BatchNorm and MaxPooling
Apply LSTM Layer 1: $h_{1} (t) = L S T M_{256} (C_{2} (t), h_{1} (t - 1))$
Apply LSTM Layer 2: $h_{2} (t) = L S T M_{96} (h_{1} (t), h_{2} (t - 1))$
Apply LSTM Layer 3: $h_{3} (t) = L S T M_{48} (h_{2} (t), h_{3} (t - 1))$
Compute Attention: $α (t) = S o f t m a x (D e n s e (h_{3} (t)))$
Context Vector: $c = \sum_{t = 1}^{T} α (t) \times h_{3} (t)$
Dense Layer 1: $d_{1} = R e L U (W_{d 1} \times c + b_{d 1})$
Apply Dropout and BatchNorm
Dense Layer 2: $d_{2} = R e L U (W_{d 2} \times d_{1} + b_{d 2})$
Output: $\hat{y} = S i g m o i d (W_{o u t} \times d_{2} + b_{o u t})$
Update weights using Adam optimizer
end for
Validate on $X_{v a l}$ , $y_{v a l}$
end for
Evaluate on test set $(X_{t e s t}, y_{t e s t})$
Generate predictions: $p r e d i c t i o n s = (\hat{y} > 0.5)$

The 1D CNN layers extract local spatial correlations from the input sequence. For a filter k with kernel

W^{(k)}

, the convolutional output at time step t is

h_{t}^{(k)} = σ (\sum_{j = 1}^{F} W_{j}^{(k)} \cdot x_{t + j} + b^{(k)}),

(2)

where

σ (\cdot)

is the ReLU activation function. Batch normalization stabilizes learning, and max-pooling reduces dimensionality:

{\tilde{h}}_{t}^{(k)} = BN (h_{t}^{(k)}), z_{j}^{(k)} = max_{t \in pool (j)} {\tilde{h}}_{t}^{(k)} .

(3)

These operations capture spatial patterns in multidimensional data from vehicular networks, infrastructure sensors, and digital platforms.

3.7. LSTM Layers for Sequential Modeling

The pooled outputs from the CNN layers are processed by stacked LSTM layers to learn and represent time-dependent correlations across sequential observations. In this research, the third LSTM layer is configured such that the layer returns full sequences, enabling the subsequent attention mechanism to operate across the full temporal dimension. Overall, at time step t, the LSTM updates are

\begin{matrix} f_{t} & = σ (W_{f} [h_{t - 1}, z_{t}] + b_{f}), & (forget gate) \\ i_{t} & = σ (W_{i} [h_{t - 1}, z_{t}] + b_{i}), & (input gate) \\ {\tilde{c}}_{t} & = tanh (W_{c} [h_{t - 1}, z_{t}] + b_{c}), & (candidate cell state) \\ c_{t} & = f_{t} ⊙ c_{t - 1} + i_{t} ⊙ {\tilde{c}}_{t}, & (cell state update) \\ o_{t} & = σ (W_{o} [h_{t - 1}, z_{t}] + b_{o}), & (output gate) \\ h_{t} & = o_{t} ⊙ tanh (c_{t}), & (hidden state) \end{matrix}

(4)

where

z_{t}

represents the pooled output from the CNN layers,

h_{t - 1}

and

c_{t - 1}

are the previous hidden and cell states, and ⊙ denotes element-wise multiplication. The three stacked LSTM layers encapsulate the temporal pattern, which makes it possible to identify the evolving intrusion patterns in network traffic.

3.8. Attention Mechanism

Attention is applied to the returned sequences from the last LSTM layer to emphasize informative time steps from the LSTM hidden states

{h_{t}}_{t = 1}^{T}

. Attention scores are computed via a dense layer, normalized using the softmax function, and multiplied element-wise with the LSTM outputs (see Equation (5)). A summation operation, as shown in Equation (6), aggregates the weighted sequence across all time steps to form a context vector, which represents the most informative features across time. Overall, the attention mechanism enhances interoperability by demonstrating which portions of the input sequence contribute most to the model decision.

e_{t} = v^{⊤} tanh (W_{a} h_{t} + b_{a}), α_{t} = \frac{exp (e_{t})}{\sum_{j = 1}^{T} exp (e_{j})}

(5)

c = \sum_{t = 1}^{T} α_{t} h_{t},

(6)

where

α_{t}

represents the attention weights and c the context vector summarizing key sequence information. This ensures focus on critical anomalies in heterogeneous smart mobility data.

3.9. Classification Decision

The refined context vector c produced by the attention mechanism is processed through fully connected layers with dropout and L2 regularization to prevent overfitting:

u = σ (W_{1} c + b_{1}), v = σ (W_{2} u + b_{2}) .

(7)

In the proposed framework, the last stage is where the sigmoid function of Equation (8) predicts the probability of malicious behavior.

\hat{y} = σ (W_{o} v + b_{o}),

(8)

where

\hat{y} \in [0, 1]

represents the probability of malicious (

\hat{y} = 1

) or benign (

\hat{y} = 0

) behavior.

3.10. Learning Objective

The network is trained by minimizing the binary cross-entropy loss (see Equation (9)), with L2 regularization helping to reduce the risk of overfitting. Together, this effectively optimizes decision boundaries for binary classification tasks.

L (θ) = - \frac{1}{N} \sum_{i = 1}^{N} [y_{i} log {\hat{y}}_{i} + (1 - y_{i}) log (1 - {\hat{y}}_{i})] + λ {∥ θ ∥}_{2}^{2},

(9)

where

λ

is the regularization parameter.

4. Experimental Results and Analysis

This section presents the empirical evaluation of the proposed CNN–LSTM–Attention framework across six benchmark datasets, 5G-NIDD, UNSW-NB15, Edge-IIoTset, Car-Hacking, CICIoV2024, and VeReMi Extension, spanning diverse attack scenarios and network environments representative of cloud, vehicular, IoT, and other mobility platforms.

4.1. Experimental Setup

All experiments were conducted using an NVIDIA Tesla T4 GPU with 32 GB of RAM. The CNN–LSTM–Attention model was implemented using Scikit-learn (v1.6.1), Keras (v3.10.0), Pandas (v2.3.3), and TensorFlow (v2.19). The dataset was processed as described in Section 3.2, followed by the splitting into training, validation, and testing datasets at the ratio of 70%:15%:15%. Random seeds 0, 1, and 42 were used across the NumPy (v2.0.2), TensorFlow, and Python (v3.12.12) environments to ensure reproducibility during three independent experimental runs.

In order to improve the model efficiency and to limit overfitting and computation waste, Keras Tuner with Bayesian optimization was used for hyperparameter tuning. It builds a surrogate model, and specifically a Gaussian process, to model and explore the hyperparameter space. Unlike grid search or random search, Bayesian optimization focuses on evaluating the most promising hyperparameters and therefore is suited for the time-consuming architectures in deep learning. The tuning was targeted towards maximizing validation accuracy for the purposes of the optimization. Thirty trials were done, with a maximum of 10 epochs for each of the trials. In order to control overfitting, early stopping was instituted with five epochs of patience and learning rate reduction on a plateau with a factor of 0.5 and three epochs of patience to increase convergence speed. Each of the trials was conducted with a batch size of 4096.

The search space included a wide array of flexible hyperparameters to optimize model trade-off between capacity, regularization and computation:

CNN configuration: Filters in Layer 1 (32–128, step = 32), Filters in Layer 2 (16–64, step = 16); Kernel size $\in {3, 5, 7}$ .
LSTM units: Layer 1 (64–256, step = 64), Layer 2 (32–128, step = 32), Layer 3 (16–64, step = 16).
Dropout rates: Standard dropout (0.2–0.5, step = 0.1); Recurrent dropout (0.2–0.5, step = 0.1).
Dense layers: Units in Layer 1 (32–128, step = 32), Layer 2 (16–64, step = 16).
Regularisation and optimisation: L2 penalty $\in {0.001, 0.01, 0.1}$ ; Learning rate $\in {1 \times 10^{- 4}, 5 \times 10^{- 4}, 1 \times 10^{- 3}}$ .

The resulting optimal hyperparameter configuration obtained through Bayesian optimization is summarized in Table 3. Additional training configurations, including batch size, number of training epochs, and regularization strategies (EarlyStopping and ReduceLROnPlateau), are also detailed in Table 3 to ensure reproducibility.

4.2. Metrics for Model Evaluation

Evaluating model performance is key to this research. For this purpose, a confusion matrix was used to assess the ability of the model to accurately distinguish positive classes (attack) from the negative (Benign) classes, as well as to quantify classification errors, namely false positives and false negatives. Six different classification metrics were employed to assess model performance. These metrics encompass four threshold-based metrics (accuracy, precision, recall and F1-score) and two threshold-independent metrics (ROC-AUC and PR-AUC). Additionally, the false negative rate (FNR) and the false positive rate (FPR) have been used to quantify the proportion of errors relative to the actual classes, providing insight into the reliability of the model for each true outcome. Mathematically, the expressions of the metrics are given below.

Threshold-based metrics:

Accuracy = \frac{tPositives + tNegatives}{Total Samples (N)}

(10)

Precision = \frac{tPositives}{tPositives + fPositives}

(11)

Recall (TPR) = \frac{tPositives}{tPositives + fNegatives}

(12)

F 1 - score = 2 \cdot \frac{Precision \cdot Recall}{Precision + Recall}

(13)

FPR = \frac{fPositives}{fPositives + tNegatives}

(14)

FNR = \frac{fNegatives}{fNegatives + tPositives}

(15)

Threshold-free metrics:

The ROC−AUC metric evaluates a model’s performance across all the classification thresholds.

ROC - AUC = \int_{0}^{1} TPR (FPR) d (FPR)

(16)

And the PR-AUC (see Equation (17)) metric evaluates the area of the precision–recall curve.

PR - AUC = \int_{0}^{1} Precision (Recall) d (Recall)

(17)

where

tPositives

represents the number of correctly identified attacks,

tNegatives

represents the number of correctly classified normal scenarios,

fPositives

represents misclassified normal traffic, and

fNegatives

is the number of misclassified attacks.

4.3. Overall Performance

Table 4 and Table 5 present the averaged metrics across three random seeds along with their standard deviations. The proposed model achieved outstanding classification results, with accuracy greater than 98% in all datasets.

On the 5G-NIDD dataset, which simulates lightweight network intrusion patterns in 5G-enabled vehicular communication environments, the proposed model attains a mean accuracy of (

99.90 % \pm 0.03 %

), a precision of (

99.91 % \pm 0.05 %

), a recall of (

99.83 % \pm 0.13 %

), and an F1-score of (

99.87 % \pm 0.04 %

). Similarly, AUC-ROC and PR-AUC scores reach near-perfect values of (

100.00 % \pm 0.00 %

) for both, indicating outstanding discriminative power.

The model also displays remarkable performance on the UNSW-NB15 dataset, which is a benchmark model for comprehensive network-based intrusions, achieving a mean accuracy of

98.97 %

, a precision of

96.25 %

, a recall of

95.61 %

, and an F1-score of

95.92 %

. The AUC-ROC and PR-AUC scores of

99.94 %

and

99.54 %

, respectively, demonstrate the capacity of the model to equilibrate false positives and false negatives in attack detection.

The Edge-IIoTset, which is customized for the detection of anomalies in network traffic of IIoT and IoT in smart vehicle technology, yields a mean accuracy of

99.96 %

,

99.89 %

precision,

100.00 %

recall, and

99.95 %

F1-score, and a perfect AUC-ROC and PR-AUC of

100.00 %

. The Car Hacking dataset, which focuses on CAN bus exploits, records a perfect score across the board with

100.00 %

for accuracy, precision, recall, F1-score, AUC-ROC, and PR-AUC, showing the model’s zero tolerance for evasion in real-time automotive hacking simulations.

The CICIoV2024 dataset provides comprehensive coverage of multiple attacks targeting Internet of Vehicles (IoV) within mobility contexts and provides mean accuracy of

100.00 %

and all other metrics at

100.00 %

, confirming the capability of the model to cope with the dynamic stream of threats against IoV. Lastly, with the VeReMi extension dataset capturing the misbehavior of cooperative Intelligent Transportation Systems (C-ITSs), the model attained

98.24 %

mean accuracy,

99.25 %

precision,

95.99 %

recall,

97.88 %

F1-score,

98.46 %

AUC-ROC, and

98.79 %

PR-AUC.

Figure 2 presents the confusion matrices for the hybrid model, showing the numbers of true positives (attacks correctly detected), true negatives (normal traffic correctly classified), false positives (normal traffic classified as attacks), and false negatives (attacks unmodelled). The proposed model performs perfectly on both CICIoV2024 and Car Hacking, and there are no misclassification. On Edge-IIoTset; the proposed model wrongly misclassified only one out of 60,900 samples, with benign misclassifications totalling 47 out of 119,100. This indicates a mild bias toward benign classification but demonstrates improved sensitivity to attack traffic.

With regard to the VeReMi extension, 5G-NIDD, and UNSW-NB15, the proposed model has considerably improved false positive and false negative rates, with only 38 benign and 651 attack instances missed on VeReMi, 101 benign and 11 attack instances on 5G-NIDD, and 2349 benign and 1453 attack instances on UNSW-NB15.

The proposed CNN–LSTM–Attention framework has a mean F1-score of 98.94% and a standard deviation of 0.08% (Figure 3) across all datasets, showing that the framework has a well-balanced trade-off between precision and recall. Results show that utilizing CNNs for spatial feature extraction, LSTMs for temporal modeling, and attention mechanisms for enhanced threat focus is a winning combination. The framework is able to maintain performance, and is able to provide a greater proportion of correct classification of benign traffic compared to other frameworks, while also providing a greater proportion of correct classification of attacked traffic. This is an important consideration for real-world smart mobility operational deployments, where having high false negatives (missed attacks) can cause operational and safety issues. Additionally, all standard deviations for all evaluation metrics of accuracy, F1-score, ROC-AUC, and PR-AUC for the model are below 0.3% and 0.5%, showing the model’s stability and the minimized impact of random initialization and data shuffling on the CNN–LSTM–Attention hybrid model.

4.3.1. Analysis of Perfect-Score Results

To further investigate the perfect classification performance observed in some datasets, and to address concerns regarding potential overfitting or optimistic evaluation, we conducted an additional validation using stratified k-fold cross-validation (k = 5) across all datasets. This evaluation complements the previously reported train/validation/test split and provides a more statistically robust assessment of the proposed model’s generalization capability. The mean and standard deviation of accuracy, precision, recall, F1-score, ROC-AUC, and PR-AUC across the five folds are reported in Table 6.

The cross-validation results confirm the stability of the model performance across different data partitions. As shown in Table 6, the Car Hacking dataset maintains near-perfect performance across all folds. This dataset contains CAN bus traffic with structurally distinct attack patterns, including fuzzy attacks, DoS attacks, and spoofing via RPM and gear manipulation. These attacks introduce message patterns that are significantly different from benign CAN traffic, resulting in highly separable feature distributions. Previous studies such as IDS-DEC [28] and Wang et al. [16] have similarly reported near-perfect detection results on this dataset, suggesting that the classification task is inherently separable due to the clear distinction between normal and malicious CAN messages.

A similar trend is observed for the CICIoV2024 dataset [76], which was collected from real electronic control units (ECUs) within a controlled Internet of Vehicles (IoV) testbed. The dataset contains clearly defined CAN protocol attack scenarios with well-separated traffic patterns, which also contributes to the high separability between benign and attack classes. The consistent results obtained across multiple folds further confirm that the observed performance is not caused by a favorable random data split but reflects the intrinsic structure of the dataset.

Furthermore, Figure 4 provides a visual comparison of the mean F1-scores across all evaluated datasets, reinforcing the statistical consistency of the results reported in Table 6. Specifically, the model achieves a mean F1-score of 100% on the Car Hacking dataset, 99.99% on CICIoV2024, 99.95% on Edge-IIoTset, 99.78% on 5G-NIDD, 97.31% on the VeRemi Extension and 95.74% on the UNSW-NB15 dataset.

4.3.2. Efficiency and Practical Considerations

In addition to accuracy and the F1 score, Table 5 also provides additional metrics, including metrics like the false positive rate (FPR) and false negative rate (FNR), inference latency, the number of parameters, and the amount of training time. As depicted in Figure 5, the model produces the highest error rates in the two datasets. The model has an FNR of roughly 4.4% and an FPR of approximately 0.5% for the UNSW-NB15 dataset. The VeReMi Extension dataset also has a high FNR around 4.0%, and the FPR is negligible. The model also achieved near perfect error-rate performance on the Edge-IIoTset, Car Hacking, and CICIoT2024 datasets, where FPR and FNR were near 0% (or 0% in some cases). The 5G-NIDD dataset also performs excellently, having both error rates at less than 0.2%.

To maintain low operational overhead, which is imperative for resource-constrained devices in mobility applications, we have also considered inference latency, which averages around (

4.96 ms \pm 1.26 ms

per sample), allowing for real-time detection at over 200 inferences per second on conventional hardware. Inference latency is typically low, as illustrated in Figure 6a, though the Edge-IIoTset dataset represents a significant inference time bottleneck. In contrast, the VereMi extension and the Car Hacking dataset are incredibly quick to train, while the other datasets have average training time durations of 50 s to 60 s per epoch due to the quantity of samples in the training dataset.

The hybrid model also has a reasonable size of

0.28 M \pm 0.06 M

parameters and

3.72 MB

, as shown in Figure 6b, which supports their deployment on embedded systems, such as in-vehicle ECUs and other resource-constrained systems. The model demonstrates good average training efficiency with average epoch duration of

39.71 s \pm 11.54 s

. However, there is a high variability on epoch duration, with up to 23%, which indicates that efficiency and stability are strongly dependent on the dataset.

The CNN–LSTM–Attention mechanism maintains state-of-the-art performance with solid statistical certainty across all six smart mobility datasets that we investigated, as illustrated in Table 4 and Table 5. The model’s performance consistency and stability make it a strong generalizable model, and it is an adaptable option for the protection of smart mobility ecosystems from a wide variety of cyber threats.

4.4. Ablation Experiment

To assess the contribution of each of the proposed CNN–LSTM–Attention model for intrusion detection, an ablation study was performed. Four variants of the model were studied: CNN-only, LSTM-only, Attention-only, and the complete model that is a composite of all the parts. Table 7 summarizes the benchmark results across all six Smart Mobility datasets. Among the configurations, the CNN-only model demonstrated a high level of performance, with the ability to achieve an F1 score of over 97% in the majority of the analyzed datasets, which is attributed to its ability to model local spatial dependencies in the traffic flow dataset. With reference to the LSTM-only Model, which was able to achieve the highest performance in the task of modeling temporal dependencies, its F1 score was considerably low (e.g.,

63.48 %

on UNSW-NB15 and

91.35 %

on 5G-NIDD) which demonstrates that temporal features in isolation do not sufficiently contribute to a high level. The performance of the Attention-only Model was the lowest (e.g.,

55.96 %

F1 score on 5G-NIDD), demonstrating that the mechanism of Attention requires the combination of spatial and temporal features that are of high quality.

The suggested hybrid deep-learning-based structure, integrating convolutional neural networks (CNNs), long short-term memory (LSTM) networks, and attention mechanisms, received the highest total performance metrics, achieving an F1-score of

98.94 % \pm 0.08 %

and surpassing all other configurations. This enhancement validates the synergistic impact of integration in CNN-based spatial feature extraction, LSTM-based temporal modeling, and attention feature refinement. In addition, the minimal standard deviations (<

0.3 %

) among datasets demonstrates the consistency and reliability of performance, emphasizing the versatility of our framework.

Additionally, the perfect scores reported in Table 7 should be interpreted with caution. The ablation study shows that simpler configurations, such as CNN-only and LSTM-only models, also achieve near-perfect performance on the Car Hacking and CICIoV2024 datasets, suggesting that the classification task may be inherently separable for these datasets. This can be attributed to the structurally distinct attack patterns in the Car Hacking dataset and the controlled data collection environment of CICIoV2024.

4.5. Comparison with the State of the Art

The comparative analysis of mean detection accuracy and F1 score of the proposed framework is illustrated in Table 8 across six distinct datasets and is juxtaposed against existing works. However, most existing works are not directly aimed at smart mobility and are limited to specific facets of the smart mobility system. For example, some works only address vehicular CAN traffic using autoencoder or GAN based models [16,28], and some focus on the detection of misbehaving nodes in VANETs using the VeReMi extension dataset [61]. Cloud and MaaS-based architectures by Zhang et al. [50] focused on mobility security address other verticals but remain within vertical silos without cross-integration with the IoV and other ecosystems. In the same vein, Usha et al. [32] focused on the detection of DDoS in smart transportation systems through an ANFIS-based IDS, which also points to limited specific areas of smart mobility. It is evident that existing works are specific to certain datasets and certain domains and are infrequently tested in heterogeneous conditions that represent real-world mobility systems. In this regard, the proposed IDS has been tested on six distinct datasets (VeReMi extension, Car Hacking, UNSW-NB15, Edge-IIoTset, CICIoV2024, and 5G-NIDD), allowing for a systematic evaluation of its detection capability across diverse network infrastructures and threat scenarios.

4.6. Discussion

The experimental results provide evidence for the efficacy of the suggested framework, as detailed in Section 4. The model possesses strong performance by consistently outperforming its competitors on six diverse datasets. This phenomenon can be understood in theory by the complementary nature of the model’s components (e.g., CNNs, LSTMs, and attention). The convolutional layers capture short-term patterns within the network traffic windows. These short-term patterns can be byte-level anomalies or sudden spikes in packet traffic. The LSTM layers model longer temporal dependencies and help the system identify sequential attack behaviors that emerge over time. The attention model helps to mitigate the overfitting to irrelevant patterns by emphasizing significant temporal features and assigning them higher weights.

As shown in the performance metrics presented in Table 7, while CNN and LSTM models have captured some evaluation metrics strongly, their solitary performance across the board has been weak for all datasets, including the 5G-NIDD, VeReMi extension, and CICIoV2024 datasets. Therefore, the balancing inductive bias from component combination helps the model improve across all smart mobility environments and traffic distribution, which also explains why the model shows impressive generalization across diverse smart mobility environments, as demonstrated in Table 4 and Table 5 and the confusion matrices depicted in Figure 2.

In spite of this, the ablation results Table 7 indicates that the contribution of each architectural component varies across datasets. On some datasets, such as 5G-NIDD and UNSW-NB15, the CNN-only configuration performs comparably to the full model. For instance, on 5G-NIDD, the CNN-only model achieves 99.83% accuracy compared to 99.90% for the full model, while on UNSW-NB15, the CNN-only model (99.03% accuracy, F1: 96.11%) is close to the full model (98.97%, F1: 95.92%). This suggests that for datasets where spatial feature patterns are already highly discriminative, the additional temporal and attention components provide only marginal gains.

However, the benefits of the hybrid architecture become more evident on datasets with stronger temporal dependencies or higher noise levels. For example, on the VeReMi Extension dataset, the proposed model slightly improves accuracy (98.24% vs. 98.17%) while substantially reducing the false positive rate (0.11% vs. 0.38%), indicating that the LSTM and attention mechanisms help suppress spurious detections and improve decision consistency. Similarly, on the Edge-IIoTset dataset, the full model achieves near-perfect performance (accuracy of

99.96 %

, F1-score of

99.95 %

, and AUC-ROC of

100 %

), demonstrating its effectiveness in complex IoT environments. Additionally, perfect scores observed on the Car Hacking and CICIoV2024 datasets (accuracy, F1-score, and AUC-ROC of

100 %

) should be interpreted with caution. The ablation results show that simpler models can also achieve near-perfect performance on these datasets, suggesting that the classification task may be inherently separable due to structurally distinct attack patterns and controlled data collection conditions.

Comprehensive results show that adding CNNs, LSTMs, and attention mechanisms improves the identification intrusion framework’s results across smart mobility datasets, achieving over

98 %

accuracy across all six datasets. The proposed CNN–LSTM–Attention framework demonstrates superior performance and robustness over various heterogeneous infrastructures. However, the UNSW-NB15 dataset continues to be the most challenging dataset due to increased false positives and false negatives, as evident in Figure 2f. This indicates that perhaps the proposed frameworks need to be trained more than 10 epochs or other approaches such as domain adaption or transfer learning may need to be employed to mitigate the challenges posed by dynamic and complex vehicular communication environments. Overall, the results show that attention mechanisms improve the model’s focus on important traffic features, enabling better detection of less obvious sophisticated attack behaviors while maintaining good benign traffic classification.

The offered framework is an effective framework for real-time intrusion detection and provides a robust and reliable defense against cyber attacks on smart mobility systems.

5. Conclusions

This study proposes a novel hybrid deep learning framework for intrusion detection across heterogeneous smart mobility networks, integrating CNNs, LSTM networks, and an attention mechanism to capture spatial features, temporal dependencies, and context-aware anomalies within diverse network traffic. The framework is evaluated on six benchmark datasets: VeReMi Extension, Car Hacking, 5G-NIDD, Edge-IIoTset, UNSW-NB15, and CICIoV2024; collectively comprising over one million instances representative of the heterogeneous nature of smart mobility networks. Across all datasets, the framework achieves a detection accuracy exceeding 98%, a mean F1-score of 98.94%, and minimal inference latency, demonstrating its suitability for real-time deployment. With a model size of 3.72 MB, the framework is further deployable on resource-constrained edge devices such as in-vehicle electronic control units, extending its practical applicability across the smart mobility stack. Future work will extend the framework to support both binary and multiclass classification for more comprehensive threat coverage. Real-time evaluation using live data streams from smart mobility systems will be explored to validate practical deployment, and the integration of Explainable AI (XAI) techniques will be pursued to enhance model interpretability, fostering greater transparency and stakeholder trust in model predictions.

Author Contributions

Conceptualization, V.C., A.D.B. and P.A.; methodology, O.E. and B.A.; software, P.A.; validation and formal analysis, O.E. and B.A.; investigation, P.A. and V.C.; resources, V.C.; data curation, O.E.; writing—original draft preparation, O.E.; writing—review and editing, O.E.; visualization, B.A.; supervision and project administration, V.C., A.D.B. and P.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research is funded by Hitachi Rail through a National Ph.D Scholarship in Cybersecurity, and partially supported by MOST—Centro Nazionale per la MObilita SosTenibile Project—Spoke 8 (MaaS e Servizi Innovativi).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets analyzed during this study are publicly available benchmark datasets. VeReMi Extension is available at https://github.com/josephkamel/VeReMi-Dataset (accessed on 2 December 2025); Car Hacking at https://ocslab.hksecurity.net/Datasets/car-hacking-dataset (accessed on 2 December 2025); 5G-NIDD at https://ieee-dataport.org/documents/5g-nidd-comprehensive-network-intrusion-detection-dataset-generated-over-5g-wireless (accessed on 6 December 2025); Edge-IIoTset at https://ieee-dataport.org/documents/edge-iiotset-new-comprehensive-realistic-cyber-security-dataset-iot-and-iiot-applications (accessed on 6 December 2025); UNSW-NB15 at https://research.unsw.edu.au/projects/unsw-nb15-dataset (accessed on 8 December 2025); and CICIoV2024 at https://www.unb.ca/cic/datasets/iov-dataset-2024.html (accessed on 10 December 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AI	Artificial Intelligence
ANFIS	Adaptive Neuro-Fuzzy Inference System
AUC	Area Under the Curve
CAN	Controller Area Network
CAV	Connected and Autonomous Vehicle
CNN	Convolutional Neural Network
DDoS	Distributed Denial of Service
DL	Deep Learning
DoS	Denial of Service
ECU	Electronic Control Unit
EV	Electric Vehicle
EVCS	Electric Vehicle Charging Station
FNR	False Negative Rate
FPR	False Positive Rate
GAN	Generative Adversarial Network
GPU	Graphics Processing Unit
IDS	Intrusion Detection System
IoT	Internet of Things
IoV	Internet of Vehicles
ITS	Intelligent Transportation System
IVN	In-Vehicle Network
LSTM	Long Short-Term Memory
MaaS	Mobility-as-a-Service
ML	Machine Learning
PR-AUC	Precision–Recall Area Under the Curve
RNN	Recurrent Neural Network
ROC	Receiver Operating Characteristic
RSU	Roadside Unit
STD	Standard Deviation
V2I	Vehicle-to-Infrastructure
V2V	Vehicle-to-Vehicle
V2X	Vehicle-to-Everything
VANET	Vehicular Ad Hoc Network

References

Zhu, J.; Gianoli, A.; Noori, N.; de Jong, M.; Edelenbos, J. How different can smart cities be? A typology of smart cities in China. Cities 2024, 149, 104992. [Google Scholar] [CrossRef]
Šurdonja, S.; Giuffrè, T.; Deluka-Tibljaš, A. Smart mobility solutions—Necessary precondition for a well-functioning smart city. Transp. Res. Procedia 2020, 45, 604–611. [Google Scholar] [CrossRef]
Gracias, J.S.; Parnell, G.S.; Specking, E.; Pohl, E.A.; Buchanan, R. Smart cities—A structured literature review. Smart Cities 2023, 6, 1719–1743. [Google Scholar] [CrossRef]
Lee, D.; Camacho, D.; Jung, J.J. Smart mobility with Big Data: Approaches, applications, and challenges. Appl. Sci. 2023, 13, 7244. [Google Scholar] [CrossRef]
EMQX Team. The Road to Smart Mobility: Opportunities and Challenges. EMQX Blog 2023. Available online: https://www.emqx.com/en/blog/the-road-to-smart-mobility (accessed on 29 November 2025).
Rocco Di Torrepadula, F.; Di Martino, S.; Mazzocca, N.; Sannino, P. A Reference Architecture for Data-Driven Intelligent Public Transportation Systems. IEEE Open J. Intell. Transp. Syst. 2024, 5, 469–482. [Google Scholar] [CrossRef]
Ramírez-Moreno, M.A.; Keshtkar, S.; Padilla-Reyes, D.A.; Ramos-López, E.; García-Martínez, M.; Hernández-Luna, M.C.; Mogro, A.E.; Mahlknecht, J.; Huertas, J.I.; Peimbert-García, R.E.; et al. Sensors for sustainable smart cities: A review. Appl. Sci. 2021, 11, 8198. [Google Scholar] [CrossRef]
Alam, T.; Gupta, R.; Nasurudeen Ahamed, N.; Ullah, A.; Almaghthwi, A. Smart mobility adoption in sustainable smart cities to establish a growing ecosystem: Challenges and opportunities. MRS Energy Sustain. 2024, 11, 304–316. [Google Scholar] [CrossRef]
Paiva, S.; Ahad, M.A.; Tripathi, G.; Feroz, N.; Casalino, G. Enabling technologies for urban smart mobility: Recent trends, opportunities and challenges. Sensors 2021, 21, 2143. [Google Scholar] [CrossRef] [PubMed]
Karopoulos, G.; Kambourakis, G.; Chatzoglou, E.; Hernández-Ramos, J.L.; Kouliaridis, V. Demystifying in-vehicle Intrusion Detection Systems: A survey of surveys and a meta-taxonomy. Electronics 2022, 11, 1072. [Google Scholar] [CrossRef]
Lampe, B.; Meng, W. A survey of deep learning-based intrusion detection in automotive applications. Expert Syst. Appl. 2023, 221, 119771. [Google Scholar] [CrossRef]
Arya, M.; Sastry, H.; Dewangan, B.K.; Rahmani, M.K.I.; Bhatia, S.; Muzaffar, A.W.; Bivi, M.A. Intruder detection in VANET data streams using federated learning for smart city environments. Electronics 2023, 12, 894. [Google Scholar] [CrossRef]
Bhavsar, M.H.; Bekele, Y.B.; Roy, K.; Kelly, J.C.; Limbrick, D. FL-IDS: Federated Learning-Based Intrusion Detection System Using Edge Devices for Transportation IoT. IEEE Access 2024, 12, 52215–52226. [Google Scholar] [CrossRef]
Zhao, J.; Rao, X.; Liu, J.; Guo, Y.; Yang, B. CVAR-FL IoV Intrusion Detection Framework. In Proceedings of the International Conference on Information Security Practice and Experience; Springer: Singapore, 2023; pp. 123–137. [Google Scholar] [CrossRef]
Khan, H.; Tejani, G.G.; AlGhamdi, R.; Alasmari, S.; Sharma, N.K.; Sharma, S.K. A secure and efficient deep learning-based intrusion detection framework for the internet of vehicles. Sci. Rep. 2025, 15, 12236. [Google Scholar] [CrossRef]
Wang, X.; Xu, Y.; Xu, Y.; Wang, Z.; Wu, Y. Intrusion Detection System for In-Vehicle CAN-FD Bus ID Based on GAN Model. IEEE Access 2024, 12, 82402–82412. [Google Scholar] [CrossRef]
Levy, E.; Shabtai, A.; Groza, B.; Murvay, P.S.; Elovici, Y. CAN-LOC: Spoofing Detection and Physical Intrusion Localization on an In-Vehicle CAN Bus Based on Deep Features of Voltage Signals. IEEE Trans. Inf. Forensics Secur. 2023, 18, 4800–4814. [Google Scholar] [CrossRef]
Onur, F.; Barışkan, M.A.; Gönen, S.; Kubat, C.; Tunay, M.; Yılmaz, E.N. Detection of Cyber Attacks Targeting Autonomous Vehicles Using Machine Learning. In Proceedings of the Advances in Intelligent Manufacturing and Service System Informatics (IMSS 2023); Lecture Notes in Mechanical Engineering; Şen, Z., Uygun, Ö., Erden, C., Eds.; Springer: Singapore, 2024. [Google Scholar] [CrossRef]
Aloqaily, A.; Abdallah, E.E.; AbuZaid, H.; Abdallah, A.E.; Al-hassan, M. Supervised machine learning for real-time intrusion attack detection in connected and autonomous vehicles: A security paradigm shift. Informatics 2025, 12, 4. [Google Scholar] [CrossRef]
Campos, E.M.; Hernandez-Ramos, J.L.; Vidal, A.G.; Baldini, G.; Skarmeta, A. Misbehavior detection in intelligent transportation systems based on federated learning. Internet Things 2024, 25, 101127. [Google Scholar] [CrossRef]
Qin, J.; Xun, Y.; Liu, J. CVMIDS: Cloud–Vehicle Collaborative Intrusion Detection System for Internet of Vehicles. IEEE Internet Things J. 2024, 11, 321–332. [Google Scholar] [CrossRef]
Kalpani, N.; Rodrigo, N.; Seneviratne, D.; Ariyadasa, S.; Senanayake, J. Cutting-edge approaches in Intrusion Detection Systems: A systematic review of deep learning, reinforcement learning, and ensemble techniques. Iran J. Comput. Sci. 2025, 8, 303–333. [Google Scholar] [CrossRef]
Khan, J.A.; Lim, D.W.; Kim, Y.S. A Deep Learning-Based IDS for Automotive Theft Detection for In-Vehicle CAN Bus. IEEE Access 2023, 11, 112814–112829. [Google Scholar] [CrossRef]
Wei, P.; Wang, B.; Dai, X.; Li, L.; He, F. A novel intrusion detection model for the CAN bus packet of in-vehicle network based on attention mechanism and autoencoder. Digit. Commun. Netw. 2023, 9, 14–21. [Google Scholar] [CrossRef]
Kang, H.; Vo, T.; Kim, H.K.; Hong, J.B. CANival: A multimodal approach to intrusion detection on the vehicle CAN bus. Veh. Commun. 2024, 50, 100845. [Google Scholar] [CrossRef]
Yin, L.; Xu, J.; Wang, C.; Wang, Q.; Zhou, F. Detecting CAN overlapped voltage attacks with an improved voltage-based in-vehicle Intrusion Detection System. J. Syst. Archit. 2023, 143, 102957. [Google Scholar] [CrossRef]
Jichici, C.; Berdich, A.; Musuroi, A.; Groza, B. Control System Level Intrusion Detection on J1939 Heavy-Duty Vehicle Buses. IEEE Trans. Ind. Inform. 2024, 20, 2029–2041. [Google Scholar] [CrossRef]
Shi, J.; Xie, Z.; Dong, L.; Jiang, X.; Jin, X. IDS-DEC: A novel intrusion detection for CAN bus traffic based on deep embedded clustering. Veh. Commun. 2024, 49, 100830. [Google Scholar] [CrossRef]
Almadhor, A.; Alsubai, S.; Bouazzi, I.; Karovic, V.; Davidekova, M.; Al Hejaili, A.; Sampedro, G.A. Transfer learning for securing electric vehicle charging infrastructure from cyber-physical attacks. Sci. Rep. 2025, 15, 9331. [Google Scholar] [CrossRef]
Hamdare, S.; Kaiwartya, O.; Aljaidi, M.; Jugran, M.; Cao, Y.; Kumar, S.; Mahmud, M.; Brown, D.; Lloret, J. Cybersecurity risk analysis of electric vehicles charging stations. Sensors 2023, 23, 6716. [Google Scholar] [CrossRef]
Skarga-Bandurova, I.; Kotsiuba, I.; Biloborodova, T. Cyber Security of Electric Vehicle Charging Infrastructure: Open Issues and Recommendations. In Proceedings of the 2022 IEEE International Conference on Big Data (Big Data), Osaka, Japan, 17–20 December 2022; pp. 3099–3106. [Google Scholar] [CrossRef]
Usha, G.; Karthikeyan, H.; Gautam, K.; Pachauri, N. DDoS attack detection in intelligent transport systems using adaptive neuro-fuzzy inference system. Sci. Rep. 2025, 15, 20597. [Google Scholar] [CrossRef]
Weerasinghe, N.; Usman, M.A.; Hewage, C.; Pfluegel, E.; Politis, C. Threshold cryptography-based secure vehicle-to-everything (V2X) communication in 5G-enabled intelligent transportation systems. Future Internet 2023, 15, 157. [Google Scholar] [CrossRef]
Chowdhury, A.; Naha, R.; Kaisar, S.; Khoshkholghi, M.A.; Ali, K.; Galletta, A. Information Fusion-based Cybersecurity Threat Detection for Intelligent Transportation System. In Proceedings of the 2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW), Bangalore, India, 1–4 May 2023; pp. 96–103. [Google Scholar] [CrossRef]
Li, H.; Ji, Y.; Wang, Z. A New Hybrid Hierarchical Roadside Unit Deployment Scheme Combined with Parking Cars. Appl. Sci. 2024, 14, 7032. [Google Scholar] [CrossRef]
Channamallu, S.S.; Kermanshachi, S.; Rosenberger, J.M.; Pamidimukkala, A. A review of smart parking systems. Transp. Res. Procedia 2023, 73, 289–296. [Google Scholar] [CrossRef]
Ekpo, O.; Casola, V.; De Benedictis, A. Security and Privacy Issues in Mobility-as-a-Service (MaaS): A Systematic Review. In Proceedings of the 2024 19th Annual System of Systems Engineering Conference (SoSE), Tacoma, WA, USA, 23–26 June 2024; pp. 300–307. [Google Scholar] [CrossRef]
Alderete Peralta, A.; Balta-Ozkan, N.; Li, S. The road not taken yet: A review of cyber security risks in mobility-as-a-service (MaaS) ecosystems and a research agenda. Res. Transp. Bus. Manag. 2024, 56, 101162. [Google Scholar] [CrossRef]
Belen-Saglam, R.; Yuan, H.; Heering, M.S.; Ashraf, R.; Li, S. A Systematic Literature Review on Cyber Security and Privacy Risks in MaaS (Mobility-as-a-Service) Systems. Information 2025, 16, 514. [Google Scholar] [CrossRef]
Chu, K.F.; Yuan, H.; Yuan, J.; Guo, W.; Balta-Ozkan, N.; Li, S. A Survey of Artificial Intelligence-Related Cybersecurity Risks and Countermeasures in Mobility-as-a-Service. IEEE Intell. Transp. Syst. Mag. 2024, 16, 37–55. [Google Scholar] [CrossRef]
Isik, G.K.; Eker, A.; Tryfonas, T.; Oikonomou, G. Security Risk Analysis of Logistical Support Solutions for MaaS and DLT-based Mitigations. In Proceedings of the 2025 IEEE International Conference on Cyber Security and Resilience (CSR), Chania, Crete, Greece, 4–6 August 2025; pp. 562–569. [Google Scholar] [CrossRef]
Attou, H.; Guezzaz, A.; Benkirane, S.; Azrour, M.; Farhaoui, Y. Cloud-Based Intrusion Detection Approach Using Machine Learning Techniques. Big Data Min. Anal. 2023, 6, 311–320. [Google Scholar] [CrossRef]
Ansari, L. Enhanced cloud security with Bi-Optimized Sand Cat Swarm and Conv-Bi-ALSTM deep learning models. Expert Syst. Appl. 2025, 286, 128128. [Google Scholar] [CrossRef]
Senthilkumar, G.; Tamilarasi, K.; Periasamy, J. Cloud intrusion detection framework using variational auto encoder Wasserstein generative adversarial network optimized with archerfish hunting optimization algorithm. Wirel. Netw. 2024, 30, 1383–1400. [Google Scholar] [CrossRef]
Ferrag, M.A.; Ndhlovu, M.; Tihanyi, N.; Cordeiro, L.C.; Debbah, M.; Lestable, T.; Thandi, N.S. Revolutionizing Cyber Threat Detection With Large Language Models: A Privacy-Preserving BERT-Based Lightweight Model for IoT/IIoT Devices. IEEE Access 2024, 12, 23733–23750. [Google Scholar] [CrossRef]
Tirulo, A.; Chauhan, S.; Shafie-khah, M. LLM-powered threat intelligence: Proactive detection of zero-day attacks in electric vehicle cyber-physical systems. Sustain. Energy Grids Netw. 2025, 43, 101877. [Google Scholar] [CrossRef]
Zou, X.; Jiang, X.; Huang, R.; He, H.; Kapoor, P.; Zhao, J. CloudAnoAgent: Anomaly Detection for Cloud Sites via LLM Agent with Neuro-Symbolic Mechanism. arXiv 2025. [Google Scholar] [CrossRef]
Feng, Y.; Sakurai, K. Network Intrusion Detection: Evolution from Conventional Approaches to LLM Collaboration and Emerging Risks. arXiv 2025, arXiv:2510.23313. [Google Scholar] [CrossRef]
Isgandarov, E.; Cederle, M.; Chiariotti, F.; Susto, G.A. Towards Explainable Anomaly Detection in Shared Mobility Systems. arXiv 2025, arXiv:2507.15643. [Google Scholar] [CrossRef]
Zhang, S.; Xu, T.; Zhu, J.; Sun, Y.; Jin, P.; Shi, B.; Pei, D. Privacy-preserving MTS anomaly detection for network devices through federated learning. Inf. Sci. 2025, 690, 121590. [Google Scholar] [CrossRef]
Khezri, E.; Hassanzadeh, H.; Yahya, R.O.; Mir, M. Security Challenges in Internet of Vehicles (IoV) for ITS: A Survey. Tsinghua Sci. Technol. 2025, 30, 1700–1723. [Google Scholar] [CrossRef]
Kong, S.; Wang, K.; Feng, C.; Wang, J. Smart cities and transportation based vehicle-to-vehicle communication and cyber security analysis using machine learning model in 6G network. Wirel. Pers. Commun. 2024, 1–19. [Google Scholar] [CrossRef]
Karim, S.M.; Habbal, A.; Chaudhry, S.A.; Irshad, A. BSDCE-IoV: Blockchain-Based Secure Data Collection and Exchange Scheme for IoV in 5G Environment. IEEE Access 2023, 11, 36158–36175. [Google Scholar] [CrossRef]
Sedar, R.; Kalalas, C.; Vázquez-Gallego, F.; Alonso, L.; Alonso-Zarate, J. A Comprehensive Survey of V2X Cybersecurity Mechanisms and Future Research Paths. IEEE Open J. Commun. Soc. 2023, 4, 325–391. [Google Scholar] [CrossRef]
Huang, W.; Xu, H.; Gong, Y.; Liu, Z.; Li, F.; Lin, Z.; Hu, B.J. UltraADV: An Unsupervised Deep Learning Lightweight Framework for Anomaly Detection in V2X. IEEE Internet Things J. 2025, 12, 12735–12747. [Google Scholar] [CrossRef]
Gebrezgiher, Y.T.; Jeremiah, S.R.; Gritzalis, S.; Park, J.H. VAE-Based Real-Time Anomaly Detection Approach for Enhanced V2X Communication Security. Appl. Sci. 2025, 15, 6739. [Google Scholar] [CrossRef]
Moushi, O.M.; Gunawardena, C.; Ye, F.; Hu, R.Q.; Qian, Y. Machine Learning-Based Detection of Data Replay and Data Replay Sybil Attacks for Vehicular Communication Networks. In Proceedings of the ICC 2024—IEEE International Conference on Communications, Denver, CO, USA, 9–13 June 2024; pp. 5202–5207. [Google Scholar] [CrossRef]
Kumar, A.; Shahid, M.A.; Jaekel, A.; Zhang, N.; Kneppers, M. Machine learning based detection of replay attacks in VANET. In Proceedings of the NOMS 2023—2023 IEEE/IFIP Network Operations and Management Symposium, Miami, FL, USA, 8–12 May 2023; pp. 1–6. [Google Scholar] [CrossRef]
Baharlouei, H.; Makanju, A.; Zincir-Heywood, N. Exploring Real-Time Malicious Behaviour Detection in VANETs. In Proceedings of the Int’l ACM Symposium on Design and Analysis of Intelligent Vehicular Networks and Applications, Montreal, QC, Canada, 30 October–3 November 2023; DIVANet ’23. pp. 1–8. [Google Scholar] [CrossRef]
Baharlouei, H.; Makanju, A.; Zincir-Heywood, N. Evaluating the Robustness of ADVENT on the VeReMi-Extension Dataset. In Proceedings of the 2024 20th International Conference on Network and Service Management (CNSM), Prague, Czech Republic, 28–31 October 2024; pp. 1–7. [Google Scholar] [CrossRef]
Slama, O.; Tarhouni, M.; Zidi, S.; Alaya, B. One versus all binary tree method to classify misbehaviors in imbalanced VeReMi dataset. IEEE Access 2023, 11, 135944–135958. [Google Scholar] [CrossRef]
Fu, M.; Wang, P.; Liu, M.; Zhang, Z.; Zhou, X. IoV-BERT-IDS: Hybrid Network Intrusion Detection System in IoV Using Large Language Models. IEEE Trans. Veh. Technol. 2025, 74, 1909–1921. [Google Scholar] [CrossRef]
Kamel, J.; Wolf, M.; van der Hei, R.W.; Kaiser, A.; Urien, P.; Kargl, F. VeReMi Extension: A Dataset for Comparable Evaluation of Misbehavior Detection in VANETs. In Proceedings of the ICC 2020—2020 IEEE International Conference on Communications (ICC), Dublin, Ireland, 7–11 June 2020; pp. 1–6. [Google Scholar] [CrossRef]
Sharma, P.; Liu, H. A Machine-Learning-Based Data-Centric Misbehavior Detection Model for Internet of Vehicles. IEEE Internet Things J. 2021, 8, 4991–4999. [Google Scholar] [CrossRef]
Alladi, T.; Gera, B.; Agrawal, A.; Chamola, V.; Yu, F.R. DeepADV: A Deep Neural Network Framework for Anomaly Detection in VANETs. IEEE Trans. Veh. Technol. 2021, 70, 12013–12023. [Google Scholar] [CrossRef]
Alladi, T.; Kohli, V.; Chamola, V.; Yu, F.R.; Guizani, M. Artificial Intelligence (AI)-Empowered Intrusion Detection Architecture for the Internet of Vehicles. IEEE Wirel. Commun. 2021, 28, 144–149. [Google Scholar] [CrossRef]
Song, H.M.; Woo, J.; Kim, H.K. In-vehicle network intrusion detection using deep convolutional neural network. Veh. Commun. 2020, 21, 100198. [Google Scholar] [CrossRef]
Alshathri, S.; Sayed, A.; Hemdan, E.E.D. An Intelligent Attack Detection Framework for the Internet of Autonomous Vehicles with Imbalanced Car Hacking Data. World Electr. Veh. J. 2024, 15, 356. [Google Scholar] [CrossRef]
Alsaade, F.W.; Al-Adhaileh, M.H. Cyber Attack Detection for Self-Driving Vehicle Networks Using Deep Autoencoder Algorithms. Sensors 2023, 23, 4086. [Google Scholar] [CrossRef] [PubMed]
Samarakoon, S.; Siriwardhana, Y.; Porambage, P.; Liyanage, M.; Chang, S.Y.; Kim, J.; Kim, J.; Ylianttila, M. 5G-NIDD: A Comprehensive Network Intrusion Detection Dataset Generated over 5G Wireless Network. arXiv 2022, arXiv:2212.01298. [Google Scholar] [CrossRef]
Farzaneh, B.; Shahriar, N.; Muktadir, A.H.A.; Towhid, M.S.; Khosravani, M.S. DTL-5G: Deep transfer learning-based DDoS attack detection in 5G and beyond networks. Comput. Commun. 2024, 228, 107927. [Google Scholar] [CrossRef]
Harshdeep, K.; Sumalatha, K.; Mathur, R. DeepTransIDS: Transformer-Based Deep learning Model for Detecting DDoS Attacks on 5G NIDD. Results Eng. 2025, 26, 104826. [Google Scholar] [CrossRef]
Ilias, L.; Doukas, G.; Lamprou, V.; Ntanos, C.; Askounis, D. Convolutional Neural Networks and Mixture of Experts for Intrusion Detection in 5G Networks and beyond. arXiv 2025, arXiv:2412.03483. [Google Scholar] [CrossRef]
Ferrag, M.A.; Friha, O.; Hamouda, D.; Maglaras, L.; Janicke, H. Edge-IIoTset: A New Comprehensive Realistic Cyber Security Dataset of IoT and IIoT Applications: Centralized and Federated Learning. IEEE Access 2022, 10, 40281–40306. [Google Scholar] [CrossRef]
Moustafa, N.; Slay, J. UNSW-NB15: A comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In Proceedings of the 2015 Military Communications and Information Systems Conference (MilCIS), Canberra, Australia, 10–12 November 2015; pp. 1–6. [Google Scholar] [CrossRef]
Neto, E.C.P.; Taslimasa, H.; Dadkhah, S.; Iqbal, S.; Xiong, P.; Rahman, T.; Ghorbani, A.A. CICIoV2024: Advancing realistic IDS approaches against DoS and spoofing attack in IoV CAN bus. Internet Things 2024, 26, 101209. [Google Scholar] [CrossRef]
Merzouk, M.A.; Neal, C.; Delas, J.; Yaich, R.; Boulahia-Cuppens, N.; Cuppens, F. Adversarial robustness of deep reinforcement learning-based intrusion detection. Int. J. Inf. Secur. 2024, 23, 3625–3651. [Google Scholar] [CrossRef]
Mahdi, Z.S.; Zaki, R.M.; Alzubaidi, L. Advanced hybrid techniques for cyberattack detection and defense in IoT networks. Secur. Priv. 2025, 8, e471. [Google Scholar] [CrossRef]

Figure 1. Architecture of the proposed framework for intrusion detection in smart mobility systems. Here, ‘#Features’ denotes the number of input features provided to the model.

Figure 2. Confusion matrices for the performance of the hybrid model evaluated using (a) VeReMi Extension (b) 5G-NIDD (c) CICIoV2024 (d) Edge-IIoTset (e) Car Hacking, and (f) UNSW-NB15.

Figure 3. F1-Score analysis across smart mobility datasets.

Figure 4. Cross-validation F1-score performance of the proposed model across six datasets, (a) 5G-NIDD, (b) Edge-IIoTset, (c) UNSW-NB15, (d) Car Hacking, (e) CICIoV2024 and (f) VeReMi Extension, showing consistent performance across five folds. The dashed line represents the mean F1-score.

Figure 5. False Positive Rate (FPR) and False Negative Rate (FNR) across smart mobility datasets.

Figure 6. System Performance and Model Characteristics.

Table 1. Summary of Related Work on IDSs for Smart Mobility Components.

Year	Ref.	Focus	Key Methodology	Limitations in Smart Mobility Context
2023	[24]	CAN bus anomalies	AMAEID (autoencoder + attention)	Binary message focus; no IoT/edge; limits scalability to broader smart mobility environments.
2023	[42]	Cloud IDS	Random Forest with feature engineering	Cloud-centric IDS; excludes IoV and real-time telematics integration
2023	[61]	VANET misbehavior	OVA-BT classifier ensemble	Focused on VANET misbehavior; lacks V2I integration and digital-twin validation
2023	[58]	VANET replay attacks	ML-based detection	Replay-attack detection limited to VANET; no 5G or cloud-based scalability considered
2024	[16]	CAN-FD bus ID attacks	Improved GAN	No validation across V2X/IoV domains or heterogeneous infrastructures
2024	[28]	Unsupervised CAN traffic	IDS-DEC (LCAE + entropy)	Effective on CAN datasets but excludes RSU/5G-enabled traffic and large-scale mobility scenarios
2024	[44]	Cloud intrusion detection	VAWGAN with Gazelle optimization	Cloud-centric; no MaaS, V2X coverage
2024	[60]	VANET attack detection	ADVENT with federated learning	VANET-focused; overlooks MaaS/cloud cooperation and broader cross-infrastructure security needs
2024	[57]	VANET replay/DDoS	ML with reformulated features	VANET-focused; lacks V2P integration and big-data scalability for ITS
2024	[16]	CAN-FD bus ID attacks	Improved GAN (pre-processing + detection modules)	Ignores V2X/IoV integration; vehicular-only
2024	[25]	Multimodal CAN intrusions	TIL model + optimized CANet	CAN-specific; overlooks V2I/V2P communication and broader IoV interoperability
2025	[32]	DDoS in ITS	ANFIS-based detection	Effective for vehicular DDoS, but excludes IoT, and edge platforms
2025	[29]	EVCS cyber-physical attacks	Transfer learning (DNN, LSTM-RNN)	EVCS-only; ignores dependencies with MaaS, traffic systems, and cooperative ITS security layers
2025	[39]	MaaS security/privacy	Systematic review of 87 papers	Survey on MaaS; lacks practical implementation of defense mechanism
2025	[50]	MaaS anomaly detection	Federated learning (OmniFed)	MaaS privacy preserved, but excludes RSU/C-V2X latency and heterogeneous traffic integration
2025	[49]	Shared mobility anomalies	Isolation Forest, DIFFI	Shared mobility only; no V2G, edge computing
2025	[56]	V2X anomaly detection	VAE with CNN, sliding window	V2X IDS only; lacks interaction with urban IoT

Table 2. Distribution of benign and attack traffic across the six datasets employed for evaluating the proposed model.

Class	VeReMi	5G-NIDD	UNSW-NB15	Car	CICIoV2024	Edge-IIoTset
	Extension			Hacking
Benign (Label = 0)	165,373	738,153	2,218,764	67,688	1,223,737	794,117
ATTACK (Label = 1)	118,783	477,737	321,283	1,132,312	184,482	405,883
Total	284,156	1,215,890	2,540,047	1,200,000	1,408,219	1,200,000

Table 3. Optimal Hyperparameters for the Proposed Hybrid Deep Learning Model.

Model Architecture	Hyperparameter	Value 3
CNN Layers	CNN Filters/Kernel Size	128 (1st Layer), 16 (2nd Layer)/7
CNN Layers	Padding/MaxPool size	same/2
LSTM Layers	LSTM Units	256 (1st Layer), 96 (2nd Layer), 48 (3rd Layer)
	Dropout	0.2 (standard), 0.4 (recurrent)
	Return Sequences	True (all layers)
Attention & Dense	Attention	Dense Units = 1, Softmax Axis = 1
	Dense Units/Activation	32 (1st Layer), 48 (2nd Layer)/ReLU
	Output Layer	Units = 1, Activation = sigmoid, Threshold = 0.5
Regularization	Dropout Rate	0.2
Regularization	L2 Regularization Lambda	0.001
Training and Callbacks	Optimizer	Adam (learning rate = 0.001)
	Loss Function/Metrics	Binary crossentropy/Accuracy
	EarlyStopping	Restore Best Weights = True, Patience = 15,
		Monitor = val_loss/loss
	ReduceLROnPlateau	Factor = 0.5, Patience = 7, Min LR = 1 $\times 10^{- 6}$ ,
		Monitor = val_loss/loss
Miscellaneous	BatchNormalization	Default parameters
	Activations	relu (hidden layers), None (attention score)
	Lambda Reduce_Sum Axis	1
	Epochs/Batch Size	10/4096

Table 4. Comprehensive Multi-Dataset Performance Benchmarking of the Hybrid Model for Intrusion Detection. Results are presented as Mean ± Standard Deviation for both Threshold-based and Threshold-free Metrics.

Dataset	Accuracy (%)	Precision (%)	Recall (%)	F1-Score (%)	AUC-ROC (%)	PR-AUC (%)
	Mean ± STD	Mean ± STD	Mean ± STD	Mean ± STD	Mean ± STD	Mean ± STD
5G-NIDD	$99.90 \pm 0.03$	$99.91 \pm 0.05$	$99.83 \pm 0.13$	$99.87 \pm 0.04$	$100.00 \pm 0.00$	$100.00 \pm 0.01$
UNSW-NB15	$98.97 \pm 0.05$	$96.25 \pm 0.99$	$95.61 \pm 1.20$	$95.92 \pm 0.21$	$99.94 \pm 0.00$	$99.58 \pm 0.04$
Edge-IIoTset	$99.96 \pm 0.01$	$99.89 \pm 0.03$	$100.00 \pm 0.00$	$99.95 \pm 0.02$	$100.00 \pm 0.00$	$100.00 \pm 0.00$
Car Hacking	$100.00 \pm 0.00$	$100.00 \pm 0.00$	$100.00 \pm 0.00$	$100.00 \pm 0.00$	$100.00 \pm 0.00$	$100.00 \pm 0.00$
CICIoV2024	$100.00 \pm 0.00$	$100.00 \pm 0.00$	$100.00 \pm 0.00$	$100.00 \pm 0.00$	$100.00 \pm 0.00$	$100.00 \pm 0.00$
VeReMi	$98.24 \pm 0.17$	$99.85 \pm 0.09$	$95.99 \pm 0.49$	$97.88 \pm 0.21$	$98.46 \pm 0.42$	$98.79 \pm 0.27$
Extension

Table 5. Model Performance (FPR and FNR) and Efficiency Metrics Across Multiple Datasets.

Dataset	FPR (%)	FNR (%)	Inference Latency	#Params	Size (MB)	AT/Epoch (s)
	Mean ± STD	Mean ± STD	(ms/Sample)	(K)		Mean ± STD
5G-NIDD	$0.06 \pm 0.03$	$0.17 \pm 0.13$	$2.22 \pm 0.04$	$156 K$	$1.91$	$54.68 \pm 0.13$
UNSW-NB15	$0.54 \pm 0.16$	$4.39 \pm 1.20$	$1.75 \pm 0.05$	$156 K$	$1.91$	$60.29 \pm 1.81$
Edge-IIoTset	$0.05 \pm 0.02$	$0.00 \pm 0.00$	$21.40 \pm 7.25$	$156 K$	$1.91$	$50.29 \pm 0.41$
Car Hacking	$0.00 \pm 0.00$	$0.00 \pm 0.00$	$1.51 \pm 0.08$	$463 K$	$5.43$	$15.24 \pm 0.21$
CICIoV2024	$0.00 \pm 0.00$	$0.00 \pm 0.00$	$1.44 \pm 0.09$	$463 K$	$5.43$	$52.59 \pm 66.61$
VeReMi	$0.11 \pm 0.07$	$4.01 \pm 0.49$	$1.42 \pm 0.05$	$491 K$	$5.73$	$5.19 \pm 0.09$
Extension

AT → Average Time.

Table 6. Stratified 5-Fold Cross-Validation Results Across All Datasets. Performance metrics are reported as the mean ± standard deviation across the five folds.

Dataset	Accuracy (%)	Precision (%)	Recall (%)	F1-Score (%)	ROC-AUC (%)	PR-AUC (%)
	Mean ± STD	Mean ± STD	Mean ± STD	Mean ± STD	Mean ± STD	Mean ± STD
5G-NIDD	$99.82 \pm 0.18$	$99.86 \pm 0.17$	$99.70 \pm 0.47$	$99.78 \pm 0.23$	$100.00 \pm 0.00$	$100.00 \pm 0.00$
Edge-IIoTset	$99.97 \pm 0.03$	$99.98 \pm 0.03$	$99.92 \pm 0.11$	$99.95 \pm 0.05$	$100.00 \pm 0.00$	$100.00 \pm 0.00$
UNSW-NB15	$98.92 \pm 0.20$	$95.26 \pm 1.75$	$96.25 \pm 1.30$	$95.74 \pm 0.75$	$99.93 \pm 0.04$	$99.51 \pm 0.29$
Car Hacking	$100.00 \pm 0.00$	$100.00 \pm 0.00$	$100.00 \pm 0.00$	$100.00 \pm 0.00$	$100.00 \pm 0.00$	$100.00 \pm 0.00$
CICIoV2024	$99.98 \pm 0.04$	$100.00 \pm 0.00$	$99.98 \pm 0.04$	$99.99 \pm 0.02$	$100.00 \pm 0.00$	$100.00 \pm 0.00$
VeReMi Extension	$97.81 \pm 0.16$	$100.00 \pm 0.01$	$94.77 \pm 0.39$	$97.31 \pm 0.20$	$97.74 \pm 0.05$	$98.26 \pm 0.04$

Table 7. Ablation Experiment Results Showing the Impact of Individual Components (CNN, LSTM, and Attention) and Their Combination on Six Smart Mobility Datasets. Metrics are Reported as Mean ± Standard Deviation.

Dataset	Model	Accuracy (%)	F1-Score (%)	FPR (%)	FNR (%)
		Mean ± STD	Mean ± STD	Mean ± STD	Mean ± STD
5G-NIDD	Attention	$70.64 \pm 3.42$	$55.96 \pm 1.52$	$14.78 \pm 11.52$	$47.78 \pm 15.24$
	CNN	$99.83 \pm 0.09$	$99.78 \pm 0.04$	$0.34 \pm 0.02$	$0.04 \pm 0.04$
	LSTM	$93.22 \pm 3.31$	$91.35 \pm 1.57$	$6.12 \pm 0.04$	$2.40 \pm 0.12$
	Our Model	$99.90 \pm 0.03$	$99.87 \pm 0.04$	$0.05 \pm 0.03$	$0.16 \pm 0.13$
Edge-IIoTset	Attention	$93.25 \pm 3.31$	$89.22 \pm 6.06$	$2.67 \pm 2.92$	$14.75 \pm 12.39$
	CNN	$99.95 \pm 0.02$	$99.94 \pm 0.02$	$0.02 \pm 0.01$	$0.07 \pm 0.07$
	LSTM	$99.95 \pm 0.03$	$99.95 \pm 0.01$	$0.01 \pm 0.01$	$0.12 \pm 0.09$
	Our Model	$99.96 \pm 0.01$	$99.95 \pm 0.02$	$0.05 \pm 0.01$	$0.00 \pm 0.00$
UNSW-NB15	Attention	$87.95 \pm 1.03$	$12.41 \pm 21.49$	$0.55 \pm 0.95$	$91.52 \pm 14.69$
	CNN	$99.03 \pm 0.07$	$96.11 \pm 0.32$	$0.42 \pm 0.19$	$4.80 \pm 1.70$
	LSTM	$98.79 \pm 0.03$	$63.48 \pm 0.06$	$0.61 \pm 0.03$	$2.28 \pm 0.30$
	Our Model	$98.97 \pm 0.05$	$95.92 \pm 0.21$	$0.54 \pm 0.15$	$4.38 \pm 1.19$
Car Hacking	Attention	$100.00 \pm 0.01$	$99.95 \pm 0.00$	$0.00 \pm 0.04$	$0.00 \pm 0.01$
	CNN	$100.00 \pm 0.00$	$100.00 \pm 0.00$	$0.00 \pm 0.00$	$0.00 \pm 0.00$
	LSTM	$100.00 \pm 0.00$	$100.00 \pm 0.00$	$0.00 \pm 0.00$	$0.00 \pm 0.00$
	Our Model	$100.00 \pm 0.00$	$100.00 \pm 0.00$	$0.00 \pm 0.00$	$0.00 \pm 0.00$
CICIoV2024	Attention	$93.16 \pm 0.85$	$97.63 \pm 2.05$	$62.03 \pm 31.67$	$20.38 \pm 34.07$
	CNN	$100.00 \pm 0.00$	$100.00 \pm 0.00$	$0.00 \pm 0.00$	$0.00 \pm 0.00$
	LSTM	$99.76 \pm 0.36$	$99.87 \pm 0.21$	$0.00 \pm 0.00$	$0.27 \pm 0.41$
	Our Model	$100.00 \pm 0.00$	$100.00 \pm 0.00$	$0.00 \pm 0.00$	$0.00 \pm 0.00$
VeReMi Extension	Attention	$97.67 \pm 0.13$	$97.67 \pm 0.78$	$1.53 \pm 2.52$	$3.92 \pm 2.41$
	CNN	$98.17 \pm 0.12$	$97.84 \pm 0.10$	$0.38 \pm 0.28$	$3.78 \pm 0.18$
	LSTM	$97.84 \pm 0.13$	$97.38 \pm 0.16$	$0.00 \pm 0.00$	$5.11 \pm 0.31$
	Our Model	$98.24 \pm 0.17$	$97.88 \pm 0.21$	$0.11 \pm 0.07$	$4.01 \pm 0.81$

Table 8. Comparison of Related IDS Studies Across Smart Mobility Datasets.

Study	Methodology	VeReMi Extension	Car Hacking	UNSW–NB15	Edge–IIoTset	CICIoV2024	5G-NIDD	Accuracy (%)	F1-Score (%)
Wang et al. [16]	GAN-based IDS for CAN-FD	✗	✓	✗	✗	✗	✗	99.31	✗
Slama et al. [61]	OVA-BT Ensemble (VANET misbehavior)	✓	✗	✗	✗	✗	✗	✗	76.80
Shi et al. [28]	IDS-DEC (LSTM-CNN Autoencoder + Clustering)	✗	✓	✗	✗	✗	✗	99.00	99.00
Usha et al. [32]	ANFIS IDS (DDoS detection)	✗	✗	✓	✗	✗	✗	94.30	93.90
Zhang et al. [50]	Federated MaaS IDS (OmniFed)	✗	✗	✗	✗	✓	✗	✗	92.10
This Work	CNN–LSTM–Attention (Hybrid IDS)	✓	✓	✓	✓	✓	✓	99.51	98.94

Note: ✓ indicates that the dataset or feature is used in the study, while ✗ indicates that it is not used or not reported.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ekpo, O.; Casola, V.; De Benedictis, A.; Asuquo, P.; Agbor, B. A Hybrid CNN–LSTM–Attention Framework for Intrusion Detection in Smart Mobility Networks. Future Internet 2026, 18, 210. https://doi.org/10.3390/fi18040210

AMA Style

Ekpo O, Casola V, De Benedictis A, Asuquo P, Agbor B. A Hybrid CNN–LSTM–Attention Framework for Intrusion Detection in Smart Mobility Networks. Future Internet. 2026; 18(4):210. https://doi.org/10.3390/fi18040210

Chicago/Turabian Style

Ekpo, Otuekong, Valentina Casola, Alessandra De Benedictis, Philip Asuquo, and Bright Agbor. 2026. "A Hybrid CNN–LSTM–Attention Framework for Intrusion Detection in Smart Mobility Networks" Future Internet 18, no. 4: 210. https://doi.org/10.3390/fi18040210

APA Style

Ekpo, O., Casola, V., De Benedictis, A., Asuquo, P., & Agbor, B. (2026). A Hybrid CNN–LSTM–Attention Framework for Intrusion Detection in Smart Mobility Networks. Future Internet, 18(4), 210. https://doi.org/10.3390/fi18040210

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Hybrid CNN–LSTM–Attention Framework for Intrusion Detection in Smart Mobility Networks

Abstract

1. Introduction

2. Related Work

3. Research Methodology

3.1. Dataset Insights

3.1.1. VeReMi Extension Dataset

3.1.2. Car Hacking Dataset

3.1.3. 5G-NIDD Dataset

3.1.4. Edge-IIoTset Dataset

3.1.5. UNSW-NB15 Dataset

3.1.6. CICIoV2024 Dataset

3.2. Dataset Preprocessing

3.3. Proposed Model for Intrusion Detection

3.4. Problem Formulation

3.5. Problem Definition

3.6. CNN Layer for Feature Extraction

3.7. LSTM Layers for Sequential Modeling

3.8. Attention Mechanism

3.9. Classification Decision

3.10. Learning Objective

4. Experimental Results and Analysis

4.1. Experimental Setup

4.2. Metrics for Model Evaluation

4.3. Overall Performance

4.3.1. Analysis of Perfect-Score Results

4.3.2. Efficiency and Practical Considerations

4.4. Ablation Experiment

4.5. Comparison with the State of the Art

4.6. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI