Intelligent Handover Decision-Making for Vehicle-to-Everything (V2X) 5G Networks

Al Harthi, Faiza Rashid Ammar; Touzene, Abderezak; Alzidi, Nasser; Al Salti, Faiza

doi:10.3390/telecom6030047

Open AccessFeature PaperArticle

Intelligent Handover Decision-Making for Vehicle-to-Everything (V2X) 5G Networks

by

Faiza Rashid Ammar Al Harthi

,

Abderezak Touzene

^*

,

Nasser Alzidi

and

Faiza Al Salti

Department of Computer Science, Sultan Qaboos University, Muscat 123, Oman

^*

Author to whom correspondence should be addressed.

Telecom 2025, 6(3), 47; https://doi.org/10.3390/telecom6030047

Submission received: 27 May 2025 / Revised: 23 June 2025 / Accepted: 30 June 2025 / Published: 2 July 2025

Download

Browse Figures

Versions Notes

Abstract

Fifth-generation Vehicle-to-Everything (V2X) networks have ushered in a new set of challenges that negatively affect seamless connectivity, specifically owing to high user equipment (UE) mobility and high density. As UE accelerates, there are frequent transitions from one cell to another, and handovers (HOs) are triggered by network performance metrics, including latency, higher energy consumption, and greater packet loss. Traditional HO mechanisms fail to handle such network conditions, requiring the development of Intelligent HO Decisions for V2X (IHD-V2X). By leveraging Q-Learning, the intelligent mechanism seamlessly adapts to real-time network congestion and varying UE speeds, thereby resulting in efficient handover decisions. Based on the results, IHD-V2X significantly outperforms the other mechanisms in high-density and high-mobility networks. This results in a reduction of 73% in unnecessary handover operations, and an 18% reduction in effective energy consumption. On the other hand, it improved handover success rates by 80% from the necessary handover and lowered packet loss for high mobility UE by 73%. The latency was kept at a minimum of 22% for application-specific requirements. The proposed intelligent approach is particularly effective for high-mobility situations and ultra-dense networks, where excessive handovers can degrade user experience.

Keywords:

energy consumption; handover; latency; packet loss; Q-learning; V2X

1. Introduction

The integration of digital communication into transportation systems has been driven by a growing emphasis on human life and safety. Based on statistics presented by the World Health Organization, approximately 1.19 million people die from road accidents annually, and car crashes have become the eighth leading cause of global death [1]. In 1999, the Federal Communications Commission (FCC) of the United States designated a specific range of frequencies in the 5.9 GHz band to be explicitly used to enable Intelligent Transport Services (ITSs). This decision subsequently prompted a significant amount of research on the practical implementation of Vehicle-to-Everything (V2X) communications. DSRC and Cellular V2X (C-V2X) were the two key radio access technologies used for V2X. These technologies enable vehicles and infrastructure to exchange data using standardized protocols. Both technologies operate in the 5.9 GHz band, although C-V2X can operate in the operator’s licensed band. DSRC is a broadcast-based system, while C-V2X enables both direct V2X communications through the PC5 interface using the Sidelink (SL) channel and the uU interface for cellular communication. Currently, Device-to-Device (D2D) communication technology, the Proximity Services (ProSes) introduced in 3GPP Release 12 to enable public safety UEs communications [2], is used for V2X Sidelink, but improved techniques are required to ensure Cellular-V2X communication. The current Radio Access Network (RAN) technologies for V2X communication are compared in Table 1.

Enhanced models of information exchange between vehicles using V2X can make travel much safer, reduce road congestion, and maintain eco-environments. The first launched vehicular network was limited to the objective of road safety, in essence, to control the number of deaths from traffic collisions. The vehicles can exchange information about their speed, position, and direction to avoid collisions and prevent accidents. Vehicles can also receive traffic signal timing information to optimize speed and reduce stops at intersections. Nowadays, more use cases have been considered for this technology. The advanced use cases for V2X include remote driving, intelligent platooning, unmanned aerial vehicles (UAVs), and satellite communication networks [3].

Focusing on achieving the potential of V2X not only helps drive the evolution of informatization. The implementation of V2X will promote industrialization by transforming and upgrading the automotive industry to an enabler/provider of technology support for the realization of smart cities and smart environments that foster safe, comfortable, and intelligent urban living. With the advent of new vertical industrial applications such as autonomous vehicles, railways, and unmanned aerial vehicles (UAVs), the deployment of higher-density heterogeneous networks was touted as the ideal solution to accommodate the rapidly increasing quantities and dynamic user demands in cellular networks [4]. However, the dynamic nature of V2X communications and the variety of Quality of Services (QoS) required for its applications resulted in frequent changes in network topology and consequently short connection timing. The current mobility management method, reliant on signal strength, is not effective for diverse V2X QoS requirements [5]. It is necessary to determine a solution that could optimize the mobility management of V2X UEs, first and foremost, to effectively address issues encountered in frequent connection changes in heterogeneous network environments, particularly in the network/cell selection.

Cell selection in a cellular network is the process of establishing a connection between a vehicle or user equipment (UE) and the network. The process of selecting cells in 5G networks is influenced by significant improvements and considerations compared to previous generations. These advancements are driven by the new capabilities of 5G networks, including slicing, edge computing, and high mobility users, with the aim of optimizing user experience and ensuring effective utilization of network resources. UEs in cellular networks seamlessly transfer their connections from one cell to another during an active call or data connection by transferring an active communication session using the handover (HO) procedure. An HO is initially triggered by the UE in response to low signal strength from the serving cell, requested by the serving cell, or at regular intervals [6]. This process usually results in frequent and high-volume connection disturbances and reconnections, especially when a UE moves close to the network edge and HOs are triggered in brief time intervals [7]. HOs can be classified into various types. Intra-frequency HOs occur at serving and target stations that operate at the same carrier frequency, whereas inter-frequency HOs occur at serving and target stations that operate on different carrier frequencies. In heterogeneous networks (HetNets), the capacity requirements are fulfilled by creating multiple cellular layers. An intra-layer HO is performed among cells of various sizes based on the received signal strength/quality from cells of similar layers. A cell from a certain layer can allow a UE to offload to a small cell, resulting in an inter-layer HO. HOs can also use the same Radio Access Technology (RAT), called Intra-RAT HOs, while vertical HOs occurring between different RATs are referred to as Inter-RAT HOs. HOs that occur among various systems and technologies under the same operator are called Inter-Operator HOs, whereas HOs that occur between different operators, such as during roaming, are called Intra-Operator HOs. HOs can also be categorized based on interfaces: Xn-based handover management and NG-based interfaces. Xn-based handover management uses the Xn interface between base stations without dealing with the core network, whereas the NG-based interface sends control messages to the network functions in the core network. X2/Xn interface-based handovers are six times faster than network-based handovers as the core network signaling load is reduced [6]. Performance reporting criteria can be event-triggered, periodic, or on-demand/blind [8]. HO actions are performed based on factors such as serving cell signal-level reduction, load balancing, and high error rates. Once these factors reach unacceptable levels, a UE connection must be transferred to a suitable, reliable, and stable cell to ensure seamless connectivity. This process, although occurring at a high frequency, can be challenging, especially for high-speed UEs such as drones [8]. High-speed UEs in V2X networks frequently move in and out of coverage areas, often from one network to another. This triggers frequent changes in the connection factors that necessitate HOs. A test conducted by 3GPP in 2013 to evaluate HO performance in a UDN showed that the number of handovers increased by 17% for HetNets compared to the macro-cell network alone [9]. Although the suggested solution is that an HO scheme for heterogeneous vehicular networks with varying access technologies is necessary for a smooth handover [10], the deployment of mmWave in 5G would create more issues in this regard because the coverage areas will be smaller [11]. Future cell networks require HOs to be set properly since they directly affect the QoS. In ultra-dense HetNets, efficient HO triggering techniques should feature self-optimization to minimize or eliminate network degradation. Self-optimization, which falls under radio resource management, focuses on optimizing the mobility robustness and load balancing. Various techniques for mobility robustness optimization have focused on optimizing HO control parameters [12]. Architecture enhancements for 5G systems (5GSs) to support V2X services were discussed in 3GPP Release 16, reporting that self-optimized HO mechanisms can support maintaining seamless connectivity [12].

On the other hand, improper cell selection decisions can lead to several issues that can degrade the overall network services and, therefore, negatively impact user experience. This network deficiency is a result of connection instability when the UE frequently handoffs connections between small cells (SCs) in a UDN. The deployment scenario, mobility and coverage concepts, and traffic characteristics of SCs for Long-Term Evolution (LTE) networks were proposed in 3GPP TR 36.932 version 13.0.0 Release 13 [13] to optimize the network performance. Several cell selection methods have been proposed to maintain the handover decision, including the RSS-based algorithm [14], context-aware algorithm [15], cost function algorithm [16], and fuzzy logic-based algorithm [17,18]. Although these methods demonstrate a level of acceptance in dealing with handover problems in mobile communication, more attention is required when implementing a vehicular network. With rigorous parameters, there are limited opportunities to meet new challenges posed by the complex and dynamic nature of V2X network environments, such as the need for instant decision, increased network densification, and low latency [19]. For instance, the required and more reliable HO decisions can be achieved through the deployment of machine learning (ML). ML paradigms are classified as supervised learning, unsupervised learning, and reinforcement learning (RL). Supervised learning trains labeled data sets to produce patterns, whereas unsupervised learning uses unlabeled ones. RL is a policy learning method in which the essential components are state, action, reward, and punishment for decision-making.

This paper presents an enhanced handover optimization method for 5G V2X networks. It integrates a Multi-Criteria Decision-Making (MCDM) approach using Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) with an intelligent component value-based RL algorithm in the form of Q-Learning. RL is recommended as a better approach to provide a solution for the unaddressed challenge of effective decision-making techniques to accommodate the temporal variation in the vehicular network environment [11,20,21]. The main aim is to improve handover efficiency, alleviate network congestion, and optimize performance.

The main contributions are as follows:

Develop a dual HO decision method for 5G V2X communications by integrating TOPSIS with a value-based RL algorithm in the form of Q-Learning to provide long-term network stability.
Use the TOPSIS method integrated with stay time (ST) and connection context requirements to improve decision quality.
The novelty of adjusting the TOPSIS weightage dynamically based on the updated Q-values and accumulated experiences can guarantee achieving better performance for the V2X UDN network.

The remainder of this paper is organized as follows. Section 2 provides a review of recent related work, followed by Section 3, which presents the proposed handover Q-Learning-based technique. The results of the proposed solution are presented in Section 4. Conclusions and future work are presented in Section 5.

2. Related Work

Improper cell selection decisions can lead to several issues that can affect the overall network services and, therefore, negatively impact the user experience. Recent studies have investigated HO problems in V2X communication and suggested solutions. This section provides an overview of recent studies that consider machine learning for cell selection strategies, highlighting their contributions and limitations. This section concludes the paper with a brief outline of the proposed approach.

A hybrid self-organizing network (SON) model is adopted in [22]. A SON is a type of network management approach that can adjust network responses to user requirements using automated processes to improve network responsiveness, efficiency, and capacity. The SON architecture was defined by the 3GPP LTE Advanced Standards for centralized, distributed, and hybrid implementations [23]. The SON architecture focuses on self-configuration, self-optimization, and self-healing. The central SON transfers the estimated optimal handover parameters to the distributed SON to improve the learning process efficiency. The selection of the actions is based on increasing or decreasing the value of two parameters, namely, Time-To-Trigger (TTT) and Cell Individual Offset (CIO), aiming to enhance the overall network performance by reducing the number of link failures and ping-pong HOs. However, prior knowledge stored in centralized databases can become irrelevant or outdated due to the frequent changes in the V2X network topology, which can ultimately lead to suboptimal decisions. In [24], the study followed a centralized RL execution agent approach for base station (BS) selection decisions. They designed an ML-Based HO algorithm for Vehicle-to-Network (V2N) using a real data set collected from the city of Glasgow, United Kingdom, facilitating UE location and trajectory. By using the relative information of the BS Reference Signal Received Power (RSRP) collected from the E-UTRAN measurement report, the reward function normalizes the target RSRP value from both the current selected BS and the local maximum RSRP. This approach enabled the agent to choose the strongest possible signal in the environment, leading to a stable connection. Although the proposed algorithm significantly outperforms the 3GPP RSRP method, it requires reducing the state space and gathering additional relevant network information to improve the HO decision. The Deep Q-Learning (DQN) handover optimization method was employed in [25] for a UDN to predict the handover parameters/threshold while considering the signal fading conditions. Signal fading is a common wireless phenomenon that significantly affects the overall network performance as a result of the variation in signal strength when it travels through the medium. High path loss, signal propagation, Doppler shifting, and shadowing are the primary sources of signal fading. The LSTM-assisted digital twin is leveraged to optimize handover parameters, thereby enhancing system efficiency and convergent effects, and making the handover process more robust for dynamic network environments. The reward function is designed to enhance handover rate effectiveness, aiming to maintain seamless connectivity by reducing link failures and minimizing ping-pong HOs. The simulation results demonstrate that the enhanced DQN with a digital twin outperforms the baseline DQN by achieving an effective handover rate of 2.7%. The provided module is not involved in the handover decision and is delegated to the existing handover control system.

Despite the extensive literature on mobility load balancing in cellular networks, the amount of research within the scope of vehicular communications is limited. The UE counters connection failures when the cell cannot accommodate more traffic. RL was adopted to enhance the efficacy of the load balancing algorithm [26]. The vehicular environment provides rewards as a prompt response to the UE for the actions/behavior in a particular state. The reward mechanism in this study was designed to incentivize the UE to balance the load between the two BSs (the current and neighboring BSs) to maintain spectral efficiency. The UE receives a reward based on the current BS load status and the CIO value. When the neighboring cells receive sufficient load, the UE receives a reward. The CIO value decreases to zero. An insufficient load causes the CIO to increase, transferring the UE from the overloaded cell and penalizing the UE. Although the mechanism demonstrates efficient load distribution between cells, it should consider the density of the vehicular network by incorporating multiple-cell scenarios. Another limitation is that limited contextual information was used without considering the mobility of the UE. To enhance HO decisions, a mobility-adaptive handover mechanism was developed in [27], which reduces reliance on the handover static parameter, thereby minimizing service interruptions caused by unnecessary HOs for high-speed users. The intelligent mobility management method (LIM2) utilizes the RSS-prediction technique to identify the future highest signal quality for the target cell. Subsequently, the State–Action–Reward–State–Action (SARSA) algorithm, an online reinforcement learning method, was employed to dynamically identify the optimal cell for the triggered handover. To improve the signal quality and power, the Reference Signal Received Quality (RSRQ) of the neighbor cell served as the reward function. However, the results indicated an improvement in the overall throughput and lower HO failure compared to the baseline mechanisms; neglecting the load balancing can be a constraint limiting the network handover performance. A novel reinforcement learning-based approach, named Advanced Conditional Handover (ACHO), is investigated in [28], aiming to improve the mobility robustness in 5G and future cellular networks. Conditional Handover (CHO) is a 3GPP feature designed to reduce handover failures by pre-configuring handover parameters for the UE. However, CHO still experiences failures due to non-optimal handover parameters, leading to radio link failure (RLF), and the inappropriate TTT and Handover Margin (HOM) values lead to premature or delayed handovers. As 5G networks become denser, the frequency of these problems is expected to increase. In ACHO, epsilon-greedy Q-Learning is used to dynamically optimize handover decisions based on real-time network conditions, user speed, and previous handover results. ACHO allows the UE to determine its own handover parameters (HOM and TTT) based on the current network conditions, historical handover performance, user service requirements (e.g., eMBB and uRLLC), and mobility speed. Overall, the method enhances CHO by giving the UE more autonomy in making handover decisions, unlike traditional CHO, which relies on static configurations from base stations. This approach exhibits an inadequate solution for dense networks characterized by high-mobility users with a dynamic network nature, owing to the use of a single parameter, causing more unnecessary handovers. Table 2 presents a comparative analysis of RL-based approaches for HO optimization.

The aforementioned studies demonstrated various RL implementations to optimize the handover process in different network topologies. A summary of the strengths and weaknesses of different RL-based algorithms is presented in Table 3.

The existing RL handover strategy decision-making approaches overlooked specific application requirements as an important criterion in an ultra-dense network to improve the QoS. In dynamic and complex network contexts, the dual method of coupling TOPSIS with a Q-Learning decision guarantees high-quality, efficient, and adaptable handover control, improving the overall network performance and user experience. Q-Learning is selected due to its lower computational complexity compared to other RL methods, which can accommodate the needs of low latency and energy savings for V2X networks, whereas the TOPIS method, stay time, and efficient resource usage can reduce the number of search spaces, targeting long-term network stability.

3. Intelligent Handover Decision for V2X (IHD-V2X)

The fast-paced developments of 5G V2X networks have posed new problems in sustaining seamless connectivity caused by high UE mobility, high SC density, and varied application requirements. These problems result in frequent handovers, unnecessary network signaling, increased packet loss, latency, and wasteful resource utilization. Handling the dynamic network conditions caused by these problems renders the traditional HO mechanism insufficient, necessitating a change to the use of an adaptive and intelligent handover management approach. In this paper, we propose a novel HO approach that integrates Q-Learning with the MCDM TOPSIS technique to optimize the overall 5G V2X network performance metrics. The proposed intelligent distributed method works well for latency-sensitive applications such as AR [29]. TOPSIS offers a strong MCDM method for ranking SCs based on several parameters, guaranteeing that the network choice of possible handover targets is well-informed. The handover decision-making process is then optimized by Q-Learning using this ranked list to learn and adapt over time based on accumulated experience and rewards. Considering a variety of criteria, the combination of MCDM with Q-Learning enables the network to make more complex judgments and learn from past mistakes to make better decisions in the future. This method offers other benefits, including

Decreased exploration space for Q-Learning: The IHD-V2X algorithm efficiently narrows down the search space for Q-Learning by pre-filtering and ranking the SCs using TOPSIS rather than considering every potential small cell. Q-Learning concentrates on the top-ranked possibilities that TOPSIS has found. Because it works with a more focused collection of superior choices, the Q-Learning process can converge more quickly and effectively because of this reduction in the search space.
Enhanced performance and adaptability: High-quality SCs selection is made possible by the TOPSIS approach, which also incorporates Q-Learning for adaptive learning and handover strategy refinement, based on feedback and real-world network performance. With the help of this adaptive learning feature, the network may continuously improve the handover procedure in response to shifting circumstances, including traffic loads, user movement patterns, and signal conditions, thereby improving resilience and performance.
Balancing long-term optimization and immediate quality: TOPSIS prioritizes instant quality using performance measurements to rank SCs. By drawing lessons from the past and refining future handover procedures to maximize cumulative rewards, Q-Learning presents a long-term view. Collectively, they guarantee that the network makes choices that promote long-term network stability and efficiency, in addition to optimizing the quality of handovers.
Assistance in complex and dynamic environments: This method is especially well-suited for complex and dynamic environments such as 5G V2X networks, where quick decision-making and adaptive learning are essential, because it integrates TOPSIS and Q-Learning. TOPSIS guarantees sound decision-making in multi-criteria scenarios, whereas Q-Learning manages the dynamic nature of the environment by determining the best course of action under various circumstances. Figure 1 illustrates the key Q-Learning components, where a state is mapped to a sequence of actions while interacting with an environment to maximize rewards. The knowledge gained from the environment by the agent is represented by a Q-table (Q). It keeps track of the predicted utility, or Q-values, of performing particular tasks under particular conditions.

The following parameters are considered for SCs’ performance evaluation:

▪: RSSI: It represents a traditional HO decision method that provides a measure of how good a received signal is to a UE.
▪: SINR: It is a user-perceived metric that evaluates the ratio of signal level to noise level. A high SINR value represents good signal quality, while a value below 0 indicates more noise, which means low connection speed and a high probability of losing connection.
▪: BER: It is one of the most used performance indicators in wireless networks. It is calculated by dividing the erroneous bits by the number of transmitted bits. It is used in decision-making in cognitive radio networks to determine channel quality. This parameter has been used to analyze the performance and achievable throughput of 5G uplink and downlink communications, and as a reference value for the vertical handover algorithm.
▪: Data Transmission Rate: It represents the amount of data that is transmitted. Maintaining seamless and uninterrupted service is essential for upholding user satisfaction, and this is ensured by higher data rates. By choosing smaller cells with faster data rates, the network efficiency can be maximized by transferring data faster and freeing up resources for other users.
▪: Packet loss: It represents the dropped packets when the data is transmitted over a network.
▪: Delay: It is a crucial component of safety-related communications in V2X scenarios, as even a slight delay might have serious repercussions. It calculates the amount of time the data takes to transfer from source to destination.
▪: The following discusses the flow of the algorithm.

1.: Define the State (s_v)

A state (

s_{v}

) represents the connection context currently loaded by the UE according to the application requirements. The state consists of network parameters such as the Received Signal Strength Indicator (RSSI), Signal-to-Interference-Noise Ratio (SINR), Bit Error Rate (BER), data rate, delay, and packet loss. With each movement, a UE in the network changes its state.

2.: Select an Action (a_v)

At each state, Q-Learning must choose an action (

a_{v}

) [30].

○: Stay: The UE remains connected to its current small cell.
○: Handover: The UE switches to another small cell.

Equation (1) prescribes the set of actions for the UE at movement m:

a_{v} [m] = \{a_{0}, a_{1}, a_{2}, \dots ., a_{Z - 1}\}, s . t \sum_{z = 0}^{Z - 1} a_{z = 1}

(1)

where

▪: a₀ represents stay at the current small cell.
▪: a_z for z ≠ 0 represents the UE changing its connection to a better network.

Determining a balance between exploration and exploitation is a challenging task in Q-Learning. The agent might not find better actions that ultimately result in larger rewards if it always exploits (chooses the most well-known action). However, if the agent never settles on the best course of action and keeps trying different things, it might never maximize its total benefits. The decision follows an ε-greedy policy to ensure that the system learns from experience while still testing the new handover policies. In this tactic, (1) the agent chooses a random action with probability ε (e.g., 0.1 or 10%) and tries different handover actions to discover new strategies. (2) The agent selects the action with the small cell with the highest Q-value by exploiting with probability

1 - ε

(e.g., 0.9 or 90%). This strategy allows the agent to learn effectively by exploring new actions while leveraging the learned knowledge to optimize handovers.

3.: Compute the Reward ( $r_{v + 1}$ )

After executing an action, the system received a reward (

r_{v + 1}

) based on the outcome, as shown in Figure 2.

The agent is motivated to learn how to choose small cells that maximize network performance and minimize low-quality handovers using this incentive mechanism:

○: Successful HO ( $S_{H O}$ ) represents the positive rewards of better QoS: +20 points.
○: Failed HO ( $F_{H O}$ ) represents the negative rewards of worse performance: −10 points.
○: The ping-pong effect ( $P_{H O}$ ) represents the negative rewards of unnecessary handover: −20 points.
○: Blocked HO ( $B_{H O}$ ) represents the positive rewards where the Stay decision is better: +5 points.

r_{v + 1} = - P_{r} + A_{r} | \{\begin{matrix} + S_{H O} + B_{H O} \\ - F_{H O} - P_{H O} \end{matrix},

(2)

where

▪: $P_{r}$ represents the penalties.
▪: $A_{r}$ represents the adding points.
▪: $r_{v + 1}$ is the set of possible rewards.

The reward function reinforces favorable decisions and assigns positive points for success and the prevention of unnecessary HOs, eliminating the need for weight adjustments. The penalties are the inefficient handovers caused by failed and ping-pong effect, and the negative points are assigned to trigger an adjustment of the weights. As a result, this procedure modifies the state to obtain a more favorable outcome in the next handover.

4.: Update the Q-values

By implementing the Bellman equation, the system updates the Q-value for the current (state, action) pair:

Q (S, A) = Q (S, A) + α [R + γ {m a x}_{A^{'}} Q (S^{'}, A') - Q (S, A)]

(3)

where

▪: $Q (S, A)$ represents the expected reward for action A in state S.
▪: $A$ represents the learning rate (adjusts the influence of new information), given $0 < α \leq 1$ .
▪: $γ$ represents the discount factor (prioritizing immediate vs. future rewards), given $0 \leq γ < 1$ .
▪: $A$ represents the immediate reward for the chosen action.
▪: ${m a x}_{A^{'}} Q (S^{'}, A')$ represents the best future reward in the next state.

This ensures that the system learns from previous experiences and continuously improves.

5.: Adjust the TOPSIS weights

After updating the Q-values, the mechanism adjusts the weight (importance) of each network parameter to dynamically optimize TOPSIS decision-making. These adjustments ensure that handover decisions align with real-world network performance. For example,

○: If a high SINR consistently leads to successful handovers, increase the SINR weight.
○: If a low delay is critical, increase the delay weight.
○: If high packet loss leads to failures, reduce the weight on unreliable connections.

The fundamentals of the work can be explored in Algorithm 1.

Algorithm 1: IHD-V2X Algorithm
1	Model the network grid
2	Identify UEs and plot their movements in the network grid sector
3	Inputs:
4	• User movement in Sector Sec_n
5	• Distance between each movement D
6	• UE velocity V
7	• List of SCs in UE Sector ID No (m, n) = [x/20],[y/20] ranked using TOPSIS
8	• Q-table $Q (S, A)$ initialized with zero values
9	• Q-Learning parameters:
10	○ Learning rate α = 0.3
11	○ Discount factor $γ$ = 0.7
12	○ Exploration rate ε = 1.0
13	Outputs:
14	• Successful and unsuccessful handover attempts
15	• Small cell selection and stay time assignment
16	• Q-Learning adjusted weights
17	• Network performance metrics
18	while List of SCs [i] ≤ List of SCs[max] do
19	Extract and normalize parameters values N_ij = X_ij/sqrt(sum(X_ij²))
20	Compute weighted normalized values W_ij = N_ij * W_j
21	Compute Ideal Positive and Negative Solutions (IPS, INS)
22	Compute Performance Index PI = D⁻/(D⁺ + D⁻)
23	Rank SCs R_n = arg max Pi(i)
24	Filter top n% SCs LSC = {Si/Si ∈ Sorted Rn ∧ Rn × C/100}
25	Compute ST Value for sector STV = D/V
26	Assign ST ST_SC = STV − (PI + R)
27	end while
28	for each UE movement
29	while Handover required = True do
30	Query SC Load
31	if SC Load < Max SC Load then
32	Choose Action (A): Stay or Handover using ε-Greedy strategy: (1)
33	▪ Exploration: Select a random action with probability ε
34	▪ Exploitation: Select the action with the highest Q(S,A) value
35	Compute Reward (R): (2)
36	▪ Successful handover R = +20
37	▪ Failed handover R = −10
38	▪ Unnecessary handover (Ping-Pong) R = −20
39	▪ Blocked handover R = +5
40	Update Q-table: $Q (S, A) = Q (S, A) + α [R + γ {m a x}_{A^{'}} Q (S^{'}, A') - Q (S, A)]$
41	Adjust TOPSIS weights dynamically based on learned Q-values
42	if ST ≤ STV then
43	▪ Assign current SC = SC
44	▪ Increment SC Load by 1
45	▪ Decrement ST by 1
46	else
47	▪ Current SC = None
48	end while
49	end for

4. Performance Evaluation

In this section, the simulation parameters and results of the analysis are discussed. The IHD-V2X performance of the proposed algorithm is compared with other state-of-the-art handover mechanisms, including the Advanced Conditional Handover in 5G and Beyond Q-Learning (ACHQ) presented in [28], and the non-Q-Learning algorithm named Context-aware Enhanced Application-Specific Handover in 5G V2X Networks (CAP) [30].

4.1. Simulation Settings

The effectiveness of the mechanism is measured by considering different network performance metrics. In addition to handover Key Performance Indicators (KPIs), measurements of packet loss, delay, and energy consumption are considered to better capture all facets of network performance and provide a comprehensive picture of how the network behaves in various scenarios. The simulation was coded in Python version 3.10.13, leveraging libraries such as Numpy, Matplotlib, OpenPyXL, and SQlite3. The experiment was conducted within a network environment with dimensions of 100 × 100 km². The network grid is divided into 25 network sectors. The simulation was tested using an Intel (R) Xeon (R) with 3.10 GHz, 4 Cores, 8 Logical Processors, and 64 GB of memory. The simulations were run for 1200 iterations. The simulation parameters are listed in Table 4.

4.2. Performance Measures

In addition to HO KPIs, including HO attempts, successful HOs, failed HOs, delay HOs, ping-pong HOs, and prevented unnecessary HOs, packet loss, latency, and energy consumption were used as the QoS parameters involved in measuring the handover performance of V2X networks [31]. In wireless network communication, it is essential to maintain the packet loss and delay at a low rate to decrease the probability of handover issues (unwanted HOs) and increase the occurrence of the number of necessary handovers (wanted handovers). The specifications of the performance measurements are as follows.

4.2.1. HO KPIs

The dense deployment of SCs can increase the probability of handovers. In this study, various handover metrics are considered to evaluate overall network performance.

HOs attempts: Refers to the total number of times the handover process is initiated to transfer the UE connection from the source cell to the target cell.
Successful HOs: Occur when the UE seamlessly changes the connection between two cells. The handover success rate is obtained as the ratio of the number of successful handovers to the total number of HO attempts.
Failed HOs: Occurs when the handover between two cells is unsuccessful owing to inefficient resources. Failed HOs are defined as the ratio of the number of failed handovers to the total number of HO attempts.
Ping-pong HOs: Describes an unwanted situation where a UE rapidly transitions back and forth between SCs, causing unnecessary power consumption, inducing signal overhead, and inefficient resource utilization. Ping-pong HOs quantify the ratio of the number of occurrences of ping-pong HOs to the total number of handovers performed.
Prevented unnecessary HOs: The unnecessary handover occurred as a consequence of the UE attempting to change its current connection, despite the fact that it is sufficient for communication.

4.2.2. Packet Loss

The signal quality can be degraded owing to the SINR, causing packet loss, which means that when the signal strength is low, the probability of packet loss is high [32]. Packet loss refers to the number of packets lost during transmission. The SINR, measured in dB, is the ratio of the signal level to the noise level. A high SINR value represents good signal quality, whereas a value below 0 indicates more noise, which implies a low connection speed and a high probability of losing connection [33]. The SINR of the UE must be measured and evaluated during the HO process. The UE at the cell boundary has a low SINR and significant interference, which causes HO to the neighboring cell [34]. The Packet Error Rate (PER) is used to estimate this degradation and evaluate network reliability [35]. PER denotes the ratio of the errors in the packet that reaches the destination, resulting in packet retransmission and, therefore, increasing the transmission latency.

The SINR calculation for 5G NR is presented in Formula (4):

γ_{d b} = 10 {l o g}_{10} (\frac{p_{s}}{p_{i} + p_{n}})

(4)

where

▪: $p_{s} r e p r e s e n t s$ the signal power.
▪: $p_{i} r e p r e s e n t s$ the signal inference.
▪: $p_{n} r e p r e s e n t s$ the signal noise.

Equation (5) represents the model of packet losses based on SINR

γ_{d b}

and PER

\bar{ε}

.

{\bar{ε}}_{γ_{d b}} = \{\begin{matrix} 1, & i f γ_{d b} < 0 \\ 0.8, & i f 0 \leq γ_{d b} < 5 \\ 0.3, & i f 5 \leq γ_{d b} < 10 \\ 0.1, & i f 10 \leq γ_{d b} < 20 \\ 0.01, & i f γ_{d b} \geq 20 \end{matrix}, i f γ_{d b} < 0 = 1; C o m p l e t e p a c k e t l o s s

(5)

Each packet is received with a probability of 1 −

\bar{ε}

, and the expected number of received packets (N_r) from the original number of packets (N_t) is

N_{r} = N_{t} \times {(1 - {\bar{ε}}_{γ_{d b}})}^{n}

(6)

where n is the packet length.

4.2.3. Latency

In the context of V2X communications, latency plays a vital role in enabling real-time safety applications [36] and ensuring service continuity. The reduction in the packet loss ratio, as discussed earlier, can directly affect the E2E latency. However, it can be subject to significant QoS changes owing to the high mobility of the UE. The latency model for 5G V2X E2E for time-critical applications deployment should consider variations in traffic load and transmission delay [37]. In this study, the latency is modeled by considering the queuing delay D_q, propagation delay D_p, and handover delay D_h. The queuing delay is the time taken by the packets waiting in the queue before transmission.

The calculation of the total end-to-end latency (L_total) of n packets is subject to minimizing the following parameters, as expressed below:

L_{t o t a l} = D_{q} + D_{p} + D_{h}

(7)

where

▪: $D_{q}$ represents the Queue delay(s) of 5 ms.
▪: $D_{p}$ represents the Propagation delay(s) = distance/speed of light.
▪: $D_{h}$ is the handover delay(s), radio resource control time+ path switch time+ procedure time.

4.2.4. Energy Consumption

The proposed method was formulated to minimize energy consumption by incorporating the most effective cells into the handover process. The energy consumption presents the approximated energy usage value for the period of time a UE attempted to handover to a cell. The algorithm contributed to energy consumption due to the restricted scanning of cells within UE movements, the ST role in preventing unnecessary HOs, and the reinforcement of positive rewards. The element therein for the consumption is the approximated transmission power multiplied by the duration of the handover attempt (in milliseconds, for instance). Optimizing energy consumption is critical in this mechanism, since the handover should be rapid to prevent wasteful power utilization, hence prolonging the battery life (limited power source) of UE devices. The less time used for the handover, the less energy is consumed. This demonstrates the power efficiency of the algorithm.

To calculate the total energy expenditure during the handover, the following formulas are applied:

E_{t o t a l} = E_{b a s e} + E_{r e t r a n s m i s s i o n s}

(8)

where

▪: $E_{b a s e} = P_{t} \times T$ .
▪: $E_{r e t r a n s m i s s i o n s} = N_{r} \times E_{b a s e}$ .

Thus, the total energy expenditure is

E_{t o t a l} = E_{b a s e} + N_{r} \times (P_{t} \times T)

(9)

where

▪: $P_{t}$ is the Transmit Power (W).
▪: $T s$ is the Time on Air (s).
▪: $N_{r}$ represents the number of retransmissions.

4.3. Result Analysis

This study focused on the evaluation and comparison of handover decision strategies, including ACHQ, CAP, and the proposed IHD-V2X-based algorithm for varying the UE velocity and density of SCs. Note that the network needs of UEs vary based on the priorities of applications, including basic services (audio, video streaming, and general traffic), data offloading, augmented reality, and relay is implemented in CAP and IHD-V2X mechanisms, which are not considered in ACHQ. The effects of the velocity and network density on the HO efficacy are important for different application types with distinct network performance requirements.

4.3.1. Performance Comparison in Terms of Varying UE Velocity

The velocity of the UEs significantly affects the HO performance in a 5G V2X network. Increasing the velocity raises UE transitions from cell to cell, leading to degraded network performance. HOs at lower speeds are usually stable, but higher speeds result in an increased probability of handover issues, such as HO failure, unnecessary HOs, ping-pongs, and noticeable delays. Figure 3 illustrates a UE movement [X,Y] at different velocities in a network environment dimension of a (100 × 100) network size for the proposed algorithm (IHD-V2X).

The most suitable cell for handover decision sample output for a UE traversing at random velocities is presented in Table 5.

Further analyses are conducted to evaluate the different handover mechanisms at velocities ranging from low to medium to high in a network with 30 UEs and 100 SCs per sector. Figure 4, Figure 5 and Figure 6 show the individual simulation results of the algorithms' handover performance when the UEs move at random velocities. To facilitate the results representation, the figures are produced for a range of UE velocities from 20 km/h to 140 km/h.

It is observed that when the velocity increased, the number of HO attempts increased noticeably for all algorithms. ACHQ shows the greatest number of HO attempts across all velocities, resulting in unnecessary signaling expenditures. CAP effectively decreases unnecessary HOs by ensuring a minimum stay time before transitioning to other SCs. The best performance was observed with TOPSIS with ST and Q-Learning, the IHD-V2X approach, which further reduced the number of attempts by learning from previous HO experiences and predicting optimal handover moments. Meanwhile, ACHQ somewhat improves the success rate but is unable to adapt to fast mobility and suffers from a significant drop in successful handovers, falling below 50% at 140 km/h. The failure rate worsens due to inefficient HO decisions and unnecessary HO attempts. CAP further improves the success rate by stabilizing connections before triggering an HO. IHD-V2X achieves the highest success rates, exceeding 80% for the necessary HOs, by adapting to varying velocities and dynamically tuning HO decisions. Handover delays have become a serious issue, mainly for ACHQ, resulting in excessive delays due to inefficient HO execution. CAP reduces delays by optimizing HO timing and averting recurrent disruptions in connections. The best performance is shown by IHD-V2X, which minimizes HO delays through machine learning to predict and optimize HO execution, even at higher velocities. The inclusion of ST in IHD-V2X and CAP contributes to reducing the number of ping-pong HOs as a result of prohibiting triggering HOs if the current cell is sufficient to meet UE requirements, with the lowest in IHD-V2X at an average of 15% among all. Primarily, the ACHQ algorithm exhibited the highest number of ping-pong HOs owing to the absence of stay time and resource allocation parameters. The average performance of different algorithms is shown in Figure 7. ACHQ fails to prevent unnecessary HOs with approximately 20% of the total number of handover attempts, leading to degraded network performance and unnecessary signaling overhead. The CAP significantly improves this part, compared to the total number of HOs attempts, by implementing a minimum stay period before starting an HO. The most notable improvement is observed with IHD-V2X, which dynamically adjusts to mobility settings and halts the highest percentage of unnecessary HOs, optimizing the overall network performance to approximately 75% of the total HOs attempts. These results indicate the effectiveness of IHD-V2X in supporting seamless connections in a dynamic network environment.

4.3.2. Performance Comparison in Terms of Varying Network Sector Density

In 5G V2X networks, the network density, which is the number of SCs per sector, has a significant impact on the efficiency of the handover process. A higher density of SCs improves network coverage and capacity. However, it also creates challenges such as unnecessary handovers and ping-pong effects.

This analysis evaluated three handover mechanisms for network sector densities ranging from 50 to 250 SCs per sector at a vehicle velocity of 80 km/h. Figure 8, Figure 9 and Figure 10 show the individual simulation results of three algorithms relative to SC density changes.

More frequent switches occur in ACHQ owing to the higher number of SCs per sector and the lack of ST in its implementation, leading to excessive HO signaling. CAP meaningfully decreases unnecessary HOs by ensuring a minimum ST before switching, which ensures an efficient handover decision in IHD-V2X that is further intelligently optimized by Q-Learning HO attempts by intelligently learning from past HOs. Recurrent and unnecessary HOs lead to higher failure rates, and the network struggles to process the increased volume of HOs. On the other hand, ACHQ exhibits the highest success rate because of the high occurrence of ping-pong HO; hence, it remains inefficient in highly dense situations. CAP exhibits a considerable increase in successful handovers by optimizing the duration for which a UE stays connected to a cell before switching. IHD-V2X attains the highest success rate, exceeding 80% for the necessary HOs, owing to its adaptive learning approach that selects the optimal HO and minimizes failures. At higher SC densities, handover delays become a serious issue for ACHQ because the network struggles with excessive HOs. CAP notably lessens delays by guaranteeing more stable connectivity before handover. The best performance is shown by IHD-V2X, which intelligently adjusts to network conditions and minimizes HO delays while optimizing the HOs. Similarly, ping-pong HOs were highest in ACHQ, modest in CAP, and lowest in OHD-V2X. To aggregate, as shown in Figure 11, ACHQ slightly lessens unnecessary HOs, but does not completely remove them, leading to a high network bottleneck and inefficient resource usage. The CAP notably improves this aspect by implementing a minimum stay period before HO initiation. The most significant improvement was observed for IHD-V2X. By dynamically predicting the handover necessity, the highest percentage of unnecessary HOs is prevented, resulting in optimal network performance.

4.3.3. Performance Comparison in Terms of Packet Losses and Latency Ratio

At higher speeds, packet loss surges severely, particularly in ACHQ, where frequent HO failures cause data disturbances and remain ineffective at high velocities. Figure 12 shows that CAP noticeably lowers packet loss by guaranteeing more stable connections before starting HOs, as an effective method of integrating TOPSIS with ST in the handover decision. The best performance is notable with the IHD-V2X, which effectively minimizes packet loss by intelligently foreseeing when an HO should be executed, ensuring seamless connection. TOPSIS with ST and Q-Learning, the IHD-V2X method, attains the best results by guaranteeing efficient HO execution and reducing the failed changeovers. This performance improvement benefits video, augmented reality, and audio applications, which require low latency and minimal interruption. Notably, in the context of Q-Learning, both the IHD-V2X and ACHQ outperformed the non-Q-Learning method, the CAP, where the total latency was the lowest for ACHQ handover at all velocity levels. This difference occurs because ACHQ did consider application-specific requirements in its implementation.

At higher SC densities, packet loss increases significantly in ACHQ handover, particularly at high densities, owing to excessive failed HOs and frequent network HOs. In Figure 13, CAP noticeably lowers packet loss by ensuring connection stability before initiating HOs. The best performance is shown by IHD-V2X, which minimizes packet loss by guaranteeing an efficient handover implementation and steady data transmission. In a different way, latency is the highest for CAP, while ACHQ and IDH-V2X achieve the lowest latency at all density levels because of their adaptive learning-based methods.

4.3.4. Performance Comparison in Terms of Energy Consumption

Energy efficiency is a key concern in high-mobility situations, as recurrent handovers increase power use due to unnecessary signaling. ACHQ showed the highest energy consumption because all SCs participated in the evaluation of the conditional HO decision. Frequent handovers require extra signaling and power, resulting in high energy consumption in ACHQ, as illustrated in Figure 14. As the UE velocity increases, the context-aware feature can enable energy conservation, resulting in lower energy consumption in the CAP (TOPSIS-NON QL learning) and IDH-V2X algorithms. Moreover, when the network is dense, both significantly enhance energy efficiency by reducing unnecessary HOs, because it ensures that only the SCs within the UE movement can participate in the handover decision, limiting the number of scanned SCS, and therefore, saving energy. The most energy-efficient mechanism is IHD-V2X, which optimizes HO execution, lessening the overall signaling expenditure and constant energy usage at all network density levels. It enhances energy usage even more, resulting in up to 6% less energy consumption compared to CAP and ACHQ. This is mainly useful for relay applications, where UEs act as mediators and need to preserve battery life while guaranteeing connectivity. The energy consumption increases preoperatively to approximately 18%, but is minimal compared to ACHQ, which reaches an almost 76% peak when the UE mobility and density increase steadily.

Based on the above simulation results, it is concluded that the proposed algorithm significantly enhances the performance of the handover process by fulfilling the user requirements in 5G V2X networks. The findings indicate that UE mobility and SC density are critical factors that affect HO effectiveness and network performance. As the velocity and density increase, handover occurrence increases, resulting in higher failure rates, higher packet loss, increased latency, and unnecessary energy consumption. A handover optimization strategy was considered.

5. Conclusions and Future Works

The high velocity of UEs and the high density of SCs in a 5G V2X network have ushered in significant challenges in maintaining stable connections and efficient handovers. Frequent connection transfers between SCs due to high speeds lead to a high number of unnecessary HOs and HO failures, while increasing latency and packet loss. This study demonstrated that integrating TOPSIS with ST and Q-Learning in IHD-V2X can efficiently manage HOs in high-velocity and high-density network environments. The deployment of this mechanism results in the prevention of unnecessary handovers by effectively predicting optimum transitions that reduce signaling overhead and, consequently, improving the successful handover by learning prior HO experiences and adapting to changing requirements and conditions. Furthermore, this method involves the minimization of packet losses and latencies, which significantly improves application-specific requirements. Moreover, it is incorporated in the optimization of energy usage by lowering unnecessary consumption in highly mobile and highly dense networks.

Future research directions include exploring AI-driven mobility prediction models to further enhance the network performance and scalability in next-generation V2X implementations. Realistic environmental constraints can be considered to enhance the algorithm.

Author Contributions

Conceptualization, F.R.A.A.H., A.T., N.A. and F.A.S.; Methodology, F.R.A.A.H. and A.T.; Software, F.R.A.A.H.; Validation, F.R.A.A.H.; Formal analysis, F.R.A.A.H., A.T. and F.A.S.; Investigation, F.R.A.A.H., A.T., N.A. and F.A.S.; Data curation, F.R.A.A.H.; Writing—original draft, F.R.A.A.H.; Writing—review and editing, F.R.A.A.H., A.T. and F.A.S.; Supervision, A.T., N.A. and F.A.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

5G	Fifth Generation
ACHO	Advanced Conditional Handover
BER	Bit Error Rate
BS	Base Station;
C-V2X	Cellular V2X;
CHO	Conditional Handover
CIO	Cell Individual Offset
FCC	Federal Communications Commission
D2D	Device-to-Device
DQN	Deep Q-Learning
DDQN	Double DQN
DSRC	Dedicated Short Range Communication
eMBB	Enhanced Mobile Broadband
E-URAN	Evolved Universal Terrestrial Radio Access Network
HOM	Handover Margin
HetNets	Heterogeneous Networks
HO	Handover
ITS	Intelligent Transportation System
KPIs	Key Performance Indicators
LSTM	Long Short-Term Memory
LTE	Long-Term Evolution
MCDM	Multiple Criteria Decision-Making
ML	Machine Learning
NG	New Generation
PER	Packet Error Rate
PI	Performance Index
Q-Learning	Quality Learning
QoS	Quality of Service
RAT	Radio Access Technology;
RL	Reinforcement Learning;
RSRQ	Reference Signal Received Quality
RSSI	Received Signal Strength Indicator;
SARSA	State–Action–Reward–State–Action
SINR	Signal-to-Interference-Noise Ratio
SCs	Small Cells
SL	Sidelink
SON	Self-Organizing Network
ST	Stay Yime
TTT	Time-To-Trigger
TOPSIS	Technique for Order Preference by Similarity to Ideal Solution
UDN	Ultra-Dense-Network
URLLC	Ultra-Reliable Low-Latency Communication
V2N	Vehicle-to-Network
UE	User Equipment
V2X	Vehicle-to-Everything

References

WHO Road Traffic Injuries. Available online: https://www.who.int/news-room/fact-sheets/detail/road-traffic-injuries (accessed on 26 January 2024).
Technical Specification, Universal Mobile Telecommunications System (UMTS); LTE. Proximity-Services (PROSE) Management Objects, “ETSI TS 124 333 V12.5.0 (2017-04),” Etsi.org. [Online]. Available online: https://www.etsi.org/deliver/etsi_ts/124300_124399/124333/12.05.00_60/ts_124333v120500p.pdf (accessed on 13 February 2025).
5G Automotive Association (5GAA). 5GAA P-180106 (White Paper). 5GAA. 2019. Available online: https://5gaa.org/content/uploads/2019/07/5GAA_191906_WP_CV2X_UCs_v1-3-1.pdf (accessed on 6 October 2024).
Andreev, S.; Petrov, V.; Dohler, M.; Yanikomeroglu, H. Future of Ultra-Dense Networks Beyond 5G: Harnessing Heterogeneous Moving Cells. IEEE Commun. Mag. 2019, 57, 86–92. [Google Scholar] [CrossRef]
Si, Q.; Cheng, Z.; Lin, Y.; Huang, L.; Tang, Y. Network Selection in Heterogeneous Vehicular Network: A One-to-Many Matching Approach. In Proceedings of the 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring), Antwerp, Belgium, 25–28 May 2020; IEEE: Antwerp, Belgium, 2020; pp. 1–5. [Google Scholar] [CrossRef]
Haghrah, A.; Abdollahi, M.P.; Azarhava, H.; Niya, J.M. A survey on the handover management in 5G-NR cellular networks: Aspects, approaches and challenges. EURASIP J. Wirel. Commun. Netw. 2023, 2023, 52. [Google Scholar] [CrossRef]
Thakkar, M.K.; Agrawal, L.; Rangisetti, A.K.; Tamma, B.R. Reducing ping-pong handovers in LTE by using A1-based measurements. In Proceedings of the 2017 Twenty-third National Conference on Communications (NCC), Chennai, India, 2–4 March 2017; IEEE: Chennai, India, 2017; pp. 1–6. [Google Scholar]
Shayea, I.; Dushi, P.; Banafaa, M.; Rashid, R.A.; Ali, S.; Sarijari, M.A.; Daradkeh, Y.I.; Mohamad, H. Handover Management for Drones in Future Mobile Networks—A Survey. Sensors 2022, 22, 6424. [Google Scholar] [CrossRef]
3GPP Tdoc R2-131233 Frequent Handovers and Signaling Load Aspects in Heterogeneous Networks 2013. Available online: https://www.3gpp.org/ftp/tsg_ran/WG2_RL2/TSGR2_81bis/Docs (accessed on 9 June 2024).
3GPP TS 22.185 Version 14.3.0 Release 14 LTE; Service Requirements for V2X Services 2017. Available online: https://www.etsi.org/deliver/etsi_ts/122100_122199/122185/14.03.00_60/ts_122185v140300p.pdf (accessed on 15 April 2023).
Mollel, M.S.; Abubakar, A.I.; Ozturk, M.; Kaijage, S.F.; Kisangiri, M.; Hussain, S.; Imran, M.A.; Abbasi, Q.H. A Survey of Machine Learning Applications to Handover Management in 5G and Beyond. IEEE Access 2021, 9, 45770–45802. [Google Scholar] [CrossRef]
Tashan, W.; Shayea, I.; Aldirmaz-Colak, S.; Aziz, O.A.; Alhammadi, A.; Daradkeh, Y.I. Advanced Mobility Robustness Optimization Models in Future Mobile Networks Based on Machine Learning Solutions. IEEE Access 2022, 10, 111134–111152. [Google Scholar] [CrossRef]
tr_136932v130000p.pdf. Available online: https://www.etsi.org/deliver/etsi_tr/136900_136999/136932/13.00.00_60/tr_136932v130000p.pdf (accessed on 16 August 2024).
Jain, A.; Tokekar, S. Application Based Vertical Handoff Decision in Heterogeneous Network. Procedia Comput. Sci. 2015, 57, 782–788. [Google Scholar] [CrossRef]
Satapathy, P.; Mahapatro, J. An adaptive context-aware vertical handover decision algorithm for heterogeneous networks. Comput. Commun. 2023, 209, 188–202. [Google Scholar] [CrossRef]
Jiang, D.; Huo, L.; Lv, Z.; Song, H.; Qin, W. A Joint Multi-Criteria Utility-Based Network Selection Approach for Vehicle-to-Infrastructure Networking. IEEE Trans. Intell. Transport. Syst. 2018, 19, 3305–3319. [Google Scholar] [CrossRef]
Guo, X.; Omar, M.H.; Zaini, K.M.; Liang, G.; Lin, M.; Gan, Z. Multiattribute Access Selection Algorithm for Heterogeneous Wireless Networks Based on Fuzzy Network Attribute Values. IEEE Access 2022, 10, 74071–74081. [Google Scholar] [CrossRef]
Kaur, R.; Mittal, S. Handoff parameter selection and weight assignment using fuzzy and non-fuzzy methods. In Proceedings of the 2021 2nd International Conference on Secure Cyber Computing and Communications (ICSCCC), Jalandhar, India, 21–23 May 2021; IEEE: Jalandhar, India, 2021; pp. 388–393. [Google Scholar] [CrossRef]
Tan, K. Adaptive Vehicular Networking with Deep Learning. Ph.D. Thesis, University of Glasgow, Glasgow, UK, 2023. [Google Scholar]
Ye, H.; Liang, L.; Li, G.Y.; Kim, J.; Lu, L.; Wu, M. Machine Learning for Vehicular Networks. arXiv 2018, arXiv:1712.07143. [Google Scholar] [CrossRef]
Mekrache, A.; Bradai, A.; Moulay, E.; Dawaliby, S. Deep reinforcement learning techniques for vehicular networks: Recent advances and future trends towards 6G. Veh. Commun. 2022, 33, 100398. [Google Scholar] [CrossRef]
Nguyen, M.-T.; Kwon, S. Machine Learning–Based Mobility Robustness Optimization Under Dynamic Cellular Networks. IEEE Access 2021, 9, 77830–77844. [Google Scholar] [CrossRef]
Peng, M.; Liang, D.; Wei, Y.; Li, J.; Chen, H.-H. Self-Configuration and Self-Optimization in LTE-Advanced Heterogeneous Networks. IEEE Commun. Mag. 2013, 51, 36–45. [Google Scholar] [CrossRef]
Tan, K.; Bremner, D.; Le Kernec, J.; Sambo, Y.; Zhang, L.; Imran, M.A. Intelligent Handover Algorithm for Vehicle-to-Network Communications With Double-Deep Q-Learning. IEEE Trans. Veh. Technol. 2022, 71, 7848–7862. [Google Scholar] [CrossRef]
He, J.; Xiang, T.; Wang, Y.; Ruan, H.; Zhang, X. A Reinforcement Learning Handover Parameter Adaptation Method Based on LSTM-Aided Digital Twin for UDN. Sensors 2023, 23, 2191. [Google Scholar] [CrossRef]
Ortiz, M.T.; Salient, O.; Camps-Mur, D.; Escrig, J.; Nasreddine, J.; Pérez-Romero, J. On the Application of Q-learning for Mobility Load Balancing in Realistic Vehicular Scenarios. In Proceedings of the 2023 IEEE 97th Vehicular Technology Conference (VTC2023-Spring), Florence, Italy, 20–23 June 2023; IEEE: Florence, Italy, 2023; pp. 1–7. [Google Scholar] [CrossRef]
Karmakar, R.; Kaddoum, G.; Chattopadhyay, S. Mobility Management in 5G and Beyond: A Novel Smart Handover With Adaptive Time-to-Trigger and Hysteresis Margin. IEEE Trans. Mob. Comput. 2023, 22, 5995–6010. [Google Scholar] [CrossRef]
Sundararaju, S.C.; Ramamoorthy, S.; Basavaraj, D.P.; Phanindhar, V. Advanced Conditional Handover in 5G and Beyond Using Q-Learning. In Proceedings of the 2024 IEEE Wireless Communications and Networking Conference (WCNC), Dubai, United Arab Emirates, 21–24 April 2024; pp. 1–6. [Google Scholar] [CrossRef]
Thillaigovindhan, S.K.; Roslee, M.; Mitani, S.M.I.; Osman, A.F.; Ali, F.Z. A Comprehensive Survey on Machine Learning Methods for Handover Optimization in 5G Networks. Electronics 2024, 13, 3223. [Google Scholar] [CrossRef]
Al Harthi, F.R.A.; Touzene, A.; Alzidi, N.; Al Salti, F. Context-Aware Enhanced Application-Specific Handover in 5G V2X Networks. Electronics 2025, 14, 1382. [Google Scholar] [CrossRef]
Tayyab, M.; Gelabert, X.; Jantti, R. A Survey on Handover Management: From LTE to NR. IEEE Access 2019, 7, 118907–118930. [Google Scholar] [CrossRef]
Kassler, A.; Castro, M.; Dely, P. VoIP Packet Aggregation based on Link Quality Metric for Multihop Wireless Mesh Networks. Int. J. Comput. Sci. Eng. 2011, 3, 2323–2331. [Google Scholar]
Signal Quality [LTE/5G ]—LTE and 5G Signal Quality Parameters, Zyxel Support Campus EMEA. 2019. Available online: https://support.zyxel.eu/hc/en-us/articles/360005188999-Signal-quality-LTE-5G-LTE-and-5G-signal-quality-parameters (accessed on 20 February 2025).
Ullah, Y.; Roslee, M.B.; Mitani, S.M.; Khan, S.A.; Jusoh, M.H. A Survey on Handover and Mobility Management in 5G HetNets: Current State, Challenges, and Future Directions. Sensors 2023, 23, 5081. [Google Scholar] [CrossRef] [PubMed]
BER vs. PER: What’s the Difference? Available online: https://www.test-and-measurement-world.com/measurements/general/ber-vs-per-understanding-bit-error-rate-and-packet-error-rate (accessed on 20 February 2025).
Clancy, J.; Mullins, D.; Ward, E.; Denny, P.; Jones, E.; Glavin, M.; Deegan, B. Investigating the Effect of Handover on Latency in Early 5G NR Deployments for C-V2X Network Planning. IEEE Access 2023, 11, 129124–129143. [Google Scholar] [CrossRef]
Coll-Perales, B.; Lucas-Estañ, M.C.; Shimizu, T.; Gozalvez, J.; Higuchi, T.; Avedisov, S.; Altintas, O.; Sepulcre, M. End-to-End V2X Latency Modeling and Analysis in 5G Networks. IEEE Trans. Veh. Technol. 2023, 72, 5094–5109. [Google Scholar] [CrossRef]

Figure 1. Q-Learning for V2X communications.

Figure 2. Flowchart of the Q-Learning implementation.

Figure 3. Sample scenario of a network map for a UE moving at random velocities.

Figure 4. Handover performance results of ACHQ algorithm vs. UE velocities.

Figure 5. Handover performance results of CAP algorithm vs. UE velocities.

Figure 6. Handover performance results of IHD-V2X algorithm vs. UE velocities.

Figure 7. Average results analysis for different algorithms for UEs moving at various velocities.

Figure 8. Handover performance results of ACHQ algorithm vs. SC density.

Figure 9. Handover performance results of CAP algorithm vs. SC density.

Figure 10. Handover performance results of IHD-V2X algorithm vs. SC density.

Figure 11. Comparative results analysis for different algorithms with diverse SC densities.

Figure 12. Comparative analysis of packet losses and latency vs. various UE mobilities.

Figure 13. Comparative analysis of packet losses and latency vs. various SC densities.

Figure 14. Average energy expenditure against velocity and SC density.

Table 1. V2X access technologies.

DSRC (Dedicated Short Range Communication)	C-V2X (Cellular-V2X)
Released in 2010 and deployed in 2017.	Deployed for LTE direct C-V2X in 2016, and indirect-V2X in 2024.
Standardized by IEEE 802.11p and IEEE 802.11bd.	Defined by 3GPPP
Primarily operates in the 5.9GHz band	Can operate in the 5.9 GHz band and in operator operator-licensed band.
Supports Vehicle-to-Vehicle (V2V) and Vehicle-to-Infrastructure (V2I) communications.	Supports inter-vehicle (V2V), pedestrian (V2P, infrastructure (V2I), and network (V2N) communications.

Table 2. Comparative analysis of HO RL-based approaches.

Ref.	Algorithm	Rewarding	Network Type	Input Parameters	Optimized Parameters	HO Decision	Mobility Model	Performance KPIs	Result/ Performance	Key Contributions	Limitations
[22] IEEE 2021	Distributed RL	Adjustment of TTT and CIO	5G (Xn Interface)	RSRP	TTT and Hysteresis	Hybrid	Random WAY point	RLF and PP	Adopted twenty-four times faster and improved the user satisfaction rate by 417.7% more than the non-ML algorithm.	Proposed ML based on a mobility robustness optimization algorithm to minimize unnecessary handover.	The UDN paradigm is not considered, and infrequently updated centralized databases can lead to wrong decisions.
[24] IEEE 2022	Double DQN (DDQN)	Maximize cumulated RSRP-based reward	LTE	RSRP	TTT and Hysteresis	Centralized	Google Maps Directions API	Packet loss (PL)	Offset the traditional network, by reducing packet loss by 25.72% per HO.	Designed an ML-Based HO algorithm for the V2N network using a real data set collected from the city of Glasgow (UK).	The execution agent is in the core network, adding signaling overhead to the network.
[25] Sensors 2023	DQN integrated with LSTM	Coefficient between RFL and PP	UDN	RSRP	TTT and Hysteresis	Mobile terminal	NS	RLF and PP	Enhanced DQN with a digital twin outperforms the baseline DQN by achieving an effective handover rate of 2.7%.	Proposed an ML optimization method to predict handover parameters/thresholds based on the conditions of the environment.	Not involved in the HO decision, leading to inefficient information for critical dynamic use cases such as V2X.
[26] IEEE 2023	Q-Learning	Load reward adjustment strategy	LTE	RARP	TTT and Hysteresis	Distributed (RAT)	Poisson distribution centered	Throughput and PLR	Reduction in the average of the overloaded time by 91.87% and the load is sufficiently distributed among cells.	Proposed a Q-Learning strategy that addresses the cell overload problem while serving the QoS needs.	Implemented only for two neighboring cells. Needs dynamic environmental factors (such as user mobility).
[27] IEEE 2023	SARSA	RSRQ	5G NR module	RSRP and speed	TTT and Hysteresis	Centralized	Constant speed	Throughput, RLF	Minimize HO failure between 6 and 10% and maintain a throughput of 80% of the total connections.	Designed an adaptive online learning-based handover mechanism (LIM2).	Load balancing is not considered.
[28] IEEE 2024	Q-Learning	RSRP	5G Networks	RSRP and speed	TTT and Hysteresis	Centralized	Uniform distribution	Handover rate	Reducing the HO rate by 40%.	Enhanced 5G mobility by giving the UE the ability to self-optimize handover decisions.	Load balancing is not considered.
IHD-V2X	Q-Learning Integrated with TOPSIS decision	Parameter weightage adjustment	5G V2X	RSSI, SINR, Bits Error Rate (BER), date rate, delay, PL	Handover Decision	Distributed	Random way point	HO KPIs, PLR, latency and energy consumption	Minimizing HO issues and improving PLR, latency, and energy consumption.	Intelligence-based considering QoS requirements.	To experiment with the algorithm's effectiveness using real-time traffic.

NS: not specified; HO KPIs: HO attempts, failed HOs, delayed HOs, ping-pong HOs, prevented HOs.

Table 3. Strengths and weaknesses of RL algorithms.

RL Algorithm	Strengths	Weaknesses
Q-Learning	A model-free algorithm to identify the optimal policy. It is considered a trial-and-error-free algorithm. The epsilon-greedy policy is applied to balance between exploration and exploitation.	Scalability Problems: Because the Q-table expands exponentially with the number of states and actions, Q-Learning has trouble handling vast or continuous state and action spaces. Restricted Generalization: There is no cross-state generalization with Q-Learning. Since every state-action pair is handled separately, inefficiencies may result when comparable states are present.
Deep Q-Learning	Combines the Artificial Neural Network (ANN) with Q-Learning and is suitable for making a decision in a complex environment.	Complexity: DQN is more difficult to implement and calls for a large increase in processing and memory capacity. DQN converges more slowly since a neural network must be trained, particularly in situations with sparse input. Sample inefficiency: DQN frequently needs a lot of training data to function well, which can be difficult in settings where data collection is costly or takes a long time. DQN may experience stability problems during training, such as fluctuating policies or divergent Q-values.
Distributed Reinforcement Learning	The training process is accelerated due to the decoupling of the tasks of acting and learning.	Coordination complexity: This can increase the coordination complexity between different agents, resulting in slowing down the learning process. Overhead communications: This can lead to low network efficiency due to large amounts of information sharing, which in turn increases communications costs. Communication complexity: This can result in intensive resource utilization.
State–Action–Reward–State–Action (SARSA)	Simplifying the implementation does not require modeling the environment.	Suboptimal decision: The on-policy learning algorithm follows the current policy, as it depends on the current behavior of the agent, which can limit the potential to explore better actions.

Table 4. Simulation settings.

Parameters	Value
Network area	100 × 100 km²
Network sectors	25 sectors
Date Rate	100–400 Mbps -Mid-Band 5G
Q-Learning parameters	Learning rate α = 0.1 Discount factor γ = 0.9 Exploration Rate = 0.9
Network configuration	Queue delay(s) = 5 ms Radio resource control (RRC) time = 20 ms Path switch time = 0.03 ms RACH procedure time = 0.01 ms

Table 5. IHD-V2X sample output of the handover decision for a UE moving at random velocities.

Mov ID (M): 4 Sector ID No (Sec_n): 24 No. of SCs in Sec_n: 30			Maximum Load: 20 Velocity (V): 117.84 Context: Audio
SC	PI	SC load	ST	status	Action
SC9	0.258674345	15	1	Successful HOs	Necessary
SC3	0.296381396	8	0.9915143		Unnecessary
SC21	0.308723826	9	0.9830286		Unnecessary
SC23	0.309375018	16	0.9745429		Unnecessary
SC26	0.375878704	3	0.9660573		Unnecessary
SC1	0.4069305	5	0.9575716		Unnecessary
SC5	0.409621584	4	0.9490859		Unnecessary
SC16	0.418339327	6	0.9406003		Unnecessary
SC29	0.428719489	10	0.9321146		Unnecessary
SC19	0.435141813	5	0.9236289		Unnecessary
SC4	0.457167959	13	0.9151433		Unnecessary
SC14	0.521482773	18	0.9066576		Unnecessary
SC22	0.53638102	20	0.8981719		Unnecessary
SC11	0.55055526	4	0.8896862		Unnecessary
SC12	0.582608246	8	0.8812006		Unnecessary

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Al Harthi, F.R.A.; Touzene, A.; Alzidi, N.; Al Salti, F. Intelligent Handover Decision-Making for Vehicle-to-Everything (V2X) 5G Networks. Telecom 2025, 6, 47. https://doi.org/10.3390/telecom6030047

AMA Style

Al Harthi FRA, Touzene A, Alzidi N, Al Salti F. Intelligent Handover Decision-Making for Vehicle-to-Everything (V2X) 5G Networks. Telecom. 2025; 6(3):47. https://doi.org/10.3390/telecom6030047

Chicago/Turabian Style

Al Harthi, Faiza Rashid Ammar, Abderezak Touzene, Nasser Alzidi, and Faiza Al Salti. 2025. "Intelligent Handover Decision-Making for Vehicle-to-Everything (V2X) 5G Networks" Telecom 6, no. 3: 47. https://doi.org/10.3390/telecom6030047

APA Style

Al Harthi, F. R. A., Touzene, A., Alzidi, N., & Al Salti, F. (2025). Intelligent Handover Decision-Making for Vehicle-to-Everything (V2X) 5G Networks. Telecom, 6(3), 47. https://doi.org/10.3390/telecom6030047

Article Menu

Intelligent Handover Decision-Making for Vehicle-to-Everything (V2X) 5G Networks

Abstract

1. Introduction

2. Related Work

3. Intelligent Handover Decision for V2X (IHD-V2X)

4. Performance Evaluation

4.1. Simulation Settings

4.2. Performance Measures

4.2.1. HO KPIs

4.2.2. Packet Loss

4.2.3. Latency

4.2.4. Energy Consumption

4.3. Result Analysis

4.3.1. Performance Comparison in Terms of Varying UE Velocity

4.3.2. Performance Comparison in Terms of Varying Network Sector Density

4.3.3. Performance Comparison in Terms of Packet Losses and Latency Ratio

4.3.4. Performance Comparison in Terms of Energy Consumption

5. Conclusions and Future Works

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI