Location-Based Handover with Particle Filter and Reinforcement Learning (LBH-PRL) for Mobility and Service Continuity in Non-Terrestrial Networks (NTN)

Chen, Li-Sheng; Liao, Shu-Han; Cho, Hsin-Hung

doi:10.3390/electronics14081494

Open AccessArticle

Location-Based Handover with Particle Filter and Reinforcement Learning (LBH-PRL) for Mobility and Service Continuity in Non-Terrestrial Networks (NTN)

by

Li-Sheng Chen

^1,*,

Shu-Han Liao

² and

Hsin-Hung Cho

¹

Department of Computer Science and Information Engineering, National Ilan University, Yilan 26047, Taiwan

²

Department of Electronic and Computer Engineering, Tamkang University, New Taipei City 251301, Taiwan

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(8), 1494; https://doi.org/10.3390/electronics14081494

Submission received: 7 January 2025 / Revised: 21 March 2025 / Accepted: 26 March 2025 / Published: 8 April 2025

(This article belongs to the Special Issue New Advances in Machine Learning and Its Applications)

Download

Browse Figures

Versions Notes

Abstract

In high-mobility non-terrestrial networks (NTN), the reference signal received power (RSRP)-based handover (RBH) mechanism is often unsuitable due to its limitations in handling dynamic satellite movements. RSRP, a key metric in cellular networks, measures the received power of reference signals from a base station or satellite and is widely used for handover decision-making. However, in NTN environments, the high mobility of satellites causes frequent RSRP fluctuations, making RBH ineffective in managing handovers, often leading to excessive ping-pong handovers and a high handover failure rate. To address this challenge, we propose an innovative approach called location-based handover with particle filter and reinforcement learning (LBH-PRL). This approach integrates a particle filter to estimate the distance between user equipment (UE) and NTN satellites, combined with reinforcement learning (RL), to dynamically adjust hysteresis, time-to-trigger (TTT), and handover decisions to better adapt to the mobility characteristics of NTN. Unlike the location-based handover (LBH) approach, LBH-PRL introduces adaptive parameter tuning based on environmental dynamics, significantly improving handover decision-making robustness and adaptability, thereby reducing unnecessary handovers. Simulation results demonstrate that the proposed LBH-PRL approach significantly outperforms conventional LBH and RBH mechanisms in key performance metrics, including reducing the average number of handovers, lowering the ping-pong rate, and minimizing the handover failure rate. These improvements highlight the effectiveness of LBH-PRL in enhancing handover efficiency and service continuity in NTN environments, providing a robust solution for intelligent mobility management in high-mobility NTN scenarios.

Keywords:

low earth orbit (LEO); non-terrestrial networks (NTN); handover; particle filter; reinforcement learning

1. Introduction

The upcoming era of mobile communication, 6G, aims to address the constraints of its predecessor. Recognizing the need for a more inclusive approach, organizations such as the International Telecommunication Union (ITU) and the 3rd Generation Partnership Project (3GPP) are intensively working on establishing standards for non-terrestrial networks (NTN). Among these advancements, satellite-based communication has emerged as a key area of interest, offering the potential to revolutionize global connectivity and extend the reach of 6G beyond Earth’s surface. Among these, low earth orbit (LEO) satellites stand out due to their reduced development and launch costs, facilitated by satellite miniaturization and reusable launch vehicles. Additionally, LEO satellites offer significantly lower propagation delays compared to geostationary orbit (GEO) satellites, making them a viable alternative for communication systems [1,2]. However, unlike GEO satellites, LEO satellites face frequent handover events due to their higher velocity and lower altitude. This presents unique challenges for mobility management within LEO satellite networks.

NTN encompasses a broad range of communication scenarios, including satellite-to-terrestrial communication, where user equipment (UE) connects directly to satellites, as well as satellite-to-satellite communication, which facilitates inter-satellite data transfer. While NTN technologies include both communication paradigms, this study specifically focuses on satellite-to-terrestrial communication within NTN, where UE interacts with LEO satellites for mobility management and service continuity. Hybrid architectures integrating NTN with terrestrial networks (TN) represent a significant research area; however, they fall outside the scope of this work. By narrowing our focus to satellite-to-terrestrial communication, we aim to analyze and optimize handover mechanisms that directly impact NTN mobility performance. Despite these challenges, LEO satellites are gaining prominence for their shorter roundtrip time relative to both GEO and medium Earth orbit (MEO) satellites. Recent research has focused on improving handover mechanisms for LEO-based communication networks. For instance, in [3], a handover algorithm was proposed to optimize satellite selection and handover strategy. Additionally, ref. [4] analyzed the impact of time-to-trigger (TTT) and handover margin on LEO-based NTN handover performance. In [5], a method for integrating TN with NTN during handover was introduced. Furthermore, ref. [6] outlined a handover procedure between high altitude platforms (HAPs) and LEO satellites, proposing a strategy for performance optimization.

Various algorithms and mechanisms for LEO satellite handover have been studied. For example, ref. [7] reviewed the latest trends in handover algorithms for LEO satellites, while [8] evaluated handover performance using the A4 event for measurement reporting. Notably, ref. [9] proposed a novel mobility management approach that eliminates traditional handover mechanisms by introducing a cell-free, on-demand coverage model, addressing the cost inefficiencies inherent in conventional cellular architectures. In [10], the authors presented an Area-Based Mobility Management (ABMM) scheme specifically tailored for LEO satellite networks. This scheme leverages the Global Positioning System (GPS) to mitigate the challenges posed by LEO satellites’ high velocity, which often results in elevated mobility management costs and degraded Quality of Service (QoS) compared to terrestrial networks. Their algorithm effectively addressed these issues. Additionally, ref. [11] investigated LEO-based NTN deployments, focusing on innovative mobility solutions and evaluating performance under diverse scenarios, including urban macro environments and high-speed train use cases. To counteract frequent handovers induced by moving satellite beams, ref. [12] proposed a fixed-duration strategy tailored for LEO Earth-fixed configurations. Similarly, ref. [13] introduced a learning-based auction handover algorithm that optimizes handover decisions by considering factors such as received signal strength and service time. Meanwhile, ref. [14] proposed a reimagined core network architecture that embeds mobility management functionality directly into the satellite, enhancing the adaptability and efficiency of Satellite-Terrestrial Integrated Networks (STIN). Mobility management is critical for ensuring service continuity and optimizing handover performance. Despite the work in [15] addressing mobility management challenges in NTNs, most previous studies have not thoroughly considered multiple parameters for handover mobility management in these networks. Additionally, the A3 event for measurement reporting has been extensively studied, as NTN environments often do not experience significant drops in signal strength. In certain NTN scenarios, location-based handovers may outperform reference signal received power (RSRP)-based handovers. Artificial intelligence (AI) techniques have gained significant attention in NTN due to their capability to handle dynamic and complex mobility scenarios. A comprehensive survey on AI-driven approaches for satellite communication and NTNs is provided in [16], where various machine learning (ML) and deep learning (DL) techniques are explored for enhancing network operations, including mobility management. The study highlights the growing role of AI-based solutions in optimizing satellite handover strategies, beamforming, and resource allocation. Unlike conventional rule-based methods, AI models can adapt to real-time environmental changes and improve handover decision-making. This underscores the potential of integrating AI-driven approaches, such as reinforcement learning (RL), with traditional mobility management techniques to enhance the adaptability and robustness of NTN systems.

The NTN is designed to leverage LEO satellites operating at altitudes between 600 km and 1200 km, traveling at an impressive speed of approximately 7.5 km/s relative to the Earth’s surface. The primary aim is to extend 5G New Radio (NR) services globally by utilizing multiple satellite beams. This introduces a unique and dynamic scenario that demands significant adaptations to conventional mobility management strategies, as discussed in [17]. A key concept in this context is Earth Moving Cells (EMC), where each satellite beam’s coverage area shifts along the satellite’s trajectory across the Earth’s surface. This continuous motion leads to frequent handover events; for instance, a satellite orbiting at 600 km altitude with a beam diameter of 50 km may necessitate a handover approximately every 5 s. Such frequent handovers exacerbate signaling overhead, and when combined with measurement inaccuracies and negligible signal strength differences between overlapping cells, the risk of service failures and extended interruptions increases significantly. As shown in [18], the UE-assisted network-controlled baseline handover (BHO) struggles to maintain robust service quality due to delayed handover decisions. Meanwhile, findings in [19] highlight that the Rel-16 conditional handover (CHO) reduces radio link failures (RLFs) and handover failures through earlier handover preparations. However, this comes at the expense of a 60% increase in signaling overhead and measurement reporting compared to the BHO. Both approaches rely solely on UE-provided measurements, which directly affect the timing and accuracy of handover triggers and decisions. While these measurement-based handover methods are effective and widely adopted in terrestrial networks due to their simplicity, they fall short of meeting the unique demands of NTN’s dynamic environment. Nevertheless, for NTN implementations, the 3GPP has recommended in reference [17] the incorporation of additional triggering criteria, as a purely RSRP-based handover (RBH) triggering—where RSRP is defined as the received power of reference signals—may not suffice to achieve suitable mobility performance.

The successful implementation of ML solutions for mobile network automation has driven its automated applications in 5G technology, as it provides a solution for achieving the desired adaptability with low management costs. The authors in [20] applied a method called ε-greedy Q-learning aimed at learning the suitable handover strategy that maximizes the expected future throughput based on pedestrian location and speed. The authors in [21] proposed a method to improve conditional HO techniques by preparing the target cell for potential upcoming handover through predictions. Their results indicate promising improvements in reducing the Mitigation Time (MIT). The work in [22] focuses on using Long Short-Term Memory (LSTM) supervised machine learning techniques for multi-user, multi-step trajectory prediction. In [23], the authors proposed a method to enhance handover performance in millimeter-wave vehicular networks by leveraging historical handover data combined with basic machine learning algorithms. This approach aimed to identify patterns and correlations between vehicle status parameters at the time of the handover request and the resulting handover decision. While effective for the targeted vehicular network scenarios, their solution is highly scenario-specific and lacks the adaptability required for more complex and diverse contexts, such as broader 5G, 6G, or NTN environments. The integration of NTNs into 6G enables global connectivity but introduces challenges such as latency, mobility, and resource allocation. To address these, ref. [24] reviews AI techniques, including ML, DL, and DRL, for optimizing NTN operations like network planning and interference management while also discussing security concerns. Similarly, ref. [25] highlights AI’s role in mitigating satellite NTN challenges such as Doppler effects, handovers, and spectrum sharing, emphasizing AI-driven task offloading and network slicing. Both studies underscore AI’s potential to enhance NTN efficiency while acknowledging ongoing research challenges. In [26], Wang et al. proposed a deep reinforcement learning-based satellite handover scheme to optimize handover decisions in satellite communications. Their work demonstrated the potential of DRL in improving handover efficiency by learning from past mobility patterns. However, their approach did not refine the obtained location information, which could lead to inaccuracies in handover decisions. Compared to methods utilizing Kalman filters or particle filters for precise location estimation, their approach may suffer from reduced stability in high-mobility NTN environments. Wang et al. investigated inter-satellite mobility management and network topology design and proposed an inter-satellite laser link planning framework to ensure reliable topology in optical satellite networks [27]. While their study makes significant contributions to NTN network architecture design, it does not directly address the handover challenges between LEO satellites and UE in satellite-to-terrestrial communication, nor does it tackle mobility management issues in ground-to-satellite connectivity. In [28], Juan et al. provided a comprehensive survey on handover solutions for 5G LEO satellite networks, analyzing various handover mechanisms, including RSRP-based handover and location-based handover. They also proposed an antenna gain-based handover, which leverages the predictability of satellite trajectories and the antenna gain characteristics of satellite beams, allowing UE to operate without relying on traditional radio measurements. The particle filters, also known as sequential Monte Carlo (SMC) methods, are a powerful tool for estimating dynamic systems that evolve over time [29]. They are widely used in scenarios where the system state is not directly observable and measurements are noisy or uncertain. A particle filter represents the probability distribution of a system’s state using a set of weighted particles, which are propagated and updated iteratively based on system dynamics and measurement data [30]. The RL is a branch of machine learning that enables an agent to learn and adapt its decision-making process through continuous interaction with its environment. By exploring different actions and observing their outcomes, the agent aims to develop a strategy that maximizes the total rewards accumulated over time [31]. Unlike supervised learning, RL does not require labeled data; instead, it relies on feedback from the environment in the form of rewards or penalties [32]. The combination of particle filters and reinforcement learning leverages the strengths of both methods. Particle filters provide accurate system state estimations, which serve as inputs for the RL framework. RL uses these estimations to optimize decision-making processes, such as handover parameter tuning in NTN. This integration enhances the adaptability and robustness of systems operating in dynamic and uncertain environments.

Building on the AI-driven NTN mobility management research outlined in [16], this work integrates machine learning techniques, specifically reinforcement learning, into the handover decision-making process to enhance mobility performance in NTN. Our study leverages AI’s adaptability to real-time mobility patterns, aiming to minimize unnecessary handovers while ensuring seamless service continuity in dynamic satellite environments. In existing research on NTN mobility management, RL and particle filtering (PF) have been applied separately in different scenarios. However, the LBH-PRL approach proposed in this study differs significantly in both methodology and application. Compared to conventional RL-driven NTN handover strategies, our approach not only employs RL for optimizing handover decisions but also introduces an adaptive parameter adjustment mechanism that dynamically adjusts hysteresis, TTT, and handover decisions based on environmental variations, making it more suitable for different NTN scenarios. Existing RL methods typically use fixed or offline-optimized parameters, which limits their adaptability in dynamic environments. By contrast, our approach enhances robustness and improves decision-making efficiency in highly dynamic NTN conditions. Additionally, while particle filtering is commonly used in navigation and tracking systems, its application in NTN handover decision-making has been limited. This study is the first to integrate PF with RL to enhance distance estimation accuracy and reduce unnecessary handovers in NTN networks. PF provides more accurate position estimations for RL, while RL dynamically adjusts the handover strategy based on PF estimations, thereby improving decision accuracy and efficiency in NTN environments. Furthermore, the LBH approaches primarily rely on static thresholds for handover decisions, which lack adaptability to varying network conditions. Our approach, however, employs RL for adaptive adjustments, improving its ability to respond to changing NTN conditions and enhancing robustness in decision-making. Compared to conventional RSRP-based or static-threshold LBH approaches, LBH-PRL exhibits superior adaptability across diverse NTN network conditions, maintaining high handover efficiency while significantly reducing ping-pong handovers and increasing handover success rates. Thus, the primary innovation of this approach lies in the integration of PF and RL to develop an adaptive NTN handover strategy, where PF enhances RL learning efficiency by providing precise location estimations, ultimately leading to a dynamic and highly efficient NTN mobility management solution.

To enhance the performance of NTN handovers, we propose the approach called location-based handover with particle filter and reinforcement learning (LBH-PRL). This approach utilizes a particle filter to estimate the distance between the UE and NTN satellites, followed by the application of reinforcement learning to determine the suitable hysteresis, TTT, and handover decisions for location-based handover (LBH) mechanisms in NTN networks. Through simulations, LBH-PRL demonstrates superior performance compared to RSRP-based handover and location-based handover approaches across key performance metrics, including the average number of handovers, average ping-pong rate, and handover failure rate. The remainder of this paper is organized as follows: Section 2 describes the handover issues in TN and NTN networks, Section 3 explains the machine learning-based NTN handover approach we proposed, Section 4 covers the simulation and performance evaluation, and Section 5 concludes the paper.

2. Handover Issues in TN and NTN Networks

Handover is a crucial mechanism in wireless communication systems that ensures seamless connectivity as UE moves between different network cells. In TN, handovers are primarily triggered based on signal strength measurements, while in NTN, additional challenges arise due to satellite mobility, long propagation delays, and frequent cell transitions. The process of handover has been well standardized by the 3GPP to maintain service continuity across network boundaries. However, the suitability of these standardized handover mechanisms varies depending on the environment in which they are applied. In conventional cellular networks, 3GPP specifies a structured handover procedure to facilitate smooth transitions between cells. As shown in Figure 1, the process begins when the serving cell continuously monitors the UE’s measurements of neighboring cell signals. The UE periodically reports signal quality indicators, including the RSRP, to the serving cell. When the signal strength of a neighboring cell surpasses that of the serving cell by a predefined margin, and this condition persists beyond a TTT threshold, the network initiates a handover decision. The serving cell then sends a handover request to the target cell, which evaluates the request and responds with a handover acknowledgment if resources are available. Upon approval, the serving cell issues a handover command to the UE, instructing it to disconnect from the current cell and establish a new connection with the target cell. Following a successful handover, the target cell confirms completion, and the serving cell releases the context of the UE’s previous session. This sequence ensures a controlled and efficient transition, minimizing service disruptions.

Despite its effectiveness in terrestrial networks, the 3GPP handover procedure faces several challenges in NTN scenarios. Unlike stationary terrestrial base stations, NTN employs moving satellites that introduce dynamic signal variations, resulting in frequent handovers. The reliance on RBH mechanisms in such an environment can lead to inefficient mobility management due to the fluctuating nature of signal strength in satellite communications. To address these limitations, LBH has been explored as an alternative approach, leveraging satellite trajectory information to predict handover events rather than relying solely on signal strength measurements. RSRP-based handover is a widely adopted approach in terrestrial networks, where handover decisions are made based on the UE’s received signal strength. As illustrated in Figure 1, the UE continuously measures RSRP from the serving and neighboring cells, reporting this information to the network. When the RSRP of a neighboring cell becomes stronger than that of the serving cell for a sustained period, the network triggers a handover. This approach is effective in TN environments, where signal propagation is relatively stable, and handover decisions are predominantly influenced by physical obstructions and distance-based path loss.

RSRP is a key metric used in long-term evolution (LTE) and 5G NR networks to assess signal quality and determine the optimal cell selection and handover timing. It represents the received power of reference signals transmitted by the base station (or satellite in NTN). Mathematically, RSRP is defined as the linear average of the received power from N reference signal resource elements (RSREs) within the same subcarrier and is expressed as follows:

R S R P = \frac{1}{N} \sum_{i = 1}^{N} P_{i}

(1)

where the following apply:

P_{i}

represents the power of the

i

-th reference signal resource element.

N

represents the total number of resource elements used for transmitting the reference signal.

RSRP is typically measured in dBm and is used as a fundamental input for mobility decisions. The higher the RSRP value, the better the received signal strength from a particular cell. A handover is usually triggered when the difference in RSRP values between the target and serving cells exceeds a handover margin (hysteresis) and remains stable for the TTT period.

However, in NTN environments, RSRP-based handover encounters significant limitations. The rapid movement of satellites, particularly in LEO systems, causes frequent changes in signal strength. As satellites pass over a given location, the RSRP of the serving satellite may decline rapidly while that of a neighboring satellite increases. These rapid transitions often result in frequent handovers, sometimes occurring within seconds, leading to excessive signaling overhead and increased network congestion. Additionally, the inherent propagation delay in NTN networks exacerbates the issue, as handover decisions based on outdated RSRP measurements may no longer be optimal by the time they are executed. The ping-pong effect, where a UE rapidly switches back and forth between satellites due to fluctuating RSRP levels, further degrades network efficiency and increases the likelihood of service interruptions. Another major concern with RSRP-based handover in NTN is the impact of the Doppler shift, which alters the perceived frequency of received signals as satellites move relative to the UE. This frequency shift affects the accuracy of RSRP measurements, potentially leading to incorrect handover decisions. As demonstrated in Figure 1, the reliance on signal-based triggers alone results in a reactive mobility management approach, which is inefficient in the presence of frequent satellite transitions. As a result, while RSRP-based handover remains a viable solution in TN networks, its application in NTN is challenged by satellite mobility, frequent signal variations, and long propagation delays, necessitating alternative handover strategies. To address these challenges, LBH offers an alternative approach by utilizing geographical positioning and satellite trajectory information to determine the optimal handover timing. Unlike RSRP-based handover, which reacts to signal strength changes, LBH proactively schedules handovers based on predicted UE movement and satellite coverage areas. By leveraging GNSS (Global Navigation Satellite System) positioning and predefined coverage maps, handovers can be performed before signal degradation occurs, reducing the reliance on real-time RSRP fluctuations. In NTN networks, LBH is particularly advantageous due to the predictable orbital paths of satellites. Since the movement of satellites follows well-defined trajectories, network operators can pre-calculate handover events based on UE location and satellite visibility. This predictive capability reduces unnecessary handovers and minimizes the ping-pong effect observed in RSRP-based approaches. While RSRP-based handover is widely used in terrestrial networks, its dependence on real-time signal strength makes it less effective in NTN due to frequent satellite movement, long propagation delays, and Doppler shift effects. In contrast, location-based handover offers a more predictive mechanism, leveraging satellite trajectory data to optimize transition timing. These findings underscore the need for hybrid handover strategies that integrate signal-based metrics, predictive mobility models, and adaptive learning techniques to optimize NTN mobility management. In the following sections, we describe the more advanced handover optimization approach we have proposed, including reinforcement learning-based adaptive handover strategies to enhance NTN network efficiency and service reliability.

3. Proposed Approach

In this section, we propose a hybrid algorithm for NTN handover optimization. First, we dynamically estimate the distance between the UE and the satellite using a particle filter-based method. Subsequently, we employ RL to adaptively optimize the handover parameters based on the estimated distance and location data, specifically including hysteresis, TTT, and handover decisions. The architecture of the proposed approach is shown in Figure 2.

To improve the performance of NTN handovers, we propose a novel approach that utilizes a particle filter to accurately estimate the distance between UE and NTN satellites. This estimation is followed by reinforcement learning to dynamically optimize key handover parameters, including hysteresis, TTT, and handover decision-making within the location-based handover mechanism of NTN networks. The pseudocode of the proposed approach is presented below in the LBH-PRL Algorithm 1. A mathematical model was developed to simulate the NTN handover process, complemented by a reinforcement learning framework designed to maximize throughput while minimizing handover failures and ping-pong effects. NTNs leverage LEO satellites to provide global communication coverage. However, the high-speed motion of satellites and the dynamic nature of network topologies make handover management a critical challenge in NTN systems.

Algorithm 1. LBH-PRL Algorithm

Input: Satellite and UE positions, velocities, network configuration
Output: Optimized hysteresis (H) and TTT, handover decisions
Step1. Initialize:
A. Particle filter parameters: Number of particles

K

, initial distance range

[d_{m} i n, d_{m} a x]

B. Reinforcement learning parameters: Initial hysteresis

(H)

,

T T T

, policy

π_{θ}

C. Learning rate α, discount factor γ
D. State space

S (t)

= {Estimated distance, RSSI, UE position, velocity}
Step2. For each time step

t

:
A. Particle Filter for Distance Estimation:
a. Initialize particles:
For

k

= 1 to

K

:

d_{k}^{(0)} \sim U (d_{\min}, d_{\max})

b. Predict particle states:
For

k

= 1 to

K

:

d_{k}^{(t + 1)} = d_{k}^{(t)} + v_{j} \cdot Δ t + ξ_{k}

c. Update weights:
For

k

= 1 to

K

:

w_{k}^{(t + 1)} = \frac{e x p (- \frac{{({RSSI}_{measured} - {RSSI}_{k}^{(t + 1)})}^{2}}{2 σ^{2}})}{\sum_{j = 1}^{K} e x p (- \frac{{({RSSI}_{measured} - {RSSI}_{j}^{(t + 1)})}^{2}}{2 σ^{2}})} .

d. Resample particles based on weights.
e. Estimate distance:

{\hat{d}}_{i, j} (t) = \sum_{k = 1}^{K} w_{k}^{(t)} d_{k}^{(t)}

for k = 1 to K
B. Evaluate Handover Conditions:
If UE is at the edge of the current satellite’s coverage area:
Calculate

R S S I_{t a r g e t}

and

R S S I_{c u r r e n t}

.
If (

R S S I_{t a r g e t} - R S S I_{c u r r e n t} > H

):
Start TTT timer.
If (Condition holds for TTT seconds):
Perform handover and update state.
Else:
Stay in the current cell.
C. Reinforcement Learning Policy Update:
a. Observe state:

S (t)

= {

\hat{d}

, RSSI, UE position, velocity}.
b. Select action

A (t)

= {hysteresis, TTT} using policy

π_{θ} (A | S)

.
c. Compute cumulative reward:

G (t) = \sum_{k = 0}^{\infty} γ^{k} R (t + k)

d. Update policy parameters:

θ \leftarrow θ + α \nabla_{θ} E [G]

Return optimized hysteresis (H), TTT, and final handover decisions.

Handover mechanisms in NTN typically rely on parameters like hysteresis and TTT, which are often configured statically. These static configurations are unable to adapt to the constantly changing network conditions, resulting in the following issues:

Handover failures: Frequent or poorly timed handovers can lead to connection drops.
Ping-pong effects: UEs may oscillate between satellites within a short time, leading to inefficient resource utilization.
Performance degradation: The suitable parameter settings can result in reduced throughput and increased latency.

Moreover, accurately estimating the distance between UE and satellites, which is a critical element for location-based handovers, is challenging due to factors such as multipath effects and interference.

Satellite motion:

Assume

N

LEO satellites, each moving at a constant orbital velocity

v_{s}

. The position of satellite

i

at time

t

is represented as:

s_{i} (t) = (x_{i} (t), y_{i} (t), z_{i} (t))

(2)

where the following apply:

s_{i} (t)

: Position vector of satellite

i

at time

t

.

x_{i} (t), y_{i} (t), z_{i} (t)

: Cartesian coordinates of satellite

i

at time

t

.

UE motion:

The position of UE

j

at time

t

is represented as follows:

p_{j} (t) = (x_{j} (t), y_{j} (t), z_{j} (t))

(3)

p_{j} (t)

: position vector of UE

j

at time

t

x_{j} (t), y_{j} (t), z_{j} (t)

: Cartesian coordinates of UE

j

at time

t

.

The distance between UE

j

and satellite

i

is given by:

d_{i, j} (t) = ‖ s_{i} (t) - p_{j} (t) ‖

(4)

where the following apply:

‖ ‖

: Euclidean norm.

Signal Model:

The received signal strength indicator (RSSI) from satellite

i

to UE

j

is as follows:

{RSSI}_{i, j} (t) = P_{t} - 20 \log_{10} (d_{i, j} (t)) + η,

(5)

where the following apply:

{RSSI}_{i, j} (t)

: Signal strength received by UE j from satellite

i

at time

t

.

P_{t}

: Transmit power of the satellite.

d_{i, j} (t)

: Distance between UE

j

and satellite

i

at time

t

.

η

: Gaussian noise.

Particle filter for distance estimation:

Particle filtering is employed to estimate the distance

d_{i, j} (t)

based on RSSI measurements.

Particle initialization:

Initialize

K

particles with distances sampled from a uniform distribution:

d_{k}^{(0)} \sim U (d_{\min}, d_{\max})

(6)

where the following apply:

d_{k}^{(0)}

: Initial distance of particle

k

.

U

: Uniform distribution over [

d_{\min}, d_{\max}

].

State prediction:

Update particle distances using UE velocity

v_{j}

and random noise:

d_{k}^{(t + 1)} = d_{k}^{(t)} + v_{j} \cdot Δ t + ξ_{k},

(7)

where

v_{j}

: Velocity of UE jjj.

Δ t

: Time step.

ξ_{k} \sim N (0, σ^{2})

is Gaussian noise.

Weight update:

Update particle weights based on the likelihood of the measured RSSI:

w_{k}^{(t + 1)} = \frac{e x p (- \frac{{({RSSI}_{measured} - {RSSI}_{k}^{(t + 1)})}^{2}}{2 σ^{2}})}{\sum_{j = 1}^{K} e x p (- \frac{{({RSSI}_{measured} - {RSSI}_{j}^{(t + 1)})}^{2}}{2 σ^{2}})} .

(8)

where the following apply:

w_{k}^{(t + 1)}

: Weight of particle k at time

t + 1

.

{RSSI}_{measured}

: Observed signal strength.

Resampling:

Resample particles based on their weights to concentrate around high-likelihood regions.

Distance Estimation:

The estimated distance is the weighted average:

{\hat{d}}_{i, j} (t) = \sum_{k = 1}^{K} w_{k}^{(t)} d_{k}^{(t)} .

(9)

where the following apply:

{\hat{d}}_{i, j} (t)

: Estimated distance between UE

j

and satellite

i

at time

t

.

w_{k}^{(t)}

: Weight of particle k at time

t

.

d_{k}^{(t)}

: Distance of particle

k at time t .

Based on the estimated distance

{\hat{d}}_{i, j} (t)

and location data, the handover decision is made as follows:

Hysteresis condition:

{RSSI}_{i^{'}, j} (t) - {RSSI}_{i, j} (t) > H .

(10)

Time-to-trigger condition:

Δ T > T T T

(11)

Reinforcement learning Framework, elements of reinforcement learning

State Space

S (t)

, includes the following:

S (t) = \{{\hat{d}}_{i, j} (t), {RSSI}_{i, j} (t), p_{j} (t), v_{j} (t)\}

(12)

where the following apply:

{\hat{d}}_{i, j} (t)

: Estimated distance.

{RSSI}_{i, j} (t)

: Observed signal strength.

p_{j} (t)

: UE position vector.

v_{j} (t)

: UE velocity.

Action Space

A (t)

, Actions involve adjusting hysteresis and TTT:

A (t) = \{H, T T T\}

(13)

where the following apply:

H

: Hysteresis threshold.

T T T

: Time-to-trigger.

Reward Function

R (t)

: Combines throughput, handover failures, and ping-pong effects:

\begin{matrix} R (t) = w 1 \cdot T h r o u g h p u t (t) - w 2 \cdot H a n d o v e r F a i l u r e s (t) \\ - w 3 \cdot P i n g P o n g R a t e (t) \end{matrix}

(14)

where the following apply:

w 1, w 2, w 3

: The weights for each performance metric.

Objective: Maximize long-term cumulative reward:

G (t) = \sum_{k = 0}^{\infty} γ^{k} R (t + k)

(15)

where

γ

is the discount factor.

The policy is updated using a policy gradient method:

θ \leftarrow θ + α \nabla θ E [G]]

(16)

where the following apply:

θ

: Policy parameters.

α

: Learning rate.

G: Discounted cumulative reward.

γ

: Discount factor.

In our proposed LBH-PRL approach, we adopted a policy gradient method to dynamically optimize handover decisions. Unlike value-based methods such as Q-learning, which rely on discrete Q-values for action selection, policy gradient methods directly learn an optimal policy by adjusting the probability distribution of actions based on observed rewards. This approach is particularly well-suited for NTN environments, where handover decisions require continuous parameter tuning, such as dynamically adjusting hysteresis and TTT. To ensure stable convergence and robustness, we designed a reward-shaping mechanism that penalizes unnecessary handovers while prioritizing service continuity. The accumulated reward considers multiple factors, including throughput, handover failure rate, and ping-pong rate. This enables the RL model to optimize handover strategies by learning a probability distribution over actions rather than selecting the action with the highest Q-value, making it more adaptive to dynamic NTN conditions. To enhance training efficiency and policy stability, we fine-tune key hyper parameters. The learning rate is set to 0.01, ensuring a balance between convergence speed and stability, while the discount factor is set to 0.95 to prioritize long-term optimization of handover decisions. Additionally, our approach employs stochastic policy updates with policy gradients, where action selection probabilities are continuously refined through gradient updates. This approach allows the model to learn adaptively and adjust dynamically to different NTN environments, making it more flexible compared to traditional rule-based or value-based handover mechanisms. Through policy gradient updates and adaptive learning, the proposed LBH-PRL approach effectively learns an optimal handover strategy dynamically, improving service continuity and reducing unnecessary handovers in high-mobility NTN environments.

4. Performance Evaluation

In this section, we used LBH and RBH as comparison benchmarks for our proposed approach, which integrates particle filters and reinforcement learning. The evaluation focuses on three key performance metrics: (1) average number of handovers, (2) average ping-pong rate, (3) handover failure rate, and (4) computational complexity. By comparing these metrics, we aim to comprehensively assess the effectiveness of our proposed approach in optimizing handover efficiency, stability, and reliability in NTN scenarios. The parameters used for the simulation experiments are summarized in Table 1.

(1): The average number of handovers

As shown in Figure 3, the LBH approach significantly reduces the number of handovers compared to the RBH in NTN applications. By utilizing the geographical location of the UE and satellite trajectory data, LBH can proactively predict and trigger handovers, avoiding unnecessary transitions. In contrast, RBH depends solely on signal strength comparisons, which are vulnerable to frequent fluctuations in the dynamic NTN environment, leading to more handovers and reduced efficiency. Our proposed approach, which combines particle filters for accurate distance estimation and reinforcement learning for dynamic optimization of hysteresis and TTT, further reduces handover counts compared to LBH. This integration of precise predictions and adaptive decision-making ensures even fewer unnecessary handovers and improved overall efficiency.

(2): The average ping-pong rate

Figure 4 illustrates that LBH achieves a much lower ping-pong rate compared to RBH in NTN scenarios. The predictive nature of LBH, which utilizes UE location and satellite trajectory data, prevents frequent handover reversals caused by short-term signal variations. In contrast, RBH’s reliance on real-time signal strength makes it more susceptible to the rapid fluctuations in NTN, resulting in a higher ping-pong rate. Our proposed approach surpasses LBH by further reducing ping-pong occurrences. Through the use of particle filters for precise distance estimation and RL for adaptive parameter tuning, our approach not only inherits the predictive strengths of LBH but also intelligently adjusts to network dynamics, enabling more stable and efficient handover decisions.

(3): The handover failure rate

The experimental results, as depicted in Figure 5, highlight that LBH effectively lowers the handover failure rate compared to RBH when applied to NTN. By leveraging UE location data and satellite trajectories, LBH predicts and executes handovers proactively, mitigating failures caused by signal degradation or delays. On the other hand, RBH’s dependence on real-time signal comparisons makes it prone to disruptions caused by signal fluctuations and latency, leading to a higher failure rate. Our proposed approach further enhances handover reliability by integrating particle filters for precise distance estimation with RL to optimize hysteresis and TTT parameters. This approach ensures more accurate and adaptive handover decisions, achieving a significant reduction in failure rates compared to both LBH and RBH. These results demonstrate the effectiveness of combining advanced estimation and adaptive learning techniques for robust handover management in NTN environments.

(4): The computational complexity

In this study, we analyze the computational cost of different handover approaches and evaluate their time complexity. The RBH approach relies solely on received signal strength measurements for handover decisions, where the primary computation involves comparing the RSRP values of multiple candidate satellites. As a result, its time complexity is

O (N)

, where

N

is the number of candidate satellites. In contrast, the LBH approach determines handovers based on the distance between the UE and the satellites. While the distance calculation itself is relatively simple, LBH requires historical position information to predict future mobility trajectories, leading to a slightly higher computational cost, though it remains at the

O (N)

level. In the proposed LBH-PRL approach, a particle filter is utilized to estimate the distance between the UE and satellites. Assuming

M

particles are used, the weight update and resampling operations for each particle contribute to a time complexity of

O (M)

, making the overall particle filter estimation cost

O (N M)

. Additionally, LBH-PRL incorporates RL to dynamically adjust handover parameters such as hysteresis and TTT, where the computational complexity is determined by Q-learning or Actor–Critic algorithms. In each time step, the RL update complexity is

O (A)

, where

A

represents the size of the action space. Overall, the total time complexity of LBH-PRL is

O (N M + A)

, making it computationally more demanding than RBH and LBH. However, since particle filter computations can be optimized using parallel processing and RL-based decision-making can be executed at a lower cost after training, LBH-PRL effectively balances computational efficiency and decision accuracy. Despite its higher computational cost, LBH-PRL significantly improves handover decision robustness, effectively reducing unnecessary handovers while minimizing the handover failure rate, demonstrating its practicality and advantages in high-mobility NTN environments. While the proposed LBH-PRL approach achieves superior performance in mobility management, its practical implementation overhead must also be considered, especially in user terminals with constrained computational resources. The integration of the particle filter and reinforcement learning involves iterative processing and parameter tuning that may pose a burden on devices with limited memory and processing power, such as IoT terminals or lightweight mobile equipment. To address this, lightweight versions of the RL model can be developed using model compression or quantization techniques. Additionally, the inference stage of the RL model can be offloaded to edge servers in real-world deployments, allowing user equipment UE to benefit from adaptive handover strategies without handling complex computations locally. These considerations suggest that the LBH-PRL approach remains feasible for practical NTN applications with proper system-level optimization.

The performance comparison is summarized in Table 2, along with relevant discussions to illustrate the effectiveness of our proposed LBH-PRL approach compared to conventional handover approaches. Our proposed LBH-PRL approach significantly reduces the average number of handovers, ping-pong rate, and handover failure rate, demonstrating improved stability and efficiency. RBH relies solely on signal strength, making it susceptible to frequent handovers and high ping-pong effects due to satellite mobility. The LBH mitigates unnecessary handovers by using UE location but lacks adaptive optimization. LBH-PRL extends LBH by integrating a particle filter to estimate the UE-to-satellite distance and reinforcement learning to dynamically adjust handover parameters (hysteresis, TTT). This results in a more efficient and adaptive handover mechanism in NTN environments.

5. Conclusions

This paper presents a novel approach that combines particle filters and reinforcement learning to optimize handover mechanisms in NTN. By leveraging particle filters for precise estimation of the distance between the UE and NTN satellites and employing reinforcement learning to dynamically optimize the hysteresis, TTT, and handover decisions in LBH mechanisms, the proposed approach achieves significant improvements in handover efficiency. Simulation results demonstrate that, compared to LBH and RBH approaches, our approach outperforms in terms of the average number of handovers, average ping-pong rate, and handover failure rate. These findings highlight the effectiveness of our approach in enhancing NTN handover stability and performance, offering a reliable and efficient solution for mobility management in NTN environments. In future work, the proposed LBH-PRL approach can be further evaluated in more diverse and complex NTN scenarios to test its robustness and adaptability. Potential application environments include dense LEO constellations, high-mobility conditions such as high-speed trains and aircraft, and maritime and polar regions where traditional terrestrial connectivity is limited. These environments often involve more frequent handovers, variable link quality, or signal obstruction challenges, requiring enhanced adaptability of the handover algorithm. Evaluating the LBH-PRL approach in such conditions would provide deeper insights into its practical deployment value across the full range of NTN use cases.

Author Contributions

Conceptualization, L.-S.C., S.-H.L. and H.-H.C.; Methodology, L.-S.C., S.-H.L. and H.-H.C.; Formal analysis, L.-S.C.; Investigation, L.-S.C., S.-H.L. and H.-H.C.; Writing—original draft, L.-S.C. and S.-H.L.; Writing—review & editing, H.-H.C.; Supervision, L.-S.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are contained within the article.

Acknowledgments

This work was supported by the National Science and Technology Council of Taiwan (R.O.C.), under Contract NSTC 113-2222-E-197-001.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Shao, A.; Wertz, J.R.; Koltz, E.A. Quantifying the cost reduction potential for earth observation satellites. In Proceedings of the 12th Reinventing Space Conference; Springer International Publishing: Cham, Switzerland, 2017; pp. 199–210. [Google Scholar] [CrossRef]
Wertz, J.R.; Sarzi-Amade, N.; Shao, E.A.; Taylor, C.; Van Allen, R. Moderately Elliptical Very Low Orbits (MEVLOs) as a Long-Term Solution to Orbital Debris. In Proceedings of the 26th Annual AIAA/USU Conference on Small Satellites, Logan, UT, USA, 13–16 August 2012. [Google Scholar]
Miao, J.; Wang, P.; Yin, H.; Chen, N.; Wang, X. A Multi-attribute Decision Handover Scheme for LEO Mobile Satellite Networks. In Proceedings of the 2019 IEEE 5th International Conference on Computer and Communications (ICCC), Chengdu, China, 6–9 December 2019; pp. 938–942. [Google Scholar] [CrossRef]
Kim, E.; Joe, I. Handover triggering prediction with the two-step XGBOOST ensemble algorithm for conditional handover in non-terrestrial networks. Electronics 2023, 12, 3435. [Google Scholar] [CrossRef]
Dai, C.-Q.; Liu, Y.; Fu, S.; Wu, J.; Chen, Q. Dynamic Handover in Satellite-Terrestrial Integrated Networks. In Proceedings of the 2019 IEEE Globecom Workshops (GC Wkshps), Waikoloa, HI, USA, 9–13 December 2019; pp. 1–6. [Google Scholar] [CrossRef]
Li, K.; Li, Y.; Qiu, Z.; Wang, Q.; Lu, J.; Zhou, W. Handover Procedure Design and Performance Optimization Strategy in LEO-HAP System. In Proceedings of the 2019 11th International Conference on Wireless Communications and Signal Processing (WCSP), Xi’an, China, 23–25 October 2019; pp. 1–7. [Google Scholar] [CrossRef]
Park, S.; Kim, J. Trends in LEO satellite handover algorithms. In Proceedings of the in 2021 Twelfth International Conference on Ubiquitous and Future Networks (ICUFN), Jeju Island, Republic of Korea, 17–20 August 2021; pp. 422–425. [Google Scholar] [CrossRef]
Yu, J.; Lee, W.; Kim, J.-H. Performance Evaluation of Handover using A4 Event in LEO Satellites Network. In Proceedings of the IEEE VTS Asia Pacific Wireless Communications Symposium (APWCS), Seoul, Republic of Korea, 24–26 August 2022; pp. 127–131. [Google Scholar] [CrossRef]
Liu, C.; Feng, W.; Chen, Y.; Wang, C.-X.; Ge, N. Cell-Free Satellite-UAV Networks for 6G Wide-Area Internet of Things. IEEE J. Sel. Areas Commun. 2020, 39, 1116–1131. [Google Scholar] [CrossRef]
Ganguly, D.; Chakraborty, S.; Hui, K.; Naskar, M.K. Area based Mobility Management by using GPS in LEO Satellite Networks. Int. J. Comput. Appl. 2012, 42, 41–43. [Google Scholar]
Juan, E.; Lauridsen, M.; Wigard, J.; Mogensen, P.E. 5G New Radio Mobility Performance in LEO-based Non-Terrestrial Networks. In Proceedings of the 2020 IEEE Globecom Workshops (GC Wkshps), Taipei, Taiwan, 7–11 December 2020; pp. 1–6. [Google Scholar] [CrossRef]
Baik, J.S.; Kim, J.-H. Analysis of the Earth Fixed Beam Duration in the LEO. In Proceedings of the 2021 International Conference on Information Networking (ICOIN), Jeju Island, Republic of Korea, 13–16 January 2021; pp. 477–479. [Google Scholar] [CrossRef]
Jung, S.; Lee, M.-S.; Kim, J.; Yun, M.-Y.; Kim, J.; Kim, J.-H. Trustworthy handover in LEO satellite mobile networks. ICT Exp. 2022, 8, 432–437. [Google Scholar] [CrossRef]
Han, Z.; Xu, C.; Liu, K.; Yu, L.; Zhao, G.; Yu, S. A Novel Mobile Core Network Architecture for Satellite-Terrestrial Integrated Network. In Proceedings of the 2021 IEEE Global Communications Conference (GLOBECOM), Madrid, Spain, 7–11 December 2021; pp. 01–06. [Google Scholar] [CrossRef]
R2-2100346,” 3GPP TSG-RAN WG2 Meeting 113 Electronic, Emeeting. pp. 1–18, January 2021. Available online: https://www.3gpp.org/ftp/tsg_ran/WG2_RL2/TSGR2_113-e/Docs/R2-2100346.zip (accessed on 6 January 2025).
Fontanesi, G.; Ortíz, F.; Lagunas, E.; Baeza, V.M.; Vázquez, M.Á.; Vásquez-Peralvo, J.A.; Minardi, M.; Vu, H.N.; Honnaiah, P.J.; Lacoste, C.; et al. Artificial intelligence for satellite communication and non-terrestrial networks: A survey. arXiv 2023, arXiv:2304.13008. [Google Scholar] [CrossRef]
3GPP TR 38.821 V16.2.0. Solutions for NR to support non–terrestrial networks (NTN). Technical Specification Group Radio Access Network, Release 16 April 2023. Available online: https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=3525 (accessed on 6 January 2025).
Juan, E.; Lauridsen, M.; Wigard, J.; Mogensen, P. Location-Based Handover Triggering for Low-Earth Orbit Satellite Networks. In Proceedings of the 2022 IEEE 95th Vehicular Technology Conference: (VTC2022-Spring), Helsinki, Finland, 19–22 June 202; 2022; pp. 1–6. [Google Scholar] [CrossRef]
Juan, E.; Lauridsen, M.; Wigard, J.; Mogensen, P. Performance Evaluation of the 5G NR Conditional Handover in LEO-based Non-Terrestrial Networks. In Proceedings of the 2022 IEEE Wireless Communications and Networking Conference (WCNC), Austin, TX, USA, 10–13 April 2022; pp. 2488–2493. [Google Scholar] [CrossRef]
Koda, Y.; Yamamoto, K.; Nishio, T.; Morikura, M. Reinforcement learning based predictive handover for pedestrian-aware mmWave networks. In Proceedings of the IEEE INFOCOM 2018—IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Honolulu, HI, USA, 15–19 April 2018; pp. 692–697. [Google Scholar] [CrossRef]
Lee, C.; Cho, H.; Song, S.; Chung, J.-M. Prediction-Based Conditional Handover for 5G mm-Wave Networks: A Deep-Learning Approach. IEEE Veh. Technol. Mag. 2020, 15, 54–62. [Google Scholar] [CrossRef]
Wang, C.; Ma, L.; Li, R.; Durrani, T.S.; Zhang, H. Exploring Trajectory Prediction Through Machine Learning Methods. IEEE Access 2019, 7, 101441–101452. [Google Scholar] [CrossRef]
Yan, L.; Ding, H.; Zhang, L.; Liu, J.; Fang, X.; Fang, Y.; Xiao, M.; Huang, X. Machine Learning-Based Handovers for Sub-6 GHz and mmWave Integrated Vehicular Networks. IEEE Trans. Wirel. Commun. 2019, 18, 4873–4885. [Google Scholar] [CrossRef]
Iqbal, A.; Tham, M.-L.; Wong, Y.J.; Al-Habashna, A.; Wainer, G.; Zhu, Y.X.; Dagiuklas, T. Empowering Non-Terrestrial Networks With Artificial Intelligence: A Survey. IEEE Access 2023, 11, 100986–101006. [Google Scholar] [CrossRef]
Mahboob, S.; Liu, L. Revolutionizing Future Connectivity: A Contemporary Survey on AI-Empowered Satellite-Based Non-Terrestrial Networks in 6G. IEEE Commun. Surv. Tutorials 2024, 26, 1279–1321. [Google Scholar] [CrossRef]
Wang, J.; Mu, W.; Liu, Y.; Guo, L.; Zhang, S.; Gui, G. Deep Reinforcement Learning-based Satellite Handover Scheme for Satellite Communications. In Proceedings of the 2021 13th International Conference on Wireless Communications and Signal Processing (WCSP), Changsha, China, 20–22 October 2021; pp. 1–6. [Google Scholar] [CrossRef]
Wang, W.; Zhao, Y.; Zhang, Y.; He, X.; Liu, Y.; Zhang, J. Intersatellite Laser Link Planning for Reliable Topology Design in Optical Satellite Networks: A Networking Perspective. IEEE Trans. Netw. Serv. Manag. 2022, 19, 2612–2624. [Google Scholar] [CrossRef]
Juan, E.; Lauridsen, M.; Wigard, J.; Mogensen, P. Handover Solutions for 5G Low-Earth Orbit Satellite Networks. IEEE Access 2022, 10, 93309–93325. [Google Scholar] [CrossRef]
Djuric, P.; Kotecha, J.; Zhang, J.; Huang, Y.; Ghirmai, T.; Bugallo, M.; Miguez, J. Particle filtering. IEEE Signal Process. Mag. 2003, 20, 19–38. [Google Scholar] [CrossRef]
Gustafsson, F. Particle filter theory and practice with positioning applications. IEEE Aerosp. Electron. Syst. Mag. 2010, 25, 53–82. [Google Scholar] [CrossRef]
Wiering, M.; Van Otterlo, M. Reinforcement learning. Adapt. Learn. Optim. 2012, 12, 51. [Google Scholar]
Ernst, D.; Louette, A. Introduction to reinforcement learning. 2024. Available online: http://blogs.ulg.ac.be/damien-ernst/wp-content/uploads/sites/9/2024/02/Introduction_to_reinforcement_learning.pdf (accessed on 6 January 2025).

Figure 1. The handover procedure.

Figure 2. The architecture of the proposed approach.

Figure 3. Average number of handovers.

Figure 4. Average ping-pong rate.

Figure 5. Handover failure rate.

Table 1. Simulation parameters.

Category	Parameter	Assumption
Satellite Configuration	Altitude of satellite	650 km
	Orbital inclination angle	53°
	Satellite velocity	7.5 km/s
	Tx Gain of NTN Satellite	30 dBi
	Number of beams	13
	Satellite beam diameter	60 km
	Tx power of UE	30 dBm
UE Configuration	UE velocity profile	3 km/h (pedestrian), 120 km/h (vehicle), 300 km/h (high-speed train)
	Mobility model	Random Walk, Highway Mobility Model
	Carrier frequency	2 GHz (S-Band FDD)
Radio and Network Configuration	Frequency reuse	FR1
	System Bandwidth	10 MHz
	Load of Traffic	30% PRBs

Table 2. Summary of performance comparison.

Handover Approach	Average Number of Handovers	Ping-Pong Rate (%)	Handover Failure Rate (%)	Computational Complexity	Key Features
RBH	High	High	High	$O (N)$	RSRP, frequent handovers due to satellite mobility
LBH	Medium	Medium	Medium	$O (N)$	Use UE location, reduces unnecessary handovers but lacks dynamic optimization
Proposed LBH-PRL	Low	Low	Low	$O (N M + A)$	Integrates particle filter for distance estimation and RL for adaptive optimization

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, L.-S.; Liao, S.-H.; Cho, H.-H. Location-Based Handover with Particle Filter and Reinforcement Learning (LBH-PRL) for Mobility and Service Continuity in Non-Terrestrial Networks (NTN). Electronics 2025, 14, 1494. https://doi.org/10.3390/electronics14081494

AMA Style

Chen L-S, Liao S-H, Cho H-H. Location-Based Handover with Particle Filter and Reinforcement Learning (LBH-PRL) for Mobility and Service Continuity in Non-Terrestrial Networks (NTN). Electronics. 2025; 14(8):1494. https://doi.org/10.3390/electronics14081494

Chicago/Turabian Style

Chen, Li-Sheng, Shu-Han Liao, and Hsin-Hung Cho. 2025. "Location-Based Handover with Particle Filter and Reinforcement Learning (LBH-PRL) for Mobility and Service Continuity in Non-Terrestrial Networks (NTN)" Electronics 14, no. 8: 1494. https://doi.org/10.3390/electronics14081494

APA Style

Chen, L.-S., Liao, S.-H., & Cho, H.-H. (2025). Location-Based Handover with Particle Filter and Reinforcement Learning (LBH-PRL) for Mobility and Service Continuity in Non-Terrestrial Networks (NTN). Electronics, 14(8), 1494. https://doi.org/10.3390/electronics14081494

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Location-Based Handover with Particle Filter and Reinforcement Learning (LBH-PRL) for Mobility and Service Continuity in Non-Terrestrial Networks (NTN)

Abstract

1. Introduction

2. Handover Issues in TN and NTN Networks

3. Proposed Approach

4. Performance Evaluation

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI