A Comparative Study of Machine Learning and Deep Learning Models for Real-Time UAV Positioning Error Estimation

Yang, Mei; Zhuo, Hua; Ma, Jun-Gang; Niu, Guo-Hui; Mamtimin, Zulmira; Tao, Mei; Zhu, Ya-Qiong; Li, Jun; Abdughani, Murat; Sidike, Aihemaitijiang

doi:10.3390/drones10030172

Open AccessArticle

A Comparative Study of Machine Learning and Deep Learning Models for Real-Time UAV Positioning Error Estimation

by

Mei Yang

^1,2,

Hua Zhuo

^1,2,

Jun-Gang Ma

^1,2,*,

Guo-Hui Niu

^1,2,

Zulmira Mamtimin

^1,2,

Mei Tao

^1,2,

Ya-Qiong Zhu

^1,2,

Jun Li

³,

Murat Abdughani

³

and

Aihemaitijiang Sidike

³

¹

Xinjiang Agricultural Unmanned Aircraft Performance and Safety Key Laboratory, Urumqi 830011, China

²

Xinjiang Uygur Autonomous Region Research Institute of Measurement and Testing, Urumqi 830011, China

³

School of Physical Science and Technology, Xinjiang University, Urumqi 830017, China

^*

Author to whom correspondence should be addressed.

Drones 2026, 10(3), 172; https://doi.org/10.3390/drones10030172

Submission received: 10 January 2026 / Revised: 23 February 2026 / Accepted: 27 February 2026 / Published: 2 March 2026

(This article belongs to the Topic AI and Data-Driven Advancements in Industry 4.0, 2nd Edition)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

The horizontal errors are generally larger and more variable than vertical errors.
Some features (‘gnss_L5Q_mean’ and ‘gnss_L1C_std’) dominate the positioning error of the UAV.

What is the implication of the main finding?

UAV positioning should systematically enhance horizontal accuracy.
UAV navigation systems should prioritize real-time monitoring of these specific features (‘gnss_L5Q_mean’ and ‘gnss_L1C_std’).

Abstract

Accurate real-time positioning of Unmanned Aerial Vehicles (UAVs) is critical for navigation and mapping but remains challenging in complex environments due to signal blockages and multipath effects. This study presents a comparative framework for real-time error prediction of the Global Navigation Satellite System (GNSS), evaluating two machine learning models (Random Forest and XGBoost) and a deep learning model (Long Short-Term Memory network) against an Extended Kalman Filter baseline. A high-precision total station provides ground-truth coordinates, enabling the derivation of positioning error labels from synchronized GNSS raw data. Among the evaluated models, the tree-based XGBoost model achieves a significantly lower Mean Squared Error (MSE) and a considerably higher Coefficient of Determination (

R^{2}

) score than other models in predicting positioning deviations. The high-accuracy error predictions from the optimal model establish the core of a software-only solution for positioning integrity. The framework demonstrates that reliable, real-time error estimates can be derived directly from observation data, providing the essential input required for future compensation systems without necessitating additional hardware.

Keywords:

unmanned aerial vehicle; real-time positioning; deep learning

1. Introduction

Recent studies have highlighted the transformative potential of large AI models in enabling UAV [1,2,3,4,5] networks to support the rapidly developing low-altitude economy [6,7]. These works envision AI-powered UAV systems capable of complex tasks such as communication optimization, autonomous navigation, and real-time decision-making in dynamic low-altitude airspace. As UAV-based applications continue to expand across aerial mapping, inspection, logistics, and environmental monitoring, the demand for reliable and high-precision positioning becomes increasingly critical—serving as the foundational layer upon which these advanced AI-enabled services depend.

Accurate and reliable positioning is fundamental to the safe and efficient operation of UAVs. GNSSs [8,9,10,11,12] have become the primary source of positioning information for UAVs due to their low cost and global coverage. However, GNSS positioning accuracy is highly susceptible to environmental conditions, such as satellite blockage, multipath effects, ionospheric disturbances, and signal-to-noise variations. In complex and dynamic environments, significant deviations from the true trajectory are likely to arise, compromising both navigation safety and mission accuracy. To mitigate the errors arising from such conditions, the positioning system therefore requires real-time assessment based on GNSS performance indicators. This capability is a critical prerequisite for realizing the full potential of AI-enabled UAV services in low-altitude economy scenarios.

Traditional methods often rely on differential corrections, such as real-time kinematics (RTK) [13,14,15,16,17] or Precise Point Positioning (PPP) [17,18,19], which require additional infrastructure or reference stations. These methods are capable of improving positioning accuracy, but they are not always feasible in dynamic or large-scale UAV operations where communication links may be unstable or unavailable. Consequently, there is a growing need for data-driven approaches capable of autonomously evaluating and correcting GNSS positioning errors using the onboard sensor data.

Recent advances in artificial intelligence, particularly deep learning [20,21,22,23], have demonstrated remarkable potential in modeling nonlinear and time-dependent errors in navigation systems. Long short-term memory (LSTM) [24,25,26] networks are capable of capturing temporal dependencies in sequential data, which has led to their strong performance and makes them well suited for GNSS error prediction and dynamic modeling. By learning from GNSS observations and their corresponding deviations, an LSTM-based model can infer the relationship between signal characteristics (e.g., satellite geometry, signal-to-noise ratio, and elevation angle).

The strength of LSTM and other deep learning models lies in a shared capacity for hierarchical feature learning. This common mechanism, inspired by biological neural processing, enables diverse architectures to automatically extract complex patterns from raw multi-source data. Through progressive nonlinear transformations across network layers, these models integrate heterogeneous pseudorange, carrier-phase, angular, and distance measurements to establish direct end-to-end mappings to high-precision coordinates. Thus, while LSTM specializes in temporal dependencies, a wider array of deep learning approaches collectively addresses the multifaceted nature of GNSS and total station error modeling.

While deep learning offers powerful representational capacity, a rigorous evaluation necessitates comparison against established, high-performance benchmarks. To this end, we employ the Extended Kalman Filter (EKF) [27,28,29] as a conventional baseline, given its widespread adoption in GNSS/INS integration and positioning applications. In addition, we incorporate two robust tree ensemble methods—Random Forest [30,31,32] and XGBoost [33,34]—as machine learning baselines. Random Forest excels at modeling complex nonlinear interactions through its ensemble of decision trees without requiring extensive feature engineering, while XGBoost’s gradient-boosting framework is renowned for its predictive accuracy and efficiency in handling structured data. This combination of baselines—spanning classical filtering, ensemble learning, and deep sequential modeling—enables a comprehensive evaluation of the proposed approach for GNSS error prediction across fundamentally different paradigms.

In this study, a high-precision total station will be utilized to obtain ground-truth coordinates of a UAV equipped with a prism reflector, providing a reliable reference for error quantification. Simultaneously, a GNSS receiver on board the UAV records raw observation data, including the pseudorange, carrier phase, signal-to-noise ratio (SNR), and satellite elevation. After rigorous time synchronization and coordinate transformation, the true positioning errors were derived by comparing the GNSS-derived positions with total station measurements. These labeled data are then used to train machine learning models designed to predict real-time GNSS positioning deviations.

The proposed frameworks aim to establish a foundation for assessing and potentially enhancing real-time UAV positioning performance using onboard sensor data. Its core is a predictive model capable of estimating GNSS errors with high accuracy, which is the critical first step toward any software-based correction. Experimental validation in both open-sky and obstructed environments is designed to evaluate the accuracy of these error predictions across diverse conditions. This study explores a practical, data-driven pathway toward intelligent UAV navigation, offering new insights into positioning error modeling as a fundamental component of system integrity assurance.

This paper is organized as follows: Section 2 describes the setup of the experimental environment and the data acquisition process. In Section 3, we analyze the Random Forest, XGBoost, and LSTM models, along with their respective hyperparameter configurations. Section 4 presents the fitting results obtained from the models. Finally, we conclude in Section 5.

2. Experiment and Data Acquisition

This study aims to quantify the real-time kinematic positioning errors of UAVs under low-speed flight conditions and to explore the applicability of various machine learning models for error analysis and attribution. To achieve this objective, we design a hardware-in-the-loop simulation experiment using a high-precision total station. The total station is set up at a control point with known CGCS2000 coordinates, ensuring the absolute accuracy and traceability of the external reference coordinate system. By rigidly connecting the UAV to a high-precision prism, the total station’s automatic tracking and real-time scanning capabilities were used to capture the instantaneous three-dimensional position of the UAV. To ensure continuous ground truth, we carefully designed the flight path to remain within the line of sight of the total station at all times. The maximum flight distance was limited to approximately 150 m, and the flight altitude was kept at 42.6 m. The accuracy of this measurement chain was traced through metrological verification, ensuring the reliability of the “truth” values. Finally, through precise time synchronization algorithms and spatial coordinate system transformations, the real-time coordinates calculated by the UAV’s onboard GNSS/INS integrated navigation system are matched epoch by epoch with the total station’s truth values. This process yields the original dataset for error analysis. Regarding model selection, we employed three typical algorithms—Random Forest, XGBoost, and LSTM—for comparative analysis, aiming to evaluate the fitting and interpretability of traditional machine learning versus deep learning models on the small-sample, high-noise UAV dynamic error dataset.

2.1. GNSS/INS Integrated System

In this study, the equipment comprises a DJI Phantom 4 RTK UAV (DJI Innovations, Shenzhen, China) [35,36,37], a Huilide GRZ101 360° prism (Zhonghui Surveying Instrument, Nantong, China) for the total station, a Hi-Target iRTK10 GNSS receiver (Hi-Target Surveying Instrument, Guangzhou, China), a total station and a circular prism. We present their parameters in Table 1. This is followed by a discussion of their individual performance and the mechanisms of their collaborative operation.

The navigation system [38,39] comprises an Inertial Navigation System (INS) [40,41,42,43] and a GNSS [44] unit. The INS is a self-contained navigation system that uses accelerometers and gyroscopes to estimate position, velocity, and attitude through dead reckoning, without requiring external signals. An INS comprises an IMU (Inertial Measurement Unit) and a processing unit. The IMU is a sensor device primarily consisting of accelerometers and gyroscopes, which measure linear acceleration and the angular rate, respectively. It provides raw motion data but does not perform any position computation. The INS operates autonomously in environments where external signals are absent, weak, or interrupted. However, its inherent error accumulates over time, leading to a gradual degradation in positional accuracy.

To address the inherent positioning error drift of the INS, it is necessary to integrate it with the GNSS for cooperative positioning, also known as INS/GNSS integrated navigation. The GNSS utilizes satellite signals to provide global positioning services, including systems such as the Global Positioning System (GPS) [45,46,47], GLONASS [8,48,49,50], BeiDou [51,52], and Galileo [53,54,55,56]. The GNSS can directly deliver high-precision absolute positions as well as RTK timing. Through INS/GNSS integrated navigation, the accumulated errors of the INS can be periodically corrected, enabling the provision of high-precision absolute positioning.

To obtain accurate ground-truth data for UAV positioning error analysis, a high-precision total station was employed to establish the reference trajectory. A reflective prism was rigidly mounted on the UAV, enabling the total station to continuously track the UAV’s three-dimensional position during flight. The total station operated in tracking mode, recording the spatial coordinates of the prism in the local Cartesian coordinate frame (Easting, Northing, and Up components) at a sampling interval of one second. Prior to data collection, two control points are precisely measured to define the local reference coordinate system and ensure geometric stability of the total station setup. The total station’s instrument height and prism height are carefully calibrated to minimize systematic biases. All measurements are conducted under clear weather conditions to reduce the influence of atmospheric refraction and signal attenuation.

The total station is first set up over the first CORS (Continuously Operating Reference Station) [57], with the GNSS receiver deployed on the second CORS base station and a circular prism mounted on the third station. All three base stations have precisely known coordinates within the CGCS2000 (China Geodetic Coordinate System 2000) [58] reference frame. The total station’s own CGCS2000 position is determined through a known-point backsight orientation procedure. The UAV is then modified by securely mounting a 360° prism to its underside. Flight parameters, including the planned trajectory, altitude, and speed, are pre-programmed using the UAV’s remote controller. Upon takeoff, the total station is adjusted to automatically and continuously track the 360° prism on the UAV. The total station records the UAV’s real-time coordinates, while the UAV’s own flight data (e.g., from its onboard GNSS and IMU) are logged simultaneously. The experiment involved repeated flight trials until a sufficient volume of synchronized data is collected.

The data flow of the entire system is illustrated in Figure 1. GPRMC [59,60] is a standard NMEA 0183 [61] sentence providing essential navigation parameters, including position, velocity over ground, true course, and UTC time with status indicators. GPGGA [62], another core NMEA sentence, delivers detailed fix information such as position, time, the number of satellites used, the GPS quality indicator, altitude, and geoid separation, factors crucial for assessing positioning quality. RTCM v3 (Radio Technical Commission for Maritime Services, version 3) [63] is a binary data format standard primarily used for transmitting real-time differential GNSS corrections, enabling high-precision applications like RTK positioning. It supports multiple GNSS constellations and modern signals, serving as the fundamental protocol for network RTK and precise point positioning services. PPS [64] stands for Pulse Per Second, a high-precision timing signal where each pulse’s rising edge is synchronized with the start of a UTC second. It is commonly generated by GNSS receivers and provides nanosecond-level timing accuracy for synchronizing electronic systems. For each timestamp, we aggregate the signal metrics from all satellites tracked at that moment to capture the instantaneous spatial characteristics of the GNSS constellation. To ensure data quality, epochs with fewer than four tracked satellites are considered invalid and excluded from the analysis. Any residual outliers exceeding three standard deviations are removed to mitigate the impact of gross errors.

The acquired total station coordinates are subsequently used as the true reference positions for each corresponding GNSS observation epoch. This reference trajectory serves as the benchmark for evaluating GNSS-derived positions and for computing the true positioning errors in the Easting, Northing, and Up directions. The resulting high-precision dataset provides the foundation for model training and quantitative assessment of UAV positioning performance.

2.2. Data Analysis

The GNSS observations used in this study are recorded in Receiver Independent Exchange (RINEX) format version 3.02 [65]. The dataset is collected on 25 September 2025, from 08:56:50 to 09:56:45 UTC, and on 26 September 2025, from 16:46:06 to 17:52:19 UTC, with a sampling interval of

Δ t = 1.0 s

. The receiver is an iRTK10 (serial number: 13772976, firmware version: 3.1) equipped with a HITIRTK10 antenna. The antenna phase’s center offsets (in the local topocentric frame) are applied as

Δ U = + 0.0229 m

,

Δ E = - 0.0014 m

, and

Δ N = + 0.0025 m

. The approximate station’s coordinates in the Earth-centered, Earth-fixed (ECEF) frame are

X_{approx} = (\begin{matrix} - 2, 393, 525.9357 \\ 5, 384, 548.4233 \\ 2, 432, 886.9380 \end{matrix}) m .

(1)

The observation file contains multi-constellation, multi-frequency code pseudorange (P), carrier-phase (

Φ

), Doppler (D), and signal strength (S) measurements. The tracked signals include:

GPS (G): L1 C/A (C1C), L1 P-code (C1P and L1P), L2C (C2W and L2W), and L5 (C5Q and C5I);
GLONASS (R): L1 (C1C) and L2 (C2C) on frequency channel numbers;
Galileo (E): E1 (C1B, C1C), E5a (C5I), E5b (C7I), and E5 AltBOC (C5Q and C7Q);
BDS-3 (C): B1I (C1I), B1C (C1Q), B2a (L7), B2b (C7I), and B3I (C6I);
QZSS (J): L1 C/A (C1C), L2C (C2X), and L5 (C5Q and C5X);
SBAS (S): L1 (C1C) and L5 (C5I).

These parameters represent the raw code and carrier-phase observation types standardized within the RINEX format for multi-constellation GNSS data processing. Each entry, such as C1C or L2W, is a three-character code where the first letter denotes the frequency band (e.g., C for the code/pseudorange and L for the carrier phase), the second digit identifies the specific frequency (e.g., 1 for L1/E1/B1 and 5 for L5/E5a), and the third character specifies the signal modulation and tracking attribute (e.g., C for the C/A code, Q for the quadrature phase, and I for in-phase). This comprehensive tracking capability, encompassing signals from the GPS (L1 C/A, L2C, and L5), GLONASS (FDMA-based L1 and L2), Galileo (E1, E5a, E5b, and AltBOC), BeiDou-3 (B1I, B1C, B2a/B2b, and B3I), QZSS, and SBAS, enables high-precision positioning through multi-frequency ionospheric error correction, robust signal redundancy, and enhanced resistance to interference and multipath effects in complex environments.

For each satellite and each frequency band f, the data block contains

[P_{f} (m), Φ_{f} (cycles), D_{f} (Hz), S_{f} (dB-Hz)] .

For instance, satellite G05 provides an L1 pseudorange of

P_{L 1} = 22, 226, 002.367 m

, a carrier phase of

Φ_{L 1} = 116, 798, 434.453

cycles, and a Doppler score of

D_{L 1} = - 1015.406 Hz

. This dataset represents a kinematic trajectory through high-frequency observations, making it suitable for applications requiring dynamic spatial accuracy assessments. This rich multi-GNSS dataset enables advanced processing strategies such as multi-frequency precise point positioning (PPP) and RTK solutions while supporting ionospheric and tropospheric delay modeling [66].

The kinematic coordinate solutions derived from the total station are structured as a precise three-dimensional positions. Each observation record consists of the following items:

id: It represents a unique identifier for the surveyed point. The dataset includes a fixed reference station (denoted as “Base”) and a series of sequentially numbered rover points.
N, E, and U: They represent the coordinate components in a local or projected Cartesian system, representing Northing, Easting, and Height (Up), respectively. Coordinates are recorded in meters with a precision of $1 \times 10^{- 4}$ m (0.1 mm).
time: It represents the precise UTC timestamp for each observation, formatted as MM/DD/YYYY HH:MM:SS.ss. The data exhibit a consistent sampling interval, enabling high-temporal-resolution trajectory analysis.

The objective of the proposed model is to predict the instantaneous GNSS positioning errors in the Easting, Northing, and Up directions using raw GNSS observation features. For each epoch t, the GNSS-derived coordinates

(E_{GNSS}, N_{GNSS}, U_{GNSS})

are paired with the total station’s reference position

(E_{true}, N_{true}, U_{true}),

yielding the ground-truth error vector

[\begin{matrix} d E \\ d N \\ d U \end{matrix}] = [\begin{matrix} E_{GNSS} (t) - E_{true} (t) \\ N_{GNSS} (t) - N_{true} (t) \\ U_{GNSS} (t) - U_{true} (t) \end{matrix}] .

(2)

Concurrently, the total error (three-dimensional error) is defined as

\begin{matrix} d R = \sqrt{{(d E)}^{2} + {(d N)}^{2} + {(d U)}^{2}} . \end{matrix}

(3)

In this study, we employ interpolation to temporally align the total station data with the navigation system data. Following data alignment, the distribution of the resulting alignment errors is presented in Figure 2. We observe apparent horizontal systematic bias in the North direction. Therefore, we filter out all data points satisfying

d R > 25

before the following analysis. It is important to distinguish between challenging but valid measurements and physically invalid epochs. The removed samples correspond to the latter—instances where the GNSS receiver had insufficient satellites (<4) or reported a fix quality indicator of 0 (invalid). These epochs would not be used for position estimation in any operational system, as they do not represent a valid GNSS solution. It is important to clarify the definition of “real-time” within the context of this study. The proposed method is designed to process data at the native update rate of the GNSS receiver, which is 1 Hz. Therefore, in this work, “real-time capability” is defined as the ability to complete the inference for a single epoch (i.e., generating a positioning error estimate) in less than the inter-epoch interval of 1000 ms. This study focuses on the feasibility of machine learning models for instantaneous error estimation at this specific rate, rather than addressing the higher frequencies associated with IMU data integration.

3. Models

In this study, the Long Short-Term Memory (LSTM) network serves as the core architecture for analysis. We posit that LSTM offers distinct advantages—such as its gated mechanism for effectively capturing long-term dependencies in time-series data, its robustness to noisy inputs, and its architectural suitability for processing variable-length sequential data—making it particularly well-suited for analyzing the real-time dynamic errors of UAVs.

3.1. Benchmark Models

We adopt two high-performing tree-ensemble methods, Random Forest and XGBoost, as benchmark models to establish a solid performance baseline for comparison. Random Forest and XGBoost have been widely used in GNSS-related prediction tasks due to their robustness to noise and ability to handle high-dimensional features [30,32,33,34]. LSTM is specifically designed for time-series data and has shown promise in capturing temporal dependencies in positioning errors [25]. These three models represent diverse paradigms (bagging, boosting, and deep learning), enabling a comprehensive comparison of different approaches. These models are selected for their proven effectiveness in regression tasks with tabular data, strong interpretability through feature importance metrics, and resistance to overfitting [67,68].

Both models are trained separately for each positioning error component (

d N

,

d E

,

d U

, and

d R

).

To avoid temporal data leakage, we employed a chronological split based on the timestamp of each measurement. The data were ordered by time, with the first 80% used for training and the remaining 20% for testing. This ensures that the model is evaluated on future, unseen data, simulating real-world deployment scenarios. Model performance is evaluated on the held-out test set using the Mean Squared Error (MSE) and the Coefficient of Determination (

R^{2}

).

The key hyperparameters for both models, which control the trade-off between bias and variance, are carefully selected based on preliminary experiments and are detailed in Table 2. The configuration aims to provide a strong baseline without extensive hyperparameter tuning, ensuring a fair comparison focused on architectural differences rather than exhaustive optimization.

For the Random Forest model, the feature importance scores are extracted post-training to provide insights into the most influential input variables for each error component. This offers an additional layer of interpretability complementary to the predictive performance metrics.

To provide a benchmark against conventional navigation techniques, we implemented an EKF as a baseline model. The EKF is widely adopted in GNSS/INS integration and positioning applications due to its ability to handle nonlinear systems through first-order linearization. The state vector is defined as

x = {[d E, d N, d U, v_{E}, v_{N}, v_{U}]}^{T},

(4)

where

v_{E}, v_{N},

and

v_{U}

are the corresponding velocity errors. A constant velocity motion model is adopted for state prediction, with

x_{k | k - 1} = F x_{k - 1} + w_{k},

(5)

where

F

is the state transition matrix and

w_{k} \sim N (0, Q)

is the process noise. The measurement model uses the GNSS-derived position errors as observations, with

z_{k} = H x_{k} + v_{k},

(6)

where

H = [I_{3 \times 3}, 0_{3 \times 3}]

extracts the position components and

v_{k} \sim N (0, R)

is the measurement noise. The process noise covariance

Q

and measurement noise covariance

R

were empirically tuned using the training data to optimize performance.

3.2. LSTM

The LSTM network is adopted to capture complex temporal dependencies in sequential GNSS observation data. The network employs a hierarchical feature extraction strategy through three sequentially connected LSTM layers, followed by fully connected dense layers for regression output.The specific layer configuration is listed in Table 3.

The progressive reduction in LSTM units (128 → 64 → 32) follows a funnel-like architecture that extracts increasingly abstract temporal features while reducing dimensionality. The model is optimized using the Adam algorithm [69] with a learning rate of

α

= 0.001.

4. Results

4.1. Model Performance Assessment

A comprehensive comparative analysis of model performance is presented in this section, evaluating the Random Forest, XGBoost, and LSTM models on the task of GNSS positioning error prediction. The assessment is based on three principal metrics: the MSE, the Root Mean Squared Error (RMSE), and

R^{2}

. These metrics are calculated on a held-out test set comprising 20% of the total data, ensuring a fair comparison across different models.

To obtain statistically robust performance estimates, we employed a repeated cross-validation strategy. Specifically, we performed 5-fold cross-validation repeated five times with different random seeds, resulting in 25 independent evaluation results for each model. The mean and standard deviation of each metric were then computed to reflect both the central tendency and the variability of model performance. The comparative results, summarized in Table 4, reveal distinct performance characteristics across the three models. Notably, the XGBoost algorithm achieved the most favorable results overall, demonstrating superior predictive accuracy in terms of both error minimization and variance explanation.

The residual analysis plots of XGBoost for the

d R

position error prediction are presented in Figure 3.

The QQ plot (left panel) reveals significant deviation from normality in the upper and lower quantiles, where residuals systematically fall above the theoretical line, suggesting the model underestimates extreme positioning errors—particularly relevant for GNSS applications requiring high precision. The residual-versus-predicted scatterplot (middle panel) demonstrates a characteristic funnel-shaped pattern, revealing heteroscedastic behavior where prediction uncertainty scales with error magnitude. This phenomenon is consistent with GNSS error characteristics where larger positioning uncertainties correlate with more challenging satellite geometry. Finally, the actual-versus-predicted plot (right panel) shows a strong linear correlation (R² = 0.898). For errors below 0.4 m, predicted values plateau around 0.45 m, while actual errors reach 0.8 m. This indicates that the XGBoost model captures typical error patterns effectively but requires enhancement for extreme scenarios, possibly through targeted sampling of outlier conditions or ensemble methods to better model the heavy-tailed distribution characteristic of GNSS positioning errors.

The XGBoost model achieved the best overall performance, yielding an RMSE of 0.132 m (MSE = 0.0174 ±

0.0011 m^{2}

). This represents an 8.3% improvement in RMSE over the EKF baseline, outperforming the Random Forest and LSTM models by 10.8% and 6.4%, respectively. XGBoost attains the highest

R^{2}

score of 0.898, indicating that approximately 89.8% of the variance in the positioning error can be explained by its predictions. This represents improvements of 0.8, 1.4, and 2.6 percentage points over the LSTM (

R^{2} = 0.884

), Random Forest (

R^{2} = 0.872

), and EKF (

R^{2} = 0.890

) models, respectively.

4.2. Key Features

A comparative analysis of feature importance is conducted to identify the principal signal characteristics driving the predictive performance of the Random Forest, XGBoost, and LSTM models. For each GNSS measurement epoch, statistical features (mean and standard deviation) are computed across all satellites tracked at that epoch. This cross-satellite aggregation captures the instantaneous distribution of signal quality and geometry, providing input features that reflect the current state of the GNSS constellation. Remarkably, all three architectures consistently identify the same set of five GNSS observables as the most influential features, despite their fundamentally different learning mechanisms. This consensus underscores the critical role of specific signal strength and quality metrics in positioning error estimation.

The feature importance scores for the three evaluated models—Random Forest, XGBoost, and LSTM (using a permutation-based method)—are comparatively visualized in Figure 4. A consistent pattern emerges across all models, highlighting a set of core predictive drivers for GNSS positioning error.

The top five features, ranked by their aggregate importance score across all models, are:

1.: gnss_L5Q_mean: The mean signal power of the L5-band quadrature-phase component, which exhibited the strongest overall predictive influence. The L5 band’s high chipping rate and modernized signal structure provide superior noise resistance and multipath mitigation, making its average power a robust indicator of measurement quality.
2.: gnss_L1I_mean: The mean signal power of the L1 in-phase component. As the primary civilian signal, its strength is directly correlated with pseudorange precision and serves as a fundamental indicator of line-of-sight signal availability.
3.: gnss_L1C_mean: The mean power of the modernized L1C signal. Its importance highlights the value of next-generation GNSS signals, which are designed with advanced spreading codes and pilot channels for improved tracking robustness in challenging environments.
4.: gnss_L1C_std: The standard deviation of the L1C signal power. This temporal variability metric captures signal stability, with higher fluctuations often indicative of multipath interference, partial obstructions, or receiver tracking instabilities.
5.: gnss_D5Q_mean: The mean signal power of the D5-band quadrature component (specific to certain satellite systems). Its prominence suggests that leveraging signals across multiple frequencies (L1, L5, and D5) provides complementary information essential for error characterization.

The consistent identification of these features—particularly the mean and standard deviation metrics across multiple frequency bands—reveals a critical insight: positioning error is predominantly governed by the absolute strength and temporal stability of specific GNSS signal components, rather than by derived navigation parameters alone. While the relative weighting of these five features varied between models (with XGBoost attributing the highest relative importance to gnss_L5Q_mean and gnss_L1C_std), their unanimous selection validates their fundamental role in error prediction.

This finding has direct practical implications for receiver design and data preprocessing: focusing on the quality metrics of these specific signal components, particularly L5Q, L1I, and L1C, provides a highly efficient feature set for real-time error modeling. The convergence in feature importance across diverse algorithms suggests that these five observables form a robust, model-agnostic basis for GNSS positioning error estimation.

5. Conclusions

In this study, we developed Random Forest, XGBoost, and LSTM frameworks for GNSS positioning error estimation in UAV applications. The comparative analysis demonstrates that the XGBoost algorithm achieves superior predictive accuracy (the lowest MSE of

0.0174 m^{2}

and the highest

R^{2}

of 0.898) compared to the Random Forest and LSTM models while maintaining favorable computational efficiency. The feature importance analysis reveals a critical insight: positioning errors are predominantly governed by the absolute strength and temporal stability of specific GNSS signal components—particularly the L5Q, L1I, and L1C signals—rather than derived from navigation parameters alone. This provides a model-agnostic, signal-centric foundation for error estimation.

The analysis of error distribution indicates that horizontal errors (North and East components) are generally larger and more variable than vertical errors, a pattern observed across all evaluated models. This asymmetry necessitates tailored improvement strategies for UAV positioning systems.

The dominance of signal quality metrics (gnss_L5Q_mean and gnss_L1C_std) suggests that UAV navigation systems should prioritize real-time monitoring of these specific observables. An adaptive weighting mechanism can be implemented in the fusion filter, dynamically reducing reliance on GNSS measurements when these key signal indicators degrade below empirically determined thresholds while increasing dependence on alternative sensors (e.g., visual odometry and LiDAR).

The prominence of multi-frequency features (gnss_L5Q_mean and gnss_D5Q_mean) underscores the value of multi-GNSS, multi-frequency receivers. To specifically address the larger horizontal errors, the positioning algorithm should employ tight-coupling fusion that directly utilizes raw measurements from these frequency bands to enhance geometry resolution and accelerate ambiguity resolution for horizontal components.

The trained XGBoost model itself can be deployed as a lightweight, onboard Predictive Integrity Monitor. By feeding real-time streams of the five key features into the model, the UAV’s flight controller can obtain continuous, ahead-of-time predictions of expected positioning error. This allows for proactive safety measures, such as initiating a controlled hover or executing a precautionary landing when predicted horizontal error exceeds mission-specific tolerances.

Although the experimental design effectively evaluated UAV dynamic positioning accuracy under controlled conditions, certain limitations must be acknowledged. This study was conducted using only a single UAV platform (specific model) and a specific low-speed flight profile. Differences in UAV models, flight speeds, the vibration characteristics of power systems, and onboard GNSS receiver performance may lead to varying dynamic error patterns. Consequently, the generalizability of our findings to other platforms or complex flight maneuvers requires further investigation. Future work could explore the integration of L1 and L5 dual-frequency signals to better mitigate ionospheric delays and improve positioning accuracy, particularly as GNSS receivers on board UAVs continue to evolve.

In summary, this work establishes targeted, signal-quality-based machine learning approaches for accurate error estimation. These approaches deliver actionable insights for designing robust UAV navigation systems capable of real-time self-diagnosis and compensation. Future research will focus on integrating the identified key features into a real-time, adaptive Kalman Filter framework to create a closed-loop correction system. Furthermore, the generalizability of this signal-centric error model will be validated across diverse UAV platforms, operational environments, and multi-constellation GNSS receivers.

Despite the promising results, this study has certain limitations that open avenues for future research. Firstly, the validation of real-time performance was conducted on a standard laptop (Intel i5 + RTX 2060). While the measured inference times (e.g., <5 ms) are significantly faster than the 1 Hz data rate, they do not fully represent performance on resource-constrained onboard flight computers, which may have limited processing power. Secondly, the definition of “real-time” in this work is specifically tied to the 1 Hz GNSS epoch interval. Future work should focus on deploying and optimizing the proposed models on embedded platforms (e.g., NVIDIA Jetson series or STM32-based flight controllers) to verify their performance under actual hardware constraints. Furthermore, the applicability of these models to higher-frequency data streams (e.g., 5–20 Hz) and more dynamic flight maneuvers (e.g., sharp turns or sudden accelerations) requires further investigation to generalize the real-time claims beyond the current experimental scope.

Another limitation of this study is the restricted scope of data collection, which was conducted over a two-day period in a single environment with controlled flight conditions (low speed and a straight-line trajectory). While this controlled setup was appropriate for initial model comparison and proof-of-concept validation, it does not fully represent the diverse operational conditions encountered in real-world UAV applications. Factors such as urban multipath, atmospheric variability, different flight dynamics, and weather conditions may affect both feature importance and model performance. Therefore, the robustness claims in this study should be interpreted within this constrained context.

Future work will address this gap by deploying the models on representative onboard computers (e.g., NVIDIA Jetson series and STM32F4) and measuring actual inference latency, power consumption, and memory usage during real-time flight operations. Model optimization techniques such as pruning, quantization, and hardware-specific compilation will also be explored to ensure efficient deployment in resource-constrained environments.

Future work will also focus on validating the proposed models across more diverse datasets, including data collection in different environments such as urban canyons, rural areas, and mountainous terrain, as well as experiments under varying weather conditions and times of day. Additionally, evaluation on different flight profiles, such as high-speed maneuvers and curved trajectories, will be conducted, along with an analysis of feature importance stability across these diverse conditions. Such extensive validation will be necessary to confirm the generalizability and practical applicability of the proposed approach in real-world GNSS-challenging environments.

Author Contributions

Methodology, M.Y., G.-H.N. and Z.M.; Software, M.Y. and Z.M.; Formal analysis, M.T.; Investigation, M.Y., H.Z., Y.-Q.Z. and J.L.; Resources, M.Y., H.Z. and J.-G.M.; Data curation, M.Y., J.L. and A.S.; Writing—original draft, J.L.; Writing—review & editing, J.L.; Supervision, J.-G.M. and M.A.; Funding acquisition, J.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Xinjiang Natural Science Foundation, grant number 2025D01C287, and Xinjiang Talent Development Fund’s Second Round of 2025 Funding—Special Program for Talent Team Support of Scientific Research and Innovation Platforms.

Data Availability Statement

All data involved in this paper are available for researchers and have been made publicly accessible alongside the article. For further inquiries, please contact the authors.

Acknowledgments

The authors would like to thank Zhi-Xiang Luo and Wei Wang for their valuable assistance during the experiments.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Laghari, A.A.; Jumani, A.K.; Laghari, R.A.; Nawaz, H. Unmanned aerial vehicles: A review. Cogn. Robot. 2023, 3, 8–22. [Google Scholar] [CrossRef]
Ahmed, F.; Mohanta, J.; Keshari, A.; Yadav, P.S. Recent advances in unmanned aerial vehicles: A review. Arab. J. Sci. Eng. 2022, 47, 7963–7984. [Google Scholar] [CrossRef]
Valavanis, K.P.; Vachtsevanos, G.J. Handbook of Unmanned Aerial Vehicles; Springer: Berlin/Heidelberg, Germany, 2014. [Google Scholar]
Zuo, Z.; Liu, C.; Han, Q.; Song, J. Unmanned aerial vehicles: Control methods and future challenges. IEEE/CAA J. Autom. Sin. 2022, 9, 601–614. [Google Scholar] [CrossRef]
Newcome, L.R. Unmanned Aviation: A Brief History of Unmanned Aerial Vehicles; Aiaa: Reston, VA, USA, 2004. [Google Scholar]
Lyu, Z.; Gao, Y.; Chen, J.; Du, H.; Xu, J.; Huang, K.; Kim, D.I. Empowering Intelligent Low-Altitude Economy with Large AI Model Deployment. IEEE Wirel. Commun. 2026, 33, 64–72. [Google Scholar] [CrossRef]
Javaid, S.; Fahim, H.; He, B.; Saeed, N. Large Language Models for UAVs: Current State and Pathways to the Future. IEEE Open J. Veh. Technol. 2024, 5, 1166–1192. [Google Scholar] [CrossRef]
Grewal, M.S. Global navigation satellite systems. Wiley Interdiscip. Rev. Comput. Stat. 2011, 3, 383–384. [Google Scholar] [CrossRef]
Lechner, W.; Baumann, S. Global navigation satellite systems. Comput. Electron. Agric. 2000, 25, 67–85. [Google Scholar] [CrossRef]
Bonnor, N. A brief history of global navigation satellite systems. J. Navig. 2012, 65, 1–14. [Google Scholar] [CrossRef]
Yu, J.; Meng, X.; Yan, B.; Xu, B.; Fan, Q.; Xie, Y. Global Navigation Satellite System-based positioning technology for structural health monitoring: A review. Struct. Control Health Monit. 2020, 27, e2467. [Google Scholar] [CrossRef]
Wang, J.J. Antennas for global navigation satellite system (GNSS). Proc. IEEE 2012, 100, 2349–2355. [Google Scholar] [CrossRef]
Ekaso, D.; Nex, F.; Kerle, N. Accuracy assessment of real-time kinematics (RTK) measurements on unmanned aerial vehicles (UAV) for direct geo-referencing. Geo-Spat. Inf. Sci. 2020, 23, 165–181. [Google Scholar] [CrossRef]
Henkel, P.; Sperl, A. Real-time kinematic positioning for unmanned air vehicles. In Proceedings of the 2016 IEEE Aerospace Conference, Big Sky, MT, USA, 5–12 March 2016; pp. 1–7. [Google Scholar]
Shin, Y.; Lee, C.; Kim, E. Enhancing Real-Time Kinematic Relative Positioning for Unmanned Aerial Vehicles. Machines 2024, 12, 202. [Google Scholar] [CrossRef]
Tahar, K.N.; Ahmad, A.; Akib, W.A.A.W.M.; Mohd, W.M.N.W. Unmanned aerial vehicle photogrammetric results using different real time kinematic global positioning system approaches. In Developments in Multidimensional Spatial Data Models; Springer: Berlin/Heidelberg, Germany, 2013; pp. 123–134. [Google Scholar]
Famiglietti, N.A.; Cecere, G.; Grasso, C.; Memmolo, A.; Vicari, A. A test on the potential of a low cost unmanned aerial vehicle RTK/PPK solution for precision positioning. Sensors 2021, 21, 3882. [Google Scholar] [CrossRef]
Grayson, B.; Penna, N.T.; Mills, J.P.; Grant, D.S. GPS precise point positioning for UAV photogrammetry. Photogramm. Rec. 2018, 33, 427–447. [Google Scholar] [CrossRef]
Shi, J.; Yuan, X.; Cai, Y.; Wang, G. GPS real-time precise point positioning for aerial triangulation. GPS Solut. 2017, 21, 405–414. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
Mathew, A.; Amudha, P.; Sivakumari, S. Deep learning techniques: An overview. In International Conference on Advanced Machine Learning Technologies and Applications; Springer: Singapore, 2020; pp. 599–608. [Google Scholar]
Shinde, P.P.; Shah, S. A review of machine learning and deep learning applications. In Proceedings of the 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), Pune, India, 16–18 August 2018; pp. 1–6. [Google Scholar]
Deng, L.; Yu, D. Deep learning: Methods and applications. Found. Trends^® Signal Process. 2014, 7, 197–387. [Google Scholar] [CrossRef]
Graves, A. Long short-term memory. In Supervised Sequence Labelling with Recurrent Neural Networks; Springer: Berlin/Heidelberg, Germany, 2012; pp. 37–45. [Google Scholar]
Huang, R.; Wei, C.; Wang, B.; Yang, J.; Xu, X.; Wu, S.; Huang, S. Well performance prediction based on Long Short-Term Memory (LSTM) neural network. J. Pet. Sci. Eng. 2022, 208, 109686. [Google Scholar] [CrossRef]
Schmidhuber, J.; Hochreiter, S. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Fujii, K. Extended kalman filter. Ref. Man. 2013, 14, 2. [Google Scholar]
Ribeiro, M.I. Kalman and extended kalman filters: Concept, derivation and properties. Inst. Syst. Robot. 2004, 43, 3736–3741. [Google Scholar]
Yang, S.; Baum, M. Extended Kalman filter for extended object tracking. In Proceedings of the 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, USA, 5–9 March 2017; pp. 3736–3741. [Google Scholar]
Rigatti, S.J. Random forest. J. Insur. Med. 2017, 47, 31–39. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Salman, H.A.; Kalakech, A.; Steiti, A. Random forest algorithm overview. Babylon. J. Mach. Learn. 2024, 2024, 69–79. [Google Scholar] [CrossRef] [PubMed]
Chen, T. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016. [Google Scholar]
Chen, T.; He, T.; Benesty, M.; Khotilovich, V.; Tang, Y.; Cho, H.; Chen, K.; Mitchell, R.; Cano, I.; Zhou, T.; et al. Xgboost: Extreme Gradient Boosting. R Package Version 0.4-2. 2015. Available online: https://www.rdocumentation.org/packages/xgboost/versions/0.4-2 (accessed on 5 January 2026).
Taddia, Y.; Stecchi, F.; Pellegrinelli, A. Using DJI Phantom 4 RTK drone for topographic mapping of coastal areas. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2019, 42, 625–630. [Google Scholar] [CrossRef]
Mulakala, J. Measurement Accuracy of the DJI Phantom 4 RTK & Photogrammetry. DroneDeploy, Published in Partnership with DJI 2019. Available online: https://docs.djicdn.com/DJI+Enterprise/measurement-accuracy-dji-phantom-4-rtk-whitepaper-f[1].pdf (accessed on 5 January 2026).
Testing of drone DJI Phantom 4 RTK accuracy. In Advances and Trends in Geodesy, Cartography and Geoinformatics II; CRC Press: Boca Raton, FL, USA, 2020; pp. 99–105.
Hu, G.; Wang, W.; Zhong, Y.; Gao, B.; Gu, C. A new direct filtering approach to INS/GNSS integration. Aerosp. Sci. Technol. 2018, 77, 755–764. [Google Scholar] [CrossRef]
Dai, H.; Bian, H.; Wang, R.; Ma, H. An INS/GNSS integrated navigation in GNSS denied environment using recurrent neural network. Def. Technol. 2020, 16, 334–340. [Google Scholar] [CrossRef]
Jekeli, C. Inertial Navigation Systems with Geodetic Applications; Walter de Gruyter GmbH & Co KG: Berlin, Germany, 2023. [Google Scholar]
Cox, D.B. Integration of GPS with inertial navigation systems (Miscellaneous Topics). NAVIGATION J. Inst. Navig. 1978, 25, 236–245. [Google Scholar] [CrossRef]
Braasch, M.S. Inertial navigation systems. In Aerospace Navigation Systems; Wiley: London, UK, 2016; pp. 1–25. [Google Scholar]
Hasan, A.M.; Samsudin, K.; Ramli, A.R.; Azmir, R.; Ismaeel, S. A review of navigation systems (integration and algorithms). Aust. J. Basic Appl. Sci. 2009, 3, 943–959. [Google Scholar]
Hegarty, C.J. GNSS signals—An overview. In Proceedings of the 2012 IEEE International Frequency Control Symposium Proceedings, Baltimore, MD, USA, 21–24 May 2012; pp. 1–7. [Google Scholar]
Hegarty, C.J. The global positioning system (GPS). In Springer Handbook of Global Navigation Satellite Systems; Springer: Cham, Switzerland, 2017; pp. 197–218. [Google Scholar]
Kumar, S.; Moore, K.B. The evolution of global positioning system (GPS) technology. J. Sci. Educ. Technol. 2002, 11, 59–80. [Google Scholar] [CrossRef]
Enge, P.K. The global positioning system: Signals, measurements, and performance. Int. J. Wirel. Inf. Netw. 1994, 1, 83–105. [Google Scholar] [CrossRef]
Ivanov, N.; Salischev, V. The GLONASS system—An overview. J. Navig. 1992, 45, 175–182. [Google Scholar] [CrossRef]
Revnivykh, S.; Bolkunov, A.; Serdyukov, A.; Montenbruck, O. Glonass. In Springer Handbook of Global Navigation Satellite Systems; Springer: Cham, Switzerland, 2017; pp. 219–245. [Google Scholar]
Polischuk, G.; Kozlov, V.; Ilitchov, V.; Kozlov, A.; Bartenev, V.; Kossenko, V.; Anphimov, N.; Revnivykh, S.; Pisarev, S.; Tyulyakov, A. The global navigation satellite system GLONASS: Development and usage in the 21st century. In Proceedings of the 34th Annual Precise Time and Time Interval Systems and Applications Meeting, Reston, VA, USA, 3–5 December 2002; pp. 151–160. [Google Scholar]
Yang, Y.; Gao, W.; Guo, S.; Mao, Y.; Yang, Y. Introduction to BeiDou-3 navigation satellite system. Navigation 2019, 66, 7–18. [Google Scholar] [CrossRef]
Han, C.; Yang, Y.; Cai, Z. BeiDou navigation satellite system and its time scales. Metrologia 2011, 48, S213. [Google Scholar] [CrossRef]
Benedicto, J.; Dinwiddy, S.; Gatti, G.; Lucas, R.; Lugert, M. GALILEO: Satellite system design. In European Space Agency; Int. Business: Tokyo, Japan, 2000. [Google Scholar]
Falcone, M.; Hahn, J.; Burger, T. Galileo. In Springer Handbook of Global Navigation Satellite Systems; Springer: Cham, Switzerland, 2017; pp. 247–272. [Google Scholar]
Drake, S. Galileo and satellite prediction. J. Hist. Astron. 1979, 10, 75–95. [Google Scholar] [CrossRef]
Bartolom’e, J.P.; Maufroid, X.; Hern’andez, I.F.; L’opez Salcedo, J.A.; Granados, G.S. Overview of Galileo system. In GALILEO Positioning Technology; Springer: Dordrecht, The Netherlands, 2014; pp. 9–33. [Google Scholar]
Snay, R.A.; Soler, T. Continuously Operating Reference Station (CORS): History, Applications, and Future Enhancements. J. Surv. Eng. 2008, 134, 95–104. [Google Scholar] [CrossRef]
Yang, Y.X. Chinese Geodetic Coordinate System 2000. Chin. Sci. Bull. 2009, 54, 2714–2721. [Google Scholar] [CrossRef]
Hussain, T. Checking the Integrity of Global Positioning Recommended Minimum (GPRMC) Sentences Using Artificial Neural Network (ANN). 2009. Available online: https://www.diva-portal.org/smash/record.jsf?pid=diva2:233855 (accessed on 5 January 2026).
Cai, C.; Yan, H.; Chen, H.; Xu, H. Design and Implementation of Traffic Signal Controller with GPS Timing Function. In CICTP 2012: Multimodal Transportation Systems—Convenient, Safe, Cost-Effective, Efficient; American Society of Civil Engineers: Reston, VA, USA, 2012; pp. 1055–1064. [Google Scholar]
Langley, R. Nmea 0183: A gps receiver. GPS World 1995, 6, 54–57. [Google Scholar]
Aroon, N. Study of Using MQTT Cloud Platform for Remotely Control Robot and GPS Tracking. In Proceedings of the 2016 13th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON), Chiang Mai, Thailand, 28 June–1 July 2016; pp. 1–6. [Google Scholar]
Brown, N.; Keenan, R.; Richter, B.; Troyer, L. Advances in ambiguity resolution for RTK applications using the new RTCM V3. 0 Master-Auxiliary messages. In Proceedings of the 18th International Technical Meeting of the Satellite Division of The Institute of Navigation (ION GNSS 2005), Long Beach, CA, USA, 13–16 September 2005; pp. 73–80. [Google Scholar]
Niu, X.; Yan, K.; Zhang, T.; Zhang, Q.; Zhang, H.; Liu, J. Quality evaluation of the pulse per second (PPS) signals from commercial GNSS receivers. GPS Solut. 2015, 19, 141–150. [Google Scholar] [CrossRef]
IGS RINEX: The Receiver Independent Exchange Format Version 3.04. Available online: http://acc.igs.org/misc/rinex304.pdf (accessed on 5 January 2026).
Teunissen, P.J.; Montenbruck, O. (Eds.) Springer Handbook of Global Navigation Satellite Systems; Springer: Cham, Switzerland, 2017. [Google Scholar]
Segal, M.R. Machine Learning Benchmarks and Random Forest Regression. 2004. Available online: https://escholarship.org/uc/item/35x3v9t4 (accessed on 5 January 2026).
Grömping, U. Variable importance assessment in regression: Linear regression versus random forest. Am. Stat. 2009, 63, 308–319. [Google Scholar] [CrossRef]
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]

Figure 1. Data flow architecture of the experimental platform. It illustrates the integration of the GNSS receiver, total station, and time synchronization module. The diagram highlights the key data formats (GPRMC, GPGGA, and RTCM) and timing signals (PPS) used to acquire raw satellite observations and reference truth values. The process of epoch-by-epoch data aggregation and quality control is also conceptually represented.

Figure 2. Distribution of alignment errors. The horizontal positioning error, particularly in the North direction, exhibits a larger magnitude than the Up direction. This discrepancy may indicate the presence of a systematic bias rather than random noise.

Figure 3. Residual analysis plots of XGBoost provide a comprehensive diagnostic toolkit for evaluating regression model performance through four interconnected visualizations. Each point represents an individual test sample. The red dashed line indicates the zero residual line. The red solid line in the QQ plot represents the theoretical normal distribution line.

Figure 4. Top 15 most important features for predicting the

d R

of Random Forest, XGBoost, and LSTM. The top five features are consistently ranked across models with varying relative weights.

Figure 4. Top 15 most important features for predicting the

d R

of Random Forest, XGBoost, and LSTM. The top five features are consistently ranked across models with varying relative weights.

Table 1. Specifications of the experimental platform and data acquisition devices.

Device Description	Brand/Version	Primary Parameters
Small multi-rotor, high-precision aerial surveying UAV	DJI Phantom 4 RTK	Hovering Accuracy: With RTK enabled and functioning normally: Vertical: $\pm 0.1 m$ ; Horizontal: $\pm 0.1 m$ .
Remote Controller	–	Maximum Operational Range: 7 km
High-precision total station	Leica	Accuracy: Single measurement: $2 mm + 2 ppm$ ( $3 s$ ) Continuous measurement: $3 mm + 1.5 ppm$ ( $0.15 s$ )
High-precision circular prism	Leica	Prism Constant: 0 mm
$360^{\circ}$ prism	Huilide GRZ101	Prism Constant: 23.1 mm
GNSS receiver	iRTK10	Positioning Output Rate: 1 Hz–20 Hz

Table 2. Hyperparameter configurations for the benchmark models.

Parameter (Description)	Random Forest	XGBoost
Number of estimators ( $n_{estimators}$ )	100	200
Maximum tree depth ( $d_{\max}$ )	10	8
Learning rate ( $η$ )	–	0.1
Minimum samples split	2	–
Subsample ratio	1.0 (Bootstrap)	1.0
Objective/criterion	Squared Error	reg:squarederror

Table 3. Summary of LSTM model architecture and training hyperparameters.

Component	Description	Details
Network Architecture
Input Shape	Sequence length and features	(samples, timesteps = 10, and features = n)
First LSTM Layer	128 units, return sequences enabled	Dropout ( $p = 0.2$ ), batch normalization
Second LSTM Layer	64 units, return sequences enabled	Dropout ( $p = 0.2$ ), batch normalization
Third LSTM Layer	32 units, return sequences disabled	Dropout ( $p = 0.1$ )
First Dense Layer	64 units, ReLU activation	Dropout ( $p = 0.2$ )
Second Dense Layer	32 units, ReLU activation	Dropout ( $p = 0.1$ )
Output Layer	Linear dense layer	d units ( $d = 1$ for single-value prediction)
Training Configuration
Optimizer	Adam	Learning rate = 0.001
Loss Function	Mean squared error (MSE)	-
Batch Size	32	-
Epochs	100	Early stopping with patience = 10
Validation Split	80% training/20% validation	-
Weight Initialization	Glorot uniform	-
Random Seed	42	For reproducibility
Feature Normalization	Z-score normalization	Based on training set statistics

Table 4. A comparative analysis of the three models in terms of the

d R

value and key performance metrics.

Table 4. A comparative analysis of the three models in terms of the

d R

value and key performance metrics.

Model	MSE (m²)	RMSE (m)	$R^{2}$
EKF	0.0209 ± 0.0013	0.144 ± 0.004	0.890 ± 0.006
Random Forest	0.0219 ± 0.0015	0.148 ± 0.005	0.872 ± 0.012
XGBoost	0.0174 ± 0.0011	0.132 ± 0.004	0.898 ± 0.009
LSTM	0.0198 ± 0.0013	0.141 ± 0.005	0.884 ± 0.010

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yang, M.; Zhuo, H.; Ma, J.-G.; Niu, G.-H.; Mamtimin, Z.; Tao, M.; Zhu, Y.-Q.; Li, J.; Abdughani, M.; Sidike, A. A Comparative Study of Machine Learning and Deep Learning Models for Real-Time UAV Positioning Error Estimation. Drones 2026, 10, 172. https://doi.org/10.3390/drones10030172

AMA Style

Yang M, Zhuo H, Ma J-G, Niu G-H, Mamtimin Z, Tao M, Zhu Y-Q, Li J, Abdughani M, Sidike A. A Comparative Study of Machine Learning and Deep Learning Models for Real-Time UAV Positioning Error Estimation. Drones. 2026; 10(3):172. https://doi.org/10.3390/drones10030172

Chicago/Turabian Style

Yang, Mei, Hua Zhuo, Jun-Gang Ma, Guo-Hui Niu, Zulmira Mamtimin, Mei Tao, Ya-Qiong Zhu, Jun Li, Murat Abdughani, and Aihemaitijiang Sidike. 2026. "A Comparative Study of Machine Learning and Deep Learning Models for Real-Time UAV Positioning Error Estimation" Drones 10, no. 3: 172. https://doi.org/10.3390/drones10030172

APA Style

Yang, M., Zhuo, H., Ma, J.-G., Niu, G.-H., Mamtimin, Z., Tao, M., Zhu, Y.-Q., Li, J., Abdughani, M., & Sidike, A. (2026). A Comparative Study of Machine Learning and Deep Learning Models for Real-Time UAV Positioning Error Estimation. Drones, 10(3), 172. https://doi.org/10.3390/drones10030172

Article Menu

A Comparative Study of Machine Learning and Deep Learning Models for Real-Time UAV Positioning Error Estimation

Highlights

Abstract

1. Introduction

2. Experiment and Data Acquisition

2.1. GNSS/INS Integrated System

2.2. Data Analysis

3. Models

3.1. Benchmark Models

3.2. LSTM

4. Results

4.1. Model Performance Assessment

4.2. Key Features

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI