Next Article in Journal
Machine Learning-Based Classification of Team Playoff Advancement Using Pitching Performance Metrics in Korean Professional Baseball
Previous Article in Journal
Investigation of Degradation Mechanism of Unsaturated Shear Strength at Geogrid–Sandy-Soil Interface Under Rainfall Infiltration
Previous Article in Special Issue
Data Mining for Early Fault Detection in Artificial Satellites: A Review
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Hybrid Approach to Enhanced SGP4 for Galileo Constellations

Scientific Computation Research Institute (SCRIUR), University of La Rioja, 26006 Logroño, La Rioja, Spain
*
Author to whom correspondence should be addressed.
Appl. Sci. 2026, 16(5), 2214; https://doi.org/10.3390/app16052214
Submission received: 3 January 2026 / Revised: 9 February 2026 / Accepted: 23 February 2026 / Published: 25 February 2026
(This article belongs to the Special Issue Application of Machine Learning in Space Engineering)

Abstract

An up-to-date catalog of residents space objects orbiting Earth requires a critical balance between computational efficiency and orbital prediction precision. This work presents HSGP4, a hybrid orbit propagator specifically tailored for Galileo-type orbits that enhances the classical SGP4 analytical model using Artificial Neural Networks. The methodology centers on a non-invasive hybridization process that utilizes high-fidelity pseudo-observations to forecast SGP4 error residuals. A core contribution is the introduction of the Hybrid Two-Line Element format, which encapsulates neural model parameters alongside traditional orbital elements, ensuring seamless integration with existing catalog infrastructure. The development process involved comprehensive Exploratory Data Analysis and sensitivity analysis, which identified the argument of latitude as the most influential variable for correcting SGP4 errors in the MEO region. To ensure statistical robustness, a hierarchical selection strategy was implemented. This reduced an exhaustive search space of 32,256 candidate architectures to a final subset of optimized configurations. Validated against a decade of TLE data, the results confirm that HSGP4 effectively captures missing dynamic patterns and significantly improves ephemeris accuracy. By forecasting SGP4 error residuals, this hybrid approach provides a high-fidelity correction layer. It compensates for the limitations of analytical theories without requiring complex numerical integration.

1. Introduction

Maintaining a running catalog of space objects orbiting the Earth is essential for effective management of the near-Earth space environment. Ephemerides information is publicly accessible via the NORAD catalog, while other organizations, such as the European Space Agency (ESA), also provide data derived from their own observations. Sustaining these catalogs over time requires both comprehensive observational data and accurate predictions of object trajectories, which are achieved through orbit-propagation methods.
Orbit propagation refers to computer software that implements a solution to the dynamical systems governing the motion of Resident Space Objects (RSOs). Such software determines the state of an RSO at a specified time, given an initial condition. The accuracy and computational efficiency of an orbit propagator depend primarily on the force models used in the dynamical system and the integration technique employed to obtain the solution.
The primary force considered is the gravitational attraction of an ideally spherical Earth. However, other forces, such as Earth’s nonsphericity, atmospheric drag, gravitational perturbations from other celestial bodies, and solar radiation pressure, can significantly modify RSO’s trajectory. An important characteristic of this dynamical system is that not all perturbations need to be included in the equation of motion; typically, only those most relevant to the specific orbiter, its orbital characteristics, and the mission’s scientific objectives are considered. Integration techniques further categorize orbit propagators into general perturbations, special perturbations, and semianalytic theories.
General perturbation or analytical theories [1,2,3,4,5] are derived from direct analytic integration of the equations of motion and are primarily characterized by their ability to preserve the essential qualitative behavior of orbital motion. These approaches are based on truncated series expansions [6,7,8,9,10,11,12,13]. The resulting solutions are valid for arbitrary initial conditions and are formulated as explicit functions of time, physical parameters, and integration constants. Once the RSO state is known at a given time, its future state can be determined through a single evaluation of the analytical solution. The accuracy of a general perturbation theory is directly proportional to the fidelity of its force model and the order of the truncated expansion employed.
On the other hand, special perturbation theories refer [14,15] to the direct numerical integration of the equations of motion, including any external forces. In these approaches, introducing a new force into the equations of motion is achieved by simply expressing the perturbation as a function of time and the object’s state. In addition, achieving high accuracy with these methods requires using small integration steps, which penalizes their computational efficiency. Nevertheless, special perturbation theories are usually much slower than their analytical counterparts, though generally provide greater accuracy.
Finally, semianalytical techniques [16,17,18] were developed to leverage both general and special perturbation theories. These methods seek to combine the accuracy of numerical techniques and the speed advantages of analytical techniques. This approach achieves these results by separating long- and short-periodic terms, which limit the integration step size for numerical methods. Long-periodic terms correspond to cycles exceeding one orbital period, while short-periodic terms are related to cycles shorter than one orbital period. Step sizes of up to 1 day can be employed, maintaining accuracy comparable to special perturbation methods while substantially reducing computation time.
Currently, improving the accuracy or computational efficiency of the aforementioned techniques requires actions such as enhancing the force model, implementing more precise integration methods, parallelizing the code, or making refinements such as selecting the adequate variables or reference systems. Each of these approaches is considered an invasive procedure.
In 2008, Dr. San-Juan introduced a fourth non-invasive alternative approach, the hybrid methodology [19]. The hybrid methodology enhances predictive capabilities by combining classical propagation methods with advanced forecasting approaches, such as statistical time-series analysis [20] and Artificial Intelligence (AI) techniques [21]. Recent surveys in the field [22,23] emphasize that traditional physics-based propagators are increasingly limited by unmodeled perturbations and environmental uncertainties. As highlighted by Caldas and Soares [22], data-driven techniques such as Artificial Neural Networks (ANNs) provide superior capacity to model highly complex nonlinear systems. Furthermore, Kazemi et al. [23] argue that the integration of AI into orbit determination is essential for modern Space Situational Awareness (SSA) to balance computational efficiency with the required accuracy for collision avoidance. Our approach aligns with these current scientific trends by implementing a hybrid framework that corrects SGP4 analytical outputs using a tailored ANN architecture.
The forecasting component of the method increases the accuracy of the classical orbit propagator by modeling the dynamic effects that the propagator does not capture. Such effects may include uncertainties inherent to the problem, limitations of the force model, or the integration method. To further improve accuracy, the forecasting component requires additional information to compensate for the missing effects. Apart from the RSO state at an initial time, the forecaster must also have accurate knowledge of additional RSO states over a specified time interval. This additional information can be obtained from direct observations or pseudo-observations generated by a highly accurate orbit propagator. The combination of the selected orbit propagator with a predictive module constitutes what we have called a hybrid propagator. A hybrid propagator assumes that the current trajectory can be split into two components: an approximate solution, provided by the orbit propagator, which accounts for the primary dynamical effects, and a predictive module that captures the residual dynamics and provides learned corrections to the solution.
Given the large number of objects in the space catalog, with over 40,000 RSOs requiring propagation, balancing accuracy and computational efficiency is essential. High-fidelity models require step-by-step numerical integration with small step sizes, increasing computational demands. Simplified models may allow analytical solutions, reducing computational burden. In both cases, orbit propagation relies on initial conditions, such as Two Line Elements (TLEs), and the propagation model, such as SGP4 [24,25,26]. TLE serves as the standard orbital representation employed by the North American Aerospace Defense Command (NORAD) Catalog, which maintains and distributes the orbital parameters of tracked space objects in TLE format. This approach facilitates consistent identification, tracking, and orbital propagation. The TLE format is intended specifically for use with the SGP4 propagator, which calculates the time evolution of orbits using TLE parameters.
This paper introduces a hybrid version of the well-known SGP4 orbit propagator, named HSGP4, that is based on an artificial neural network model specifically trained for Galileo-type orbits. The HSGP4 propagator operates in conjunction with a Hybrid TLE (HTLE), an extension of the classical TLE, that encapsulates both TLE information and model parameters. Section 2 details the methodology used to develop the hybrid orbit propagator. Section 3 describes the data preparation process. Section 4 discusses the selection of optimal neural network architectures. Section 5 evaluates the robustness of the selected architectures using a new dataset. The paper finalizes with a summary of this work in Section 6.

2. Methodology

This section describes the process for hybridizing any orbit propagator. Achieving high accuracy with the hybrid orbit propagator HOP depends on a comprehensive understanding of the initial orbit propagator’s behavior within a defined spatial region. To support this, a methodology based on Exploratory Data Analysis is first introduced. This approach culminates in selecting the statistical or artificial intelligence method implemented in the predictive module of HOP .

2.1. Hybrid Orbit Propagator

Like any other type of orbit propagator, a hybrid propagator is the technology of computing the position and velocity of any RSO, x ^ f at a future instant t f . In the first stage of the method, we can obtain an initial approximation to x ^ f by applying any type of orbit propagator OP to the initial conditions x 1 at an initial instant t 1 :
x f OP = OP ( t f , x 1 ) .
x f OP is an approximate value because the integration techniques and the force model of OP include some uncertainties or simplifications. The aim of the hybrid methodology’s second stage, the forecasting technique, is to model and reproduce missing dynamics and the shortcomings of the integration technique. To achieve this, the forecasting technique must be adjusted to correct both sources of error. A set of precise observations, or accurately determined positions and velocities, x 1 O , x T O during a control interval [ t 1 , t T ] , with t T < t f , is necessary for that purpose. Conversely, if it is impossible to work with real observations, it is possible to use pseudo-observations generated by a high-precision numerical propagator. By means of those values, the error of the OP , that is, its difference with respect to the real behavior of the RSO, can be determined for any instant t i in the control interval as
ε i = x i O x i OP .
The time series of ε i data in the control interval, ε 1 , , ε T , which we call control data, contains sources of error that the forecasting technique must model and reproduce, and thus it is the data used to train it. It is worth noting that we can express the time series vector in any set of canonical or non-canonical variables. Once that process has been performed, an estimation of the error at the final instant t f , ε ^ f , can be determined, which allows for the calculation of the desired value of x ^ f as
x ^ f = x f OP + ε ^ f .
Therefore, a hybrid propagator HOP = ( OP , ε ^ ) combines an arbitrary propagator with a data-driven predictive module. The hybrid propagator is not unique, as it depends on the number of time-series error components that are incorporated into the predictive module.

2.2. Orbit Propagator Behavior

A comprehensive understanding of the behavior of the RSO in a specific region of space requires a substantial dataset of real observations or ephemerides. However, extensive historical records of such observations are rarely available. Consequently, this study can be developed using pseudo-observations generated by a high-accuracy numerical propagator ( HAOP ), based on freely available initial conditions.
The initial conditions are propagated over a specified period using OP and HAOP . Subsequently, the time series error is calculated as
ε t x = x t HAOP x t OP ,
where x denotes any set of variables, such as cartesian, orbital, Delaunay, polar-nodal, or equinoctial elements. The term x t HAOP refers to the pseudo-observation provided by the HAOP at epoch t. In contrast, x t OP represents the data obtained from OP at the same epoch. The six time series ε t x capture all information related to unmodeled OP effects.
Firstly, the propagator’s fidelity is initially validated against empirical observations or high-fidelity pseudo-observations through a multi-epoch error analysis across various coordinate frameworks. This evaluation encompasses a comprehensive error analysis. Among the errors that can be considered are distance, along-track, cross-track, and radial errors. Secondly, an Exploratory Data Analysis (EDA) is performed on the resulting error time series to quantify their impact on predictive accuracy and determine their suitability for the hybrid propagator. This analysis of error could consider multiple state representations, including Cartesian coordinates, orbital elements, and equinoctial or polar-nodal variables. The methodology begins by identifying discrepancies between the predicted ( OP ) and observed ( O ) states. Then, to isolate the contribution of each variable to the overall error, a substitution technique is employed: individual propagated components are replaced by their observed counterparts while holding all other variables constant. This sensitivity analysis is then extended to all 56 possible combinations of variables. Ranking the 64 cases by their maximum residual errors identifies the most influential variables, allowing an informed selection of optimal estimation techniques.
Orbital elements ( a , e , i , ω , Ω , M ) or Delaunay variables ( l , g , h , L , G , H ) provide an effective basis for initiating this process. These variable sets provide a geometric description of the error of the time series. The results can be readily translated to alternative sets of variables.
The EDA also comprises the following:
  • A graphical analysis of errors, conducted by using box-and-whisker plots. In these plots, the box is centered on the median, with the lower and upper edges representing the first and third quartiles ( Q 1 and Q 3 ). The whiskers extend to the minimum and maximum values, excluding outliers indicated by circles. The upper and lower limits are defined as Q 3 + 1.5 ( Q 3 Q 1 ) and Q 1 1.5 ( Q 3 Q 1 ) , respectively. Values outside these limits are classified as outliers.
  • Sequence and error plots of ε t x , as well as the use of autocorrelation functions (ACF) and periodograms. The ACF function quantifies the linear relationship between an observation at time t and observations at previous times. The periodogram is used to detect cyclical components in the dataset by representing the time series data as a sum of sinusoidal waves of different frequencies.
After this analysis, it is necessary to investigate the most suitable predictive architecture based on the stochastic properties of the identified error series. This approach aims to identify AI-based models capable of capturing complex, non-linear patterns and long-term dependencies that traditional propagators may not detect within this orbital regime.
Finally, the selective predictive model is integrated with the propagator OP to construct a hybrid propagator HOP, which combines physical modeling with data-driven correction. This integration is expected to produce a hybrid propagator that enhances orbital prediction accuracy across various initial conditions.

3. Data Preprocessing

As discussed in Section 2.1, the selection of a hybrid propagator depends on the number of variables included in the predictive module. This section evaluates the best combination that improves the accuracy of the SGP4 propagator while maintaining computational efficiency. The hybrid propagator is specifically tailored to the orbital region of the Galileo constellation to minimize the distance error.

3.1. TLE Data

The first step in developing the hybrid version of SGP4 is to understand its perturbation model and the integration method over a time interval. To assess this, we study the distance error obtained when comparing SGP4 with pseudo-observations generated by the high-accuracy numerical orbit propagator AIDA (Accurate Integrator for Debris Analysis) [27]. The perturbation model used in AIDA includes Earth’s gravitational field (up to 50 × 50 ), solar radiation pressure, and a third-body point-mass force model.
On the other hand, SGP4 is founded on the analytical theories of artificial satellites developed by [1,3]. Initially, SGP4 modeled perturbations using only zonal gravitational terms up to J 4 and atmospheric drag, as described by [28]. With the increased prevalence of Molniya and geostationary orbits, deep-space modeling, specifically Simplified Deep-space Perturbation-4 (SDP4), was incorporated into SGP4 [25,26]. This integration included lunisolar and tesseral-harmonic resonance effects, as developed by [29,30].
The analysis begins with a set of 313 TLEs from a Galileo satellite (NORA ID 40545). These TLEs provide an initial, comprehensive overview of the SGP4 behavior of this satellite over time. The data set is obtained from Space Track (https://www.space-track.org) and covers the period from 19 April 2015 to 28 December 2016. Figure 1 presents scatter diagrams for the considered TLEs in two-dimensional parameter planes: ( e , a ) , ( i , a ) , ( e , ω ) , and ( e , Ω ) . The semi-major axis values range from 29,598.8 to 29,954.7 km, with a peak near 29,600 km. Eccentricities span from 0.0000382 to 0.0016566 , and inclination values are approximately 55°. The argument of perigee ranges from 1.0519° to 358.291°, showing a greater concentration above 120° and below 260°. The right ascension of the ascending node varies between 76.8346° and 94.9086°, while the mean anomaly ranges from 1.685° to 358.952°.

3.2. Exploratory Data Analysis

In this subsection, we analyze the impact that correcting the error of a specific variable or a combination of them may have on the accuracy of the SGP4. The orbital variables are the first set to consider because they provide the geometric description of the differences between the numerical integrator and SGP4. The maximum, Q 3 , median, Q 1 , and the minimum SGP4 distance errors are approximately 52.01, 17.35, 10.73, 6.29, and 1.31 km, respectively.
Initially, each series is considered separately. For instance, the semimajor axis, a t SGP4 , is substituted with its most accurate determination a t AIDA , while the remaining x t SGP4 time series variables remain unchanged. Then, the distance error after 30 days of propagation is calculated to assess the variable’s effect on reducing it. Finally, we use the same process to check the distance error across the 64 possible combinations of variables. It is worth noting that the maximum error reduction is reached when all the variables are replaced.
There are 16 variables combinations that yield lower distance errors than SGP4 across all TLEs. These combinations are as follows: ( a , e , i , Ω , ω , M ) , ( a , e , i , ω , M ) , ( a , e , Ω , ω , M ) , ( a , i , Ω , ω , M ) , ( e , i , Ω , ω , M ) , ( a , e , ω , M ) , ( a , i , ω , M ) , ( a , Ω , ω , M ) , ( e , i , ω , M ) , ( e , Ω , ω , M ) , ( i , Ω , ω , M ) , ( e , ω , M ) , ( a , ω , M ) , ( i , ω , M ) , ( Ω , ω , M ) and ( ω , M ) . As can be seen, all combinations include the arguments of the perigee and the mean anomaly. Figure 2 presents box-and-whisker plots for the best combinations of fewer than five variables that reduce distance errors in SGP4. For the combination ( ω , M ) , the maximum, Q 3 , median, Q 1 , and minimum distance errors are approximately 3.6, 2.4, 2.1, 1.8, and 1.2 km, respectively. These values indicate the optimal performance achievable by this combination.
The analysis is further extended to polar-nodal variables ( r , θ , ν , R , Θ , N ) , which are expressed as functions of the orbital elements as follows:
( r = a ( 1 e 2 ) 1 + e cos f , θ = ω + f , ν = Ω , R = e G sin f a ( 1 e 2 ) , Θ = μ a ( 1 e 2 ) , N = G cos i ) ,
where μ denotes the gravitational parameter, and f represents the true anomaly, which is related to the mean anomaly M by the Kepler equation. In this set of variables, there are 32 combinations that improve distance error relative to SGP4 for all TLEs. These combinations include ( r , θ , ν , R , Θ , N ) , ( θ , ν , R , Θ , N ) , ( r , θ , R , Θ , N ) , ( r , θ , ν , Θ , N ) , ( r , θ , ν , R , N ) , ( r , θ , ν , R , Θ ) , ( θ , R , Θ , N ) , ( θ , ν , Θ , N ) , ( θ , ν , R , N ) , ( θ , ν , R , Θ ) , ( r , θ , Θ , N ) , ( r , θ , R , N ) , ( r , θ , R , Θ ) , ( r , θ , ν , N ) , ( r , θ , ν , Θ ) , ( r , θ , ν , R ) , ( θ , Θ , N ) , ( θ , R , N ) , ( θ , R , Θ ) , ( θ , ν , N ) , ( θ , ν , Θ ) , ( θ , ν , R ) , ( r , θ , N ) , ( r , θ , Θ ) , ( r , θ , R ) , ( r , θ , ν ) , ( θ , N ) , ( θ , Θ ) , ( θ , R ) , ( θ , ν ) , ( r , θ ) , and ( θ ) . Figure 3 presents a box-and-whisker plot of the most effective variable combinations for reducing SGP4 distance errors. We have considered only those combinations that reduce the maximum distance error to fewer than three variables. Notably, one of the most effective combinations consists solely of the variable ( θ ) , although the optimal combination is ( r , θ , ν ) . As can be seen, θ is in all the best combinations. For ( θ ) , the maximum, Q 3 , median, Q 1 , and minimum distance errors are approximately 2.4, 1.7, 1.5, 1.3, and 1 km, respectively. These results are slightly better than the combination obtained using the orbital elements ( ω , M ) . Remember that the argument of latitude is defined as θ = ω + f . This result supports the development of a parsimonious model, defined as the simplest model with minimal assumptions and variables while maximizing explanatory power.

3.3. Error in the Argument of Latitude

This subsection examines the argument of latitude error and evaluates its significance relative to propagation models.
Figure 4 plots a 50-representative sample of the time series of the error in the argument of latitude ε θ over a 15-day propagation span. As shown, these time series can be categorized into three groups based on their trend characteristics. The first two groups exhibit a trend component, which may increase or decrease over time, a seasonal component, and irregular data variance. In contrast, the third group lacks a discernible trend but displays pronounced seasonal components. The periods of the seasonal components, determined using autocorrelation functions (ACF) and periodograms, are approximately equal to the duration of a satellite revolution, about 14 h. Of the 313 ε θ time series, 169 exhibit a positive trend, 118 a negative trend, and 26 show no clear trend.
Figure 5 shows the box-and-whisker plots of the SGP4 distance error and the SGP4 improvement due to θ correction over a 30-day propagation span. The 313 ε θ time series are classified according to their trend. Figure 5a shows the SGP4 distance error, whereas Figure 5b shows the optimal SGP4 distance error obtained by substituting the ε SGP4 θ time series with ε AIDA θ . As can be seen, the stronger the trend, the greater the reduction in error achieved by the hybrid model. By contrast, for time series with no clear trend, the potential for error reduction is lower, since the initial SGP4 error is already small. Finally, comparing the two plots indicates that the maximum improvement from the θ correction must be our goal when training neural network models to predict θ error.
To summarize, this analysis shows that the argument of the latitude θ is an excellent candidate for modeling the evolution of their errors, thus constituting the best compromise between accuracy and simplicity.

3.4. Predictive Technique: Artificial Neural Network

Artificial neural networks (ANNs) are used as a predictive technique to hybridize SGP4. ANNs can model residual dynamics not captured by the analytical formulation of SGP4. By learning temporal patterns in propagation errors from TLEs, ANNs can approximate the unmodeled effects and uncertainties of SGP4. This data-driven correction based on the influence of the ε θ time series enhances accuracy while maintaining the efficiency and interpretability of the analytical model.
The hybrid propagator increases its computational cost as the time required by the predictive part increases. In this case, the neural network’s predictions result from a sequence of matrix multiplications and nonlinear transformations. For a given input vector, each layer computes a weighted sum by multiplying the input by a weight matrix and adding a bias vector. The output is then processed by an activation function, which serves as the input for the subsequent layer. This procedure continues through each layer until the final output is produced, representing the model’s prediction. It is worth noting that the network weights, biases, and architectural configuration are embedded in the HTLE, enabling the predictive correction to be applied directly during propagation.

4. Artificial Neural Network Model Architectures

The performance of the ANN depends, among other factors, on its architecture, which is defined by a set of hyperparameters: the batch size ( b s ), the number of hidden layers ( n h l ), the number of neurons in the first hidden layer ( n n h l ), the activation function ( a f ), the optimizer (o), the loss function ( l f ), and the learning rate ( l r ), among others. Table 1 summarizes the range of values considered for each hyperparameter. From the second hidden layer onward, if it exists, the number of neurons is fixed at half that of the preceding layer.
In the scenario considered in Table 1, the total number of ANN architectures with one hidden layer is 1536, while 6144 and 24,576 with two and three hidden layers, respectively. This results in a total of 32,256 possible architectures. The input layer consists of a set of 168 neurons to receive the error series represented by a vector of dimension 169. 168 points for the input, and the one value left is the expected output, altogether corresponding to approximately two satellite revolutions. The datasets for training, validation, and testing span approximately 7, 3, and 14 satellite revolutions, respectively.
An experimental strategy is implemented to reduce the number of architectures under consideration. The approach begins with selecting the loss function and the number of hidden layers. Once these parameters are established, the optimal values for the remaining hyperparameters are identified. Finally, for all experiments, the number of epochs is fixed at 500, and early stopping with a patience parameter of 60 is applied. Each experiment is repeated 10 times for each architecture. The nonparametric k-sample test is employed to validate the selection process by determining whether performance differences among candidate architectures are statistically significant. The experiments utilize the Python packages TensorFlow (v2.10.1), Matplotlib (v3.10.8), Scikit-learn (v1.7.2), Pandas (v2.3.3), NumPy (v1.23.5), Itertools, Statsmodels (v0.14.6), Subprocess, Random, and Json. The high-performance computing center of the University of La Rioja is used throughout the study. The optimization of hyperparameters is described in the following subsection.

4.1. 7D Hyperparameter Space

This section determines the most suitable loss function, the optimal number of hidden layers, and the learning rate, thereby reducing the hyperparameter space to four dimensions.

4.1.1. Loss Function

The initial experiment is designed to identify the most suitable loss function, thereby facilitating the subsequent search for the optimal ANN architecture while accounting for the remaining hyperparameters. Accordingly, the loss functions mse, mae, mape, and msle listed in Table 1 are evaluated. Beginning with the complete set of 32,256 architectures described previously, fixing the loss function reduces the hyperparameter search space to six dimensions. This approach evaluates 384 architectures with one hidden layer, 1536 architectures with two hidden layers, and 6144 architectures with three hidden layers, for a total of 8064 architectures per loss function.
To reduce the number of executions, probabilistic random sampling stratified by the number of hidden layers is employed, with a 5% margin of error and a 95% confidence level. This yields 366 architectures: 17 with one hidden layer, 70 with two hidden layers, and 279 with three hidden layers. These architectures are trained and tested using 6 out of 313 available time series ε θ : 2 with a positive trend, 2 with a negative trend, and 2 with no trend, generating a total of 2196 architectures for each of the four loss functions. Figure 6a shows the six randomly ε θ time series considered in this experiment. Each model uses 2 satellite revolutions as input and is trained, validated, and tested using 7, 3, and 14 satellite revolutions, respectively. Figure 6b presents the distance error (in km) between AIDA and SGP4 for the selected cases over 26 satellite revolutions. Time series with a positive trend are depicted in blue, those with a negative trend in red, and those without a discernible trend in green. The results demonstrate that both positive and negative arguments of the latitude error trends reduce performance when calculating the SGP4 distance error, while time series lacking a clear trend have a comparatively limited influence on the resulting distance error.
Table 2 presents SGP4 distance error (in km) at 2 and 4 propagation days during the training period, as well as at 2, 4, 6, and 8 propagation days during the test period. These distance errors define the reference baseline used to evaluate the accuracy of the proposed hybrid orbital propagator ( HOP ), which uses an ANN to correct the argument of the latitude error variable θ .
A total of 2196 architectures are used for each loss function over the six ε θ error time series. The model with the lowest rmse during training and testing is selected. Table 3 indicates the number of models that improve SGP4 error at 4-day training, grouped by loss function. As shown, the best-performing loss functions during training are mse and mape, with similar percentages. The loss functions mse and mape exhibit comparable performance during training, approximately 38% and 35%, respectively. In contrast, mape achieves the highest performance during testing, reaching nearly 40%, followed by mse, which decreases to 26%. We apply a non-parametric k-sample test to evaluate whether improvements in rmse are attributable to the choice of loss function.
To determine whether statistically significant differences exist among the evaluated loss functions (mae, mape, mse, and msle), and particularly between mape and mse, a chi-square test for proportions is conducted. This non-parametric test assesses whether the observed differences in the performance of the loss functions are attributable to chance or represent real differences, by testing the null hypothesis that all functions produce equivalent results, against the alternative hypothesis that significant differences exist in their performance. The results indicate statistically significant differences ( χ 2 = 1010.91 for training, χ 2 = 595.81 for testing, both with p < 0.001 ).
In the following, the influence of these models when integrated into the hybrid propagator is evaluated. Figure 7 presents box-and-whisker plots of the distance error relative to the family of HSGP4 propagators that use forecasting models with mse (in red) and in mape (blue) loss functions. The analysis covers both the training and testing period for 2 and 4 days, and for 2, 4, 6, and 8 days, respectively. In these plots, the outliers were removed for clarification. Figure 7a displays the training period. Although the error distributions are similar for mse and mape, in training, the third quartile value for mape is lower than that for mse. Additionally, the percentage of outliers is also similar. At 2 days mse presents 11.11% of outliers, against the 10.99% mape. At 4 days, the opposite occurs, with 9.10% and 9.49% for mse and mape, respectively. Figure 7b illustrates the testing period. The distance errors with mape loss function are lower than those of mse along the total test period. Additionally, mape exhibits less variability, with an extreme value rate nearly 50% lower than that of errors from the mse loss function. The percentages of outliers for mse at 2, 4, 6, and 8 days are 14.97%, 17.73%, 19.10%, and 20.14%, against 7.39%, 13.75%, 10.68%, and 11.02% obtained for mape, respectively. Based on these results, mape is selected as the preferred loss function.

4.1.2. Optimum Number of Hidden Layers

Upon selecting the mape loss function, the number of candidate architectures decreases to 2196, comprising 102 with one hidden layer, 420 with two hidden layers, and 1674 with three hidden layers. Distance error serves as the primary metric to identify the architectures with the best performance model. After 4 days of propagation during the training period, the experiment shows that in 1405 of the 2196 cases, about 63.98% of the time, the error in HSGP4 is lower than in SGP4.
Table 4 presents the number of models out of the total for each number of the hidden layers whose distance errors are lower than SGP4. The mean computational time for training each model is also given. As shown, the proportion of models that improve increases as the number of hidden layers increases. In particular, the improvement percentage is 23.52% from one to two hidden layers and 8.22% from two to three hidden layers. A chi-square test indicates that these differences are statistically significant ( χ 2 = 17.30 , d f = 2 , p < 0.001 ), thereby rejecting the hypothesis that improvement proportions are equal across all hidden layer configurations and demonstrating that model performance is influenced by the number of hidden layers. The average computation time per model also increases with the number of hidden layers.
This result confirms that increasing the number of hidden layers enhances model performance but also increases complexity and computational demand. Based on this result, an architecture with two hidden layers, which represents the major improvement in percentage and a relative reduced training time, is selected as the optimal configuration.

4.1.3. Optimal Learning Rate Value

At this point, mape has been established as the optimal value of the loss function, and two hidden layers have been identified as optimal, reducing the number of possible architectures to 1536. However, among the hyperparameters that define such architectures, the optimizer is relevant and depends on another hyperparameter, the learning rate, a continuous parameter that complicates the search for an optimal value. In this analysis, three learning rate values are considered: the default value 1 × 10 4 , a higher value 1 × 10 3 , and a lower value 1 × 10 5 . Training a model for each of the 1536 architectures with all three learning rates is computationally intensive. Therefore, a stratified random sampling with uniform strategy is employed, of n = 308 architectures is drawn, with sub-samples of size n i = 77 for each optimizer, where i = { r m s p r o p , a d a d e l t a , a d a m , n a d a m } . These randomly selected architectures are then trained, validated, and evaluated on the six previously selected time series, described in Figure 6, resulting in 308 × 6 = 1848 models. Then, these 1848 models are trained, and the model with the smallest distance error among the three learning rates is selected for both the training set (to assess reproducibility) and the test set (to evaluate forecasting performance).
Table 5 presents the proportion of models for which the HSGP4 distance error is lower than that of SGP4, classified by the learning rate at a 4-day propagation span in the training set and for an 8-day span in the test set.
The results demonstrate that, within the training set, higher learning rates are associated with lower performance. In contrast, in the test set, the two highest learning rates yield similar and superior performance. The chi-square test indicates that these differences are statistically significant ( χ 2 = 259.52 , d f = 3 , p < 0.05 ), thereby rejecting the hypothesis that the association between the learning rate (lr) and the proportion of models with lower error is independent and demonstrating that model performance is influenced by the learning rate. Since lr = 1 × 10 4 achieves the highest performance in the test set (42.97%) and demonstrates consistency across both sets, it is identified as the optimal learning rate value.

4.2. 4D Hyper-Parameter Space

Mape is set as the loss function, two hidden layers are selected as optimal, and the learning rate is set to 1 × 10 4 . This selection reduces the hyperparameter space to four dimensions and limits the number of possible architectures to 1536, although a substantial number of configurations remain, such as the number of neurons, batch size, activation function, and optimizers. This section aims to identify the architectures with the best performance among the six time series described in Figure 6, followed by an analysis of their behavior across the 313 time series. Two satellite revolutions are again used as input, and 7, 3, and 14 are allocated to the training, validation, and test sets, respectively. We set the input layer with a dimension of 169 neurons, and training is conducted for a maximum of 500 epochs with a patience parameter of 60. Then, each architecture is trained ten times and incorporates the predictive module with ε ^ θ corrections within the hybrid HSGP4 propagator. The selected model is the one that minimizes the average distance error relative to the SGP4 error at three and seven satellite revolutions, corresponding to approximately 2 and 4 propagation days, respectively.

4.2.1. First Reduction of Architectures

A total of 1536 architectures are trained on each of the six time series, and their test set predictions are combined with the corresponding SGP4 ephemerides. The distance error between the numerical propagator AIDA and HSGP4 is calculated at 2, 4, 6, and 8 days. For each time series, the top five models in which the HSGP4 distance error is lower than that of SGP4 are selected and ranked in order of increasing distance error after an 8-day propagation span. Table 6 presents the result. The first column indicates the ranking, while columns two to six specify the batch size, the number of neurons in the first hidden layer, the optimizer, and the activation functions for the first and second hidden layers. The final column shows the frequency with which each architecture appears.
The most frequently observed parameter combination consists of a batch size of 256; 16 neurons in the first layer; 8 and 4 neurons in the second and third layers, respectively; the adam optimizer; and (tanh, elu) as activation functions for the first and second hidden layers, respectively. This configuration is closely followed by the (relu, elu) and (linear, linear) combinations. Notably, only architecture 1 yielded two time series whose models, when trained with this configuration, were among the best-performing. As can be seen, the remaining architectures were included among the top five in only one of the six trained time series.
Based on the 29 best-performing architectures presented in Table 6, we analyze the performance of the corresponding models—six per architecture, for a total of 174 models, with 58 models per trend category—and compare the resulting HSGP4 distance errors with the distance errors obtained using SGP4 without neural network correction. HSGP4 demonstrated improvements with respect to SGP4 distance error at 2, 4, 6, and 8 propagation days. Table 7 presents the number of models that improve SGP4 error, categorized by time series trend. Without considering the trend, the improvement after 8 days reaches 81% (141 models out of 174). For architectures applied to series with no defined trend, this value decreases, resulting in approximately 76% improvement. In contrast, series exhibiting a positive trend demonstrate an improvement of about 88% at 8 days, and consistent improvement is observed at shorter time horizons (2 and 4 days).
Figure 8 presents box-and-whisker plots of the HSGP4 distance error for the top 5 architectures in Table 6 training with the six selected time series in Table 6. These results also show improvements in the HSGP4 distance error with respect to SGP4 after 2, 4, 6, and 8 days in the test set.

4.2.2. Select the Best Three Architectures

The process of reducing architectures is ongoing. In the following, each of the 29 architectures is trained on the 313 time series in this study, yielding a total of 9077 models.
After 8 days of propagation, 6810 models, embedded as a predictive module in HSGP4 propagators, exhibit lower distance error than the distance error produced by SGP4 for the same time series. This accounts for 75.02% of the total models generated. Figure 9 presents, for each architecture, the number of hybrid propagators that improve upon SGP4. Notably, 15 out of 29 architectures surpass the 75% improvement threshold. The architectures highlighted in red (11, 15, and 29) exhibit the highest proportion of series modeled with the same architecture for which the HSGP4 distance error is lower than that of SGP4.
In the following analysis, we examine the SGP4 distance error for the 313 TLEs, grouped by trend, and compare it with the results obtained using HSGP4 for the same TLEs, with the argument of latitude corrected. This variable is the one whose error is modeled using neural networks. The final model architectures considered are those ranked 11, 15, and 29; their configurations are described in Table 6.
Figure 10 illustrates the evolution of the SGP4 distance error over 2, 4, 6, and 8 days, grouped by positive, negative, and no-trend time series. A systematic increase in both median distance error and dispersion is observed across all categories. Time series with positive or negative trends exhibit larger errors and greater variability than no-trend cases. In contrast, this last group maintains consistently low and stable errors, indicating greater robustness.
Table 8 summarizes the number of cases, grouped by trend category and network architecture, in which the error produced by HSGP4 exceeds that of SGP4 for different propagation horizons. For time series exhibiting a positive trend, architectures 11 and 29 show a clear increase in the number of cases where HSGP4 performs worse as the propagation time increases, particularly at 8 days. Architecture 15, while initially showing fewer cases, also shows a gradual increase with longer predictive horizons.
In the negative trend category, architecture 11 consistently presents the highest number of cases across all time intervals, indicating a pronounced degradation of HSGP4 performance relative to SGP4 for this trend type. Architectures 15 and 29 display lower overall counts but still show a noticeable increase in error dominance as the propagation interval increases. For no trend time series, the number of cases remains relatively stable across all architectures and propagation times. This suggests that, in the absence of a clear trend, the relative performance of HSGP4 compared to SGP4 is less sensitive to the predictive horizon.
Overall, the results indicate that the likelihood of HSGP4 exhibiting larger errors than SGP4 increases with propagation time, particularly for time series with pronounced positive or negative trends. This behavior is architecture-dependent, with certain configurations being more sensitive to trend characteristics than others.
Figure 11, Figure 12 and Figure 13 display box-and-whisker plots of HSGP4 distance errors for 313 time series, categorised by trend and architecture. Only hybrid propagators that outperform SGP4 are included. Figure 11 presents distance errors for the 169 time series exhibiting a positive trend, Figure 12 provides results for the 118 time series with a negative trend, and Figure 13 illustrates the box-and-whisker plots for the remaining 26 time series with no clear trend. These plots represent models trained with architectures 11 (in red), 15 (in blue), and 29 (in green), whose hyperparameters are detailed in Table 6.
Figure 11 presents the results of the models based on the 11, 15, and 29 architectures for the 169 time series with positive trends at 2-, 4-, 6-, and 8-day propagation. Training the time series with architecture 15 reduces the HSGP4 distance error in 138 cases, corresponding to approximately 82%. Architectures 11 and 29 achieve reductions in 133 cases, or about 79%, for a combined total of 169. The errors associated with the three selected architectures are comparable during the initial 4 days of propagation. Over the subsequent 4 days, the error for architecture 15 increases at a slower rate than that for architectures 11 and 29. The 75th percentile error for architecture 15 is approximately 10.29 km, while architectures 11 and 29 yield errors of 14.30 km and 13.11 km, respectively. In the worst-case scenario, at 8 days, the maximum error for architecture 15 is 27.41 km, compared to 34.97 km and 35.62 km for architectures 11 and 29, respectively.
Figure 12 shows the results of the models based on the 11-, 15-, and 29-architecture for the 118 time series with negative trends at 2-, 4-, 6-, and 8-day propagation. Architecture 11 performed best, with 88 out of 118 cases, or approximately 75%. Error rates for the three selected architectures remained similar throughout all propagation days. The 75th percentile errors for architectures 11, 15, and 29 were approximately 11.82, 12.33, and 11.43 km, respectively. At 8 days, the maximum error was 20.57 km for architecture 15, 23.02 km for architecture 11, and 28.36 km for architecture 29.
Figure 13 presents the results for the 26 time series with no trend. This scenario demonstrates the lowest performance among the three architectures, as only approximately 24% of models improve the distance error of SGP4 in the best case and 17% in the worst case at 8 propagation days.
The time series exhibiting the smallest distance error among the models generated with the three selected architectures is chosen, and the ε t θ forecasts are graphically displayed alongside their corresponding real values. Figure 14 presents the time series in black, while the series predicted by models parameterized with architectures 11, 15, and 29 are shown in blue, green, and red, respectively. The inputs used for forecasting in both the training and test sets are omitted; these inputs correspond to two satellite revolutions.
The model trained with architecture 15 (green) generates predictions closer to the true data than the other two architectures, in both the training and test sets. In contrast, the predictions made with the models using architectures 11 and 29, represented in blue and red, respectively, show that they capture the overall trend but fail to accurately reflect the variations.
The HSGP4 configuration with architecture 15 produces a distance error of approximately 2.75 km, which is close to the optimal error of 2.13 km, obtained by substituting the ε SGP4 θ time series with ε AIDA θ . This architecture provides a significant improvement over the SGP4 error of 22.04 km, while the HSGP4 propagators with architectures 11 and 29 exhibit errors of 19.41 km and 12.51 km, respectively. Overall, architecture 15 can be considered a near-optimal solution for minimizing distance error relative to the numerical propagator within this time series.
Finally, the characteristics of the three selected architectures are presented in Table 9.

5. Analysis of the HSGP4 Based on the Architectures 11, 15, and 29

This section evaluates the performance of three selected architectures on a new dataset of 1683 Two-Line Elements (TLEs) from the Galileo satellite (NORA ID 38857), spanning 12 October 2012 to 13 March 2021. The resulting time series are classified by trend into positive, negative, and no trend categories. The semi-major axis values range from 29,600 to 29,640.1 km, with a concentration near 29,600.2 km. Eccentricities vary from 0.0000279 to 0.0012339 , and inclination values are approximately 55°. The argument of perigee ranges from 156.471° to 274.624°, while the right ascension of the ascending node varies between 154.368° and 239.521°.
Time series ε θ are generated from the TLEs and classified into positive, negative, and no-trend categories, with counts of 1447, 227, and 9, respectively. After classification, the analysis compares the distance errors obtained with the hybrid HSGP4 propagator against those produced by SGP4 over prediction horizons ranging from 2 to 8 days. The training, validation, and testing datasets are again 7, 3, and 14 satellite revolutions, respectively. Table 10 summarizes, by trend category and network architecture, the number of cases in which the error produced by HSGP4 exceeds that of SGP4 for all propagation horizons.
The results are consistent with those reported in Table 8, showing that propagation time and trend classification have a stronger influence on performance. In the first dataset, positive trend series account for the majority of cases (169 out of 313 total). This behavior becomes more pronounced in the second dataset, where positive trend series account for the largest share and exhibit a nearly monotonic increase with propagation time across all architectures.
Negative trend series display a similar but less pronounced growth pattern, with counts remaining consistently lower than those in positive trend series in both experiments. In contrast, no trend series yields the fewest cases, with near-zero counts in the second dataset even at the longest predictive horizon.
Figure 15 shows the distance error between the numerical propagator and SGP4 over predictive horizons of 2, 4, 6, and 8 days for the test set, categorized into positive, negative, and no-trend cases. In all categories, the median distance error and its dispersion systematically increase, with errors exceeding 100 km at longer horizons. Figure 10 presents a comparable pattern; however, in the case of NORA ID 40545, the distance error remains below 50 km across all predictive horizons.
Figure 16 presents box-and-whisker plots of the distance error for 1683 HSGP4 propagations with correction. Figure 16a represents a time series with a positive trend, whereas Figure 16b,c represents negative and no-trend series, respectively. The predictive module for each HSGP4 propagator is derived from the three selected architectures: 11 (in red), 15 (in blue), and 29 (in green), with hyperparameters provided in Table 9. As can be seen, the three HSGP4 propagators outperform SGP4 across all propagation spans in the test set. Firstly, it can be observed that the HSGP4 distance error for the three architectures shows highly similar distributions, characterized by notable symmetry and closely aligned statistics ( Q 1 , median, Q 3 , and upper whisker).
After 8 days of propagation, the distance error relative to the positive trend series improves by 77 km, lower than that of SGP4. However, it still has an error of approximately 30 km relative to the optimal position. For series with a negative trend, the improvement is about 10 km compared to SGP4, but 17 km higher than the optimum. In a series without a trend, the maximum errors of the three hybrid propagators do not show any improvement; instead, the distance error is greater than SGP4 by about 7 km.
Time series exhibiting positive or negative trends display patterns consistent with those observed for Galileo NORA ID 40545, as shown in Figure 11 and Figure 12. For both satellites, the proportion of outliers remains stable as the dataset size increases, suggesting that extreme cases persist across varying sample sizes. At the 8-day propagation horizon, the interquartile range (IQR) exhibits distinct behaviors depending on the trend. For positive trends, the IQR varies by architecture, increasing by 23.5% and 9.4% for architectures 11 and 15, respectively, and decreasing by 6.4% for architecture 29. In contrast, negative trends consistently reduce the IQR across architectures by about 45%, indicating greater data concentration around the mean. Measures of central tendency also reveal distinct patterns based on trend type. For positive trends, medians remain similar between datasets, and the 75th percentiles are within comparable ranges, confirming stability in the error distribution. For series with negative trends, medians are systematically lower in the 1683 TLE dataset (approximately 40% reductions), and the 75th percentiles are also consistently lower. These results indicate a general shift in the distribution toward lower errors and improved predictive performance as data volume increases.
Finally, in the case of a series with no trend, a different behavior is observed when comparing both datasets. The interquartile range shows notable reductions (25–33%), medians are slightly lower in the expanded dataset, and the 75th percentiles are also lower, indicating greater distribution compaction and smaller absolute errors. However, this apparent stability is accompanied by a fundamental limitation: the absence of temporal structure prevents neural networks from capturing systematic error-evolution patterns, resulting in predictions with small absolute errors but without the ability to model the progressive improvement or deterioration observable in a series with a trend.

6. Conclusions

This work demonstrates that the effectiveness of a hybrid SGP4-based orbit propagation methodology is strongly driven by the choice of neural network architectures and their associated hyperparameters. Starting from a large search space of 32,256 candidate models, a statistically guided reduction strategy identified a compact and representative subset of 1536 architectures suitable for detailed evaluation. The results confirm that standard loss functions, such as mape, are not sufficient on their own to assess model quality in orbit prediction problems. Incorporating a problem-specific metric based on position error proved essential for distinguishing well-performing hybrid SGP4 models, although its use requires caution to avoid overfitting when applied exclusively to training data. Among the hyperparameters analyzed, a learning rate of 1 × 10 4 consistently led to improved distance-error performance, establishing a reliable baseline for training hybrid models.
The EDA analysis indicates that joint modeling of the l and g variables, that is, the mean anomaly and the argument of the perigee, yields excellent results in the short term. Therefore, we choose to model a unique variable that comprises both effects, namely the argument of latitude, which yields excellent results by processing a single time series. Hence, from a dynamical perspective, the argument of latitude is identified as the most influential variable for improving SGP4 accuracy in Galileo-type orbits. Neural network architectures are shown to successfully model the evolution of their error time series, supporting their integration into a hybrid forecasting framework. In particular, positive trend series account for the majority of cases in which HSGP4 outperforms SGP4, and this advantage increases as the propagation horizon extends. Propagations from negative trend series show a comparable but less pronounced effect, while series without a clear trend consistently produce the smallest degradations. Although these limitations exist, the hybrid models typically reduce distance errors compared to SGP4, particularly for time series with distinct trends. In contrast, minimal or no improvement is observed for series without a discernible trend.
This study has revealed that the effectiveness of HSGP4 is closely tied to the exploitable structure within the error dynamics. To facilitate the application of this methodology to other orbital regimes, such as LEO or GEO, the following recommendations are provided for future research. It is fundamental to recognize that the success of a hybrid propagator is highly dependent on the baseline performance of the original analytical model and its interaction with the specific force model of the region. Researchers should prioritize a comprehensive EDA to characterize the residuals before training. Our findings indicate that the hierarchical reduction strategy, moving from a massive model space to a selected subset, is essential for finding a parsimonious solution. Furthermore, caution is advised when the base propagator (SGP4) already exhibits minimal distance errors or lacks a discernible trend. In such scenarios, the machine learning component may lack sufficient patterns to improve the ephemerides, and the hybrid model might even introduce slight degradations. Therefore, this hybrid approach is most effective when applied to orbital regions where unmodeled dynamics produce systematic and learnable error patterns.
In cases where SGP4 residuals lack a clear trend, the current ANN architecture suffers from a high noise-to-signal ratio, limiting its predictive performance. Future work will focus on enhancing performance in these regimes by exploring alternative state representations, a major control interval, and integrating external geophysical covariates (e.g., solar and geomagnetic indices) that may drive these small-scale residuals and model the slow-dynamics effects. Furthermore, investigating deep learning architectures better suited for weakly structured data, such as Recurrent Neural Networks (RNNs) or attention-based models, could provide the necessary sensitivity to improve orbital predictions even when the baseline analytical model is near its accuracy limit.
With this study, we have proven a practical engineering application of HSGP4, centered on its operational scalability and interoperability with existing space surveillance systems. By utilizing the Hybrid Two-Line Element (HTLE) format, the proposed methodology can be seamlessly deployed within current orbital catalog infrastructures without requiring a fundamental redesign of database architectures. The HTLE encapsulates both the mean-motion parameters and the compressed neural network coefficients, enabling efficient and decentralized updates to the error forecast. This deployment scheme is especially valuable for real-time Space Situational Awareness (SSA) and Space Traffic Management (STM), providing a high-fidelity alternative to standard analytical propagation for large-scale constellations. It effectively bridges the gap between the speed of analytical models and the precision of numerical integration, offering a robust tool for initial conjunction assessment and long-term constellation maintenance.

Author Contributions

Conceptualization, J.F.S.-J.; methodology, J.F.S.-J., R.L., and I.P.; software, E.S., I.P., and J.F.S.-J.; validation, E.S., R.L., and J.F.S.-J.; formal analysis, J.F.S.-J., E.S., R.L., and I.P.; investigation, J.F.S.-J., E.S., R.L., and I.P.; resources, J.F.S.-J. and M.L.; data curation, E.S.; writing—original draft preparation, J.F.S.-J. and E.S.; writing—review and editing, J.F.S.-J., M.L., E.S., R.L., and I.P.; supervision, J.F.S.-J.; funding acquisition, J.F.S.-J. and M.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work has been developed under Project PID2021-123219OB-I00, funded by MICIU/AEI/10.13039/501100011033 and by ERDF/EU.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The TLE sets were freely obtained from Space-Track (https://www.space-track.org). The accuracy analysis and SGP ephemerides can be requested from the corresponding author under reasonable justification or generated using open-source software such as the astrodynamics library Orekit (v13.1.4).

Acknowledgments

We would like to thank the reviewers for their valuable comments and constructive suggestions, which helped to improve the quality of this manuscript. We also thank the high-performance computing center of the University of La Rioja (Beronia) for providing computational resources. During the preparation of this manuscript, the authors used Grammarly (v9.96.0) for the purposes of enhancing the grammar, spelling, and overall readability of this manuscript. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Brouwer, D. Solution of the problem of artificial satellite theory without drag. Astron. J. 1959, 64, 378–397. [Google Scholar] [CrossRef]
  2. Deprit, A.; Rom, A. The main problem of artificial satellite theory for small and moderate eccentricities. Celest. Mech. 1970, 2, 166–206. [Google Scholar] [CrossRef]
  3. Kozai, Y. Second-order solution of artificial satellite theory without air drag. Astron. J. 1962, 67, 446–461. [Google Scholar] [CrossRef]
  4. Lyddane, R.H. Small eccentricities or inclinations in the Brouwer theory of the artificial satellite. Astron. J. 1963, 68, 555–558. [Google Scholar] [CrossRef]
  5. San-Juan, J.F. ATESAT: Automatization of Theories and Ephemeris in the Artificial Satellite Problem; Technical Report CT/TI/MS/MN/94-250; Centre National d’Études Spatiales (CNES): Toulouse, France, 1994. [Google Scholar]
  6. Krylov, N.M.; Bogoliubov, N.N. Introduction to Non-linear Mechanics; Princeton University Press: Princeton, NJ, USA, 1943. [Google Scholar]
  7. Bogoliubov, N.N.; Mitropolsky, Y.A. Asymptotic Methods in the Theory of Non-linear Oscillations; Translated from the Second Revised Russian Edition; International Monographs on Advanced Mathematics and Physics; Hindustan Publishing Corp.: Delhi, India; Gordon and Breach Science Publishers: New York, NY, USA, 1961. [Google Scholar]
  8. Hori, G.i. Theory of general perturbations with unspecified canonical variables. Publ. Astron. Soc. Jpn. 1966, 18, 287–296. [Google Scholar] [CrossRef]
  9. Deprit, A. Canonical transformations depending on a small parameter. Celest. Mech. 1969, 1, 12–30. [Google Scholar] [CrossRef]
  10. Henrard, J. On a perturbation theory using Lie transforms. Celest. Mech. 1970, 3, 107–120. [Google Scholar] [CrossRef]
  11. Kamel, A.A. Perturbation method in the theory of nonlinear oscillations. Celest. Mech. 1970, 3, 90–106. [Google Scholar] [CrossRef]
  12. Mersman, W.A. A new algorithm for the Lie transformation. Celest. Mech. 1970, 3, 81–89. [Google Scholar] [CrossRef]
  13. Mersman, W.A. Explicit recursive algorithms for the construction of equivalent canonical transformations. Celest. Mech. 1971, 3, 384–389. [Google Scholar] [CrossRef]
  14. Cappellari, J.O.; Long, A.C.; Velez, C.E.; Fuchs, A.J. Goddard Trajectory Determination System (GTDS); Technical Report CSC/TR-89/6021; Goddard Space Flight Center: Greenbelt, MD, USA, 1989. [Google Scholar]
  15. Maisonobe, L.; Cefola, P.J.; Frouvelle, N.; Herbinière, S.; Laffont, F.X.; Lizy-Destrez, S.; Neidhart, T. Open governance of the OREKIT space flight dynamics library. In Proceedings of the 5th International Conference on Astrodynamics Tools and Techniques, ICATT 2012, Noordwijk, The Netherlands, 29 May–1 June 2012. [Google Scholar]
  16. Neelon, J.G., Jr.; Cefola, P.J.; Proulx, R.J. Current development of the Draper Semianalytical Satellite Theory standalone orbit propagator package. Adv. Astronaut. Sci. 1998, 97, 2037–2052. [Google Scholar]
  17. Setty, S.J.; Cefola, P.J.; Montenbruck, O.; Fiedler, H. Application of Semi-analytical Satellite Theory orbit propagator to orbit determination for space object catalog maintenance. Adv. Space Res. 2016, 57, 2218–2233. [Google Scholar] [CrossRef]
  18. Lara, M.; San Juan, J.F.; Hautesserres, D. HEOSAT: A mean elements orbit propagator program for highly elliptical orbits. CEAS Space J. 2018, 10, 3–23. [Google Scholar] [CrossRef]
  19. San-Juan, J.F.; San-Martín, M.; Ortigosa, D. Hybrid analytical-statistical models. Lect. Notes Comput. Sci. 2011, 6783, 450–462. [Google Scholar] [CrossRef]
  20. San-Juan, J.F.; San-Martín, M.; Pérez, I.; López, R. Hybrid perturbation methods based on statistical time series models. Adv. Space Res. 2016, 57, 1641–1651. [Google Scholar] [CrossRef]
  21. San-Juan, J.F.; Pérez, I.; San-Martín, M.; Vergara, E.P. Hybrid SGP4 orbit propagator. Acta Astronaut. 2017, 137, 254–260. [Google Scholar] [CrossRef]
  22. Caldas, F.; Soares, C. Machine learning in orbit estimation: A survey. Acta Astronaut. 2024, 220, 97–107. [Google Scholar] [CrossRef]
  23. Kazemi, S.; Azad, N.L.; Scott, A.; Oqab, H.B.; Dietrich, G.B. Orbit determination for space situational awareness: A survey. Acta Astronaut. 2024, 222, 272–295. [Google Scholar] [CrossRef]
  24. Hoots, F.R.; Roehrich, R.L. Models for Propagation of the NORAD Element Sets; Spacetrack Report #3; U.S. Air Force Aerospace Defense Command: Colorado Springs, CO, USA, 1980. [Google Scholar]
  25. Hoots, F.R.; Schumacher, P.W., Jr.; Glover, R.A. History of analytical orbit modeling in the U.S. space surveillance system. J. Guid. Control. Dyn. 2004, 27, 174–185. [Google Scholar] [CrossRef]
  26. Vallado, D.A.; Crawford, P.; Hujsak, R.; Kelso, T.S. Revisiting spacetrack report #3. In Proceedings of the 2006 AIAA/AAS Astrodynamics Specialist Conference and Exhibit, American Institute of Aeronautics and Astronautics, Keystone, CO, USA, 21–24 August 2006; Paper AIAA 2006-6753. Volume 3, pp. 1984–2071. [Google Scholar] [CrossRef]
  27. Morselli, A.; Armellin, R.; Di Lizia, P.; Bernelli-Zazzera, F. A high order method for orbital conjunctions analysis: Sensitivity to initial uncertainties. Adv. Space Res. 2014, 53, 490–508. [Google Scholar] [CrossRef]
  28. Lane, M.H.; Cranford, K.H. An Improved Analytical Drag Theory for the Artificial Satellite Problem. In Proceedings of the AIAA/AAS Astrodynamics Specialist Conference, Princeton, NJ, USA, 13–16 August 1969. Paper AIAA 69-925. [Google Scholar] [CrossRef]
  29. Hujsak, R.S. A Restricted Four Body Solution for Resonating Satellites Without Drag; Spacetrack Report #1; U.S. Air Force Aerospace Defense Command: Colorado Springs, CO, USA, 1979. [Google Scholar]
  30. Bowman, B. NORAD Document; Spacetrack Report #1; U.S. Air Force Aerospace Defense Command: Colorado Springs, CO, USA, 1971. [Google Scholar]
Figure 1. Scatter diagrams for the Galileo TLEs in two-dimensional parameter planes: (a) ( e , a ) -Plane. (b) ( i , a ) -Plane. (c) ( e , ω ) -Plane. (d) ( e , Ω ) -Plane.
Figure 1. Scatter diagrams for the Galileo TLEs in two-dimensional parameter planes: (a) ( e , a ) -Plane. (b) ( i , a ) -Plane. (c) ( e , ω ) -Plane. (d) ( e , Ω ) -Plane.
Applsci 16 02214 g001
Figure 2. Box-and-whisker plots showing the distance errors (km) of the best combinations of orbital variables for a 30-day propagation span.
Figure 2. Box-and-whisker plots showing the distance errors (km) of the best combinations of orbital variables for a 30-day propagation span.
Applsci 16 02214 g002
Figure 3. Box-and-whisker plots showing the distance errors (km) of the best combinations of polar-nodal variables for 30-day propagation spam.
Figure 3. Box-and-whisker plots showing the distance errors (km) of the best combinations of polar-nodal variables for 30-day propagation spam.
Applsci 16 02214 g003
Figure 4. Plot of the 50-representative sample of the 313 ε θ time series.
Figure 4. Plot of the 50-representative sample of the 313 ε θ time series.
Applsci 16 02214 g004
Figure 5. Box-and-whisker plots showing distance errors (km) classified according to their trend over a 30-day propagation span. (a) SGP4 distance error. (b) Optimal SGP4 distance error obtained by substituting the ε SGP4 θ time series with ε AIDA θ .
Figure 5. Box-and-whisker plots showing distance errors (km) classified according to their trend over a 30-day propagation span. (a) SGP4 distance error. (b) Optimal SGP4 distance error obtained by substituting the ε SGP4 θ time series with ε AIDA θ .
Applsci 16 02214 g005
Figure 6. (a) Random series utilized for selecting neural network architectures. Orange dots indicate the input vector, red denotes the training set, green represents the validation set, and blue indicates the test set. (b) Distance error between AIDA and SGP4 (in km) for the same TLEs. Time series with a positive trend are shown in blue, a negative trend in red, and no discernible trend in green.
Figure 6. (a) Random series utilized for selecting neural network architectures. Orange dots indicate the input vector, red denotes the training set, green represents the validation set, and blue indicates the test set. (b) Distance error between AIDA and SGP4 (in km) for the same TLEs. Time series with a positive trend are shown in blue, a negative trend in red, and no discernible trend in green.
Applsci 16 02214 g006
Figure 7. Box-and-whisker plots of the distance error of HSGP propagators using the mse (in red) and mape (in blue) loss functions at 2 and 4 days in the training period and at 2, 4, 6, and 8 days in the testing period.
Figure 7. Box-and-whisker plots of the distance error of HSGP propagators using the mse (in red) and mape (in blue) loss functions at 2 and 4 days in the training period and at 2, 4, 6, and 8 days in the testing period.
Applsci 16 02214 g007
Figure 8. Box-and-whisker plots of the HSGP4 distance error for the top 5 architectures training with the six selected time series.
Figure 8. Box-and-whisker plots of the HSGP4 distance error for the top 5 architectures training with the six selected time series.
Applsci 16 02214 g008
Figure 9. Number of models by architecture for which the HSGP4 distance error is lower than that of SGP4 after 8 days of propagation, in red the selected architectures.
Figure 9. Number of models by architecture for which the HSGP4 distance error is lower than that of SGP4 after 8 days of propagation, in red the selected architectures.
Applsci 16 02214 g009
Figure 10. Box-and-whisker plots of the SGP4 distance error for 169 time series positive (PT, in red), 118 negative (NT, in blue), and 26 no trend (NoT, in green), respectively.
Figure 10. Box-and-whisker plots of the SGP4 distance error for 169 time series positive (PT, in red), 118 negative (NT, in blue), and 26 no trend (NoT, in green), respectively.
Applsci 16 02214 g010
Figure 11. Box-and-whisker plots of the HSGP4 distance error, using models trained with architectures 11 (in red), 15 (in blue), and 29 (in green) under a positive trend.
Figure 11. Box-and-whisker plots of the HSGP4 distance error, using models trained with architectures 11 (in red), 15 (in blue), and 29 (in green) under a positive trend.
Applsci 16 02214 g011
Figure 12. Box-and-whisker plots of HSGP4 distance error, using models trained with architectures 11 (in red), 15 (in blue), and 29 (in green) under a negative trend.
Figure 12. Box-and-whisker plots of HSGP4 distance error, using models trained with architectures 11 (in red), 15 (in blue), and 29 (in green) under a negative trend.
Applsci 16 02214 g012
Figure 13. Box-and-whisker plots of the HSGP4 distance error, using models trained with architectures 11 (in red), 15 (in blue), and 29 (in green) under a positive non-trend.
Figure 13. Box-and-whisker plots of the HSGP4 distance error, using models trained with architectures 11 (in red), 15 (in blue), and 29 (in green) under a positive non-trend.
Applsci 16 02214 g013
Figure 14. The real values are shown in black with dots. Predicted values are generated by models using architectures 11 (in blue), 15 (in green), and 29 (in red). (a) Training set. (b) Test set.
Figure 14. The real values are shown in black with dots. Predicted values are generated by models using architectures 11 (in blue), 15 (in green), and 29 (in red). (a) Training set. (b) Test set.
Applsci 16 02214 g014
Figure 15. Box-and-whisker plots of the SGP4 distance error for 1447 time series positive (PT, in red), 227 negative (NT, in blue), and 9 no trend (NoT, in green), respectively.
Figure 15. Box-and-whisker plots of the SGP4 distance error for 1447 time series positive (PT, in red), 227 negative (NT, in blue), and 9 no trend (NoT, in green), respectively.
Applsci 16 02214 g015
Figure 16. Box-and-whisker plots of the distance error between the numerical and HSGP4 propagators, using models trained with architectures 11 (in red), 15 (in blue), and 29 (in green). (a) Positive trend. (b) Negative trend. (c) No trend.
Figure 16. Box-and-whisker plots of the distance error between the numerical and HSGP4 propagators, using models trained with architectures 11 (in red), 15 (in blue), and 29 (in green). (a) Positive trend. (b) Negative trend. (c) No trend.
Applsci 16 02214 g016
Table 1. 7D hyperparameter space ( b s , n h l , n n h l , s f , o , l f , l r ) with their considered values. The activation functions considered are linear, hyperbolic tangent (tanh), rectified linear unit (relu), and exponential linear unit (elu). The optimizers considered are root mean square prop (rmsprop), adaptive delta (adadelta), adaptive moment estimation (adam), and nesterov-accelerated adaptive moment estimation (nadam). The loss functions considered are the mean squared error (mse), mean absolute error (mae), mean absolute percentage error (mape), and mean square logarithmic error (msle).
Table 1. 7D hyperparameter space ( b s , n h l , n n h l , s f , o , l f , l r ) with their considered values. The activation functions considered are linear, hyperbolic tangent (tanh), rectified linear unit (relu), and exponential linear unit (elu). The optimizers considered are root mean square prop (rmsprop), adaptive delta (adadelta), adaptive moment estimation (adam), and nesterov-accelerated adaptive moment estimation (nadam). The loss functions considered are the mean squared error (mse), mean absolute error (mae), mean absolute percentage error (mape), and mean square logarithmic error (msle).
HyperparametersRange of Values
Batch size64, 128, 256, 512
Hidden layers1, 2, 3
Number of neurons in the first hidden layer8, 16, 32, 64, 128, 256
Activation Functionslinear, tanh, elu, relu
Optimizersrmsprop, adadelta, adam, nadam
Loss functionmse, mae, mape, msle
Learning rate0.001, 0.0001, 0.00001
Table 2. SGP4 distance error (in km) at 2 and 4 propagation days during the training period, and at 2, 4, 6, and 8 propagation days during the test period.
Table 2. SGP4 distance error (in km) at 2 and 4 propagation days during the training period, and at 2, 4, 6, and 8 propagation days during the test period.
TrainTest
Trend2d4d2d4d6d8d
Positive7.01210.88520.64123.54826.65129.515
Positive7.57612.56617.08118.79323.32127.189
Negative6.8709.33815.68521.86026.88530.712
Negative7.52910.94514.32215.78717.05618.930
No trend2.4733.3664.2294.6194.6194.619
No trend3.0665.1876.2506.2506.2506.250
Table 3. Number of models with the lowest rmse during training and testing.
Table 3. Number of models with the lowest rmse during training and testing.
ArchitecturesTrainingTesting
mae403 (18.35%)510 (23.22%)
mape780 (35.52%)880 (40.07%)
mse837 (38.12%)581 (26.46%)
msle176 (8.01%)225 (10.25%)
Table 4. Number of models out of the total for each number of hidden layers where the HSGP4 distance error is lower than the SGP4 error after a 4-day propagation span in the training set. The last column shows the average training time for each model.
Table 4. Number of models out of the total for each number of hidden layers where the HSGP4 distance error is lower than the SGP4 error after a 4-day propagation span in the training set. The last column shows the average training time for each model.
Hidden LayersTrainingTime (min)
136/102 (35.29%)2.83
2247/420 (58.81%)3.18
31122/1674 (67.03%)3.60
Table 5. Distribution of models with HSGP4 distance error lower than that of SGP4, classified by the learning rate. The final column gives the average training time for each model.
Table 5. Distribution of models with HSGP4 distance error lower than that of SGP4, classified by the learning rate. The final column gives the average training time for each model.
LRTrainTestTime (min)
1 × 10 5 817 (44.21%)290(15.69%)4.92
1 × 10 4 647 (35.01%)794 (42.97%)4.46
1 × 10 3 384 (20.78%)764 (41.34%)4.44
Table 6. Configuration of architectures of the top five models for each time series that minimize the distance error with respect to SGP4 at an 8-day propagation span. Columns: batch-size (bs), number of neurons in the first hidden layer (nnfl), optimizer (o), activation function in the first layer (af1), activation function in the second layer (af2), and frequency of architecture appearance as a best model.
Table 6. Configuration of architectures of the top five models for each time series that minimize the distance error with respect to SGP4 at an 8-day propagation span. Columns: batch-size (bs), number of neurons in the first hidden layer (nnfl), optimizer (o), activation function in the first layer (af1), activation function in the second layer (af2), and frequency of architecture appearance as a best model.
idbsnnfloaf1af2freq
15128nadamtanhelu2/6
22568adamtanhelu1/6
364256adamtanhtanh1/6
46416adadeltareluelu1/6
5128256nadamreluelu1/6
625664adamtanhtanh1/6
712832nadamtanhelu1/6
812816adamlinearlinear1/6
9512128nadamlinearrelu1/6
1012864adamelutanh1/6
116416rmsproptanhelu1/6
12256128adamlinearlinear1/6
13128128adamtanhtanh1/6
14256128nadamlinearelu1/6
1525664nadamlineartanh1/6
1625616adameluelu1/6
17512128adameluelu1/6
1825616adamlinearlinear1/6
1951216adamreluelu1/6
2025616adamtanhelu1/6
2164128nadamreluelu1/6
2225616nadamrelutanh1/6
236416nadamlinearelu1/6
24256256nadamlinearlinear1/6
2564256adamrelulinear1/6
2625632adameluelu1/6
276464adamlineartanh1/6
2825632adamtanhlinear1/6
2925632nadamlineartanh1/6
Table 7. Number of methods reducing SGP4 error, grouped by time series trend.
Table 7. Number of methods reducing SGP4 error, grouped by time series trend.
Trend2 Days4 Days6 Days8 Days
Positive58585451
Negative54545346
No trend56515044
Table 8. Number of cases for each architecture, grouped by trend category, where the error associated with HSGP4 is greater than that of SGP4 for the 313 TLE of the Galileo satellite (NORA ID 40545) at 2-, 4-, 6-, and 8-day propagation spans. The numbers of positive, negative, and no-trend time series are 169, 118, and 26, respectively.
Table 8. Number of cases for each architecture, grouped by trend category, where the error associated with HSGP4 is greater than that of SGP4 for the 313 TLE of the Galileo satellite (NORA ID 40545) at 2-, 4-, 6-, and 8-day propagation spans. The numbers of positive, negative, and no-trend time series are 169, 118, and 26, respectively.
TrendArchitecture2 Days4 Days6 Days8 Days
Positive1133333336
1520243031
2919242836
Negative1145404651
1520222730
2928283337
No trend1122222119
1517201919
2919202021
Table 9. Each selected architecture contains two hidden layers, a learning rate of 1 × 10 4 , a linear activation function in the output layer, and uses mape as the loss function.
Table 9. Each selected architecture contains two hidden layers, a learning rate of 1 × 10 4 , a linear activation function in the output layer, and uses mape as the loss function.
IdBatchSizeN° Neu 1° HLOptimizerActFun1AtFun2
116416rmsproptanhelu
1525664nadamlineartanh
2925632nadamlineartanh
Table 10. Number of cases for each architecture where the distance error associated with HSGP4 is greater than that of SGP4 for the Galileo satellite (NORA ID 38857). The numbers of positive, negative, and no trend time series are 1447, 227, and 9, respectively.
Table 10. Number of cases for each architecture where the distance error associated with HSGP4 is greater than that of SGP4 for the Galileo satellite (NORA ID 38857). The numbers of positive, negative, and no trend time series are 1447, 227, and 9, respectively.
TrendArchitecture2 days4 days6 days8 days
Positive1115406090
15113279129
29143268138
Negative1119273441
1512385164
2916385365
No trend110125
150234
290014
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Segura, E.; López, R.; Pérez, I.; Lara, M.; San-Juan, J.F. A Hybrid Approach to Enhanced SGP4 for Galileo Constellations. Appl. Sci. 2026, 16, 2214. https://doi.org/10.3390/app16052214

AMA Style

Segura E, López R, Pérez I, Lara M, San-Juan JF. A Hybrid Approach to Enhanced SGP4 for Galileo Constellations. Applied Sciences. 2026; 16(5):2214. https://doi.org/10.3390/app16052214

Chicago/Turabian Style

Segura, Edna, Rosario López, Iván Pérez, Martín Lara, and Juan Félix San-Juan. 2026. "A Hybrid Approach to Enhanced SGP4 for Galileo Constellations" Applied Sciences 16, no. 5: 2214. https://doi.org/10.3390/app16052214

APA Style

Segura, E., López, R., Pérez, I., Lara, M., & San-Juan, J. F. (2026). A Hybrid Approach to Enhanced SGP4 for Galileo Constellations. Applied Sciences, 16(5), 2214. https://doi.org/10.3390/app16052214

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop