An Improved VMD-LSTM Model for Time-Varying GNSS Time Series Prediction with Temporally Correlated Noise

: GNSS time series prediction plays a signiﬁcant role in monitoring crustal plate motion, landslide detection, and the maintenance of the global coordinate framework. Long short-term memory (LSTM) is a deep learning model that has been widely applied in the ﬁeld of high-precision time series prediction and is often combined with Variational Mode Decomposition (VMD) to form the VMD-LSTM hybrid model. To further improve the prediction accuracy of the VMD-LSTM model, this paper proposes a dual variational modal decomposition long short-term memory (DVMD-LSTM) model to effectively handle noise in GNSS time series prediction. This model extracts ﬂuctuation features from the residual terms obtained after VMD decomposition to reduce the prediction errors associated with residual terms in the VMD-LSTM model. Daily E, N, and U coordinate data recorded at multiple GNSS stations between 2000 and 2022 were used to validate the performance of the proposed DVMD-LSTM model. The experimental results demonstrate that, compared to the VMD-LSTM model, the DVMD-LSTM model achieves signiﬁcant improvements in prediction performance across all measurement stations. The average RMSE is reduced by 9.86% and the average MAE is reduced by 9.44%; moreover, the average R2 increased by 17.97%. Furthermore, the average accuracy of the optimal noise model for the predicted results is improved by 36.50%, and the average velocity accuracy of the predicted results is enhanced by 33.02%. These ﬁndings collectively attest to the superior predictive capabilities of the DVMD-LSTM model, thereby demonstrating the reliability of the predicted results.


Introduction
Over the past three decades, with the rapid development of satellite navigation technology, many GNSS continuously operating reference stations have been established worldwide.
These stations provide important data sources for crustal plate motion monitoring [1][2][3][4][5], landslide detection [6][7][8], the deformation monitoring of bridges or dams [9][10][11][12][13], and the maintenance of regional or global coordinate frameworks [14,15].By analyzing the long-term GNSS observation data time series obtained from these stations, it is possible to predict the variation of coordinates at continuous time points, thereby providing an important basis for determining motion trends.This has significant practical and theoretical value in geodesy and geodynamics research [16][17][18].Time series prediction methods can be mainly categorized into two types: physical simulation and numerical simulation [19,20].Traditional physical and numerical simulation methods rely on geophysical theories, linear terms, periodic terms, and gap information to construct models [21].However, these models face challenges in capturing complex nonlinear data and require a manual selection of feature information and modeling parameters, leading to systematic biases and limitations [22].In contrast, deep learning, as an emerging technology, can automatically extract information that is suitable for data features by constructing deep network structures.Deep learning exhibits strong learning capabilities and has advantages in handling large-scale and high-dimensional data.It has been widely applied in various fields such as image recognition [23][24][25], natural language processing [26][27][28], speech recognition [29][30][31], and time series prediction [32][33][34][35][36]. Li et al. (2022) comprehensively analyzed and elaborated on the application of image recognition to plant phenotypes by comparing and analyzing various deep learning methods [24].Otter et al. (2020) summarized and analyzed the relevant research of deep learning models in the field of natural language processing and provided valuable suggestions for future research in this field [26].Nassif et al. (2019) systematically studied its accuracy in speech recognition through convolutional, recurrent, and fully connected deep learning methods [31].Masini et al. (2023) elaborated on the application of machine learning in the field of economy and finance by analyzing the application of different neural networks and tree structures in time series in the context of deep learning [36].
Long short-term memory (LSTM), as an excellent variant of recurrent neural networks (RNNs), overcomes the issues of gradient vanishing, gradient exploding, and insufficient long-term memory in RNNs [37][38][39].Due to its significant advantages in long-range time series prediction, LSTM has been widely applied in various time series prediction domains such as electricity load forecasting [40][41][42] and wind speed prediction [43][44][45].In recent years, the application of the LSTM algorithm in the GNSS domain has also become increasingly widespread.Kim et al. (2019) improved the accuracy and stability of absolute positioning solutions in autonomous vehicle navigation using a multi-layer LSTM model [46].Tao et al. (2021) utilized a CNN-LSTM approach to extract deep multipath features from GNSS coordinate sequences, reducing the impact of multipath effects on positioning accuracy [47].Xie et al. (2019) accurately predicted landslide periodic components using the LSTM model to establish a landslide hazard warning system [48].
Variational Mode Decomposition (VMD) is a signal processing method based on the principle of variational inference.It decomposes signals into various mode components (Intrinsic Mode Functions, IMF) with different frequencies through an optimization process, effectively extracting the local time-frequency features of signals and enabling efficient signal decomposition and analysis [49][50][51].Currently, many researchers have combined VMD with LSTM to enhance the performance of LSTM in a range of fields [52][53][54][55].Huang et al. (2022) applied the VMD-LSTM model in the coal seam thickness prediction field, confirming that the predicted results closely matched the coal seam information obtained from existing boreholes [56].Zhang et al. (2022) applied the VMD-LSTM model in the field of sports artificial intelligence, demonstrating its broad application prospects in predicting sports artificial intelligence directions [57].Han et al. (2019) applied the VMD-LSTM model in the wind power prediction field, validating its high performance in multi-step and real-time predictions [58].Xing et al. (2019) applied the VMD-LSTM model in predicting the dynamic displacements of landslides and verified its high prediction accuracy using the case of landslides in paddy fields in China [59].
The VMD-LSTM model has been widely adopted in various fields for time series prediction.However, most studies utilize VMD to decompose the original data, predict each Intrinsic Mode Function (IMF) and residual term separately, and then combine the predicted results to obtain the final prediction.Although this method yields good results for each IMF value, the fluctuation characteristics of the residual term are difficult to extract, leading to significant prediction errors in the model.Furthermore, existing research has mainly focused on the accuracy of the prediction results while neglecting the noise characteristics of the data itself [60][61][62].Considering these factors, this paper proposes a dual VMD-LSTM (DVMD-LSTM) hybrid model that considers the characteristics of noise.By performing VMD decomposition on the residual components obtained from the initial VMD decomposition, the proposed model effectively extracts the fluctuation features within the residuals, enabling the high-precision prediction of GNSS time series.By analyzing the RMSE and MAE and R2 (coefficient of determination) of the predicted results in the E, N, and U directions across multiple measurement stations, the applicability and robustness of the proposed method are evaluated.Additionally, the quality of the predicted results is assessed by incorporating noise models and velocity evaluation.
The structure of this paper is as follows: Section 2 introduces the principles of VMD, LSTM algorithms, and accuracy evaluation metrics.The principles and specific processes of the DVMD-LSTM model are explained in detail.Section 3 describes the GNSS station data, presents data-preprocessing strategies, and analyzes reasons for the improved accuracy of the DVMD-LSTM model.Section 4 focuses on the prediction results and accuracy of both the single LSTM model and the hybrid model.The optimal noise model and velocity under each prediction model are compared and analyzed to evaluate the performance of the DVMD-LSTM model using different accuracy assessment metrics.Finally, Section 5 provides conclusions and an analysis.

Variational Modal Decomposition (VMD)
Variational Mode Decomposition (VMD) is an adaptive and fully non-recursive method used for solving modal variational and signal processing problems [63].GNSS time series exhibit inherent non-stationarity.Utilizing VMD to decompose the data effectively separates it into stationary signals, thereby extracting the fluctuation characteristics of the GNSS time series and providing a superior data foundation for model prediction.VMD iteratively searches for a variational model to decompose the original time series into distinct modal components.The specific decomposition process is outlined as follows [64][65][66]: (1) For each modal component µ K (t), the corresponding analytic signal is computed using the Hilbert transform, which allows its one-sided spectrum to be obtained: In the equation, j 2 = −1, where δ is the Dirac distribution.
(2) By introducing exponential terms in each mode, the center frequency e −jω K t of each mode can be estimated, and the spectral components of each mode can be modulated to their respective fundamental frequency bands: (3) The bandwidth of ω K is estimated based on the smoothness of the demodulated signal's H1 Gaussian.This leads to a constrained variational problem: In the equation, f represents the original signal, {µ K } represents the decomposed mode functions, and {ω K } represents the corresponding center frequencies of each mode.
(4) On this basis, quadratic penalty factors α and the Lagrange multiplier operator λ t are introduced to transform it into an unconstrained variational problem.The extended Lagrange expression is as follows: where α represents the quadratic penalty factor and λ t denotes the Lagrange multiplier operator.Subsequently, the alternating direction method of multipliers (ADMMs) is employed to solve this unconstrained variational problem.By alternately updating µ K n+1 , ω K n+1 , and λ n+1 , the saddle point of the extended Lagrange expression, i.e., the optimal solution of the constrained variational model in Equation (3), is sought.

Long Short-Term Memory (LSTM)
LSTM is an improved type of recurrent neural network (RNN) that addresses the issue of long-term dependencies by utilizing memory cells, effectively mitigating the problems of vanishing and exploding gradients [67][68][69].Compared to traditional neural networks, LSTM demonstrates strong advantages in handling long-term sequence prediction tasks and has been widely applied in areas such as time series forecasting and fault detection [70][71][72][73][74].The LSTM architecture consists of input layers, hidden layers, and output layers, where each hidden layer employs input gates, forget gates, and output gates to store and access data, as shown in Figure 1.
In the equation, f represents the original signal, { } , , where α represents the quadratic penalty factor and t λ denotes the Lagrange multi- plier operator.Subsequently, the alternating direction method of multipliers (ADMMs) is employed to solve this unconstrained variational problem.By alternately updating ω + , and λ + , the saddle point of the extended Lagrange expression, i.e., the optimal solution of the constrained variational model in Equation (3), is sought.

Long Short-Term Memory (LSTM)
LSTM is an improved type of recurrent neural network (RNN) that addresses the issue of long-term dependencies by utilizing memory cells, effectively mitigating the problems of vanishing and exploding gradients [67][68][69].Compared to traditional neural networks, LSTM demonstrates strong advantages in handling long-term sequence prediction tasks and has been widely applied in areas such as time series forecasting and fault detection [70][71][72][73][74].The LSTM architecture consists of input layers, hidden layers, and output layers, where each hidden layer employs input gates, forget gates, and output gates to store and access data, as shown in Figure 1.The VMD-LSTM model, as a classical hybrid deep learning model, has been widely applied in time series prediction tasks such as load forecasting and wind speed prediction, demonstrating remarkable predictive accuracy [75,76].This model utilizes the Variational Mode Decomposition (VMD) to decompose the original data into a set of Intrinsic Mode Functions (IMFs) and a residue component, denoted as "r.".Subsequently, each IMF and the residue component are individually predicted, and their predictions are cumulatively aggregated to obtain the final model's prediction.It is worth noting that the IMFs, being stationary signals, can achieve higher predictive accuracy when they are individually predicted; thus, effectively enhancing the predictive performance of the VMD-LSTM model.The specific prediction process is shown on the left side of Figure 2, and the residual value is not decomposed.However, the residue component remains unprocessed during the prediction process, leading to errors that can affect the model's predictive accuracy.Considering that the residual terms obtained after the VMD decomposition of real-world data still exhibit certain fluctuation characteristics and non-white noise such as highfrequency noise [77,78], this model further decomposes the residual terms using VMD and predicts the decomposed mode components to mitigate the impact of incomplete VMD decomposition.The DVMD-LSTM model improves the overall prediction accuracy by replacing the predicted results of the original residual terms with the fused mode components, thereby reducing the influence of residual terms on the prediction accuracy.The specific workflow is illustrated in Figure 2.
The VMD-LSTM model, as a classical hybrid deep learning model, has been widely applied in time series prediction tasks such as load forecasting and wind speed prediction, demonstrating remarkable predictive accuracy [75,76].This model utilizes the Variational Mode Decomposition (VMD) to decompose the original data into a set of Intrinsic Mode Functions (IMFs) and a residue component, denoted as "r.".Subsequently, each IMF and the residue component are individually predicted, and their predictions are cumulatively aggregated to obtain the final model's prediction.It is worth noting that the IMFs, being stationary signals, can achieve higher predictive accuracy when they are individually predicted; thus, effectively enhancing the predictive performance of the VMD-LSTM model.The specific prediction process is shown on the left side of Figure 2, and the residual value is not decomposed.However, the residue component remains unprocessed during the prediction process, leading to errors that can affect the model's predictive accuracy.Considering that the residual terms obtained after the VMD decomposition of real-world data still exhibit certain fluctuation characteristics and non-white noise such as high-frequency noise [77,78], this model further decomposes the residual terms using VMD and predicts the decomposed mode components to mitigate the impact of incomplete VMD decomposition.The DVMD-LSTM model improves the overall prediction accuracy by replacing the predicted results of the original residual terms with the fused mode components, thereby reducing the influence of residual terms on the prediction accuracy.The specific workflow is illustrated in Figure 2. The specific prediction process of the DVMD-LSTM model is as follows: Step 1: Preprocess the GNSS time series data by removing outliers, performing interpolation, and other data preprocessing techniques.Then, input the preprocessed data into the Variational Mode Decomposition (VMD) for decomposition.
Step 2: Further decompose the residue component "r1" obtained from the VMD into individual modal components and another residue "r2" through another round of VMD.
Step 3: Add up the modal components obtained from the VMD decomposition of the residue component "r1" to form the fused Intrinsic Mode Function (Fuse-IMF).Use the Fuse-IMF as a feature for prediction in the LSTM model.
Step 4: Use the individual modal components obtained from the VMD decomposition of the original GNSS time series as features and input them separately into the LSTM The specific prediction process of the DVMD-LSTM model is as follows: Step 1: Preprocess the GNSS time series data by removing outliers, performing interpolation, and other data preprocessing techniques.Then, input the preprocessed data into the Variational Mode Decomposition (VMD) for decomposition.
Step 2: Further decompose the residue component "r 1 " obtained from the VMD into individual modal components and another residue "r 2 " through another round of VMD.
Step 3: Add up the modal components obtained from the VMD decomposition of the residue component "r 1 " to form the fused Intrinsic Mode Function (Fuse-IMF).Use the Fuse-IMF as a feature for prediction in the LSTM model.
Step 4: Use the individual modal components obtained from the VMD decomposition of the original GNSS time series as features and input them separately into the LSTM model for prediction.Obtain K prediction results, where K represents the number of modal components.
Step 5: Add the K prediction results obtained in Step 4 with the prediction result of the Fuse-IMF to obtain the final prediction result of the DVMD-LSTM model.
Step 6: Calculate the RMSE and MAE of the prediction results and use them to evaluate the performance of the model under different noise models.

Precision Evaluation Index
To evaluate the prediction accuracy and noise characteristics of the hybrid model, this study employs Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE), and coefficient of determination (R2) as evaluation metrics for model prediction accuracy [79,80].Additionally, the Bayesian information criterion (BIC_tp) is used to determine the optimal noise model for the original GNSS time series and the predicted time series under each model to determine whether the prediction results consider colored noise [81][82][83].The definitions of the three evaluation metrics are as follows: (1) (2) In the above equations, y i represents the actual GNSS data values, y represents the mean of actual GNSS data values, ŷi represents the predicted results of each model, and n denotes the number of GNSS data points.The values of RMSE and MAE are used as evaluation metrics for model prediction accuracy.Smaller values of RMSE and MAE indicate the higher prediction accuracy of the model, while larger values indicate a lower prediction accuracy.The coefficient of determination (R2) ranges between 0 and 1.When R2 is close to 1, it indicates that the prediction model can explain the variability of the dependent variable well.On the other hand, when R2 is close to 0, it suggests that the explanatory power of the prediction model is weak. (4) To provide a visual assessment of the improvement achieved by the hybrid model on each evaluation metric, this study introduces the Improvement Ratio (I) to quantify the magnitude of improvement in each accuracy evaluation metric.By calculating the I value, the degree of improvement in accuracy achieved by the hybrid model can be accurately determined.The calculation formula for the Improvement Ratio is as follows: In the above equation, y and ŷ represent the evaluation metrics for accuracy, such as RMSE.The variable y represents the evaluation metric for the accuracy of the initial model's predictions, while ŷ represents the evaluation metric for the accuracy of the predictions made by the hybrid model.A larger value of I y ŷ indicates a greater improvement in the evaluation metric achieved by the hybrid model and vice versa.

Data Sources
In this work, the daily time series of 8 GNSS stations (ENU) from the Extended Solid Earth Science ESDR System (ES3) were selected for the experiment.The GNSS daily loose constraints solution from GAMIT and GIPSY was used with Quasi-Observation Combination Analysis (QOCA) software to generate a combined solution [62].The information for each station is presented in Table 1, and the distribution of the stations is depicted in Figure 3. See Appendix A for details of data fluctuations.In this work, the daily time series of 8 GNSS stations (ENU) from the Extended Solid Earth Science ESDR System (ES3) were selected for the experiment.The GNSS daily loose constraints solution from GAMIT and GIPSY was used with Quasi-Observation Combination Analysis (QOCA) software to generate a combined solution [62].The information for each station is presented in Table 1, and the distribution of the stations is depicted in Figure 3. See Appendix A for details of data fluctuations.In order to reduce the impact of missing data on the estimation and prediction results of the noise model, the following principles were followed in the selection process of the station: (1) the selected station coordinate time series must contain data from 2000 to 2022 to ensure the consistency of the experiment and obtain reliable velocity parameter estimation; (2) in the time range from 2000 to 2022, the average missing rate of the selected station data should not exceed 5% to ensure the reliability of the predicted experiment; (3) in order to reduce the impact of inter-regional correlation on the repeatability of speed parameters and noise modeling, the selected sites should be evenly distributed.

Data Preprocessing
For data preprocessing, this study employed the Hector software to remove outliers and detect step discontinuities in the raw data [84,85].After identifying the step discontinuities, they were corrected using the least squares fitting method.The corrected data were then subjected to interpolation using the Regularized Expectation Maximization (RegEM) algorithm [86,87].This method combines the Expectation Maximization (EM) algorithm with regularization techniques to simultaneously maximize the likelihood function and consider the smoothness of the model and noise reduction.It can effectively handle the interpolation problem of missing data [88,89].Due to space limitations, only the comparison of interpolation results for the GBOS station with the highest missing rate in the E, N, and U components is shown in Figure 4.

Data Preprocessing
For data preprocessing, this study employed the Hector software to remove outliers and detect step discontinuities in the raw data [84,85].After identifying the step discontinuities, they were corrected using the least squares fitting method.The corrected data were then subjected to interpolation using the Regularized Expectation Maximization (Re-gEM) algorithm [86,87].This method combines the Expectation Maximization (EM) algorithm with regularization techniques to simultaneously maximize the likelihood function and consider the smoothness of the model and noise reduction.It can effectively handle the interpolation problem of missing data [88,89].Due to space limitations, only the comparison of interpolation results for the GBOS station with the highest missing rate in the E, N, and U components is shown in Figure 4.As shown in the figure, it can be observed that the RegEM method not only produces good interpolation results for scattered missing data but also maintains the trend of the sequence well in the presence of many continuous missing data.It successfully overcomes the limitation of the poor interpolation performance of linear interpolation at locations with continuous missing data.Moreover, it provides high-quality continuous time series data for subsequent experiments.

VMD Parameter Discussion
When performing data decomposition using VMD, the selection of an appropriate number of mode components K is crucial for achieving high-quality decomposition results in VMD.An excessively large K may lead to over-decomposition, while a small K may result in the under-decomposition of the data.To determine the optimal K value for the E, N, and U time series of the different stations, this study adopts the method of comparing the signalto-noise ratio (SNR) of the decomposed data to evaluate the quality of the decomposition results.A higher SNR indicates clearer signal decomposition and a better denoising effect.
Through extensive experiments, and based on empirical rules, this study restricts the K value to a range of 2 to 10 and selects the K value within this range that yields the highest SNR as the optimal K value for each time series [90,91].The definition of SNR is given as follows: where f (i) represents the original signal, and g(i) represents the reconstructed signal.The determination of the penalty factor α also has a certain impact on the decomposition results in VMD; moreover, considering that selecting a penalty factor of approximately 1.5 times the decomposed data is optimal [92], in order to ensure experimental consistency, a penalty factor of 10,000 was set for all the decomposition processes in this study.The results of K value selection in three directions at each site are shown in Table 2.

DVMD-LSTM Prediction Results Analysis
To ensure experimental fairness and consistency, all deep learning models used in this paper are consistently divided into data sets, which are divided into training sets (2000.0 to 2011.9), validation sets (2012.0 to 2014.9), and test sets (2015.0 to 2022.9).The training set was used to train the model parameters and learn the data features.The validation set was used to fine-tune the model's hyperparameters and evaluate its performance.The test set was used for the final evaluation of the model's performance to assess its effectiveness in practical applications.The purpose of this dataset partitioning scheme was to ensure that the model had sufficient training data to fully learn the data features.Additionally, by obtaining sufficient prediction results on the test set, the optimal noise model for prediction accuracy could be evaluated.
In order to visually demonstrate the differences in the prediction results between the DVMD-LSTM model and the VMD-LSTM model, this study compares and discusses the prediction results of the decomposed IMF and residual terms using the two hybrid models.Due to space limitations, this paper only presents the prediction results of the IMF and residual terms in the U direction at the SEDR station.For detailed information, please refer to Figure 5.
From Figure 5, it can be observed that both the VMD-LSTM and DVMD-LSTM models yield good prediction results for each IMF component.However, due to the lack of apparent regularity in the residual terms, the VMD-LSTM model struggles to capture their fluctuation characteristics effectively, resulting in lower prediction accuracy and subsequently affecting the overall prediction performance of the VMD-LSTM model.To address this issue, the proposed DVMD-LSTM model conducts a secondary VMD decomposition on the residual terms obtained after the first VMD decomposition, further extracting the fluctuation information within the residual terms and significantly improving the prediction accuracy.In order to investigate whether performing multiple VMD decompositions can further enhance accuracy, analyses were conducted on the residual terms after the second decomposition.It was found that they lack noticeable fluctuation characteristics.When these results are incorporated into the model for prediction, there is no significant improvement observed; moreover, some stations even exhibit a decrease in prediction accuracy.This indicates that increasing the number of decompositions on the residual terms may not necessarily enhance the prediction accuracy of the model.Therefore, in this study, the data after the secondary VMD decomposition were used as the feature input for the subsequent deep learning experiments.
training set was used to train the model parameters and learn the data features.The validation set was used to fine-tune the model's hyperparameters and evaluate its performance.The test set was used for the final evaluation of the model's performance to assess its effectiveness in practical applications.The purpose of this dataset partitioning scheme was to ensure that the model had sufficient training data to fully learn the data features.Additionally, by obtaining sufficient prediction results on the test set, the optimal noise model for prediction accuracy could be evaluated.
In order to visually demonstrate the differences in the prediction results between the DVMD-LSTM model and the VMD-LSTM model, this study compares and discusses the prediction results of the decomposed IMF and residual terms using the two hybrid models.Due to space limitations, this paper only presents the prediction results of the IMF and residual terms in the U direction at the SEDR station.For detailed information, please refer to Figure 5.

DVMD-LSTM Model Prediction Results and Precision Analysis
To compare the improvement in the predictive accuracy of the DVMD-LSTM model and the VMD-LSTM model compared to the LSTM model under different fluctuation amplitudes, this study conducted experiments using datasets from different stations in three directions.To better distinguish the prediction results, this study analyzed the prediction error R, which is the difference between the true values and the predicted results.Due to space limitations, this section only presents the prediction results of the SEDR station in three directions for different models, as shown in Figure 6.From Figure 6, it can be observed that, as the fluctuation amplitude of the original data increases, the prediction errors of different models also increase to varying degrees, with the largest errors being observed in the U direction.Compared to the LSTM model, the VMD-LSTM hybrid model better captures the fluctuation trends and amplitudes of the true values in the data and exhibits smaller variations and extremities in the prediction error R.This indicates that, after VMD decomposition, the VMD-LSTM model can capture the inherent fluctuation characteristics of the initial data more effectively, leading to more accurate predictions.Both the VMD-LSTM and DVMD-LSTM models exhibit similar prediction fluctuations and trends; however, the DVMD-LSTM model has smaller prediction errors R.This suggests that the DVMD-LSTM model not only retains the advantages of the VMD-LSTM model in predicting fluctuation trends and amplitudes but that it also achieves higher prediction accuracy.
To analyze the applicability and robustness of the DVMD-LSTM model, this study conducted predictions using the LSTM, VMD-LSTM, and DVMD-LSTM models in the E, N, and U directions for each GNSS station.The prediction accuracy and improvement achieved by each model are summarized in Table 3, where "I" represents the degree of accuracy improvement of the hybrid model compared with the single LSTM model under different accuracy indexes.From Figure 6, it can be observed that, as the fluctuation amplitude of the original data increases, the prediction errors of different models also increase to varying degrees, with the largest errors being observed in the U direction.Compared to the LSTM model, the VMD-LSTM hybrid model better captures the fluctuation trends and amplitudes of the true values in the data and exhibits smaller variations and extremities in the prediction error R.This indicates that, after VMD decomposition, the VMD-LSTM model can capture the inherent fluctuation characteristics of the initial data more effectively, leading to more accurate predictions.Both the VMD-LSTM and DVMD-LSTM models exhibit similar prediction fluctuations and trends; however, the DVMD-LSTM model has smaller prediction errors R.This suggests that the DVMD-LSTM model not only retains the advantages of the VMD-LSTM model in predicting fluctuation trends and amplitudes but that it also achieves higher prediction accuracy.
To analyze the applicability and robustness of the DVMD-LSTM model, this study conducted predictions using the LSTM, VMD-LSTM, and DVMD-LSTM models in the E, N, and U directions for each GNSS station.The prediction accuracy and improvement achieved by each model are summarized in Table 3, where "I" represents the degree of accuracy improvement of the hybrid model compared with the single LSTM model under different accuracy indexes.From the results in Table 3, it can be observed that the VMD-LSTM model exhibits an average reduction of 19.77% in RMSE for the E direction, an average reduction of 26.83% in RMSE for the N direction, and an average reduction of 19.31% in RMSE for the U direction, outperforming the LSTM model.The VMD-LSTM model demonstrates an average reduction of 20.31% in MAE for the E direction, an average reduction of 27.12% in MAE for the N direction, and an average reduction of 19.48% in MAE for the U direction.Additionally, the VMD-LSTM model shows an average increase of 43.66% in R2 for the E direction, an average increase of 43.47% in R2 for the N direction, and an average increase of 44.54% in R2 for the U direction.The experimental results indicate that the VMD-LSTM model significantly improves prediction accuracy compared to the standalone LSTM model.Although there are varying degrees of improvement in R2, they are observed to different degrees at different stations.Notably, the improvement is more prominent in stations where the LSTM model had lower R2 values, suggesting that the VMD-LSTM model exhibits better explanatory power and produces predictions that closely match the observed values with improved fitting results.
Compared to the VMD-LSTM model, the DVMD-LSTM model demonstrates an average reduction of 9.71% in RMSE for the E direction, an average reduction of 8.84% in RMSE for the N direction, and an average reduction of 11.02% in RMSE for the U direction.The DVMD-LSTM model exhibits an average reduction of 9.17% in MAE for the E direction, an average reduction of 8.55% in MAE for the N direction, and an average reduction of 10.61% in MAE for the U direction.Moreover, the DVMD-LSTM model shows an average increase of 20.68% in R2 for the E direction, an average increase of 12.18% in R2 for the N direction, and an average increase of 21.03% in R2 for the U direction.The overall average R2 value reaches 0.78, indicating a strong correlation between the DVMD-LSTM model's prediction results and the original data along with improved fitting performance.It can be concluded that the DVMD-LSTM model achieves a significant improvement in accuracy compared to the VMD-LSTM model, with particularly notable improvements in R2.The DVMD-LSTM model exhibits a greater improvement in the U direction, suggesting that it performs better for time series with larger fluctuations.This is because, for time series with larger fluctuations, the residual terms obtained after VMD decomposition are larger and contain more fluctuation characteristics.
In summary, the DVMD-LSTM model preserves the advantages of the VMD-LSTM model in predicting fluctuation trends and frequencies while achieving higher prediction accuracy.The results of the predictions conducted across the different directional components of various stations further validate the superiority of the proposed model.These experimental findings confirm the model's applicability and robustness, demonstrating its potential for broad utilization in the field of high-precision time series forecasting.

Comparison of Optimal Noise Models under Each Prediction Model
To further investigate whether the DVMD-LSTM model can adequately consider the noise characteristics of different datasets during the prediction process, we considered the fact that, currently, domestic and foreign scholars believe that white noise + flicker noise (FN + WN) and a small amount of random walk noise + flicker noise (RW + FN) are the optimal random models for the noise characteristics of GPS coordinate time series [93][94][95][96][97].In addition, some scholars have proposed that, in GPS coordinate time series, some noise models can be represented by power law noise (PL) and the Gaussian Markov model (GGM) [98][99][100].This paper takes GNSS reference stations with the same time span in North America as the research object.Four combined noise models, random walk noise + flicker noise + white noise (RW + FN + WN), flicker noise + white noise (FN + WN), power law noise + white noise (PL + WN) and Gaussian Markov + white noise (GGM + WN), were used to analyze the training set and test set data of each station.Finally, eight stations with the same optimal noise model were selected as the experimental data, and the optimal noise model of each prediction model to the prediction results of each station was calculated.The specific results are shown in Table 4.
According to Table 4, the optimal noise models differ among different stations, indicating the presence of inconsistent noise characteristics.The LSTM model exhibits significant differences between its prediction results and the optimal noise models of the original data, with an average accuracy of only 25% across all three directions.Additionally, the predominant optimal noise models are PLWN and GGMWN.This suggests that the LSTM model does not adequately consider the inherent noise characteristics of GNSS time series during prediction.In contrast, the VMD-LSTM model shows improved accuracy in capturing the optimal noise models, with an average accuracy of 42.67%.This indicates that the VMD decomposition effectively captures the noise characteristics within the IMF components; however, the noise characteristics in the residual component r are not fully captured, resulting in relatively lower overall accuracy.Therefore, the proposed DVMD-LSTM model further enhances the noise characteristics in the residual component r by performing VMD decomposition once again.As a result, the DVMD-LSTM model achieves an impressive average accuracy of 79.17% in capturing the optimal noise models.In summary, the DVMD-LSTM model adequately considers the noise characteristics of the data during the prediction process by processing the original data and decomposed residual components.

Velocity Estimation Impact Analysis
To further investigate the quality of the prediction results from each deep learning model, this study first utilized these models to predict the original data.The optimal noise model and corresponding velocities were computed for each model's prediction results.Subsequently, these velocities were compared with the velocities obtained by calculating the optimal noise model of the original data using the Hector software [84,85].By calculating the absolute error between the prediction results of each model and the original velocities at different measurement stations, the average absolute error between the velocities computed from the prediction results of each deep learning model and the velocities from the original data could be obtained.Finally, by comparing the average absolute error between the prediction results of each deep learning model and the velocities from the original data, the quality of the model's prediction results could be assessed.The velocities computed from the prediction results of each deep learning model under the optimal noise model at different measurement stations are shown in Table 5.
According to Table 5, in the E direction of each station, the average absolute error between the velocities predicted by the LSTM model and the velocities of the original data is 0.068 mm/year.In the N direction, it is 0.093 mm/year; in the U direction, it is 0.078 mm/year.For the VMD-LSTM model, the average absolute error between the predicted velocities and the velocities of the original data is 0.031 mm/year in the E direction, 0.060 mm/year in the N direction, and 0.060 mm/year in the U direction.As for the DVMD-LSTM model, the average absolute error between the predicted velocities and the velocities of the original data is 0.016 mm/year in the E direction, 0.042 mm/year in the N direction, and 0.047 mm/year in the U direction.Compared to the LSTM model, the VMD-LSTM model shows an average improvement of 37.67% in velocity prediction accuracy, while the DVMD-LSTM model demonstrates an average improvement of 56.80%.Compared with VMD-LSTM, the velocity prediction accuracy of the DVMD-LSTM model is improved by 33.02% on average.Thus, both the VMD-LSTM and DVMD-LSTM models exhibit improved velocity prediction accuracy compared to the LSTM model, with the DVMD-LSTM model showing a greater improvement, further demonstrating its outstanding predictive performance.In summary, this study evaluated the performance of various prediction models by analyzing their prediction accuracy, optimal noise models, and velocity results.The results indicate that the DVMD-LSTM model outperforms the others in multiple aspects, highlighting its potential for widely applicable high-precision time series prediction with multiple noise characteristics.

Conclusions
Addressing the limitations of low prediction accuracy and inadequate consideration of noise characteristics in the VMD-LSTM model for time series forecasting, this paper proposes a high-precision GNSS time series prediction method based on DVMD and LSTM.The proposed method is comprehensively validated and tested on the daily time series data from eight North American regional GNSS stations, spanning the period from 2000 to 2022, in the E, N, and U directions.The experimental results demonstrate the following:

( 4 )
On this basis, quadratic penalty factors α and the Lagrange multiplier operator t λ are introduced to transform it into an unconstrained variational problem.The ex- tended Lagrange expression is as follows:

Figure 3 .
Figure 3. Distribution map of each GNSS station.

Figure 5 .
Figure 5. Prediction results of each IMF and residual terms under different models after VMD decomposition in the U direction of the SEDR station (the black curve represents the original data as well as the IMF components and residual terms obtained from VMD decomposition.The red curve represents the prediction results of IMF components using the DFVMD-LSTM and VMD-LSTM models, the blue curve represents the prediction results of residual terms using the VMD-LSTM

Figure 5 .
Figure 5. Prediction results of each IMF and residual terms under different models after VMD decomposition in the U direction of the SEDR station (the black curve represents the original data as well as the IMF components and residual terms obtained from VMD decomposition.The red curve represents the prediction results of IMF components using the DFVMD-LSTM and VMD-LSTM models, the blue curve represents the prediction results of residual terms using the VMD-LSTM model, and the green curve represents the prediction results of residual terms using the DVMD-LSTM model).

23 Figure 6 .
Figure 6.Comparison of prediction results and prediction error R in three directions of the SEDR station under different models (sub-figures (a-c) are the prediction results of each model and subfigures (d-f) are comparison diagrams of the prediction error R of each model).

Figure 6 .
Figure 6.Comparison of prediction results and prediction error R in three directions of the SEDR station under different models (sub-figures (a-c) are the prediction results of each model and subfigures (d-f) are comparison diagrams of the prediction error R of each model).

Figure A4 .
Figure A4.FOOT station data distribution.Figure A4.FOOT station data distribution.

Table 1 .
Information of each GNSS station.

Table 1 .
Information of each GNSS station.

Table 2 .
Results of K value selection in three directions at each site.

Table 3 .
Comparison of the prediction results of each GNSS station in the three directions of E, N, and U under different models (the units of RMSE and MAE in the table are in mm).

Table 4 .
The optimal noise model of each station under different models in the three directions of E, N, and U.

Table 5 .
Velocity values obtained by each station under the optimal noise model.