Analysis of Noise and Velocity in GNSS EPN-Repro 2 Time Series

This study evaluates the EUREF Permanent Network (EPN) station position time series of approximately 200 GNSS stations subject to the Repro 2 reprocessing campaign in order to characterize the dominant types of noise and amplitude and their impact on estimated velocity values and associated uncertainties. The visual inspection on how different noise model represents the analysed data was done using the power spectral density of the residuals and the estimated noise model and it is coherent with the calculated Allan deviation (ADEV)-white and flicker noise. The velocities resulted from the dominant noise model are compared to the velocity obtained by using the Median Interannual Difference Adjusted for Skewness (MIDAS). The results show that only 3 stations present a dominant random walk noise model compared to flicker and powerlaw noise model for the horizontal and vertical components. We concluded that the velocities for the horizontal and vertical component show similar values in the case of MIDAS and maximum likelihood estimation (MLE), but we also found that the associated uncertainties from MIDAS are higher compared to the uncertainties from MLE. Additionally, we concluded that there is a spatial correlation in noise amplitude, and also regarding the differences in velocity uncertainties for the Up component.


Introduction
Earth's surface dynamics can be studied with unprecedented precision and accuracy with the help of Global Navigation Satellite System (GNSS) [1,2], but the system can also be used for various studies and different purposes throughout various scientific areas, such as the determination of the terrestrial reference frame [3], study of water vapour and precipitable water in the atmosphere [4][5][6][7][8][9], ionospheric studies [10][11][12], and many other potential applications based on the analysis of GNSS data. All these applications rely on high quality GNSS data, so obtaining these is possible due to efforts made in recent years by countries who are working to put in operation and/or maintain Continuously Operating Reference Station (CORS) networks consisting of GNSS stations.
We are able to derive the stations' velocity by analysing the GNSS coordinates time series from GNSS CORS networks, which can be used as an input to geophysical models [13,14] used for plate boundary dynamics, postglacial rebound, surface mass loadings, and global sea level change. It is worth noting that the velocity must be computed with a realistic uncertainty.
The determination of crustal velocity represents an important factor in understanding the crustal deformation in any region around the globe [2,15], but for a reliable GNSS station velocity determination we need large amounts of continuous data at a long time

Mathematical Models
The form of the power spectrum P x , that describes many types of geophysical data, whose behaviour in the time domain denoted by x(t) given by [32,33] is: where f is the spatial or temporal frequency, P 0 and f 0 are normalizing constants and k is the spectral index. Typically, the spectral index, k, lies within the range −3 to 1 [33]. The process within this range is subdivided into "fractional Brownian motion" with −3 < k < −1 and "fractional white noise" with −1 < k < 1 [34,35]. Within this stochastic model, special cases of the integer values occur. At k = 0 we are dealing with classical white noise, at k = −1 we are subject to flicker noise, and at k = −2 we have Brownian motion-the so-called "random walk". To refer to power law processes that differ from classical white noise, we will use the term-coloured noise. An important property of the white noise is that it is not correlated in time, whereas the coloured noise is correlated in time.
Taking the research done by [36] to be able to estimate the noise components and the parameters of the linear function, the likelihood l from a set of observation x has to be maximized. If we are assuming that the distribution is Gaussian, the likelihood is: where: • det represents the determinant of the matrix, • C represents the covariance matrix of the assumed noise in the data, • N is the number of epochs and, •v is the postfit residuals of the linear function using weighted least squares with the same covariance matrix C.
In terms of stability, the logarithm of the likelihood must be maximized or minimize the negative: since the maximum is unaffected by the monotonic transformations. The typical model is composed by an intercept, a linear trend (velocity), sinusoidal terms represented by annual and semi-annual signal, terms for offsets, and in the case of a large co-seismic event, a term to describe the post-seismic motion [37]. The covariance matrix C can correspond to different stochastic noise, such as white noise, power law, moving average, autoregressive, first-order Gauss Markov, band pass, and combinations between these different types of noises. If we assume that we are dealing with a white noise component and power law, then the matrix is: where a w and b k represents the white, respectively, the power law noise amplitudes, I is the n × n identity matrix and J k is the power law covariance matrix with the spectral index k.
Although the Equation (4) presents only two types of noise, the time series may contain more types of noise, and sometimes power law noise might not be one of them [23]. By fitting a straight line through a series of n points (x i ) taken at time moment t i , which represents the basic linear regression problem, the determination of rate uncertainties is given by: where ε x (t i ) represents the error term. Assuming that ε x (t i ) is subjected to linear combination of independent random variables and it is identically distributed, ∝ (t i ), and a sequence of temporally correlated random variables, β(t i ), will result in: The amplitude of white noise is represented by the scalar factor a, and b k =0 is the scale factor of coloured noise of spectral index k.
Assuming that the covariance matrix presents time-dependent positions and the station monuments are subjected to a random walk process, then the formal uncertainty of the estimated velocities is approximated by [19,20], 30]: where b represents the random walk noise amplitude, N the number of measurements and T is the total time span. Equation (7) expresses that in the presence of heavily correlated time series, velocity uncertainties are significantly influenced with respect to those from uncorrelated time series. Increasing the observation span by adding more (correlated) positions barely reduced the formal rate uncertainty. If the observation span is kept constant, then changing the sampling interval does not affect, at all, the estimated formal uncertainties. The Allan variance is defined for a finite dataset as follows [38]: where y i is the ith of M fractional frequency averaged over the sampling interval (τ). In terms of the phase data, it can be expressed as: where x i is the ith of the N = M + 1 phase values spaced by the measurement interval τ.
There are many reasons to understand the noise that is contained in GNSS time series, and this is important for geodetic and geophysical applications. One of the most important features is that by understanding the type of noise, we can estimate realistic site velocity and uncertainties.
A variant of the Theil-Sen median trend estimator is the Median Interannual Difference Adjusted for Skewness (MIDAS), and for this the standard version of the median of slopes will be: computed between all data pairs where i > j. For normally distributed data, Theil-Sen and least-squares trend estimates are statistically identical, but unlike least squares, Theil-Sen is resistant to undetected data problems [31]. Non-stationarity of the signal may be verified using time-frequency analysis. In this work we used the CWT (Continuous Wavelet Transform) method, which is commonly used in the analysis of geophysical phenomena, and provides results with equal variance in time and frequency. In the analysed signal, the amplitude and frequency can change, and higher harmonic components may be present due to varying environmental conditions in the vicinity of the GNSS antenna, e.g., changing of the measurement equipment at the station, or variations of a geodynamic nature. CWT allows for the examined signal to be presented in the form of a scalogram-a time-frequency spectrum which allows for the identification of both frequencies present in the signal and their location in time. CWT is defined as a convolution of the analysed signal s(t) with scaled and shifted wavelet function ψ(t) [39]: where a represents the scale, b denotes the shift, ψ represents the mother wavelet function and * (asterisk) denotes the complex conjugate. The normalisation factor 1/ √ a is used to allow direct comparability of the transformation for all scales. In our analysis, the Morlet wavelet was used to calculate the time-frequency spectrum: where, f denotes the centre frequency and f b represents the bandwidth.
The centre frequency was taken as f = 1.2 Hz, and the bandwidth parameter as f b = 2.0 for all the coordinate time series analysed. The values of these two parameters were selected empirically to set a proper balance between frequency and time resolution.

Results
The analysed data is from EUREF Permanent Network (EPN) Repro2 campaign. The aforementioned data was analysed by several analysis centres (AC), where the daily/weekly solutions of each AC were combined into one solution. The combined solutions were obtained using Bernese GPS software version 5.2., with the reference frame being IGb08.
The data was analysed in terms of type of noise, amplitude of noise, velocity, and their associated uncertainty. For the velocity and associated uncertainty, we have used two different approaches: (a) using maximum likelihood estimation (MLE) and (b) using the Median Interannual Difference Adjusted for Skewness (MIDAS). For the MLE we have used 3 different noise modes: (a) Powerlaw+White noise model-we will refer to it throughout the article by just simply calling it Powerlaw noise, (b) Flicker+White noise model-we will refer to it by Flicker noise, and (c) Random Walk+White noise model. By using the MLE, we obtained the log likelihood value, which is an indicator of the degree of coherence between the original data and a priori chosen noise model, thus giving us the most probable candidate for the best noise model. The noise model was chosen based on the highest log value-maximum log likelihood value (MLL), but also by using Akaike and Bayesian information criteria (AIC/BIC) [30,40,41]. The verification of the chosen noise models' behaviour was done by using the power spectra density from the generated residuals, after subtracting the least squares which estimated the linear motion [42] showed that the confidence level of rejecting another noise model with 95% confidence is δMLL > 2.8, [43] considered to be δMLL > 1.92, whereas [44] suggested that a value of δMLL between 2.9 and 3 for one degree of freedom difference at a 95% confidence level. In our analysis, we have considered a value higher than 3.0 for δMLL for accepting a specific noise model. The software that was used for noise analysis was Hector [30].
In the case of MLE estimation, the annual and semi-annual signal was taken into consideration due to its ability to significantly bias the velocity estimates.
Upon analysing the data using the Powerlaw+White noise model, the results revealed that the majority of stations had a spectral index close to 1 for all three components, which indicates that we are dealing with Flicker noise. This was further confirmed by the fact that when the noise model was imposed without indicating the spectral index, the log likelihood value in the case of the imposed flicker noise was higher than all other noise model values.

Noise Analysis
In this subsection, an analysis for the dominant type of noise was carried out for the entire EPN Repro-2 network. The validation was done by analysing the values of the log-likelihood values, Akaike and Bayesian information criterions (AIC and BIC) [30,40,41], but also using the power spectrum, which describes the goodness of fit of the model. For the log-likelihood, we chose the maximum values, whereas for AIC and BIC criteria we chose the lowest values, which indicates the best fit with the observations. From all the analysed GNSS stations, just three of them presented a higher value of the MLL for the randomwalk noise model compared to flicker noise model. These GNSS stations are presented in Figure 1. As presented in the figure above, only three stations (LEON, MOPS, and TUBI) experienced a dominant randomwalk noise model, whereas all the other stations are subjected to flicker noise model; the criteria used to emphasise the noise model was δMLL > 3. To validate those three stations, we have used the power spectra analysis, and we have fitted both noise models on the computed spectrum for the observations. The results are presented in Figure 2. Figure 2 illustrates the power spectra density for the stations LEON, MOPS, and TUBI, the ones that experienced a dominant random walk noise model. We can note that, on both horizontal components-East and North, the random walk noise model fits better the data than the flicker noise model. For the Up component, we can observe that the flicker noise model fits better the data compared to the random walk noise model.
For the Up component, there are also three stations that have δMLL > 3 in favour of the randomwalk noise mode when compared to the flicker noise model. These stations are in the Northern part of Europe-two of them are in Finland and one in Sweden, none of them being the stations mentioned in the earlier comparison. The results are presented in Figure 3. Figure 3 shows the three stations-KIR0, SODA, and VAAS, as having a dominant randomwalk noise model based on the δMLL. These stations were also analysed in terms of PSD to confirm the results based on log-likelihood value, AIC and BIC. The results are presented in Figure 4. Figure 4 reveals that these stations are better fitted by a random walk noise model, especially for the Up component, but we can observe that for the horizontal component-East and North, especially for stations KIR0 and SODA, there is an intrusive randomwalk noise component.

Noise Amplitude
We continued the analysis of the EPN-Repro-2 network to assess the amplitude of the noise for each station. The results for the East component are presented in Figure 5.   The amplitude of the noise for the North component is presented in Figure 6.  The amplitude of the noise for the North component is presented in Figure 6. Figure 6 illustrates that the average amplitude of noise for the North component is between 2.00 mm and 6.00 mm, with a maximum of 6.18 mm. The stations with amplitude higher than 6.00 mm are stations MORP that is in Cockle Park, Morpeth, United Kingdom, and SMLA station, located in Smila, Cherkasy Oblast, Ukraine. The stations that have an amplitude between 5.0 mm and 6.0 mm are BISK, BOLG, KIRU, MEDI, SVTL, TRO1, and UNTR.
The amplitude of the noise regarding the Up component is presented in Figure 7.
Remote Sens. 2021, 13, x FOR PEER REVIEW 11 of 23 Figure 6. Noise amplitude for the North component. Figure 6 illustrates that the average amplitude of noise for the North component is between 2.00 mm and 6.00 mm, with a maximum of 6.18 mm. The stations with amplitude higher than 6.00 mm are stations MORP that is in Cockle Park, Morpeth, United Kingdom, and SMLA station, located in Smila, Cherkasy Oblast, Ukraine. The stations that have an amplitude between 5.0 mm and 6.0 mm are BISK, BOLG, KIRU, MEDI, SVTL, TRO1, and UNTR.
The amplitude of the noise regarding the Up component is presented in Figure 7.   Figure 7 reveals that the noise amplitude for the Up component is between 6.00 mm and 24.00 mm, peaking at a value of 24.50 mm for stations MOPI, located in Modra, Slovakia, and TRF2 station from Gemeinde, Muggendorf, Austria. Both stations are located in Central Europe. The stations that have a noise amplitude between 20.00 mm and 24.00 mm are BOLG and NOA1, and SVTL.
ADEV for East, North, and Up components, respectively, for all of analysed stations are presented in Figure 8. The black dotted line and black dashed line represents the type of noise that is most suitable for ADEV results, which, in our case, are FPM (Flicker Phase Modulation) noise and WFM (White Frequency Modulation). All analysed stations' coordinates time series are the closest to the FPM model, for the all analysed averaging intervals (τ). The second closest noise type to the analysed data is WFM, however, only to a very small extent. Additionally, the charts presented in Figure 8 show a very high consistency of the analysed data with respect to each component (East, North, and Up). This is due to the high-quality EPN reprocessing, and also because of the location of all stations being on one continental plate.  ADEV for East, North, and Up components, respectively, for all of analysed stations are presented in Figure 8. The black dotted line and black dashed line represents the type of noise that is most suitable for ADEV results, which, in our case, are FPM (Flicker Phase Modulation) noise and WFM (White Frequency Modulation). All analysed stations' coordinates time series are the closest to the FPM model, for the all analysed averaging intervals (τ). The second closest noise type to the analysed data is WFM, however, only to a very small extent. Additionally, the charts presented in Figure 8 show a very high consistency of the analysed data with respect to each component (East, North, and Up). This is due to the high-quality EPN reprocessing, and also because of the location of all stations being on one continental plate. Multidimensional ADEV (MADEV) for each of the directions are shown in Figure 9. For a better interpretation, scales for horizontal coordinates, which are the same at~0.5 mm to 2.5 mm are used, which strictly corresponds with this coordinate's accuracy (for daily solutions, the value is 1-2 mm). The height scale is over-exaggerated (~1-10 mm) and it is also connected with Up coordinate accuracy (3-6 mm). MADEV confirms very high cohesion of the time series within a single coordinate component.

Velocity, Uncertainty and Stationarity Analysis
In the next stage of our analysis, velocity of the stations and their associated uncertainty are computed using MLE and MIDAS. The velocity is presented in Figure 10. Figure 10 illustrates the horizontal velocities and their associated uncertainty using MIDAS (results with red arrows) and MLE (results with blue arrows). Almost all of the stations present the same value and direction for the velocity, except for a few stations, which are presented in Table 1.  In Figure 11, the difference between MIDAS and MLE velocity for the Up component is presented. It can be observed that the highest differences in terms of velocity between MIDAS and MLE are for stations VALA, MLVL, QAQ1, SPRN, and MOPS, with values between 1.69~1.14 mm.
In Figure 12, the difference between velocity uncertainty between MIDAS and MLE for the Up component is presented.   The time-frequency analysis of GNSS station coordinate time series allows for the identification of the dominant periodic components and their changes during the analysed time span. This analysis was performed independently for each station and each coordinate. In the majority of cases, the annual component was dominant, but on a few occasions, a dominant semi-annual period was observed. Variations in the signal periodic components frequency and amplitude can be observed in the vast majority of cases. Figure 13 shows several examples of the time-frequency spectrum obtained from the continuous wavelet transform analysis showing non-stationarity of the coordinate changes. Although, in our research, we modelled seasonality with annual and semi-annual signals, other harmonics of yearly signal are evident in some cases, e.g., BACA N (4th harmonics in [2006][2007][2008][2009][2010][2011][2012]. There are also a number of stations where coordinates time series periodic components are deviating from annual or its harmonics e.g., over 250-day period at BADH N from 2010 and KURE N from 2011 ( Figure 13). Such effects not considered in the analysis may impact on results obtained. Presence of the semi-annual and other harmonics other than those corresponding to tropical or draconitic period, in coordinate time series, needs further analysis.

Discussion
If the trend and the associated periodic terms of the deterministic model are not accurate enough to characterize the time series, misfits in the deterministic model will cause an alteration of the stochastic model, thus impacting the velocity uncertainty.
During the analysis, the Common Model Error (CME) was not analysed although the extraction and elimination plays an important role in improving the accuracy of regional CORS networks such as EPN. The CME is a dominant non-deterministic error source in daily GNSS solutions which can cause error source in daily GNSS solutions [37,[45][46][47][48]. Studies have shown that Principal Component Analysis (PCA), Karhunen-Loeve expansion (KLE), and stacking filtering can be used to determine the CME. The elimination of CME can have an increasing trend on velocities and amplitudes and a decrease in uncertainty trend as demonstrated by [49]. Moreover, the filtering effect of the CME can be seen especially in the Up direction, better than in the horizontal component. More literature related to CME can be found in [50,51].
It can be seen in Figure 7 that the amplitude of noise for the Up component shows a spatial correlation. The Western part of continental Europe has amplitudes between 6.00 mm and 10.00 mm, lower compared to other areas of the continental part of Europe. In central and Eastern Europe, the amplitudes are starting to increase, resulting in values between 10.00 mm and 14.00 mm. The upper part of continental Europe experiences the highest amplitudes, with values between 14.00 mm and 22.00 mm. The results are in agreement with the findings of [52] although they computed the data in ITRF 2014.
Various sources of errors affect the uncertainty of the estimated velocity, which are presented in Table 2. It can be observed that the reference frame stability has the largest effect compared to others (up to 2 mm/year) as shown by [53], whereas the second largest influence is due to undetected/uncorrelated offsets [54] and unmodeled loading effects [57], which can reach up to 1 mm [47] showed that CME can cause an important effect of the precision of velocity estimates and the differences in velocity in regional networks between filtered and unfiltered time series ranges between 0.2 mm/year up to 0.4 mm/year. It can be noticed that choosing one noise model over another can generate an effect on velocity uncertainty up to 0.3 mm/year, due to the noise level in the position data and the length of the time series [56]. The findings of [58] comes the complete the data from Table 2, in which they show that also the choice of the computational strategy-Precise Point Positioning (PPP) or Double Differential (DD) has an impact on the velocities and their uncertainities.
Usage of the ADEV shows similar results-the dominant noise is FPM with WFM as the 2nd dominant one. Table 3 presents the summarized statistics of the uncertainties of the Flicker and Powerlaw noise in mm/year. The results between the two noise models for the horizontal components are in agreement at a mean value of 0.12 mm/year, excepting the maximum values. The information criteria-Akaike and Bayesian, which describes the goodness of fit of the model, were very close for both noise models, and also the spectral index determined for the Powerlaw noise model indicates values close to −1, which indicates Flicker noise model. For the Up component, a higher sensitivity to the chosen noise model can be observed, and the uncertainties present a mean difference of 0.700 mm/year, excepting the maximum values.
During the estimation process, breaks were added, which was a key method to obtain more realistic uncertainties, especially for stations affected by geodynamic issues, such as the stations located in Iceland. The breaks were added based on: (1) manual detection based on visual inspection of the time series, (2) based on log-sheet information, (3) based on "eq_file" that contains earthquakes, and other discontinuities which can be found in the distribution of Gamit-Globk [59].
The statistics of velocity uncertainty in mm/year are presented in Table 4. When comparing all quartiles, the horizontal differences between the two noise models can be considered negligible. When compared to MIDAS, the Powerlawis roughly 2 time smaller, and upon comparing to Flicker noise, it is 1.6 times smaller. However, the Up component, which is more sensitive to the chosen noise model, shows an average difference of 1.4 times between the noise model, and a difference of 2.4 times between MIDAS and the Powerlaw noise model.
There are 10 stations out of 200 that present a difference of more than 1.00 mm in Up velocity between MIDAS and MLE. Only a quarter of all the stations shows a lower velocity value when using MIDAS compared to MLE, whereas the MLE, in the case of the other 75% of the stations, presents a lower Up velocity.

Conclusions
The article presents an exhaustive overview of advance in analysing GNSS daily station position series over the past two decades, by choosing a network that is prone to higher levels of noise compared to the new GNSS time series integrated in ITRF 2014 reference frame. We investigated several aspects of GNSS time series analysis in terms of dominant type of noise, related amplitude, velocity, and associated uncertainty using state of the art processing techniques. Three types of noise models were used for evaluating the performance and the results, which show that in the case of power-law the index is around~1, which indicates flicker noise. The log-likelihood, AIC, BIC, and power spectrum comes to validate that the most favourable type of noise is either power-law + white, with a spectral index of~1, or a combination between flicker and white noise, except for three stations in the horizontal component, and for other three stations for the vertical component, that all show another type of dominant noise-randomwalk noise. Similar results were obtained after applying the Allan variance (flicker noise-dominant one, white noise-with smaller impact), which can be used as a control for the MLE method.
It is known that MIDAS removes outliers and recomputes the median to reduce biases and computes a robust and realistic estimate of trend uncertainty which gives its general nature. MIDAS has the potential to be used in broader applications in the domain of geodesy and geosciences. We found that MIDAS reflects more optimistic uncertainties, compared to MLE.
The vertical uncertainties present a higher sensitivity related to the chosen noise model, notably for shorter time series. For example, the PL+WN seems to have the tendency to overestimate the uncertainties. Some stations with data spans roughly of about 3-4 years show vertical uncertainties of 0.62-0.78 mm/year when using Pl+WN, whereas the flicker noise estimates shows only~0.35 mm/year.
The stations with many discontinuities present higher uncertainty values compared to the ones with continuous observations. The uncertainties reflect very well the length of the time series and also the degree of continuity in the time series.
The unidentified breaks that generate offsets in the time series could introduce random level shifts which have a nature of pseudo-random walk noise, which is also validated by [60].
The difference in uncertainty for the Up component shows spatial correlations, in which the stations situated in the Northern part of Europe have a difference between 0 and 0.2 mm, the central part of Europe shows differences between 0.2 mm and 0.5 mm, while the Southern part of Europe showing differences between 0.5 mm and 0.8 mm.