Accuracy Enhancement and Feature Extraction for GNSS Daily Time Series Using Adaptive CEEMD-Multi-PCA-Based Filter

: Global navigation satellite system (GNSS) positions include various useful signals and some unmodeled errors. In order to enhance the accuracy and extract the features of the GNSS daily time sequence, an improved method of complete ensemble empirical mode decomposition (CEEMD) and multi-PCA (MPCA) based on correlation coefﬁcients and block spatial ﬁltering was proposed. The results showed that the mean standard deviations of the raw residual time sequence were 1.09, 1.20 and 4.79 mm, while those of the newly proposed method were 0.15, 0.20 and 2.86 mm in north, east and up directions, respectively. The proposed method outperforms wavelet decomposition (WD)-PCA and empirical mode decomposition (EMD)-PCA in effectively eliminating low-and high-frequency noise, and is suitable for denoising nonlinear and nonstationary GNSS position sequences. Furthermore, feature extraction of the denoised GNSS daily time series was based on CEEMD, which is superior to WD and EMD. Results of noise analysis suggested that the noise components in the original and denoised GNSS time sequence are complex. The advantages of the proposed method are the following: (i) it fully exploits the merits of CEEMD and WD, where CEEMD is ﬁrst used to obtain the limited intrinsic modal functions (IMFs) and then to extract seasonal and trend features; (ii) it has good adaptive processing ability via WD for noise-dominant IMFs; and (iii) it fully considers the correlation between the different components of each station and the non-uniform behavior of common mode error on a spatial scale.


Introduction
In recent decades, owing to the rapid development of Global Navigation Satellite System (GNSS) technology and the continuous accumulation of observational data, a significant amount of data pertaining to crustal movement velocity fields and crustal deformation in plate boundary regions have been acquired.These data have been applied in geophysics, atmospheric science and dynamic displacement detection during natural disasters [1][2][3][4][5][6][7].However, the GNSS position time sequence includes not only crustal deformation signals, but also non-tectonic deformation signals and several systematic errors [8][9][10][11][12].Various error sources affect GNSS positioning accuracy.Researchers have proposed various techniques to calculate the parameters of the GNSS signal time series and the associated noise [13,14].Research findings indicate that random fluctuations of the GNSS position time series, typically termed as noise, can be represented by a combination of white noise (WN) and power-law (PL) processes [15].
Owing to the superposition of all types of "signals" and "noises" in GNSS coordinate time series, seasonal variations are apparent [16][17][18][19], which include geophysical signals and systematic errors from various sources.To accurately classify GNSS signals into crustal movement and other signals and achieve dependable long-term motion trends, seasonal signals must be estimated and must be obtained accurately [20].Moreover, a common mode error (CME) is evident between GNSS sites, i.e., the positional error generated by the common movement of the GNSS sites [9,10,21].Hence, the quality of GNSS observations is affected significantly by these errors, which may result in inaccurate interpretations or biased estimations of geophysical events.
To date, various spatiotemporal filtering methods have been proposed to reduce the CME at GNSS positions, including stacking filtering (STK) [9,10] and principal component analysis (PCA) [12,22].STK is easy to compute but is only applicable to small-scale, uniformly distributed GNSS networks.PCA can be applied to reduce data dimensions and extract data correlations; however, during denoising, the primary principal components cannot be obtained, and some signal information can be lost.For the GNSS daily coordinate time series, Dong et al. [11] performed PCA and the Karhunen-Loeve expansion to identify signals and unmodeled systematic errors.The results showed that one of the dominant error sources was spatiotemporally correlated with the CME in the GNSS long-term coordinate time series.However, PCA is effective only if the CME exhibits an almost uniform behavior, which is based on the fact that on a certain spatial scale, the CME exhibits a non-uniform behavior.Furthermore, PCA is mainly used to weaken the systematic colored noise in GNSS residual signals, whereas it cannot remove WN noise in the signals.
Although the methods discussed above can reduce the noise in GNSS positions to some degree, they do not perform any decomposition but rather process GNSS datasets in a single dimension, whereas GNSS data are multi-scale in nature; thus the signals cannot be decomposed into different time scales.As a new analytical method for time-frequency localization, the wavelet transform (WT) [23][24][25] can decompose the original signal, obtain the general situation of the signal on a large scale and obtain details regarding the signal on a small scale.It is widely used in signal analysis and noise elimination [26].Although WT performs well in smoothing noise, no accurate method exists for identifying the wavelet base and decomposition level.The pre-divided time and frequency characteristics of the WT hamper its capacity to adaptively decompose signals into various time frequencies based on the intrinsic features of signals; it is mainly used to eliminate high-frequency random noise, but with low efficiency in eliminating low-frequency colored noise.Based on the merits of WT and PCA, Bakshi [27] proposed a WT-based multi-scale PCA (MSPCA), which proved to be highly efficient in multivariate monitoring.However, when denoising GNSS positions, MSPCA considers only the correlations between the same position directions at different sites, whereas the spatial correlation between the components of the coordinates for all sites is disregarded.Singular spectrum analysis (SSA) [28] is effective for managing signals with small time-domain waveform fluctuations.Moreover, for signals with significant fluctuations, the appropriate noise platform parameters should be selected to significantly reduce noise.However, the selection rules for the noise platform parameters are not clearly defined [29].Jin et al. [30] used a method combining empirical mode decomposition (EMD) and SSA to estimate the nonlinear trend terms in the time series of actual sea level changes with high accuracy and consistency.Kong et al. [31] performed an SSA and a Fourier fast transform to analyze a long-term polar motion time sequence obtained via Doppler orbitography and radio positioning integrated by satellite observations.Many methods based on time-scale decomposition have been developed.Shen et al. [32] used variational mode decomposition (VMD) combined with a correlation coefficient to denoise the original sequence and extract seasonal, trend, and residual terms.Xu et al. [33] proposed a type of energy entropy mutual information as the objective function to improve VMD combined with a wavelet packet denoising algorithm, to address inadequate decomposition or to effectively denoise the time series.Montillet et al. [34] used the EMD algorithm to decompose the GPS coordinate time series into a series of sub-time series, and then proposed an algorithm to estimate the Hurst parameter for each sub-time series to calculate the statistics of WN.He et al. [12] performed PCA to a process based on spatial filtering to enhance the accuracy and reliability of GNSS position time series.Lu et al. [35] proposed an EMD-PCA combined method to reduce the multipath effect and high-frequency random noise.He et al. [36] developed a MATLAB-based denoising software named GNSS-TS-NRS, which can denoise signals using five methods, including EMD and ensemble EMD, and can effec-tively perform signal processing.Lee et al. [37] used EMD and PCA to detect climate change signals to weaken the effect of background noise and increase their ability to detect climate change.Zhang et al. [38] proposed a wavelet decomposition (WD)-PCA combined method to improve the vulnerability of PCA to high-frequency random noise in daily coordinate sequences and effectively improved positioning accuracy.Lu et al. [39] proposed a GPS time series prediction model based on complete ensemble empirical mode decomposition (CEEMD).Based on the model identification rules, each intrinsic modal function (IMF) and residual should be modeled separately, and the final predicted results of the GPS time series can be calculated.For GPS multipath error extraction, Long et al. [40] proposed a new algorithm based on the CEEMD-wavelet-SavGol model to address the mode-mixing effect of the EMD algorithm and the limitations of wavelet denoising.Li et al. [41] developed CEEMD-multi-PCA (CEEMD-MPCA) combined with a correlation coefficient to increase the accuracy and feature extraction of GNSS coseismic positions with a high sampling rate.The CEEMD-MPCA denoising method is not only suitable for the seismic feature extraction of GNSS seismic positions, but also eliminates WN and systematic errors while retaining earthquake waveforms with high performance by considering the correlation among the different components of each station.
To enhance the reliability of GNSS positioning and feature extraction, an adaptive method using CEEMD-MPCA based on correlation coefficients and block spatial filtering is proposed herein.Compared with previous studies, our study offers the following advantages: (i) the CEEMD method is first applied to decompose the GNSS time series into various IMFs based on the inherent time-scale features of the data; (ii) the proposed approach based on MPCA fully considers correlations among different components of each station and the non-uniform behavior of the CME on a spatial scale; and (iii) the CEEMD method is applied to the trend-and seasonal-feature extraction of the denoised GNSS time series.
The remainder of this paper is organized as follows.The GNSS daily time series and processing process are presented in Section 2. Section 3 describes the methodological procedures and principles of the analysis.In Section 4, noise analysis based on Hector software is presented.Finally, the conclusions are presented in Section 5.

GNSS Daily Time Series and Preprocessing Procedure
The International GNSS Service (IGS) is a high-precision international GNSS product (https://www.igs.org/about/,accessed on 8 March 2023).Currently, approximately 506 continuously operating GNSS sites are globally distributed and use data from these sites to publish satellite orbits, satellite clocks, zenith path delays, zenith tropospheric delays and other products.The IGS comprises three global data centers: the Crustal Dynamics Data Information System, the South Orange Performing Arts Center and the Institut Géographique National, which can provide pseudo-range and phase observations, broadcast ephemeris and meteorological data free of charge.The SOPAC uses "st_filter" software (http:// sopac-ftp.ucsd.edu/pub/timeseries/measures/,accessed on 8 March 2023) to combine constrained solutions from the JPL and SOPAC analysis centers, remove the errors caused by software and processing schemes and derive a consistent solution.The solution strategy adopted by the analysis center and the error model used are details that can be found on the SOPAC website (http://garner.ucsd.edu/pub/measuresESESES_products/ATBD/,accessed on 8 March 2023).
The SOPAC provides three types of data: raw, clean and filtered.Among them, "clean" daily time series data are generated by the removal of outliers, mean and coseismic and non-seismic jumps from the original data.The gaps for a few missing epochs are filled using linear interpolation.In theory, the daily time series of a "clean trend" location contains trends, seasonal signals and spatial correlation noise.
The accuracy of the newly proposed method was assessed using a GNSS daily coordinate time sequence in a 10-year span from 27 October 2010 to 27 October 2020, at 20 sites in the SOPAC network (Figure 1).The selected 20 stations were categorized into two blocks (10 GNSS stations in central California (Block 1) and 10 stations in southern California (Block 2)).This selection was based on the following considerations: First, He et al. [12] showed that in the GNSS region, block-space filtering can weaken seasonal variation, reduce regional effects and enhance the accuracy and reliability of GNSS data.Second, to obtain accurate trends from continuous time sequences, the time span must be at least 2.5 years [4,42].Finally, the selected IGS stations have a low data loss rate, as shown in Table 1, for the 20 stations; the average data loss rates for Blocks 1 and 2 were 0.98% and 1.81%, respectively (Table 1).The low data loss rate renders it possible to analyze the GNSS time series in both time and frequency domains.Furthermore, the 10-year observation period was sufficient to obtain accurate trends.
filled using linear interpolation.In theory, the daily time series of a "clean trend" location contains trends, seasonal signals and spatial correlation noise.
The accuracy of the newly proposed method was assessed using a GNSS daily coordinate time sequence in a 10-year span from 27 October 2010 to 27 October 2020, at 20 sites in the SOPAC network (Figure 1).The selected 20 stations were categorized into two blocks (10 GNSS stations in central California (Block 1) and 10 stations in southern California (Block 2)).This selection was based on the following considerations: First, He et al. [12] showed that in the GNSS region, block-space filtering can weaken seasonal variation, reduce regional effects and enhance the accuracy and reliability of GNSS data.Second, to obtain accurate trends from continuous time sequences, the time span must be at least 2.5 years [4,42].Finally, the selected IGS stations have a low data loss rate, as shown in Table 1, for the 20 stations; the average data loss rates for Blocks 1 and 2 were 0.98% and 1.81%, respectively (Table 1).The low data loss rate renders it possible to analyze the GNSS time series in both time and frequency domains.Furthermore, the 10-year observation period was sufficient to obtain accurate trends.The daily GNSS coordinate time sequences and other relevant parameters were obtained for each site.The GNSS time sequence at station P242 is shown in Figure 2. Site P242 exhibited a long-trend term of moving northeast (Figure 2a), and the mean root mean squares of the GNSS residue were 1.46, 1.64 and 4.67 mm in the north (N), east (E) and up (U) directions, respectively (Figure 2b), with clear seasonal variation and primarily annual trends, particularly in the north and up components.The daily GNSS coordinate time sequences and other relevant parameters were obtained for each site.The GNSS time sequence at station P242 is shown in Figure 2. Site P242 exhibited a long-trend term of moving northeast (Figure 2a), and the mean root mean squares of the GNSS residue were 1.46, 1.64 and 4.67 mm in the north (N), east (E) and up (U) directions, respectively (Figure 2b), with clear seasonal variation and primarily annual trends, particularly in the north and up components.

An Improved Denoising Method for GNSS Daily Time Series
The definitions of the CEEMD method, correlation coefficient between each IMF and raw signal are provided in Appendix A1,2.Based on the methods mentioned in the appendix, an improved method involving CEEMD-MPCA combined with the correlation coefficient and block spatial filtering was developed to enhance the accuracy of the GNSS daily coordinate time sequence and feature extraction.In this method, the selected GNSS network is classified into several blocks.The three-dimensional daily GNSS data matrix is converted into a two-dimensional matrix, which is used to decompose the GNSS daily coordinate time sequence into limited IMFs.The inflection point of the transitional IMF is determined using the correlation coefficient.Wavelet denoising is performed to filter the former noise-dominated mode, and the filtered signal is reconstructed using the remaining matrix.Finally, MPCA is performed to reconstruct and process the processed IMF to obtain the denoised time sequence.The details of the improved method are as follows: (i) The GNSS network is classified into several blocks based on the scale or range while considering the spatial distribution of the GNSS stations.(ii) The three-dimensional GNSS data matrix S(m, n, q) is converted into a two-dimensional GNSS data matrix S(n, u), u = m × q, where n, m and q represent the number of epochs, selected station and coordinate directions of each site, respectively.The three coordinate components in order are N, E and U. (iii) Each column in matrix S(n, u) is adaptively decomposed into K IMFs based on the frequency level using the CEEMD method.IMFs= {imf j,1 ,imf j,2 ,…,imf j,K } , The CCs between the raw residual time sequence and each IMF are calculated, and the corresponding matrix CCs={cc j,1 ,cc j,2 ,…,cc j,K } are obtained.Starting from cc j,1 , the IMF component imf j,h corresponds to the first extreme value cc j,h in the CCs, and the next IMF component imf j,h+1 is the transitional IMF.
(v) The transition IMF (noise-dominant) and noise IMFs are combined into a new matrix X D ={xd j,1 ,xd j,2 ,…,xd j,h+1 } , and each column of matrix X D is processed via wavelet

An Improved Denoising Method for GNSS Daily Time Series
The definitions of the CEEMD method, correlation coefficient between each IMF and raw signal are provided in Equations (A1) and (A2) in the Appendix A. Based on the methods mentioned in the appendix, an improved method involving CEEMD-MPCA combined with the correlation coefficient and block spatial filtering was developed to enhance the accuracy of the GNSS daily coordinate time sequence and feature extraction.In this method, the selected GNSS network is classified into several blocks.The three-dimensional daily GNSS data matrix is converted into a two-dimensional matrix, which is used to decompose the GNSS daily coordinate time sequence into limited IMFs.The inflection point of the transitional IMF is determined using the correlation coefficient.Wavelet denoising is performed to filter the former noise-dominated mode, and the filtered signal is reconstructed using the remaining matrix.Finally, MPCA is performed to reconstruct and process the processed IMF to obtain the denoised time sequence.The details of the improved method are as follows: (i) The GNSS network is classified into several blocks based on the scale or range while considering the spatial distribution of the GNSS stations.(ii) The three-dimensional GNSS data matrix S(m, n, q) is converted into a two-dimensional GNSS data matrix S(n, u), u = m × q, where n, m and q represent the number of epochs, selected station and coordinate directions of each site, respectively.The three coordinate components in order are N, E and U. (iii) Each column in matrix S(n, u) is adaptively decomposed into K IMFs based on the frequency level using the CEEMD method.I MFs = im f j,1 , im f j,2 , . . ., im f j,K , j = 1, 2, . . ., u.
(iv) The CCs between the raw residual time sequence and each IMF are calculated, and the corresponding matrix CCs = cc j,1 , cc j,2 , . . ., cc j,K are obtained.Starting from cc j,1 , the IMF component im f j,h corresponds to the first extreme value cc j,h in the CCs, and the next IMF component im f j,h+1 is the transitional IMF.
(v) The transition IMF (noise-dominant) and noise IMFs are combined into a new matrix , and each column of matrix X D is processed via wavelet denoising; subsequently, the new matrix X R = xr j,1 , xr j,2 , . . ., xr j,h+1 is obtained after denoising.(vi) The denoised matrix X R and monochromatic IMFs are reconstructed to obtain a new matrix X E = {xe 1 xe 2 , . . . ,xe n }.MPCA is applied to matrix X E and the value of cumulative contribution rate (CCR) of the first k components is calculated.(vii) If the value of CCR exceeds 85%, then the top k components are selected as the denoised signals, and a new denoised matrix X F = {x f 1 , x f 2 , . . . ,x f n } is reconstructed.

Feature Extraction of Seasonal and Trend Terms
The fundamental step in data analysis is to determine and remove nonlinear trend terms from the data [41].Based on EMD, the trend term is defined as follows: the trend for a dataset with a certain time range is a function that has, at the maximum, one extreme value or a monotone function with inherent determination [43].
One of the most evident features of the GNSS daily time sequence is the seasonal term, which is primarily caused by the migration of surface materials, including surface water, snow cover, vegetation changes, and tides [44,45].Many early studies showed that the seasonal terms of the sequence are mainly composed of semi-annual and annual cycle terms [46,47].
After performing denoising using the proposed method, the denoised matrix X F is used as the starting input matrix.Subsequently, CEEMD is applied to obtain the trend and seasonal terms.The corresponding feature extraction results are calculated as follows: where I SEA represents the seasonal term of the denoised data results obtained via CEEMD, I TRE represents the trend term extracted from the denoised data and RES represents the residuals.The structure of the proposed method is shown in Figure 3.

Feature Extraction of Seasonal and Trend Terms
The fundamental step in data analysis is to determine and remove nonlinear trend terms from the data [41].Based on EMD, the trend term is defined as follows: the trend for a dataset with a certain time range is a function that has, at the maximum, one extreme value or a monotone function with inherent determination [43].
One of the most evident features of the GNSS daily time sequence is the seasonal term, which is primarily caused by the migration of surface materials, including surface water, snow cover, vegetation changes, and tides [44,45].Many early studies showed that the seasonal terms of the sequence are mainly composed of semi-annual and annual cycle terms [46,47].
After performing denoising using the proposed method, the denoised matrix X F is used as the starting input matrix.Subsequently, CEEMD is applied to obtain the trend and seasonal terms.The corresponding feature extraction results are calculated as follows: where ISEA represents the seasonal term of the denoised data results obtained via CEEMD, ITRE represents the trend term extracted from the denoised data and RES represents the residuals.The structure of the proposed method is shown in Figure 3.

Results of GNSS Denoised Data
Herein, the N component of site P242 is presented as an example to assess the proposed method.The GNSS coordinate time series were first decomposed using CEEMD into different IMFs, and the decomposition results for the eight IMFs and one residue were obtained (Figure 4).IMF 8 shows a clear annual variation in the GNSS daily time series, whereas IMF 7 shows a strong seasonal cycle.

Results of GNSS Denoised Data
Herein, the N component of site P242 is presented as an example to assess the proposed method.The GNSS coordinate time series were first decomposed using CEEMD into different IMFs, and the decomposition results for the eight IMFs and one residue were obtained (Figure 4).IMF 8 shows a clear annual variation in the GNSS daily time series, whereas IMF 7 shows a strong seasonal cycle.Table 2 shows the CCs between the raw north GNSS time sequence of station P242 and each IMF decomposed by CEEMD.The transitional IMF was IMF5.Based on our method, IMF1-IMF4 are "noise-dominant" IMFs, IMF5 is a "transition" IMF and the remaining IMFs are "signal-dominant" IMFs.IMF1-IMF5, as "noise-dominant" models, were combined into a new matrix X D ={xd j,1 ,xd j,2 ,…,xd j,5 }, j = 1, 2, …, u.For every column of wavelet soft threshold denoising processing, a new matrix X R ={xr j,1 ,xr j,2 ,…,xr j,5 } was obtained.The denoised matrix and "signal-dominant" IMFs were reconstructed to obtain a new matrix X E .The MPCA method was used for matrix X E , and the top k components, where the cumulative contribution rate of the first k components exceeded 85%, were selected.The final denoised matrix X F was reconstructed.The raw and denoised GNSS coordinate time sequence at station P242 in the N direction and their corresponding residual time series are shown in Figure 5. Clear nonlinear Table 2 shows the CCs between the raw north GNSS time sequence of station P242 and each IMF decomposed by CEEMD.The transitional IMF was IMF5.Based on our method, IMF1-IMF4 are "noise-dominant" IMFs, IMF5 is a "transition" IMF and the remaining IMFs are "signal-dominant" IMFs.IMF1-IMF5, as "noise-dominant" models, were combined into a new matrix X D = xd j,1 , xd j,2 , . . ., xd j,5 , j = 1, 2, . . ., u.For every column of wavelet soft threshold denoising processing, a new matrix X R = xr j,1 , xr j,2 , . . ., xr j,5 was obtained.The denoised matrix and "signal-dominant" IMFs were reconstructed to obtain a new matrix X E .The MPCA method was used for matrix X E , and the top k components, where the cumulative contribution rate of the first k components exceeded 85%, were selected.The final denoised matrix X F was reconstructed.The raw and denoised GNSS coordinate time sequence at station P242 in the N direction and their corresponding residual time series are shown in Figure 5. Clear nonlinear variations and oscillations with time were exhibited in the raw coordinate time series, which indicates that the raw GNSS time series was affected by WN, and the colored noise was nonlinear and nonstationary (Figure 5a).The proposed method was smoother than WD-PCA and EMD-PCA, indicating that the proposed method not only effectively eliminated noise at high frequencies, but also reduced noise at low frequencies.Nonlinear oscillations and random fluctuations with high frequencies were observed in the raw residual time series (Figure 5b), indicating the presence of WN with high frequencies and systematic noise with low frequencies in the raw GNSS daily coordinate sequence.The proposed method can eliminate more noise at both high and low frequencies than WD-PCA and EMD-PCA.WD-PCA considers the advantages of wavelet denoising and PCA in removing noise at both high and low frequencies.Meanwhile, EMD-PCA considers the advantages of EMD and PCA in partially removing low-frequency noise.Furthermore, the proposed method combines the advantages of CEEMD-WD and MPCA, which consider the correlation between all directions of the station and eliminate various noises effectively.CEEMD is more accurate than the other methods because of its highly adaptive and data-driven ability to decompose GNSS datasets into IMFs of different frequencies.variations and oscillations with time were exhibited in the raw coordinate time series, which indicates that the raw GNSS time series was affected by WN, and the colored noise was nonlinear and nonstationary (Figure 5a).The proposed method was smoother than WD-PCA and EMD-PCA, indicating that the proposed method not only effectively eliminated noise at high frequencies, but also reduced noise at low frequencies.Nonlinear oscillations and random fluctuations with high frequencies were observed in the raw residual time series (Figure 5b), indicating the presence of WN with high frequencies and systematic noise with low frequencies in the raw GNSS daily coordinate sequence.The proposed method can eliminate more noise at both high and low frequencies than WD-PCA and EMD-PCA.WD-PCA considers the advantages of wavelet denoising and PCA in removing noise at both high and low frequencies.Meanwhile, EMD-PCA considers the advantages of EMD and PCA in partially removing low-frequency noise.Furthermore, the proposed method combines the advantages of CEEMD-WD and MPCA, which consider the correlation between all directions of the station and eliminate various noises effectively.CEEMD is more accurate than the other methods because of its highly adaptive and data-driven ability to decompose GNSS datasets into IMFs of different frequencies.To analyze the denoising effect more comprehensively using the proposed method, the raw and denoised GNSS coordinate time sequences in the U direction for Blocks 1 (Figure 6) and 2 (Figure A1) are shown.The raw GNSS daily time sequence contains centimeter-level fluctuations, namely low-frequency colored noise and clear random oscillations with high frequencies.The denoised coordinate time series was much smoother than the original time series, indicating that the denoising effect is better.For example, the time series of station P242 shows apparent annual cycles, which may be due to changes in the surface environment composed of sea level, continental water and atmospheric load changes, thus resulting in periodic changes in crustal deformation [44,48].To analyze the denoising effect more comprehensively using the proposed method, the raw and denoised GNSS coordinate time sequences in the U direction for Blocks 1 (Figure 6) and 2 (Figure A1) are shown.The raw GNSS daily time sequence contains centimeter-level fluctuations, namely low-frequency colored noise and clear random oscillations with high frequencies.The denoised coordinate time series was much smoother than the original time series, indicating that the denoising effect is better.For example, the time series of station P242 shows apparent annual cycles, which may be due to changes in the surface environment composed of sea level, continental water and atmospheric load changes, thus resulting in periodic changes in crustal deformation [44,48].To quantitatively describe the improvement in the accuracy of the GNSS daily positions, the standard deviations (STDs) of the GNSS residual time sequences of various methods at each station are shown in Figure 7. Regardless of the block, the STDs of the original data were the highest.The EMD-PCA method exhibited the least improvement in terms of accuracy, whereas the proposed method exhibited the most significant improvement.To quantitatively describe the improvement in the accuracy of the GNSS daily positions, the standard deviations (STDs) of the GNSS residual time sequences of various methods at each station are shown in Figure 7. Regardless of the block, the STDs of the original data were the highest.The EMD-PCA method exhibited the least improvement in terms of accuracy, whereas the proposed method exhibited the most significant improvement.
For Block 1, the horizontal components of the proposed method were significantly better than the up component of the other methods.For the N component, the STDs of the original residual data, WD-PCA, EMD-PCA and the proposed method were [0.86-2.02],[0.20-1.71],[0.41-1.98]and [0.17-0.58],respectively.For the E component, the STDs of the original residual time series, WD-PCA, EMD-PCA and the proposed method were [1.09-1.66],[0.34-0.93],[0.62-1.46]and [0.16-0.38],respectively.The proposed approach exhibited a significantly higher accuracy than the other approaches in the horizontal direction because the coordinate time sequences of each station in the horizontal direction indicated a stronger correlation, and MPCA can theoretically fully utilize the correlation between different coordinate components to remove noise (Figure 7).In the up direction, the accuracies achieved by the three methods for stations P216, P232, P233, P235 and P242 were similar.At the remaining stations, the proposed method exhibited a slightly higher accuracy than the other methods.The STD ranges of the original residual time series, WD-PCA, EMD-PCA and C-MPCA data were [4.50-13.02],[2.29-12.36],[2.45-12.71]and [2.35-12.46],respectively, because the correlation between the U components of each station was weak.Considering the correlation between different coordinate components, the filtering effect in the U component was not significant.For Block 1, the horizontal components of the proposed method were significantly better than the up component of the other methods.For the N component, the STDs of the original residual data, WD-PCA, EMD-PCA and the proposed method were [0.86-2.02],[0.20-1.71],[0.41-1.98]and [0.17-0.58],respectively.For the E component, the STDs of the original residual time series, WD-PCA, EMD-PCA and the proposed method were [1.09-1.66],[0.34-0.93],[0.62-1.46]and [0.16-0.38],respectively.The proposed approach exhibited a significantly higher accuracy than the other approaches in the horizontal direction because the coordinate time sequences of each station in the horizontal direction indicated a stronger correlation, and MPCA can theoretically fully utilize the correlation between different coordinate components to remove noise (Figure 7).In the up direction, the accuracies achieved by the three methods for stations P216, P232, P233, P235 and P242 were similar.At the remaining stations, the proposed method exhibited a slightly higher accuracy than the other methods.The STD ranges of the original residual time series, WD-PCA, EMD-PCA and C-MPCA data were [4.50-13.02],[2.29-12.36],[2.45-12.71]and [2.35-12.46],respectively, because the correlation between the U components of each station was weak.Considering the correlation between different coordinate components, the filtering effect in the U component was not significant.
For Block 2, the proposed approach showed higher accuracy in the northern regions than the other approaches, except for the MIDA and TBLP stations.For the N component, the STDs of the original residual data, WD-PCA, EMD-PCA and the proposed method were [0.83-1.46],[0.15-0.33],[0.25-0.64]and [0.04-0.29],respectively.The median error of the proposed method exceeded that of the WD-PCA method, which was mainly due to the inferior decomposition effect of CEEMD at this station, which resulted in the low filtering accuracy of these two stations.In the east direction, except for TBLP station, the accuracy of the proposed method exceeded that of the other methods.For the E component, the STD ranges of data obtained by the original residual time series, WD-PCA, EMD-PCA and the proposed method were  For Block 2, the proposed approach showed higher accuracy in the northern regions than the other approaches, except for the MIDA and TBLP stations.For the N component, the STDs of the original residual data, WD-PCA, EMD-PCA and the proposed method were [0.83-1.46],[0.15-0.33],[0.25-0.64]and [0.04-0.29],respectively.The median error of the proposed method exceeded that of the WD-PCA method, which was mainly due to the inferior decomposition effect of CEEMD at this station, which resulted in the low filtering accuracy of these two stations.In the east direction, except for TBLP station, the accuracy of the proposed method exceeded that of the other methods.For the E component, the STD ranges of data obtained by the original residual time series, WD-PCA, EMD-PCA and the proposed method were [1.00-1.64],[0.20-0.36],[0.37-0.95]and [0.15-0.26],respectively.For the U component, the STD ranges of the raw residual data, WD-PCA, EMD-PCA and C-MPCA were [4.24-6.26],[2.54-4.73],[2.92-5.51]and [2.11-4.40],respectively.For the LAND and MASW stations, the WD-PCA method performed slightly better than the EMD-PCA and the proposed method, whereas for the other stations, the proposed method performed better than the other methods.This is because the correlation between the U components of each station was weak.Considering the correlation between different coordinate components, the filtering effect of the upper components was not significant.
Table 3 presents the mean STD in each direction for the overall, Block 1 and Block 2 regions.For the overall denoising, the average STDs of the original residual data, WD-PCA and EMD-PCA in the N, E and U directions were (1.12, 1.28, 5.25) mm, (0.45, 0.53, 3.70) mm and (0.56, 0.68, 3.91) mm, respectively.However, the average STDs of the proposed approach were 0.31, 0.30, 3.46 mm in the N, E and U directions, respectively.Furthermore, after separating the GNSS network into two blocks by determining each block's scale or range and considering the spatial distribution of the GNSS stations, for Block 1 region in the N, E and U directions, the average STDs of the raw residual data, WD-PCA and EMD-PCA were (1.15, 1.36, 5.70) mm, (0.57, 0.51, 4.32) mm and (0.76, 0.87, 4.59) mm, respectively.However, the average STDs of the proposed approach were 0.27, 0.29 and 4.03 mm, respectively.For Block 2 in the N, E and U directions, the average STDs of the original residual data, WD-PCA and EMD-PCA were (1.09, 1.20, 4.79) mm, (0.28, 0.28, 3. 19) mm and (0.50, 0.57, 3.58) mm, respectively.However, the average STDs of the proposed approach were 0.15, 0.20 and 2.86 mm, respectively.The original residual time series exhibited the highest STDs, followed by EMD-PCA, whereas the STDs shown by WD-PCA were lower because WD can accurately separate high-frequency WN and irregular periods.By contrast, the proposed method exhibited the lowest mean STD.This is because the CEEMD method can accurately decompose the GNSS coordinate time series, whereas our method takes the correlations between the different coordinate directions for each station in consideration.

Feature Extraction of Seasonal and Trend Items
The CEEMD method was subsequently applied to decompose the denoised GNSS coordinate time sequences into numerous IMFs to obtain the trend and seasonal terms, and each IMF was arranged based on frequency, as described by Wu et al. [43].The decomposed results of the denoised GNSS data via CEEMD in the N direction at station P242 are presented and compared with the properties of WD and EMD (Figure A2).d1-d6, d7-d8 and a8, which were extracted using WD, denote the noise, seasonal and trend terms, respectively (Figure A2a).IMF1-IMF3, IMF4-IMF5 and Res, which were extracted using EMD, denote the noise, seasonal and trend terms, respectively (Figure A2b).IMF1-IMF7, IMF8-IMF9, and Res extracted via CEEMD denote the noise, seasonal and trend terms, respectively (Figure A2c).
WD and EMD can be used to perform feature extraction on a time sequence.However, for WD, the amplitudes of the noise part are larger, which indicates that noise exerts a more significant effect on feature extraction.The EMD exhibits a modal overlap.To demonstrate the feature extraction ability of the CEEMD method, the similarity of the slope of the trend term and the Pearson correlation coefficients (PCCs) of the seasonal term were used for the denoised GNSS coordinate time series.
(1) Comparison results between trend terms.WD, EMD and CEEMD were applied to trend extraction for Blocks 1 and 2. The slopes of the original sequences and the trend terms were used to verify the superiority of the trend extraction.The similarity of the slope between the raw data and trend terms was calculated as follows: where Similarity represents the similarity between trend terms, Ms i represents the slope of the ith trend term and Os i represents the slope of the ith original data.
The similarity in the slope between the trend term and the denoised GNSS time series is presented in Table 4.The closer the trend term is to the slope value of the raw GNSS data, the more reliable is the extracted result.Based on Table 4, CEEMD is more stable and reliable than EMD and WD.The accuracy of CEEMD is 99.99%, which was higher than that of WD (99.96%) and EMD (97.60%), demonstrating that CEEMD is superior to WD and EMD in terms of trend-term extraction.(2) Comparison results between seasonal terms.
The performances of WD, EMD and CEEMD in extracting seasonal items were compared, and the PCC was used to establish the superiority of CEEMD (Table 5).For Block 1, the average PCCs of the seasonal terms extracted via WD, EMD and CEEMD were 0.10, 0.18 and 0.35, respectively, whereas for Block 2, they were 0.12, 0.26 and 0.37, respectively (Table 5).The average PCCs obtained by CEEMD in Block 1 were 250.0% and 94.4% higher than those obtained by WD and EMD, respectively, whereas those in Block 2 were higher by 208.3% and 43.2%, respectively.These results demonstrate that CEEMD is more accurate than EMD and WD for the extraction of seasonal items.The seasonal terms extracted via CEEMD and the other parameters in Blocks 1 and 2 are shown in Figure 8a

Noise Analysis
Typical methods for the noise characteristic analysis of GNSS data include maximum likelihood estimation (MLE) and power spectrum analysis.For noise analysis in a time sequence, the nature and intensity of noise in the GNSS data are determined in the frequency and time domains, respectively.Bos et al. [13] developed a technique that substantially increases the efficiency of the MLE method, which has since been improved [49].Additionally, more advanced and computationally expensive methods, such as Markov

Noise Analysis
Typical methods for the noise characteristic analysis of GNSS data include maximum likelihood estimation (MLE) and power spectrum analysis.For noise analysis in a time sequence, the nature and intensity of noise in the GNSS data are determined in the frequency and time domains, respectively.Bos et al. [13] developed a technique that substantially increases the efficiency of the MLE method, which has since been improved [49].Additionally, more advanced and computationally expensive methods, such as Markov chain Monte Carlo methods [14,50] and wavelet analysis [51], have been proposed to obtain unbiased probability distributions for noise in the time series.The definition of the related information criterion (IC) is provided in Appendix A.3.
He et al., objectively selected the best noise model by investigating various criteria such as the AIC and BIC [52].The performance of the models investigated was quantified by analyzing 500 batches of simulated time series with lengths of 8.2, 16.4 and 24.6 years and known noise characteristics.He et al. processed a time series before jointly estimating stochastic and functional models using a maximum log-likelihood estimator via Hector software [13,53].To prevent the over-selection of noise models with numerous parameters, the optimal noise model was selected based on the log-likelihood value and information criteria (i.e., AIC, BIC and BIC_tp).He et al., applied these information criteria to Monte Carlo simulations of synthetic time series and obtained highly reliable results [54].In this study, the Hector software package was employed to estimate the parameters of various combinations of noise models, which were then used to analyze the noise in raw and denoised GNSS residual sequences at the selected GNSS sites in the two blocks.The AIC, BIC and BIC_tp were applied to describe the feasibility of each noise model.WN, WN plus PL noise (WN + PL), WN plus flicker noise (WN + FN), WN plus random walk noise (WN + RW), WN + FN + RW, generalized Gaussian-Markov noise (GGM) and WN plus generalized Gaussian-Markov noise (WN + GGM) were considered in the noise analysis.
The percentages of each noise model with the lowest AIC, BIC and BIC_tp values in the raw and denoised residual sequences of the selected GNSS stations (N, E and U) in Blocks 1 and 2 are shown in Table 6.The values of the AIC, BIC and BIC_tp were mostly consistent, thus demonstrating the reliability of the AIC, BIC and BIC_tp criteria used to determine the best noise model.Additionally, the optimal noise model for the raw GNSS residual sequence in the selected Blocks 1 and 2 regions can be regarded as a combination of the WN + FN + RW model with percentages of 46.7% and 46.7%, respectively, which is based on the BIC_tp criterion for the selection of the best noise model.Furthermore, the percentage of the WN + FN models was 26.7% in the Block 1 region, whereas that of the WN + PL model was 23.3% in the Block 2 region.
For the WD-PCA method for most cases in the selected Blocks 1 and 2 regions, a combination of GGM + WN can be regarded as the optimal noise model, followed by the GGM model.For the EMD-PCA method for most cases in the selected Blocks 1 and 2 regions, a combination of WN + FN + RW could be regarded as the optimal noise model, whereas the GCM model in Block 1 and WN + FN in Block 2 can be regarded as the secondmost optimal model.For the proposed method, a combination of WN + GGM model can be regarded as the optimal noise model for most cases in the selected Blocks 1 and 2 networks, whereas the GGM model in Block 1 and GGM + WN model in Block 2 can be regarded as the second-best model.One can conclude that the noise components in the raw and denoised GNSS sequences are complex and that applying only one model to all the GNSS positions in the selected GNSS region is unreasonable (Table 6).The power spectra of the raw and denoised GNSS residual sequences in the three directions at station P242 are shown in Figure A3.No significant period was removed by CEEMD, while WD-PCA and EMD-PCA eliminated high-and low-frequency noise, respectively; however, the clearance level was limited.The proposed method performed better than WD-PCA and EMD-PCA in eliminating low-and high-frequency noise, respectively.For high-frequency noise in the right side, both WD-PCA and the proposed method exhibited a favorable denoising effect, indicating that the two approaches can fully utilize the merits of the WD method for eliminating high-frequency noise.For the low-frequency noise in the intermediate section, the proposed method was significantly better than the other two methods for all three components.This indicates that the proposed method fully exploits the merits of CEEMD and WD, where CEEMD is first used to obtain various IMFs as well as because of its good adaptive processing ability by WD for noise-dominant IMFs.For the lower-frequency noise in the left side, the proposed method performed better than the other two methods, particularly for the horizontal components.This indicates that the proposed method fully considers the correlation between the different components of each station and the non-uniform behavior of the CME on a spatial scale.

Conclusions
To enhance the accuracy of GNSS positioning and feature extraction, an adaptive method using CEEMD and MPCA based on correlation coefficients and block spatial filtering was proposed.The proposed approach fully exploits the merits of CEEMD and WD to remove high-and low-frequency noise.Furthermore, the proposed method fully considers the correlation between the different components of each station and the nonuniform behavior of the CME on a spatial scale to eliminate low-frequency noise.Finally, the seasonal-and trend-term features of the denoised GNSS sequences were extracted using the CEEMD method.The properties of the proposed approach were estimated using a GNSS daily time series from two blocks in California, USA.The following four conclusions were drawn.
(1) Compared with other typically applied approaches, the newly proposed approaches were more accurate in denoising the GNSS daily time series.In the Block 1 region in the N, E and U directions, the mean STDs of the raw residual data, WD-PCA and EMD-PCA were (1.15, 1.36, 5.70) mm, (0.57, 0.51, 4.32) mm and (0.76, 0.87, 4.59) mm, respectively, whereas those of the proposed approach were 0.27, 0.29 and 4.03 mm, respectively.In Block 2 in the N, E and U directions, the mean STDs of the raw residual data, WD-PCA and EMD-PCA were (1.09, 1.20, 4.79) mm, (0.28, 0.28, 3.19) mm and (0.50, 0.57, 3.58) mm, respectively, whereas those of the proposed approach were 0.15, 0.20 and 2.86 mm, respectively.The performance of the proposed approach across the two regions was consistent, indicating that it can effectively decrease noise at lowand high-frequency in GNSS daily sequences.The proposed method is suitable for denoising nonlinear and nonstationary GNSS position sequences.(2) WD, EMD and CEEMD were used to extract the features from the GNSS daily data.
The results of the trend-feature extraction showed that the accuracy of CEEMD was 99.99% for trend-term feature extraction in the denoised GNSS daily sequences, which was higher than the 99.96% and 97.60% accuracy levels exhibited by WD for EMD, respectively.Compared with WD and EMD, CEEMD was more reliable and stable in Blocks 1 and 2. The results of seasonal-feature extraction demonstrated that the average PCC extracted via CEEMD was 0.36, which was 63.6% and 27.3% higher than those of WD and EMD, respectively.These results showed that the seasonal terms obtained by CEEMD were consistent with the residual time series, and that CEEMD is more reliable than WD and EMD.(3) For the raw GNSS data for most cases in Blocks 1 and 2, the WN + FN + RW model was the optimal noise model.For WD-PCA and EMD-PCA for most cases in Blocks 1 and 2, the WN + GGM and WN + FN + RW models were the best noise models.For the proposed method, the WN + GGM model was the best noise model in Blocks 1 and 2. The second-best model was the GGM model in Block 1 and the GGM + WN model in Block 2. These results were obtained possibly because the components of the noise in the raw and denoised GNSS data were complex, and that applying only one model to all the GNSS positions in the selected GNSS region is unreasonable.(4) Results of spectral analysis suggest that the proposed method is better than the other two methods for all three components.The proposed method is suitable for denoising nonlinear and nonstationary GNSS data.The advantage of the proposed method is that it fully exploits the merits of CEEMD and WD.CEEMD was first used to obtain various IMFs and then to obtain noise-dominant IMFs owing to its good adaptive processing ability.Finally, it fully considers the correlation between the different components of each station and the non-uniform behavior of the CME on a spatial scale.
PEER REVIEW 20 of 23

Figure 2 .
Figure 2. (a) Raw coordinate time series of P242 station, and (b) the corresponding residual time series.

Figure 2 .
Figure 2. (a) Raw coordinate time series of P242 station, and (b) the corresponding residual time series.

Figure 3 .
Figure 3. Flowchart of the proposed method.Figure 3. Flowchart of the proposed method.

Figure 3 .
Figure 3. Flowchart of the proposed method.Figure 3. Flowchart of the proposed method.

Figure 4 .
Figure 4. Decomposed results based on the CEEMD method in N direction of GNSS station P242.

Figure 4 .
Figure 4. Decomposed results based on the CEEMD method in N direction of GNSS station P242.

Figure 5 .
Figure 5. (a) Raw and denoised GNSS coordinate time series in N direction of site P242, and (b) their corresponding residual time series.(Black lines: raw time series; green lines: resuls via WD-PCA; blue lines: resuls via EMD-PCA; red lines: resuls via the propsed method).

Figure 5 .
Figure 5. (a) Raw and denoised GNSS coordinate time series in N direction of site P242, and (b) their corresponding residual time series.(Black lines: raw time series; green lines: resuls via WD-PCA; blue lines: resuls via EMD-PCA; red lines: resuls via the propsed method).

Figure 6 .
Figure 6.Effect of the proposed denoising method in U direction for Block 1. Grey lines denote original GNSS daily time series; red lines denote denoised GNSS daily time series; and black Ishapes indicate errors corresponding to GNSS positions.

Figure 6 .
Figure 6.Effect of the proposed denoising method in U direction for Block 1. Grey lines denote original GNSS daily time series; red lines denote denoised GNSS daily time series; and black I-shapes indicate errors corresponding to GNSS positions.

Figure 7 .
Figure 7. Standard deviations of raw and denoised GNSS residual data in three directions for Blocks 1 and 2.

Figure 7 .
Figure 7. Standard deviations of raw and denoised GNSS residual data in three directions for Blocks 1 and 2.

Figure 8 .
Figure 8. Extraction effect of GNSS seasonal term (yellow lines) in N direction based on CEEMD for Block 1 (a), and Block 2 (b).Grey lines denote de-trended GNSS sequence, and black I-shapes indicate errors corresponding to GNSS sequence.

Figure 8 .
Figure 8. Extraction effect of GNSS seasonal term (yellow lines) in N direction based on CEEMD for Block 1 (a), and Block 2 (b).Grey lines denote de-trended GNSS sequence, and black I-shapes indicate errors corresponding to GNSS sequence.

Figure A2 .
Figure A2.Decomposed results of the denoised GNSS data in N component at the P242 station based on WD (a), EMD (b), and CEEMD (c), respectively.

Figure A2 .
Figure A2.Decomposed results of the denoised GNSS data in N component at the P242 station based on WD (a), EMD (b), and CEEMD (c), respectively.

Figure A2 . 23 Figure A3 .
Figure A2.Decomposed results of the denoised GNSS data in N component at the P242 station based on WD (a), EMD (b), and CEEMD (c), respectively.

Figure A3 .
Figure A3.Power spectral density of station 'P242' for North (a), East (b), and Up (c) directions, expressed in dB of mm 2 /cpy.

Table 1 . Data loss rates for Blocks 1 and 2. Block 1 Data Loss Rate (%) Block 2 Data Loss Rate (%)
Remote Sens. 2023, 15, x FOR PEER REVIEW 6 of 23 denoising; subsequently, the new matrix X R ={xr j,1 ,xr j,2 ,…,xr j,h+1 } is obtained after denoising.(vi) The denoised matrix X R and monochromatic IMFs are reconstructed to obtain a new matrix X E ={xe 1 ,xe 2 ,…,xe n }.MPCA is applied to matrix X E and the value of cumulative contribution rate (CCR) of the first k components is calculated.(vii) If the value of CCR exceeds 85%, then the top k components are selected as the denoised signals, and a new denoised matrix X F ={xf 1 , xf 2 ,…, xf n } is reconstructed.

Table 2 .
CCs between each component location in site P242 based on the position time series and its corresponding IMFs.

Table 2 .
CCs between each component location in site P242 based on the position time series and its corresponding IMFs.

Table 3 .
Average standard deviations of raw and denoised residual data (mm).

Table 4 .
Similarity between denoised GNSS time series and trend term obtained via WD, EMD and CEEMD, separately.

Table 5 .
Similarity of seasonal terms based on WD, EMD and CEEMD.

Table 6 .
Percentage of each noise model with the lowest values of the Akaike information criterion (AIC), Bayesian information criterion (BIC) and BIC_tp of the raw and denoised GNSS residual sequences at Blocks 1 and 2 (the value in bold indicates the optimal noise model (%)).