Truncated Singular Value Decomposition Regularization for Estimating Terrestrial Water Storage Changes Using GPS: A Case Study over Taiwan

: It is a typical ill-conditioned problem to invert GPS-measured loading deformations for terrestrial water storage (TWS) changes. While previous studies commonly applied the 2nd-order Tikhonov regularization, we demonstrate the truncated singular value decomposition (TSVD) regularization can also be applied to solve the inversion problem. Given the fact that a regularized estimate is always biased, it is valuable to obtain estimates with di ﬀ erent methods for better assessing the uncertainty in the solution. We also show the general cross validation (GCV) can be applied to select the truncation term for the TSVD regularization, producing a solution that minimizes predictive mean-square errors. Analyzing decade-long GPS position time series over Taiwan, we apply the TSVD regularization to estimate mean annual TWS variations for Taiwan. Our results show that the TSVD estimates can su ﬃ ciently ﬁt the GPS-measured annual displacements, resulting in randomly distributed displacement residuals with a zero mean and small standard deviation (around 0.1 cm). On the island-wide scale, the GPS-inferred annual TWS variation is consistent with the general seasonal cycle of precipitations. However, on smaller spatial scales, we observe signiﬁcant di ﬀ erences between the TWS changes estimated by GPS and simulated by GLDAS land surface models in terms of spatiotemporal pattern and magnitude. Based on the results, we discuss some challenges in the characterization of TWS variations using GPS observations over Taiwan.


Introduction
As a fundamental Earth system component, terrestrial water storage (TWS) represents the sum of all water on and below land subsurface, including surface water, soil moisture, snow/ice, canopy water, and groundwater. It is of great interest to characterize TWS variations for enhanced understanding of water cycles and their roles in the evolving Earth system [1]. The Global Positioning System (GPS) technique has successfully been used to measure the crustal deformation associated with various geophysical processes [2,3], including Earth's elastic response to surface mass loadings (e.g., [4,5]). Besides other in situ and remotely-sensed TWS data, GPS-measured elastic deformation on ground surface has been considered as an independent data type for estimating TWS variations, providing

Truncated Singular Value Decomposition for TWS Estimation
The GPS-measured elastic loading deformation can be modeled as where = [d , d , ⋯ , d ] is the observation vector containing elastic displacements (e.g., in vertical direction) measured at GPS stations over the region of interest; = [x , x , ⋯ , x ] is the unknown parameter vector consisting of m TWS anomalies parameterized over the region of interest. In this study, the TWS anomalies are parameterized over a 0.2 ∘ 0.2 ∘ regular grid covering the entire Taiwan island; is a n m design matrix, with the i-th column vector containing the displacements at the locations of GPS stations in response to a unit mass anomaly given by the parameter x ; = [e , e , ⋯ , e ] is the vector of measurement noise.
The estimation of the TWS anomaly parameter is an ill-conditioned problem, and a regularization is needed in order to obtain a stabilized solution. While the Tikhonov regularization has been widely adopted for the TWS inversion, we apply a classic regularization method based on the SVD of the ill-conditioned design matrix. As a powerful computational tool, the SVD has been

Truncated Singular Value Decomposition for TWS Estimation
The GPS-measured elastic loading deformation can be modeled as where d = [d 1 , d 2 , · · · , d n ] T is the observation vector containing elastic displacements (e.g., in vertical direction) measured at n GPS stations over the region of interest; x = [x 1 , x 2 , · · · , x m ] T is the unknown parameter vector consisting of m TWS anomalies parameterized over the region of interest. In this study, the TWS anomalies are parameterized over a 0.2 • × 0.2 • regular grid covering the entire Taiwan island; A is a n × m design matrix, with the i-th column vector containing the displacements at the locations of GPS stations in response to a unit mass anomaly given by the parameter x i ; e = [e 1 , e 2 , · · · , e n ] T is the vector of measurement noise. The estimation of the TWS anomaly parameter x is an ill-conditioned problem, and a regularization is needed in order to obtain a stabilized solution. While the Tikhonov regularization has been widely adopted for the TWS inversion, we apply a classic regularization method based on the SVD of the ill-conditioned design matrix. As a powerful computational tool, the SVD has been widely applied to solve ill-conditioned and rank-deficient systems but hasn't been studied for the inversion of GPS-measured hydrological loading deformations. In this methodology, the original ill-conditioned design matrix is replaced with its low-rank approximation, which is formed after applying the SVD of the design matrix and neglecting the components associated with small singular values [17]. Replacing the design matrix with its approximation formulates a new well-conditioned inversion problem, whose minimum-norm solution uniquely exists and is referred as the truncated singular value decomposition (TSVD) solution of the problem.
Specifically, the ill-conditioned design matrix A can be decomposed as A = U·Σ·V T , where the n × m matrix U = [u 1 , u 2 , · · · , u m ] and the m × m matrix V = [v 1 , v 2 , · · · , v m ] consist of orthonormal vector sets {u i } and {v i }, respectively, which are called singular vectors of A; Σ = diag(σ 1 , σ 2 , · · · , σ m ) is a m × m diagonal matrix whose diagonal entries consist of singular values σ i of A. Conventionally, singular values are arranged in decreasing order. Figure 2 shows the singular values of the design matrix A of the Taiwan TWS inversion problem. The decay rate of the singular values measures how ill-posed a particular problem is. The problem is called mildly ill-posed if the decay of singular value σ K with the rank K has a power-law form of K −α with α ≤ 1, moderately ill-posed if α > 1, and severely ill-posed if the decay has an exponential form of e −αK [18]. Thus, α can be used to measure the degree of ill-posedness of a problem. The TWS inversion problem over Taiwan is a mildly ill-posed problem, as the decay can be fitted by the power-law function of σ K = 0.028 × K −0.741 ( Figure 2). Independent of actual data, the singular values (or the design matrix) only depend on the spatial distributions of GPS stations and TWS parameters. The SVD thus can be utilized in project planning for optimizing the condition number of an inverse problem.
Remote Sens. 2020, 12, x FOR PEER REVIEW 4 of 21 widely applied to solve ill-conditioned and rank-deficient systems but hasn't been studied for the inversion of GPS-measured hydrological loading deformations. In this methodology, the original ill-conditioned design matrix is replaced with its low-rank approximation, which is formed after applying the SVD of the design matrix and neglecting the components associated with small singular values [17]. Replacing the design matrix with its approximation formulates a new well-conditioned inversion problem, whose minimum-norm solution uniquely exists and is referred as the truncated singular value decomposition (TSVD) solution of the problem.  Figure 2 shows the singular values of the design matrix A of the Taiwan TWS inversion problem. The decay rate of the singular values measures how ill-posed a particular problem is. The problem is called mildly ill-posed if the decay of singular value with the rank K has a power-law form of with ≤ 1, moderately ill-posed if 1, and severely ill-posed if the decay has an exponential form of [18]. Thus, can be used to measure the degree of ill-posedness of a problem. The TWS inversion problem over Taiwan is a mildly ill-posed problem, as the decay can be fitted by the power-law function of = 0.028 . (Figure 2). Independent of actual data, the singular values (or the design matrix) only depend on the spatial distributions of GPS stations and TWS parameters. The SVD thus can be utilized in project planning for optimizing the condition number of an inverse problem.
which satisfies all the four Moor-Penrose conditions [19]. Then, the pseudoinverse solution of the inverse problem can be obtained as The SVD can be used to accurately compute the Moore-Penrose pseudoinverse of the matrix A as Remote Sens. 2020, 12, 3861 5 of 21 which satisfies all the four Moor-Penrose conditions [19]. Then, the pseudoinverse solution of the inverse problem can be obtained asx It can be shown that the model predictiond based on the pseudoinverse solution is the projection of the observation d onto the range space of A, aŝ Thus, the pseudoinverse solution minimizes the L2-norm of data misfits, d −d 2 , and is the least-squares solution of the problem. A favorable property of the pseudoinverse solution is that it always exists for any design matrix A, and represents the least-squares solution of the linear system [20].
From Equation (3), we can see that the pseudoinverse solution can be considered as a linear expansion in the orthonormal vectors {v i }, with the expansion coefficients determined by the structure of the design matrix (i.e., singular values and corresponding vectors {u i }) and the observations. As shown in Figure 3, the singular vectors v i associated with large singular values have broad-scale features primarily over regions with dense GPS station coverage. For example, the patterns of the first three singular vectors are dominated by large-scale features over western, southwestern and northern Taiwan, where GPS stations are most densely installed. The fourth and fifth singular vectors also shows features along the east coast. Although not so dense as over the northern and western regions, the coverage of GPS stations along the east coast is denser than over the central mountain area. Thus, the singular vectors associated with large singular values can model any large-scale signals over the most densely measured regions. They should be less vulnerable to individual data outliers potentially exist within the cluster of stations. As the singular value decreases, the associated singular vectors become increasingly dominated by local and high-frequency features, and thus are more appropriate to model local signals that would be detected by more sparsely scattered stations. Meanwhile, they are more sensitive to individual data noise. The last several singular vectors in Figure 3 have local features primarily over areas with sparse or no observations. vulnerable to individual data outliers potentially exist within the cluster of stations. As the singular value decreases, the associated singular vectors become increasingly dominated by local and high-frequency features, and thus are more appropriate to model local signals that would be detected by more sparsely scattered stations. Meanwhile, they are more sensitive to individual data noise. The last several singular vectors in Figure 3 have local features primarily over areas with sparse or no observations.  As can be seen from Equation (3), the i-th singular vector is scaled by the factor of σ −1 i ·u T i ·d in the pseudoinverse solution. As shown in Figure 2, for the Taiwan TWS inversion problem, there exists many small singular values σ i , which will severely amplify the effect of observation errors in d by the factor of σ −1 i if the corresponding singular vectors are included in the pseudoinverse solution, making the solution practically unusable. Additionally, the solution can vary significantly with even a small perturbation in the observation errors. The solution thus needs to be stabilized. The most common approach to regularize the problem is to truncate the summation series in Equation (3) to only the first K terms, so that the effects of small singular values are eliminated. The solution after the truncation is called the TSVD solution, and can be written aŝ Neglecting the terms after K can be considered as replacing the small singular values σ K+1 , . . . , σ m of A with zeros, which turns the originally ill-conditioned matrix A of rank m into a rank-deficient matrix A K of rank K < m. A K can be regarded as an approximation to the original matrix A, and the TSVD solution is the minimum-norm least-squares solution to the problem of d = A K ·x + e.

Generalized Cross Validation for Regularization Parameter Choice
Just as the selection of the regularization parameter in the Tikhonov regularization, the choice of the truncation term K determines the quality of a TSVD solution. The previous studies applying the 2nd-order Tikhonov regularization usually chose the regularization parameter empirically so that the solution would represent a satisfactory compromise between its smoothness and data fits. Although the empirical choice can produce useful solutions, we investigate if the choice of the truncation term can be determined based on any quantitative criteria. As a traditional method for choosing the regularization parameter, the generalized cross validation (GCV) has been applied to a range of geophysical inversion problems. The GCV method is based on a predictive mean-square error criterion. In other words, the optimal choice of the regularization parameter (i.e., the truncation term K for the TSVD solution) minimizes a measure of prediction errors at stations that would be left out from the inversion. Specifically, if the i-th datum d i is left out of the observation vector d, we can obtain a regularized solution x [i] , where the superscript [i] indicates the solution is obtained without the i-th datum. In an ideal case, the truncation term K should be chosen such that the model perfectly predicts the missing datum, i.e., i is the model prediction for the missing i-th datum. Generalizing such 'leave-one-out' criterion, the GCV seeks the regularization parameter (i.e., the truncation term K) that minimizes the predictive mean-square error for all the n data, i.e., It can be shown that the GCV function g(K) can be evaluated as wherex K is the TSVD solution truncated to the term K as given by Equation (5), and "Tr" denotes the trace of a matrix.

Simulation
In order to assess the TSVD solution, as well as the performance of GCV for choosing the truncation term, we conducted a simulation study based on the Catchment Land Surface Model (CLSM) in the Global Land Data Assimilation System Version 2 (GLDAS-2) [21], which is a global hydrological model integrated with space-and ground-based observations to generate land surface states (including soil moisture, precipitation, ground water and terrestrial water storage, heat and radiation fluxes, runoff, etc.) and fluctuations. We used the GLDAS-2.2 daily CLSM with a 0.25 • x 0.25 • spatial resolution, which is forced with the meteorological data sets from the European Center for Medium-Range Weather Forcasts (ECMWF), and assimilates the TWS anomalies inferred by the Gravity Recovery and Climate Experiment (GRACE) satellite mission [22]. We first produced 12 monthly mean TWS fields by averaging the daily GLDAS-2.2/CLSM TWS fields of each month in 2010. Then, we calculated monthly TWS anomalies as the differences between the monthly TWS fields and the mean TWS of the year. Figure 4A shows the TWS anomalies of September 2010 for example. There is no special reason for choosing September 2010, except its TWS anomalies show a higher degree of spatial variations. The monthly TWS anomalies are used to simulate monthly GPS surface displacement measurements. Specifically, for each month, we forward calculate vertical displacements at locations of 269 GPS stations, which are selected for the actual data analysis (see discussions in Section 3.2), based on an elastic-loading Green's function derived by assuming a non-gravitating, homogeneous and semi-infinite cartesian half-space [23]. The average Young's modulus and Poisson's ratio over Taiwan from the CRUST 1.0 model [24] are used in the forward modeling. Figure 4B shows the forward calculated vertical displacement of September 2010. We also add Gaussian white noise to the surface displacement in order to assess the impact of observation noise on the inversion. The standard deviation of the Gaussian white noise is determined by a given signal-to-noise ratio (SNR), which is defined as the ratio of the root-mean-square (RMS) of displacement signals at all stations to the standard deviation of the noise. Figure 4C,D shows the measurement noise and final measurements, respectively, simulated for September 2010 with SNR = 2. modeling. Figure 4B shows the forward calculated vertical displacement of September 2010. We also add Gaussian white noise to the surface displacement in order to assess the impact of observation noise on the inversion. The standard deviation of the Gaussian white noise is determined by a given signal-to-noise ratio (SNR), which is defined as the ratio of the root-mean-square (RMS) of displacement signals at all stations to the standard deviation of the noise. Figure 4C,D shows the measurement noise and final measurements, respectively, simulated for September 2010 with SNR = 2. Similar to the Tikhonov regularized estimate, the TSVD solution reflects a compromise between the bias and variance. The trade-off can be adjusted by changing the truncation term K. Truncating at a higher term produces the regularized estimate with smaller data misfits, but more complex spatial features, which could be dominated by the effect of observation errors. Figure 5 (the first row) compares the TSVD estimates truncated at different terms using the simulated measurements for September 2010. The estimate errors reflect the superposition of biases (the third row in Figure 5) and random errors due to measurement noise (the fourth row in Figure 5). As the truncation term K increases, the estimate shows greater spatial complexities and better fit to data. Additionally, the Similar to the Tikhonov regularized estimate, the TSVD solution reflects a compromise between the bias and variance. The trade-off can be adjusted by changing the truncation term K. Truncating at a higher term produces the regularized estimate with smaller data misfits, but more complex spatial features, which could be dominated by the effect of observation errors. Figure 5 (the first row) compares the TSVD estimates truncated at different terms using the simulated measurements for September 2010. The estimate errors reflect the superposition of biases (the third row in Figure 5) and random errors due to measurement noise (the fourth row in Figure 5). As the truncation term K increases, the estimate shows greater spatial complexities and better fit to data. Additionally, the bias of the estimate becomes smaller. However, the high-frequency features in the estimate increasingly reflect the data noise. For example, as shown in Figure 5 (the second and fourth rows), the TSVD estimates are dominated by true TWS signals when K ≤ 5. The RMS of random errors are always smaller than 0.7 cm, although the RMS of the biases can be larger than 1.9 cm. The signals over regions densely covered by GPS stations, such as northern and southwestern Taiwan, are largely recovered. Biases exist mainly over areas of sparse station coverage, particularly the very south of the island. As K increases from 5 to 41, the RMS of the true signals in the solution increases by only 8.5%. The noise-free solution of K = 41 (the 2nd row in Figure 5) recovers more small-scale signals, such as the negative TWS anomalies over the southern tip of the island. Meanwhile, the effect of the observation noise (the fourth row in Figure 5) is dramatically amplified by 11.3 times. For K = 41, the RMS of the noise-only solution reaches 8.6 cm, about 1.3 times larger than that of the noise-free solution (i.e., 6.4 cm). As shown in Figure 6, while the RMS of signals in the estimate remains relative stable after the first several K's, the effect of noise keeps amplifying rapidly as K increases. After K = 30, the noise starts to dominate the solutions, making them practically unusable. In practice, the curves shown in Figure 6 take different shapes depending on many factors such as the magnitude and distribution of signals, data noise, parameterization and regularization methods. They are also unknown as we usually have very limited information about the true signal. Hence, choosing the truncation term K represents a critical yet challenging step in the TSVD regularization.  The simulation study allows us to assess the performance of a given parameter-choice method, namely the generalized cross validation (GCV) method. We first define an optimal estimate, which is the TSVD estimate that minimizes the sum of the squares of estimate errors. Since the GLDAS model is considered as the true signal in the simulation, the estimate error can be calculated as the difference between the regularized estimate and the true signal. The estimate error reflects the superposition of the regularization error, which is the bias induced by the regularization (i.e., the truncation of the singular vector series), and the perturbation error, which is induced by the random measurement noise [25]. The optimal estimate is obtained by comparing the estimate errors of a series of TSVD estimates obtained by successively increasing the truncation term . The performance of GCV can be assessed by comparing the GCV-determined (denoted as ) with that of the optimal estimate (denoted as ). For each month of 2010, we apply the GCV to determine the truncation term K of the TSVD estimate. Figure 7 compares the optimal estimates (the second row) and the GCV-selected estimates (the third row) for all months of 2010. Although the spatial pattern and magnitude of the true TWS anomalies (the first row in Figure 7), as well as the simulated random measurement noise, vary from month to month, the GCV method is always able to select close to that of the optimal estimate (the fifth row in Figure 7). Depending on the characteristics of the signal and the simulated data noise, the selected can be either larger or smaller than the for different months, but they are close enough so that the two corresponding estimates are similar to each other (Figure 7). The RMS of the differences between the GCV-selected and the optimal estimates range between 0 cm to 2.6 cm over the year, which are always smaller than the RMS of the corresponding optimal solutions. The differences between the two estimates (the fourth row in Figure 7) reflect different degrees of regularization error and perturbation error, which depend on the amplitude and spatial distribution of the true signal and measurement noise, respectively. Therefore, the differences vary greatly from month to month in the simulation.
We also test whether or not changing magnitude of measurement noise would affect the performance the GCV method. We repeat the above analysis by simulating measurement errors with SNR = 1.5 and SNR = 3, respectively. For both cases, the GCV method always produces close-to-optimal estimates for all months. The RMS of the difference between the optimal and GCV-selected estimates ranges between 0 cm and 1.9 cm for SNR = 1.5, and between 0 cm and 0.9 cm when SNR = 3.
In practice, as we usually have limited knowledge about the true signal and data noise, it is impossible to quantify the difference between the GCV-selected and the optimal estimates. However, our simulation shows that, for the given inversion problem over Taiwan, the The simulation study allows us to assess the performance of a given parameter-choice method, namely the generalized cross validation (GCV) method. We first define an optimal estimate, which is the TSVD estimate that minimizes the sum of the squares of estimate errors. Since the GLDAS model is considered as the true signal in the simulation, the estimate error can be calculated as the difference between the regularized estimate and the true signal. The estimate error reflects the superposition of the regularization error, which is the bias induced by the regularization (i.e., the truncation of the singular vector series), and the perturbation error, which is induced by the random measurement noise [25]. The optimal estimate is obtained by comparing the estimate errors of a series of TSVD estimates obtained by successively increasing the truncation term K. The performance of GCV can be assessed by comparing the GCV-determined K (denoted as K GCV ) with that of the optimal estimate (denoted as K OPT ). For each month of 2010, we apply the GCV to determine the truncation term K of the TSVD estimate. Figure 7 compares the optimal estimates (the second row) and the GCV-selected estimates (the third row) for all months of 2010. Although the spatial pattern and magnitude of the true TWS anomalies (the first row in Figure 7), as well as the simulated random measurement noise, vary from month to month, the GCV method is always able to select K OPT close to that of the optimal estimate (the fifth row in Figure 7). Depending on the characteristics of the signal and the simulated data noise, the selected K GCV can be either larger or smaller than the K OPT for different months, but they are close enough so that the two corresponding estimates are similar to each other (Figure 7). The RMS of the differences between the GCV-selected and the optimal estimates range between 0 cm to 2.6 cm over the year, which are always smaller than the RMS of the corresponding optimal solutions. The differences between the two estimates (the fourth row in Figure 7) reflect different degrees of regularization error and perturbation error, which depend on the amplitude and spatial distribution of the true signal and measurement noise, respectively. Therefore, the differences vary greatly from month to month in the simulation. GCV-selected TSVD solution should be close to the optimal solution, when the measurement errors are independent and identically distributed Gaussian white noise with modest magnitudes. It is possible for GCV to select a larger truncation term for certain months (e.g., October 2010), which can be considered as a plausible upper limit for the choice of K. Second row, the optimal TSVD estimates for the monthly TWS anomalies, which minimize the RMS of estimate errors when compared with the true signals of the corresponding month; Third row, the TSVD estimates with the truncation term K selected by the GCV method; Fourth row, the difference between the optimal and GCV-selected estimates. Fifth row, the logarithms of GCV functions ℊ( ), in unit of cm 2 , used to select the truncation term . For each month, the is selected such that the GCV function takes its minimal value. The corresponding to the optimal estimate is also indicated on the curves.

Real Data Analysis
The observation from the Taiwan GPS network spanning 2000-2014 are processed by the GAMIT/GLOBK software package, as described by [15]. Tropospheric delays are modeled by using the Global Mapping Function [26], and ocean tidal loadings are modeled based on the FES2004 ocean tide model [27]. The parameters and covariance matrices obtained from the daily GAMIT processing are combined using GLOBK to produce daily position time series in the ITRF2008 reference frame [28]. For the estimation of TWS variations, the atmospheric loading (ATML) and non-tidal ocean loading (NTOL) effects need to be removed from GPS measured displacements. It Second row, the optimal TSVD estimates for the monthly TWS anomalies, which minimize the RMS of estimate errors when compared with the true signals of the corresponding month; Third row, the TSVD estimates with the truncation term K selected by the GCV method; Fourth row, the difference between the optimal and GCV-selected estimates. Fifth row, the logarithms of GCV functions }(K), in unit of cm 2 , used to select the truncation term K GCV . For each month, the K GCV is selected such that the GCV function takes its minimal value. The K OPT corresponding to the optimal estimate is also indicated on the curves. We also test whether or not changing magnitude of measurement noise would affect the performance the GCV method. We repeat the above analysis by simulating measurement errors with SNR = 1.5 and SNR = 3, respectively. For both cases, the GCV method always produces close-to-optimal estimates for all months. The RMS of the difference between the optimal and GCV-selected estimates ranges between 0 cm and 1.9 cm for SNR = 1.5, and between 0 cm and 0.9 cm when SNR = 3.
In practice, as we usually have limited knowledge about the true signal and data noise, it is impossible to quantify the difference between the GCV-selected and the optimal estimates. However, our simulation shows that, for the given inversion problem over Taiwan, the GCV-selected TSVD solution should be close to the optimal solution, when the measurement errors are independent and identically distributed Gaussian white noise with modest magnitudes. It is possible for GCV to select a larger truncation term for certain months (e.g., October 2010), which can be considered as a plausible upper limit for the choice of K.

Real Data Analysis
The observation from the Taiwan GPS network spanning 2000-2014 are processed by the GAMIT/GLOBK software package, as described by [15]. Tropospheric delays are modeled by using the Global Mapping Function [26], and ocean tidal loadings are modeled based on the FES2004 ocean tide model [27]. The parameters and covariance matrices obtained from the daily GAMIT processing are combined using GLOBK to produce daily position time series in the ITRF2008 reference frame [28]. For the estimation of TWS variations, the atmospheric loading (ATML) and non-tidal ocean loading (NTOL) effects need to be removed from GPS measured displacements. It has been demonstrated that applying ATML corrections to GPS daily position solutions is equivalent to applying the correction at the observation level in the GPS processing [29]. We use the 0.25 • × 0.25 • atmospheric pressure data over Taiwan from GLDAS-2/CLSM [22], and forward calculate the atmospheric loading deformation using the same elastic-loading Green's function as described above. Based on the forward modeling, atmospheric loading leads to approximately 1-2.5 mm annual variations in the vertical displacement over Taiwan. The non-tidal ocean loading effect is modeled based on ocean bottom pressure data from the ECCO data-assimilating model kf080 [30,31]. The modeled non-tidal ocean loading displacements have the magnitude of 0.5-2 mm over Taiwan.
After corrected for the ATML and NTOL effects, the vertical position time series from total 365 continuous GPS stations are fitted using a linear model consisting of a constant term, a linear rate, annual and semi-annual sinusoidal functions. The poseismic deformation of the 1999 Mw 7.6 Chi-Chi earthquake is corrected by using a GPS-constrained model that accounts for both short-and long-term poseismic displacements [32]. The coseismic and poseismic displacements associated with other earthquakes during the observation period are modeled by using a Heaviside step function plus an exponential decay function. The offset and decay time are fitted together with the other parameters in a least-squares estimation, which assumes the measurement errors to be Gaussian white noise. Figure 8 shows the daily vertical position time series and the fitted models for selected stations. Significant annual variations can be seen from these time series.
Remote Sens. 2020, 12, x FOR PEER REVIEW 12 of 21 has been demonstrated that applying ATML corrections to GPS daily position solutions is equivalent to applying the correction at the observation level in the GPS processing [29]. We use the 0.25° × 0.25° atmospheric pressure data over Taiwan from GLDAS-2/CLSM [22], and forward calculate the atmospheric loading deformation using the same elastic-loading Green's function as described above. Based on the forward modeling, atmospheric loading leads to approximately 1-2.5 mm annual variations in the vertical displacement over Taiwan. The non-tidal ocean loading effect is modeled based on ocean bottom pressure data from the ECCO data-assimilating model kf080 [30,31]. The modeled non-tidal ocean loading displacements have the magnitude of 0.5-2 mm over Taiwan.
After corrected for the ATML and NTOL effects, the vertical position time series from total 365 continuous GPS stations are fitted using a linear model consisting of a constant term, a linear rate, annual and semi-annual sinusoidal functions. The poseismic deformation of the 1999 Mw 7.6 Chi-Chi earthquake is corrected by using a GPS-constrained model that accounts for both short-and long-term poseismic displacements [32]. The coseismic and poseismic displacements associated with other earthquakes during the observation period are modeled by using a Heaviside step function plus an exponential decay function. The offset and decay time are fitted together with the other parameters in a least-squares estimation, which assumes the measurement errors to be Gaussian white noise. Figure 8 shows the daily vertical position time series and the fitted models for selected stations. Significant annual variations can be seen from these time series. Over Taiwan, the major source of hydrological loads is the rainfall during the annual monsoon (May to June) and typhoon (July to October) seasons. On average, 90% of the annual precipitation occurs during the wet season (May to October), while the other 10% occurs during the dry season (November to April) [33]. During the dry season, the southern part along the west coast remains completely dry, and northern and eastern Taiwan receive moderate occasional rains [34]. Figure 9 shows the phases of annual variations fitted for all GPS stations over Taiwan. It can be seen that most stations show the maximum subsidence around July and August, consistent with the seasonal the Choshui River Alluvian Fan, Chianan Plain, Pingtung Plain, Huadong Valley, and Yilan Plain, where the surface displacements may be largely associated with the compaction and expansion of underlying aquifer systems [35,36]. We thus remove the stations whose annual displacements are out-of-phase with the average rainfall patterns. In addition, stations with data length less than 4.5 years are excluded from the analysis for a reliable estimation of annual variation. A total of 269 GPS stations were kept for the inversion. Based on the best-fit amplitudes and phases, we predict the mean annual variations in vertical displacement for the stations. Specifically, for each station, monthly vertical displacements are predicted based on where ( ) is the mean vertical displacement of the i-th month at station s, and is the mean epoch (in 'day of year') of the month; and are the best-fit annual amplitude and phase for the station. Figure 10 shows the predicted mean displacements for all the 12 months. In general, stations across the entire island uplift during December-April, and subside during June-November, in spite of important differences in magnitude. Consistent with the amplitudes shown in Figure 9, stations Over Taiwan, the major source of hydrological loads is the rainfall during the annual monsoon (May to June) and typhoon (July to October) seasons. On average, 90% of the annual precipitation occurs during the wet season (May to October), while the other 10% occurs during the dry season (November to April) [33]. During the dry season, the southern part along the west coast remains completely dry, and northern and eastern Taiwan receive moderate occasional rains [34]. Figure 9 shows the phases of annual variations fitted for all GPS stations over Taiwan. It can be seen that most stations show the maximum subsidence around July and August, consistent with the seasonal rainfall variations. There also exists many stations showing the maximum subsidence out of the wet season. As shown in Figure 9B, most of these stations are located over the plains and basins, such as the Choshui River Alluvian Fan, Chianan Plain, Pingtung Plain, Huadong Valley, and Yilan Plain, where the surface displacements may be largely associated with the compaction and expansion of underlying aquifer systems [35,36]. We thus remove the stations whose annual displacements are out-of-phase with the average rainfall patterns. In addition, stations with data length less than 4.5 years are excluded from the analysis for a reliable estimation of annual variation. A total of 269 GPS stations were kept for the inversion.
Based on the best-fit amplitudes and phases, we predict the mean annual variations in vertical displacement for the stations. Specifically, for each station, monthly vertical displacements are predicted based on d s (t i ) = A s · cos 2π 365 where d s (t i ) is the mean vertical displacement of the i-th month at station s, and t i is the mean epoch (in 'day of year') of the month; A s and ϕ s are the best-fit annual amplitude and phase for the station. Figure 10 shows the predicted mean displacements for all the 12 months. In general, stations across the entire island uplift during December-April, and subside during June-November, in spite of important differences in magnitude. Consistent with the amplitudes shown in Figure 9, stations of large annual changes are found mainly over western and southwestern regions, as well as along the eastern coast. The TSVD solutions for monthly TWS changes are then obtained by inverting the modeled displacements in Figure 10, with the truncation term selected by GCV.
Remote Sens. 2020, 12, x FOR PEER REVIEW 14 of 21 of large annual changes are found mainly over western and southwestern regions, as well as along the eastern coast. The TSVD solutions for monthly TWS changes are then obtained by inverting the modeled displacements in Figure 10, with the truncation term selected by GCV.  Figure 11 shows the TSVD estimates of mean annual TWS changes by inverting GPS observations over 2000-2014. The spatiotemporal patterns of GPS-estimated TWS changes are consistent with those of monthly displacement predictions ( Figure 10). The goodness-of-fit can be assessed by checking the residuals shown in Figure 12. For all months, the residuals have zero means and standard deviations of about 0.1 cm, suggesting good fits for the TSVD estimates. On the island-wide scale, the GPS-inferred TWS reaches the annual maximum and minimum around August and February, respectively, consistent with the timing of the seasonal wet-dry cycle over the island in general. Figure 11 also compares the GPS-derived TWS changes with those from the GLDAS-2.2/CLSM, which models the sum of soil water, snow water equivalent, canopy water and shallow groundwater [22]. We also include two other GLDAS land surface models (i.e., GLDAS-2.0/CLSM and GLDAS-2.1/Noah) for comparison in order to understand the degree of uncertainties in GLDAS models over Taiwan. While GLDAS-2.0/CLSM is entirely forced with meteorological input data, GLDAS-2.2/CLSM also assimilates GRACE mascon solutions [22]. CLSM does simulate shallow ground water, but Noah does not. Thus, the TWS of the Noah model is the sum of soil moisture, snow water equivalent and plant canopy surface water. As shown in Figure 11, TWS changes of all three models have smaller magnitudes when compared with the GPS estimates.
Comparing the spatiotemporal patterns of the GPS-derived and model-predicted TWS changes (Figure 11), we find some degree of similarities, but also significant discrepancies. Strongly affected by the East Asian monsoons, the precipitation over Taiwan shows significant seasonal variabilities. The precipitation distribution also shows large spatial variabilities due to the presence of Central Mountain Range (CMR), which dominates the island's terrain by extending from the north to the south of the island [37]. During the winter months (December through March), the monsoon flow from the northeast bring in light rains along the northern and northeastern windward coasts. Meanwhile, dry conditions prevail over the leeside of the CMR and the western plains. The GLDAS-2.2/CLSM TWS changes during this period appear consistent with the winter precipitation  Figure 11 shows the TSVD estimates of mean annual TWS changes by inverting GPS observations over 2000-2014. The spatiotemporal patterns of GPS-estimated TWS changes are consistent with those of monthly displacement predictions ( Figure 10). The goodness-of-fit can be assessed by checking the residuals shown in Figure 12. For all months, the residuals have zero means and standard deviations of about 0.1 cm, suggesting good fits for the TSVD estimates. On the island-wide scale, the GPS-inferred TWS reaches the annual maximum and minimum around August and February, respectively, consistent with the timing of the seasonal wet-dry cycle over the island in general. Figure 11 also compares the GPS-derived TWS changes with those from the GLDAS-2.2/CLSM, which models the sum of soil water, snow water equivalent, canopy water and shallow groundwater [22]. We also include two other GLDAS land surface models (i.e., GLDAS-2.0/CLSM and GLDAS-2.1/Noah) for comparison in order to understand the degree of uncertainties in GLDAS models over Taiwan. While GLDAS-2.0/CLSM is entirely forced with meteorological input data, GLDAS-2.2/CLSM also assimilates GRACE mascon solutions [22]. CLSM does simulate shallow ground water, but Noah does not. Thus, the TWS of the Noah model is the sum of soil moisture, snow water equivalent and plant canopy surface water. As shown in Figure 11, TWS changes of all three models have smaller magnitudes when compared with the GPS estimates.
Comparing the spatiotemporal patterns of the GPS-derived and model-predicted TWS changes (Figure 11), we find some degree of similarities, but also significant discrepancies. Strongly affected by the East Asian monsoons, the precipitation over Taiwan shows significant seasonal variabilities. The precipitation distribution also shows large spatial variabilities due to the presence of Central Mountain Range (CMR), which dominates the island's terrain by extending from the north to the south of the island [37]. During the winter months (December through March), the monsoon flow from the northeast bring in light rains along the northern and northeastern windward coasts. Meanwhile, dry conditions prevail over the leeside of the CMR and the western plains. The GLDAS-2.2/CLSM TWS changes during this period appear consistent with the winter precipitation pattern. The GPS-derived TWS changes, however, shows negative anomalies over the entire island, including the northern and eastern areas, where the GLDAS-2.2/CLSM model predicts positive TWS anomalies. The differences between the GLDAS-2.0/CLSM and GLDAS-2.2/CLSM are significant, as well. The GLDAS-2.0/CLSM does not predict positive anomalies over the northern and northeastern areas either. Over the northern and eastern areas, the GLDAS-2.0/CLSM shows dry signals during the winter months, more consistent with the GPS estimate, although the amplitudes of the GLDAS-2.0/CLSM over the northern area are much smaller than those of the GPS estimates.

Discussion
It is not uncommon for Taiwan to face severe droughts, and the risks of future droughts have increased with climate changes [38]. Taiwan's water resource availability critically depends on annual rainfalls, which supply the river water diverted for human usage, and recharge reservoirs and groundwater storage. Despite receiving the abundant annual precipitation 2.5 times higher than the world's average, most of the water from precipitation either evaporates or directly discharge into the sea due to its steep terrain. Additionally, the rainfall pattern over Taiwan shows complex variabilities in both space and time [37,39]. It is thus of interest to characterize TWS variations for the water source allocation and management of Taiwan.
In spite of large-scale similarities between the GPS-estimated TWS changes and GLDAS models, significant discrepancies exist on smaller spatiotemporal scales, not only between the GPS estimate and GLDAS models, but also among different GLDAS land surface models. As shown in Figure 11, the largest annual TWS variations of the GLDAS-2.2/CLSM are mainly over the western regions, particularly over the coastal basins and plains, such as the Taichung basin, Choshui River alluvial fan, Chianan plain, and Pingtung plain. The annual TWS peaks around September after the summer rain season. GPS also infers large annual changes over southwestern Taiwan, but they are more localized around 23° N when compared to those inferred from the GLDAS models. The western plains contain abundant aquifer systems. Broad-scale aquifer-system compactions and land deformations due to excessive groundwater pumping have been reported [36,40,41]. As many stations over the western plains were excluded from our analysis, the GPS estimate may lack the sensitivity to potential large TWS variations over the region.
Along the Huadong Valley of the eastern Taiwan, many GPS stations coherently show large annual vertical displacements, with amplitudes comparable to, or even greater than those of the southwestern stations, which are over the region of the largest annual TWS variations modeled by the GLDAS-2.2/CLSM. As the maximum subsidence and uplift measured by these stations occur Starting from March (to mid-May), the airflow from the southwest gradually strengthens and starts to bring warm and moist over the island [37]. The western slopes of the CMR start to receive precipitations, while the eastern and southeastern areas remain wet as the springtime cold fronts pass through the island. Nevertheless, the rainfall accumulation of these months remains relatively low for most of the island. During this period, both the GPS and GLDAS models infer island-wide TWS increases.
During the Meiyu season (from mid-May to mid-June), widespread precipitation falls over Taiwan due to formation of the east-west extended weather front between the continental cold air and Pacific warm air brings. The rainfall over western slopes of the CMR is also enhanced by effects of topography and local winds [37]. During this period, although both the GPS estimate and GLDAS-2.2/CLSM show overall TWS increases, they show opposite spatial patterns. While GPS-estimated TWS increases largely occur over the eastern Taiwan, particularly along the Huadong Valley, those modeled by GLDAS-2.2/CLSM are primarily over the western plains, with the eastern coastal area remaining dry. Furthermore, the GLDAS-2.0/CLSM and GLDAS-2.1/Noah show more uniform TWS increases over both the eastern and western areas, inconsistent with either the GPS estimate or GLDAS-2.2/CLSM model.
During late summer months (July and August), the precipitation usually reaches annual maxima for most of the island [37]. The western plain and western slopes of the CMR usually show a peak in rainfall in early August, as a result of the joint influence of the summer monsoon flow from the southwest and tropical storms from the west Pacific. The southeastern and eastern slopes of the CMR also receive abundant precipitation as the tropic storms pass Taiwan from the west Pacific. Meanwhile, the northern and northeastern coasts receive relatively smaller amount of precipitation during a year. Consistent with the general precipitation pattern, the GPS estimate and GLDAS models show increased TWS over central and western Taiwan. Over northern and eastern areas, different from the GLDAS-2.2/CLSM model which show TWS decreases, both the GPS estimate and GLDAS-2.0/CLSM show TWS increases. The GPS-estimated largest TWS increases locate along the central east coast, rather than over the southwestern region where both the GLDAS-2.0/CLSM and 2.2/CLSM show the most significant TWS increases. Besides, the GPS-estimated TWS also show larger increases over the northern area.
Starting from September, the northeasterly monsoon flow gradually replaces the southwesterly flow, and transits to westerly flow later. As a result, the windward northern and northeastern coasts remain rainy and usually show a rainfall maximum of the annual cycle in autumn, while the southern and western parts of the island gradually become dry [37]. As shown in Figure 11, from September onwards, both the GPS estimate and GLDAS-2.2/CLSM show continuous TWS decreases. Their major differences still exist over the northern and eastern areas. Unlike the GPS-estimated TWS which decreases from the annual maximum, the GLDAS-2.2/CLSM TWS increase from the annual minimum.

Discussion
It is not uncommon for Taiwan to face severe droughts, and the risks of future droughts have increased with climate changes [38]. Taiwan's water resource availability critically depends on annual rainfalls, which supply the river water diverted for human usage, and recharge reservoirs and groundwater storage. Despite receiving the abundant annual precipitation 2.5 times higher than the world's average, most of the water from precipitation either evaporates or directly discharge into the sea due to its steep terrain. Additionally, the rainfall pattern over Taiwan shows complex variabilities in both space and time [37,39]. It is thus of interest to characterize TWS variations for the water source allocation and management of Taiwan.
In spite of large-scale similarities between the GPS-estimated TWS changes and GLDAS models, significant discrepancies exist on smaller spatiotemporal scales, not only between the GPS estimate and GLDAS models, but also among different GLDAS land surface models. As shown in Figure 11, the largest annual TWS variations of the GLDAS-2.2/CLSM are mainly over the western regions, particularly over the coastal basins and plains, such as the Taichung basin, Choshui River alluvial fan, Chianan plain, and Pingtung plain. The annual TWS peaks around September after the summer rain season. GPS also infers large annual changes over southwestern Taiwan, but they are more localized around 23 • N when compared to those inferred from the GLDAS models. The western plains contain abundant aquifer systems. Broad-scale aquifer-system compactions and land deformations due to excessive groundwater pumping have been reported [36,40,41]. As many stations over the western plains were excluded from our analysis, the GPS estimate may lack the sensitivity to potential large TWS variations over the region.
Along the Huadong Valley of the eastern Taiwan, many GPS stations coherently show large annual vertical displacements, with amplitudes comparable to, or even greater than those of the southwestern stations, which are over the region of the largest annual TWS variations modeled by the GLDAS-2.2/CLSM. As the maximum subsidence and uplift measured by these stations occur around February and August, respectively, consistent with the general annual precipitation cycle, we choose to include them for the TWS inversion. Consequently, the GPS-derived TWS exhibits significant annual variations along the east coast with annual maxima in August. The GLDAS-2.2/CLSM model, in contrast, only shows moderate changes along the east coast, secondary to the dominant annual variations over the western regions. Formed by river channel and alluvial fan deposits, the Huadong Rift Valley has a huge thick groundwater layer. The discrepancies between the GPS-estimated and GLDAS-modeled TWS changes could arise from the fact that the GLDAS-2.2/CLSM does not model deep groundwater [42]. Accurate modeling of groundwater, particularly at regional scales, requires detailed geological, hydrogeological and topographical information [43]. To compare the GPS estimate with ground water models over this region, it seems critical to first assess the uncertainty in the modeled groundwater, particularly at the regional scale. It is also possible that vertical displacements of the GPS stations are associated with annual groundwater withdrawal and recharge [44], rather than purely the elastic response to the TWS loading. Understanding the causes of these large annual vertical displacements needs detailed investigation of the hydrogeological property around the stations.
In addition, disagreements between the GPS estimate and GLDAS-2.2/CLSM are found over the northern and northeastern regions. The GPS-inferred and GLDAS-modeled annual TWS variations have the maximum phase difference of 5-6 months. Specifically, the GPS-inferred TWS reaches its annual maximum value around August, in contrast to February as modeled by the GLDAS-2.2/CLSM. On the other hand, the GLDAS-2.0/CLSM and GLDAS-2.1/Noah show more consistent phase with the GPS estimate over these areas, with the phase differences reduced to 1-2 months. Compared with the GLDAS-2.0/CLSM and GLDAS-2.1/Noah, the GLDAS-2.2/CLSM assimilates GRACE satellite observation, whose effective spatial resolution is about 350 km [45]. With GRACE data assimilated, the improved groundwater simulation by GLDAS-2.2/CLSM has been demonstrated by the evaluation using in situ groundwater well data over several regions around the world [22], such as North America, Brazil, and Africa. The impact over Taiwan is not clear yet. Given the relative small area of Taiwan (about 36,000 km 2 ), GRACE data lack sufficient spatial resolution for constraining TWS variations over Taiwan. The GRACE systematic stripe errors and leakage errors could induce large uncertainties in the GRACE-inferred TWS changes on small spatial scales [46]. It is necessary to understand the difference between GLDAS models with and without GRACE assimilation, particularly over Taiwan, before they could be used to evaluate the GPS estimates.
Over northern Taiwan, the GPS stations measure annual displacements of diverse amplitudes and phases, even if some stations are very close to each other ( Figure 9). It may indicate that many GPS stations are sensitive to local non-loading processes. The Taipei Basin and Yilan plain in the northern Taiwan, for example, contain abundant aquifers. Taipei Basin had subsidence issues due to excessive groundwater pumping [47]. Although the subsidence gradually slowed down after the prohibition of groundwater pumping in 1970s, the ground surface deforms as a result of the aquifer compaction and expansion with groundwater level changes. Yilan plain has the similar land deformation problem mainly due to the soil compaction induced by the groundwater over pumping [35].
As shown in Figure 1, the Taiwan GPS network has a highly non-uniform spatial distribution. A majority of the stations are over western plains and along eastern coasts. As discussed above, land deformations associated with groundwater withdrawal and recharge have been reported over these regions. When the annual displacements from these stations are used for the TWS estimation, the deformations associated with the seasonal groundwater withdrawal and recharge should be carefully investigated. Thus, assessing the GPS-inferred TWS change over Taiwan requires a joint analysis of regional precipitations, river flux, hydro-geological properties, groundwater recharge and consumption.

Conclusions
In this study, we introduce the TSVD regularization for the TWS estimation using GPS-measured surface displacements, and demonstrate it can effectively model TWS changes, particularly over the areas under dense GPS station coverage. The estimate is thus less vulnerable to potential observation outliers. We also show the GCV can be applied to select the truncation term for the TSVD regularization, producing a solution that minimizes predictive mean-square errors.
We apply the TSVD regularization to estimate mean annual TWS variations over Taiwan using decade-long GPS vertical position time series. We show that the TSVD estimate is able to sufficiently fit the GPS-measured annual displacements, resulting in randomly distributed displacement residuals with a zero mean and a standard deviation of 0.1 cm. Comparing the GPS-inferred TWS variations with GLDAS models, we find some similar spatiotemporal patterns, particularly over the southwestern Taiwan. Meanwhile, significant discrepancies are observed over the northern and northeastern areas. The differences may be explained by the fact that surface water and deep groundwater components are not simulated in GLDAS models. They may also reflect some challenges in utilizing GPS measurements to quantify TWS changes for Taiwan. The island of Taiwan has a relatively small area, but its TWS is highly variable in both space and time. The GPS network over Taiwan shows non-uniform distribution, with a majority of stations over coastal plains containing abundant aquifer systems. It is critical to carefully discriminate loading signals from those associated with other non-loading processes within the displacement measurements. In addition, given the small area of Taiwan, evaluating the GPS-estimated TWS variation may require a fine-scale TWS model based on high-resolution meteorological, hydrological, hydro-geological and topography data.