Improved Hydrological Loading Models in South America: Analysis of GPS Displacements Using M-SSA

: Environmental loading, in particular from continental water storage changes, induces geodetic station displacements up to several centimeters for the vertical components. We investigate surface deformation due to loading processes in South America using a set of 247 permanent GPS (Global Positioning System) stations for the 2003–2016 period and compare them to loading estimates from global circulation models. Unfortunately, some of the hydrological components, and in particular surface waters, may be missing in hydrological models. This is especially an issue in South America where almost half of the seasonal water storage variations are due to surface water changes, e.g., rivers and ﬂoodplains. We derive river storage variations by rerouting runoffs of global hydrology models, allowing a better agreement with the mass variations observed from GRACE (Gravity Recovery and Climate Experiment) mission. We extract coherent seasonal GPS displacements using Multichannel Singular Spectrum Analysis (M-SSA) and show that modeling the river storage induced loading effects signiﬁcantly improve the agreement between observed vertical and horizontal displacements and loading models. Such an agreement has been markedly achieved in the Amazon basin. Whilst the initial models only explained half of the amplitude of GPS, the new ones compensate for these gaps and remain consistent with GRACE.


Introduction
Earth's crustal deformations, such as tectonic plate motions, volcanism, and landslides, arise at a range of scales in space and time ranging from local to continental scale and from seconds to millions of years, respectively. In addition, detectable changes in the shape of the solid Earth and its gravity field arise due to the varying mass distribution of the surface fluids (oceans, atmosphere, groundwater, soil moisture, lakes, snow and ice) [1]. The loading signals appear mainly seasonally but there are also higher frequency signals including ocean tidal loading (primarily semidiurnal and diurnal). Furthermore, a secular signal (Glacial Isostatic Adjustment) is induced by the Earth's response to past fluctuations of ice sheets, comprising large, measurable changes in sea level, 3D crustal motion, the gravity field, and Earth's rotation [2]. Beside global climate changes, loads exhibit regional variations over all time scales and the different physical processes contributing to loading effects are not globally uniform.
Analysis of loading signal provides a better understanding of the interaction between the solid Earth and the mass redistributions at its surface linked to natural climate variability and human activities. Better understanding of these interactions is essential for applications such as how coasts react to relative sea level change and storms as well as natural hazard mitigation. The spatial and temporal pattern analysis of the loading signal helps to separate and quantify the different sources and frequencies from daily to decadal time scales and to infer hydrologic local/regional signals linked to recent climate change. Nevertheless, climate driven changes in water storage can have non-negligible contribution to decadal sea level changes [3]. Moreover, accurate crustal deformation modelling is required to measure sea level rise, which is not spatially uniform and is a complex function of different parameters including local land movement [4,5].
Space geodesy provides unprecedented coverage of the Earth's surface and can contribute to monitoring the interactions between climate change and Earth's time-dependent shape and gravity field. Precise geodetic techniques enable ground-and space-based observations that are critical to a wide array of scientific disciplines, including seismology, geodynamics, and meteorology. The Global Navigation Satellite System (GNSS), and in particular the Global Positioning System (GPS), have become essential tools for the mm-to cm-level positioning requirements within geosciences. GPS has already been used for measuring 3-D surface loading deformations of different natures: tidal and nontidal ocean, atmosphere and hydrology [6][7][8][9]. GPS's high temporal resolution was also used for studying transient loading signals induced by storm surge [10]. Nontidal loading effects are generally not taken into account during GNSS data processing and analysis for geophysical studies requiring high precision, although these can have non-negligible impact on geodynamical interpretation [11][12][13]. Discriminating the seasonal signals from other sources of deformation in geodetic measurements is essential to isolate tectonic signals and to monitor spatiotemporal variations in continental water storage. Most of the vertical displacement variability can be explained by atmospheric, oceanic and hydrological loading effects for periods ranging from a few days to decades [14].
Gravity satellite missions such as the Gravity Recovery and Climate Experiment (GRACE) can also observe the movement of environmental fluids since a change in surface or underground water storage implies a mass change and corresponding gravity field change [15]. Therefore, the total water mass change, both in liquid and ice forms, can be estimated by measuring the gravity change. Satellite gravimetry missions as GRACE showed great success for climate research studies and other geophysical and geodetic applications. Since its launch in 2002, followed by GRACE Follow-On in 2018, the GRACE mission has mapped the time-variable and mean gravity field of the Earth at typically monthly temporal resolution. It provides estimates of seasonal, yearly, and long-term hydrological, cryospheric and oceanic mass variations at spatial scales of a few hundred km and greater, to accuracies approaching 1 cm water thickness. The results can be used to monitor water storage and improve hydrological models [16]. They also help to determine the role of continental water variability in global mean sea level change, to better understand the water transfer between the land and atmosphere (precipitation and evaporation) at regional scales [17], and to climate change impact [18].
GPS and GRACE provide very long time series of observations (20 years or more), that are extremely complementary in terms of spatial and temporal resolutions, for monitoring loading effects and enhancing global understanding of the mass transport at the Earth's surface. The time series observed by these two techniques are useful to study hydrological loading signals and to infer change in terrestrial water storage. Combining GPS and GRACE observations allows discriminating regional and local signals. For instance, recent studies demonstrated that dense GPS networks could provide quantitative estimates of water storage variations [19] or give insights of drought events [20][21][22]. Separating GPS observations into regional and local surface mass loading contributions can provide an independent measure of total water storage at the watershed scale [23]. Studies also evidenced that fast water storage changes contribute to daily GPS height time series and that a GRACE-assimilating hydrological model would provide a promising option for removing hydrology-induced vertical deformation from GPS time series [24]. Others try to improve the understanding of nontectonic stress and forces that modulate seismicity in California from loading signal analysis [25].
Loading effects can explain large parts of the annual signal in position time series. Nevertheless, the errors in the geophysical models are still significant and discrepancies exist between models and observations. For instance, in most global hydrology models, only vertical fluxes are modelled and they are not realistic in frozen areas (e.g., snow underestimation). Seasonal variations shown by GRACE and GNSS may significantly differ from the different global hydrological models, which may be partly due to the noninclusion of river runoff and water reservoirs in the models. Moreover, mismodelled subdaily periodic signals can induce station-dependent GNSS systematic errors affecting the vertical rate, for instance, used in sea level studies. Contrary to ocean tides and their corresponding loading effects, hydrology is less known, allowing large error sources. It is particularly fundamental to model rapid loading effects that could be aliased into the solutions masking some of the long period contribution of climate change. River storage is important at low latitudes such as evidenced in the case of the Mekong [26], Amazon [27], or Niger [28]. Strong signals with high amplitudes can be observed along the major river channels and river storage models using relocation procedure significantly improve the hydrological loading estimates [29].
Having the world's largest continental seasonal water variations, the Amazon area has a very strong hydrological signal (both spatially and in amplitude) and exhibits interannual anomalies linked to extreme event of drought and floods. This area is of great interest concerning the Earth's hydrological system understanding, including the exchange of the water between the river, rainforest, ocean and atmosphere. Seasonal patterns of water circulation and inundation duration exhibit strong interannual variations in Amazonian floodplains [30]. This area is therefore well suited for hydrological loading studies and South America is equipped with a large number of long-term running GPS stations, although these are sparsely distributed.
Previous studies analyzed the loading signal in South America using space geodesy time series. It has been demonstrated that GRACE signal is consistent with in situ water level and discharge data [31] and that GRACE and GPS give complementary estimates of vertical displacement in the Amazon basin [32]. The observed displacement includes seasonal, interannual and secular variations in mass loading. Despite the GRACE hydrological signal being dominated by seasonal and long-term variability, recent changes in terrestrial water storage in South American river basins have been evidenced [33]. In particular it concerns drought in 2010 in the Amazon area, in 2012-2014 in the Sao Paulo region [34] or the 2015-2016 severe drought, as well as extreme floods in 2011-2012 and 2013-2014 [32]. GPS time series analysis is also useful to monitor hydrological hazard, as it was performed on the vertical loading signal induced by drought in Brazil [35]. Some studies also used geodetic observation to infer the terrestrial water storage and evidenced some extreme anomalies [36]. Most of these studies focus on the vertical component.
In general, there is a good agreement between GPS and GRACE observation of the hydrological signal. Nevertheless, some studies exhibited significant discrepancies between GPS observations and the loading deformation models. For instance, at NAUS (Manaus, Brazil), GRACE and GPS are close in phase and amplitude, but hydrological models underestimate the vertical displacement amplitude [34,36]. From 20 to 70% of the GPS vertical signal can be explained by hydrological loading, but about 50% of the GPS signal remained unexplained by the models, while GRACE systematically agrees better with the GPS time series [37,38]. The mean annual amplitude of the vertical loading inferred from GRACE reached 85% of the GPS signal, whereas it is only 46-48% for the hydrological models [35]. Discrepancies also exist in the decadal trend between GRACE and global models over river basins. For instance, in Amazon, GRACE shows a large increasing trend whereas most models estimate decreasing trends [39]. They highlighted potential areas of future model development, particularly simulated water storage. A better insight on land-surface processes is requested to improve terrestrial water storage and regional climate models over South America [40]. All these studies confirm the importance of considering more complex modelling able to represent the main hydrological processes. In this context, GPS, GRACE, and more accurate hydrological models are essential to improve our understanding of the water cycle and impact of climate change over this area. One way we investigate in this study is to include surface waters (rivers) that are usually absent in the most hydrological models. Indeed, it is usually assumed that only soil moisture, snow/ice cover and canopy storage describe the continental water storage.
In this study, we assess new hydrological models that we have produced to include river runoff in the three larger river basins of South America: the Amazon, Orinoco and Parana. For this, we use the M-SSA (Multichannel Singular Spectrum Analysis) approach, a data-adaptive, multivariate and nonparametric method that simultaneously exploits the spatial and temporal correlations of time series to extract common modes. This allows consideration of the full 3D signal at the same time, whereas most of previous studies in this area focused on the vertical displacement. The SSA (Singular Spectrum Analysis) has already been used to analyze loading signals [40][41][42] for the vertical component and the effectiveness of the M-SSA method has been demonstrated for GRACE data analysis [43]. After generating the signal induced by the rivers, we added this contribution to the MERRAland [44] and GLDAS/Noah [45] models to generate the full hydrological loading signal. We considered these two models since they employ different forcing data and snow models that vary considerably. We apply M-SSA to compare GPS time series to the modelled time series computed using classical hydrological models, our new models including the river contribution and the loading signal inferred from GRACE observations. In Section 2, we present the atmospheric, oceanic and hydrological general circulation models used for the loading computations, as well as the river modelling (Section 2.1); we describe also the GPS time series used to assess these loading models (Section 2.2). We explain the M-SSA methodology and the Taylor diagram approach in Section 2.3. Our analysis results are shown and described in Section 3. Section 4 is devoted to the discussion and concluding remarks.

Materials and Methods
In this Section, we first present the loading models used for the comparison with the GPS observations, with a particular emphasis on the modeling of the river storage, as the MERRA-land [44] and GLDAS/Noah [45] models do not include the surface water component. We then describe the GPS solution used in this study. Finally, we explain the M-SSA technique used to extract the spatially and temporally coherent signals.

General Circulation Models
We computed surface deformation due to atmospheric and induced oceanic loads using surface pressure from the operational ECMWF (European Centre for Medium-Range Weather Forecasts) with the TUGO-m [46] barotropic ocean model forced by ECMWF pressure and winds (ATMMO). Hydrological loading effects were computed using total land water storage from GLDAS/Noah [45] or MERRA-land [44]. We also used the v2.4 version of the global iterated GRACE mascon solutions (update of [47]). More details regarding the loading computations can be found in [14,48].

River Modelling
As the two hydrology models used here, e.g. GLDAS/Noah [45] and MERRA-land [44], do not take into account the surface water storage (lakes, rivers, etc.), we developed a similar approach to [27] to model the surface water storage for the Amazon, Orinoco and Parana Rivers. The water balance equation in hydrology models is equal to: where s(t), p(t), e(t) and q(t) are respectively the total water storage (soil moisture, snow, canopy water), the total precipitation (rain and snow fall), the evapotranspiration and the total runoff (surface and subsurface). The river water storage was estimated by rerouting the runoff on the TRIP [49,50] river model at 0.5 • resolution, and computed using the continuity equation [7,9]: where h(t), u and L are respectively the river equivalent water height, the flow velocity and the distance to the downstream cell. The terms u L h(t) and ∑ i u i L i h i (t) are respectively the downstream flow and the flow from all upstream river cells.

Flow Velocity
The only unknown variable in the previous equation is the flow velocity u. We chose to adjust it individually for each river basin by fitting the equivalent water height h(t) from the model to water elevation from ENVISAT radar altimetry [51] and surface extent from MODIS (Moderate Resolution Imaging Spectroradiometer) [52] NDVI (Normalized Difference Vegetation Index) MOD13Q1 products [53]. Our river model was then validated by comparing the modeled discharge to observations from the Global Runoff Data Center (GRDC, https://www.bafg.de/GRDC/ accessed on 1 March 2021) and comparing total water storage (river and hydrology model) to GRACE mass changes for each individual basin. Figure 1 shows the TRIP routing scheme for the three major river basins in South America, as well as the localization of the 182 virtual stations where radar altimetry from ENVISAT satellite covering the 2002-2010 period are available (89 for the Amazon, 33 for the Orinoco and 62 for the Parana) which were used to estimate the mean river flow and the 14 discharge stations (ten for the Amazon, two for the Orinoco and two for the Parana) used for validation. We used altimetry measurements from the ENVISAT satellite, covering the 2002-2010 period. As each 0.5 • TRIP cell is not fully covered by water, h(t) in Equation (2) is an equiv-alent water height which cannot be directly compared to the radar altimetry measurements. We need to determine the variable surface water extent of each TRIP cell corresponding to the altimetric datasets. We adopt the methodology developed in [54] to derive lake water extent from 16-day MODIS/NDVI imagery [53]. We define water when NDVI values are lower than 0.1 and NIR (near infrared) reflectance is lower than 0.2.
The measured equivalent water height derived from radar altimetry and MODIS measurements over each 0.5 • cell is defined as: where h ALTI (t), h 0 , S MODIS and S tot are respectively the altimetric water height measurements, a reference altitude, the surface water extent over the TRIP cell and its total surface. For each hydrology model, we ran the river model using the continuity equation for different velocities. We then correlated the modeled h(t) (Equation (2)) to the measured h Measured (t) (Equation (3)) equivalent water heights and chose the optimal constant flow velocity u for each individual water basin. We found u equal to 30 cm/s for both GLDAS/Noah and MERRA-land for the Amazon River, 64 cm/s for both hydrology models for the Parana River. For the Orinoco River, the velocity was equal to 27 cm/s for GLDAS/Noah and 23 cm/s for MERRA-land. It is important to notice that the optimal flow velocity for the Amazon River was the same as in [27], in spite of different optimization methodology, as they correlated the total water storage of the entire basin with GRACE observed mass variations.

Validation of the River Models
We validated our river models by comparing them to independent datasets: (1) mean annual river discharge gauges from GRDC, and (2) total mass variations over each individual basin with GRACE estimates. Figure 2 shows the comparison between the mean annual modeled discharge for a selection of six gauges in the Amazon, Orinoco and Parana River basins and gauge measurements from the GRDC. Although our river modeling is rather simple, there is a significant good agreement between our optimal river model and the discharge data, both in terms of amplitude and phase. A perfect match between the modeled and the measured discharge cannot be expected, as real discharge can vary significantly along the river whereas our model has a low resolution (0.5 • ) and a constant flow velocity. Some of the gauges (for example Balsa Santa Maria on the Parana River) show strong interannual variability, represented by the error bars on Figure 2, which may not be computed on the same time period than our model. Improving the discharge modeling would require the use of Manning's equation and the full knowledge of the geometry of each river (cross section area, hydraulic radius and channel slope), variable in time (flooding seasons) and space. This is far beyond the scope of this paper, as we only want to model the surface water storage with sufficient accuracy to compute the induced loading effects. Our second validation technique was the comparison between the total mass storage, e.g., the sum of the land-water storage from the hydrology model and the corresponding river storage, to GRACE estimates. Figure 3 shows the mean seasonal storage variations for each individual river basin compared to the GRACE iterated global mascons (v02.4, update of [47]) for the period 2003-2016. As also shown by [27], a simple rerouting of the runoff from the hydrology model, added to their modeled land water storage, allows a good agreement with the modeled storage from the GRACE mission at seasonal timescales. GLDAS/Noah and MERRA-land hydrology models only explain half of the seasonal mass variations deduced from GRACE, in particular for the Amazon and Orinoco river basins. When we added the river storage, filtered at GRACE resolution, the agreement between the models and GRACE significantly improved spatially and temporally. As the surface water component was not negligible, we chose to improve our hydrological loading estimates, and the comparison with geodetic observations, by adding the surface displacements due to the river loading to the loading due to the soil-moisture, snow and eventually canopy water of the GLDAS/Noah or MERRA-land models, using the same methodology [5,6]. This modelled total hydrological loading signal could also be compared to the loading estimates using GRACE derived equivalent water height. We also computed the atmospheric and induced oceanic loading effects using ECMWF with TUGOm (ATMMO) dynamic ocean response to pressure and winds. Adding this contribution to the hydrological loading should explain most of the observed displacements.

GNSS Time Series
In this paper, we used the GPS daily Precise Point Positioning (PPP) 3D coordinate time series of continuous stations processed by the Nevada Geodetic Laboratory (NGL) [55]. These solutions were generated using GipsyX (version 1.0) [56] with the Jet Propulsion Laboratory (JPL) final Repro3.0 products, IERS 2010 Conventions [57] for solid Earth tides, pole tides and for Earth Orientation Parameters as well as FES2004 ocean tide model [58]. No nontidal loading effect was applied in the GPS data processing. The daily position estimates are expressed in the IGS14 reference frame [59].
We selected 247 permanent stations in South America ( Figure 4). In order to mitigate the effect of interpolation as much as possible, the selection of GPS sites was performed adopting the following criteria: time span not shorter than 5 years, common observation period with the model data sets no shorter than 2.5 years, the gap ratio could not be more than one third of the time series. The observation period spans from January 2003 to February 2016 and 60% of GPS time series were longer than 6 years (148 sites). The formal error of the selected stations was lower than 1 mm and 3 mm for the horizontal and vertical components, respectively. We corrected the time series offset discontinuities related to equipment changes or earthquakes using the Hector software [60] with the potential steps epoch database provided by the NGL. Then, we adjusted and removed a linear trend of the GPS time series and outliers on all the three components at each station. We defined outliers as data points that exceed two times the sigma value. Before signal analysis, we filled the gap in GPS time series using an autoregressive (AR) model.

Multichannel Singular Spectrum Analysis (M-SSA)
For extracting time-variable seasonal signals from both daily GPS 3D position time series and environmental loading models mentioned above, we used Multichannel Singular Spectrum Analysis (M-SSA). This advanced analysis method of time series exploits the fact that geophysical data are both spatially and temporally correlated. The method thus allows common modes of spatial and temporal variability to be extracted as empirical basis functions. Hence, the method is particularly suitable to unravel oscillations and trends embedded in noisy time series without any a priori assumption of their amplitudes or slopes, even when stationarity is not ensured.
Following the first theoretical works of the 1980-90s, the pioneer method known as Singular Spectrum Analysis (SSA) has been successively utilized with paleoclimatic time series [61,62] and global surface air temperature time series [63]. A comprehensive review of suitable methods, including SSA and M-SSA, for extracting useful information from geophysical time series may be found in [64]. Since then, the applications of SSA and M-SSA to time series from geophysics and space geodesy have not stopped developing. The value of SSA for extracting and predicting time-variable seasonal signals from weekly GPS position time series is illustrated in [65,66]. A systematic use of statistical indicators (e.g., Akaike Information Criterion [67]) and tests (Fisher-Snedecor test) is proposed in e.g., [68] so as to adjust SSA setting parameters for extracting periodicities of GPS ellipsoidal height time series. A systematic study of different analysis methods dedicated to GPS position time series presented in [69], demonstrated that SSA is as efficient as Kalman filtering regarding the percentage of the variance it can model. M-SSA is required when several time series are to be analyzed together [70]. The method is successfully applied in [71] for detecting transient-deformation signals in GNSS position time series. Other types of data can be processed by M-SSA as carried out in [43] with GRACE satellite data. Not less than 13 years of GRACE-estimated geopotential spherical harmonics coefficients provided by five processing centers are jointly analyzed therein, thus efficiently filtering undesired north-south stripes. Recent work has extended the use of M-SSA to the study of environmental loading models to be compared to space geodesy data coming notably from GNSS and GRACE [42,72]. These studies have undoubtedly shown that environmental loading models generally underestimate the amplitudes of seasonal signals observed by space geodesy, which motivated our own study.
The following is an overview of the mathematical formulation from which we implemented M-SSA. We drew heavily on seminal papers-e.g., [61,62]-where greater mathematical detail is given. The formulation is broadly inspired by the one proposed by [64].
Let there be a set of L time series and N the total number of data points evenly sampled at the sampling period T s . The duration of times series is thus (N − 1)T s . The case L = 1 can be treated by means of traditional single channel SSA. To be consistent with our own analysis, we henceforth focus on the case L = 3, assuming that the time series {X 1 (k), X 2 (k), X 3 (k), k = 1, 2, . . . , N} represent respectively the centered variations of East, North and Up position vector components at a given station. The starting point of M-SSA consists of determining the covariances of M time-delayed replicas of the original time series gathered in a fully augmented trajectory matrix D which is expressed as follows: where trajectory matrices X l , l = 1, 2, 3 may be expressed as: The integer M can be interpreted as the length of the window on which original time series are embedded. Its value must be carefully chosen so as to allow seasonal signals sought to be extracted whilst keeping safe statistical confidence. This issue can be tackled in a theoretical (e.g., [73]) or practical way as proposed in [42]. We adopted the latter in this study carrying out numerical experiments with synthetical time series. A value of M = 730, corresponding to a 2-year window with daily data, has proven to be well suited for resolving semiannual and annual time-varying signals. Equations (5) and (6) are then used to express the grand lag-covariance matrix C X given by: where D T denotes the transpose of matrix D.
Diagonalizing the 3M × 3M symmetric matrix C X yields 3M eigenvectors {E r , r = 1, 2, . . . , 3M}, also designated as Empirical Orthogonal Functions (EOFs), each associated with 3M eigenvalues λ r such that for all r ∈ {1, 2, . . . , 3M}: Each eigenvector E r consists of three consecutive M-long segments of elements denoted now as e r l (j) for l = 1, 2, 3 and j = 1, 2, . . . , M. The 3M single-channel time series A r (k) obtained by projecting the X l onto the EOFs represent the Principal Components (PCs); they can be therefore expressed as: where k varies from 1 to N − M + 1. At that stage, the so-called decomposition phase is over. The next phase consists in selecting a given number of EOFs in order to reconstruct the part of the original time series associated with these EOFs. Let R ⊂ {1, 2, . . . , 3M} be the subset of EOF indices on which the reconstruction is based. The Reconstructed Components (RCs) can then be obtained by convolving the corresponding PCs with the EOFs. Hence, the rth RC (r ∈ R) at time k (k ∈ {1, 2, . . . , N − M + 1}) for the lth position vector component (l = 1, 2, 3) is given by: where the normalization factor M k and the lower and upper bounds L k and U k respectively are expressed by: Once RCs determined, the periods of those that turn out to be periodic functions of time have to be accurately estimated. For this, we used the modified periodogram jointly developed by Lomb [74] and Scargle [75,76].

Taylor Diagram
The Taylor diagram [77] provides an easy-to-read graphical representation of key statistical values related to time series. Examining this diagram permits to assess how well matched are two times series in terms of their correlation, their root-mean-square difference, and their respective standard deviations. In this study, for each site, we systematically used the Taylor diagram to quantify the degree of similarity between one time series computed from observations and several time series computed from displacement models.
If X m (k) and X o (k), k = 1, 2, . . . , N, denote respectively one of the model-derived time series and the observation-derived times series and X m , X o their respective mean values, then the four key statistical values displayed in the Taylor diagram are: (1) the standard deviation σ m of X m (k), (2) (4) the Centered RMS difference (CRMS) D between X m (k) and X o (k) defined as: The four values (σ m , σ o , R, D) are simply related to each other by the following equation: The correlation coefficient reaches its maximum value of 1 when, for all k ∈ {1, 2, . . . , N}, X m (k) − X m = a X o (k) − X o , where a is a positive constant. Although perfectly correlated, the respective variations of the two time series nevertheless remain of different amplitudes unless a = 1. In that case, the corresponding RMS difference satisfies D = |σ m − σ o |, equal to 0 when σ m = σ o . Hence, in the general case, D is all the closer to 0 as both i) the correlation coefficient between the two time series is close to 1 and ii) the respective deviations of the two time series from their mean values are similar. To help the reader properly understand the results presented in Section 3, Figure 5 shows a synthetic example of Taylor diagram. In this figure, the point representing the observation time series X o (k) is plotted along the abscissa axis proportionally to its standard deviation σ o . The radial distance from the origin to the point representing the model-derived time series X m (k) is proportional to its standard deviation σ m , and the azimuthal position gives the correlation coefficient R between the two times series. The Centered RMS difference is directly proportional to the distance between the points representing X o (k) and X m (k), respectively. It is plotted along the abscissa axis using its standard deviation (i.e., σ o = 4.2 mm). The point corresponding to the model-derived time series X m (k) is represented by a bullet (•). The radial distance from the origin to this point is proportional to the standard deviation of X m (k) (i.e., σ m = 5.3 mm). The labeled circles measure the distance from the star and thus the Centered RMS difference between the two time series (i.e., D = 2.6 mm). The radial lines are labeled by the correlation coefficient between the two time series (i.e., R = 0.87).

Results
We applied the proposed analysis technique to compare the 3D positioning GPS cleaned time series and the deformation time series computed with the different models presented in Section 2 for all the 247 selected sites and the time span from 2003 to 2016. For each site, we extracted the annual and semiannual signals for each 3D time series with the M-SSA. We applied the Taylor diagram method to the horizontal and vertical components of each time series using GPS as the reference. We computed the following comparison indicators: standard deviation of each solution, correlation coefficient between GPS and each model, and the centered RMS difference between GPS and the model time series (CRMS). In this section, the initial loading models are referred as GLDAS/Noah and MERRA-land and the new ones after addition of the river loading contribution are labelled GLDAS/Noah + river and MERRA-land + river. For these models, ATMMO indicates the model we used for the atmospheric and oceanic parts of the loading signal to compute the full deformation time series. We present the different results obtained concerning the annual signal for the initial loading models and the new models adding the river contribution, as well as the loading deformation time series inferred from GRACE mascons (v02.4, update of [47]) for comparison. We do not present and analyze the semiannual term in this paper. For each parameter and indicator, we give the results overall the network considered (247 stations) and for the Amazon (25 stations) and Parana (40 stations) basins. Unfortunately, our selection criteria for GPS time series did not allow us to keep any stations in the Orinoco basin.

Annual Signal Amplitude
The first component (RC1) extracted from the M-SSA gives an estimate of the annual amplitude at each site for East, North, and Up components, respectively. The amount of the annual signal in the GPS times series can be estimated by calculating the percentage of RC1 (%RC1) defined as the ratio between the standard deviation of RC1 and the sum of the standard deviations for all the RCs. For all considered sites, the %RC1 for the vertical component ranges from 0% to 100% with a mean value at 47.1% and a standard deviation at 29.2%. For the horizontal components, the %RC1 varies in the same range, but the mean value of the North component (49.2%) is significantly different from the one of the East component (28.5%). The same comparisons carried out for the stations located in the Amazon basin and the Parana basin indicate a significant increase of the %RC1 mean values of both for the Up component at 63.8% and 59.5%, respectively. Similarly, an increase of %RC1 mean values can be observed for the North and East components, which reach the values of 52.1% and 43.4% respectively, for the Amazon and 58.6% and 31.1% respectively, for the Parana. These results suggest that the portion of the annual signal in the GPS time series is in the order of 60% for the stations located in the river basins and in the order of slightly less than 50% for the North and East components in the same zones.  [34,36,37].
To go further in the improved hydrological model assessment, we need to consider other comparison estimators that can be provided by the Taylor diagrams. Figures 8 and 9 illustrate the Taylor diagrams obtained for POVE and SAGA stations. For a given component at a given site, considering such a diagram gives clear indication of the agreement between the different models and GPS. Each color corresponds to a solution: GPS in black, MERRA-land + river model in orange, initial MERRA-land model in green, GLDAS/Noah + river model in red, initial GLDAS/Noah model in purple, and GRACE observations in brown. The improvement induced by the addition of the river contribution is clearly evidenced comparing the green to the orange and the purple to the red solutions. On the left part of the figure, the closer the points are to the reference point (black circle), the better the agreement between the model and the observations. The table gives in the first column the correlation coefficient (in %), in the second column the standard deviation (in mm), and in the last column the CRMS (in mm) with respect to the GPS for each solution in the same order as the color legend. The standard deviation for GPS (in mm) is given on the table legend. The histogram indicates the annual amplitude (in mm) for each solution, where the horizontal dashed line corresponds to the amplitude value for GPS. Considering these two examples, the lack of initial hydrological loading models and the huge improvement of the new models are clearly highlighted both in horizontal and in vertical. Nevertheless, we can still notice some discrepancies between the new models and GPS.      We compute the model improvement averaged for each river basin. Figures 10 and 11 are the Taylor diagrams for the Amazon and Parana basins, respectively. Considering the Up component for both basins which is the highest in terms of GPS amplitude (8.2 mm and 4.7 mm for the Amazon basin and the Parana basin, respectively), we can observe a significant improvement in the models including rivers for the Amazon basin (CRMS < 3 mm with a correlation maintained between 70% and 80% using GLDAS/Noah + river and MERRA-land + river). This improvement is more moderate for the Up component of the Parana basin where the initial models already give rather good results regarding the CRMS (<2.4 mm) and the correlation (>70% for MERRA-land). Looking now at the horizontal, East and North components for both basins, we have to bear in mind the very low amplitudes of the corresponding GPS signals (<1 mm for the East and <2 mm for the North), at the limits of detectability. However, the results indicate that the models with rivers perform better for the horizontal components of the Amazon basin (CRMS roughly divided by two) than for the Parana basin (CRMS and correlation almost unchanged). A slight asymmetry can also be observed in the Amazon basin where the new models seem to overestimate the East amplitude component (0.9 mm for GLDAS/Noah + river, 0.8 mm for MERRA-land, 0.7 mm for GPS), and underestimate the North amplitude (1.1 mm for GLDAS/Noah + river, 1.0 mm for MERRA-land, 1.3 mm for GPS). This could be explained by the West to East direction of the Amazon's runoff, which as the major river of the Amazon basin could have overcontributed to the models. It should be noted that the horizontal and vertical components derived from GRACE are generally consistent with those of the new models, regarding notably the amplitudes for the Amazon and the Parana basins e.g., 7.1 mm and 2.8 mm for GRACE Up component in the Amazon and Parana basins, respectively to be compared to 8.8 mm and 3.6 mm for GLDAS/Noah + river and 7.9 mm and 2.8 mm for MERRA-land, except for the East component of Parana. Such a consistency gives confidence in the reliability of the two models. The main point which deserves to be highlighted from our analysis, is perfectly illustrated by the behavior of the Up component in the Amazon basin. Whilst the initial models only explained nearly half of the amplitude of GPS (8.2 mm for GPS while 4.0 mm for GLDAS/Noah and 3.4 mm for MERRA-land), the new ones compensate for these gaps (8.8 mm for GLDAS/Noah + river, 7.9 mm for MERRA land) and remain consistent with GRACE (amplitude at 7.1 mm). These findings demonstrate conclusively the efficiency of the model with rivers in such basins as the Amazon.

Correlation Coefficient between the Models and GPS
The correlation coefficient between the annual signal extracted from the GPS time series and from the different loading model time series are computed for each site and each component. The higher the correlation value is, the higher the agreement between the considered model time series is in agreement with the GPS one. A phase shift between two time series tends to decrease their correlation regardless of their amplitudes (see Section 2, Section 2.4). This indicator can thus be interpreted as a marker of phase shift. Figure 12 illustrates the correlation coefficients for GLDAS/Noah and GLDAS/Noah + river models. The darker the color, the higher agreement between the two time series. The spatial analysis of these correlation coefficients gives an indication of the spatial distribution of the impact of the new models in relation to the different river basins. We can clearly notice the major improvement for the vertical component in the Amazon basin.
We compute the number of sites in each case for different classes of correlation coefficient values: lower than 70%, between 70% and 80%, between 80% and 90%, and greater than 90%. Table 1 gives the distribution of the corresponding values for each component and each dataset. For the whole station set, the correlation coefficient values range from 0% to 100% for all components. The improvement between the initial (labelled nR in the table) and the new models (referred as wR in the table) can be highlighted for instance by the reduction of the percentage of sites with a correlation coefficient lower than 70%. Regarding the values in Table 1, the improvement is not obvious considering this value over the entire network. Nevertheless, if we focus on the number of stations with a correlation coefficient greater than 80% and even greater than 90%, then the improvement is highlighted for the Up and the East components. For instance, in the case of GLDAS/Noah when we add the river contribution, for all the considered sites the percentage of stations with a correlation larger than 80% increases of about 1.2% for the East, decreases of about 7% for the North, and increases of about 4% for the Up component. The corresponding values for MERRA-land are slightly different: increasing of about 2% in East, decreasing of about 11% in North, and increasing of about 2% in vertical. Compared to GRACE, the largest discrepancies with the new models are experienced in the North component. The proportions of correlation coefficient higher than 80% with the new models are larger than GRACE values of about 6-12% in Up, 19-26% in North, and 9-11% in East. If we focus on Amazon basin, the proportion of stations with a vertical correlation coefficient larger than 80% changes from 44% to 60% and from 56% to 64% for GLDAS/Noah and MERRA-land, respectively. For the horizontal components, the introduction of the river contribution slightly increases this proportion for the North component and decreases in the East component, the worst case being with MERRA-land. For Parana basin, the improvement is less obvious. Compared to GRACE the number of sites with a correlation higher than 80% differs from 12-20% in East, 0-4% in North, and 8-12% in Up in the Amazon basin. The impact of the new models is slightly different since the number of sites decreases by 5-10% in East, increases by 23-30% in North, and decreases by 2-22% in vertical. Thus, as expected the impact of the addition of the river is not uniform over the different river basins.  Table 1. Distribution (in %) of the correlation coefficient between GPS and the different models for different subsets of stations: all stations (247), the Amazon basin (25 stations), and the Parana basin (40 stations), designated by the acronyms "All", "AB", "PB", respectively. The loading models without the river are provided in the "nR" columns, the ones including the rivers are in the "wR" columns.

Model Contribution to Explain GPS Annual Signal
To estimate the contribution of the GPS annual signal explained by the different models, we computed the ratio of the standard deviation STD(model)/STD(GPS) for each model. Figure 13 illustrates the results obtained for GLDAS/Noah and GLDAS/Noah + river for all the three positioning components. As indicated previously, on this figure the length of the arrows corresponds to the annual amplitude. The color scale corresponds to the standard deviation ratio. The closer the model is to GPS, the closer this value is to 100%. Lighter colors correspond to better agreement between the model and GPS. Red colors indicate an overestimation of the model larger than 150%. For the North component, the oversize arrow corresponds to MAPA (Macapá, Brazil, amplitude of 6.1 mm for GLDAS/Noah + river). Looking at the annual amplitudes, the model improvement for each positioning component can be observed comparing the lengths of the arrows of the left map (initial model) to the ones of the right map (new model). We can first notice that the addition of the river contribution produces larger amplitudes for the vertical component of stations localized in the river basins. Larger amplitudes are also extracted for the new models for the horizontal components. In addition, the spatial distribution of the standard deviation ratio highlights the overall contribution of the river in the GPS signal and the importance of including them in loading models. It is particularly obvious on the vertical and North components. Nevertheless, the new model overestimates the annual signal at some stations for the vertical and North components, and for a lot of stations for the East component. These results concerning the horizontal components should be considered with respect to the horizontal accuracy of the GPS time series used in this study (1 mm).  Table 2. Distribution (in %) of the STD ratio between the different models and GPS for different subsets of stations: all stations (247), Amazon basin (25 stations), and Parana (40 stations), designated by the acronyms "All", "AB", "PB", respectively. The loading models without the river are provided in the "nR" columns; the ones including the rivers are in the "wR" columns.  Table 2 gives the standard deviation ratio distribution for all the sites, and for Amazon and Parana basins. We computed the number of stations for different ratio classes: lower than 50%, between 50 and 85%, between 85% and 115%, between 115% and 150%, and larger than 150%. Values higher than 100% indicate an overestimation of the annual signal by the model with respect to GPS annual term. The standard deviation ratio values demonstrate the improvement brought by the new models for the full 3D deformation. For instance, considering the number of sites in the class between 85% and 115% before and after addition of the river contribution, the differences are clear. For the vertical component, the number of stations in this class increases by 11% for the entire network, 36% for Amazon and 23% for Parana basin. The corresponding values with MERRA-land are slightly different with 7%, 28%, and 5%, respectively. For the North components, the values increase by less than 2% for the entire network, 8% for the Amazon basin, and 2.5% for the Parana basin with GLDAS/Noah, and by −0.5%, 8%, and 2.5% with MERRA-land. For the East component, with GLDAS/Noah the number of sites in this class increases by 7% for the entire network, 12% in the Amazon basin and 10% in the Parana basin. The corresponding values using MERRA-land are 4.8%, 0% and 7.5%. In the case of GRACE, the number of sites having a standard deviation between 85% and 115% are 44%, 12% and 8% in the Amazon basin for Up, North, and East components, respectively. These values are smaller in the case of the Parana basin with 17.5%, 5%, and 2.5%, respectively. This indicates that the agreement between GPS and GRACE in terms of standard deviation of the annual signal is better in the Amazon basin. These relatively lower agreements are probably due to the differences between GPS and GRACE in terms of spatial resolution and sensitivity to the loading effect, since GPS is sensitive to both regional and local effects.

Model
These results evidence the localized importance of the river contribution in the river basins considering both the new models and the GPS and GRACE observations. Nevertheless, we can notice the slightly increasing proportion of sites where the new model overestimates the annual signal. Effectively, for instance, considering the proportion of sites in the class with a standard deviation ratio between 115% and 150%, it concerns an addition of about 7% with GLDAS/Noah and 4% with MERRA-land of sites for the entire network for the Up component. The corresponding values are about 3% for both models for the North component, and 8-9% for the East component. Focusing on the Amazon basin, with GLDAS/Noah, we notice an increase of 12%, 4% and 8%, for the Up, North, and East components, respectively. With MERRA-land, the differences are larger with corresponding values of 20%, 4%, and 16%. For comparison, in the Amazon basin, the GRACE solution shows 8%, 4%, and 11% of sites in this class for Up, North, and East components, respectively. Considering the stations with the larger overestimation, the ones in the upper class of standard deviation ratio, their number increases most in the Amazon basin for the East component for both models (more 24%). In the Amazon basin, GRACE exhibits a very small number of sites with such an overestimation for the vertical component, but the models largely overestimate the horizontal components for 16-20% of the sites in this basin. Again, this is most likely due to the spatial resolution difference between GRACE and GPS.

Centered RMS (CRMS) between the Models and GPS
The Taylor diagram analysis gives another useful agreement estimator between the GPS and model time series, the centered RMS (CRMS). Figure 14 shows the map of the CRMS for the annual signal extracted by the M-SSA for all the considered sites with GLDAS/Noah initial and new models for the East, North, and Up components. The smaller the CRMS value is, the closer the model and the reference are. This estimator clearly evidences the improvement of the new models in the river basins, mainly in the Up and North components.  Table 3 provides the distribution (in %) of the number of sites in different classes of the CRMS values. These classes are different for the horizontal and vertical components to take into account the relative difference in terms of accuracy and amplitude. For the horizontal components, the classes are: lower than 1 mm, between 1 and 2 mm, between 2 and 3 mm, and larger than 3 mm. For the vertical component, the classes are: lower than 3 mm, between 3 and 6 mm, between 6 and 9 mm, larger than 9 mm. For the vertical component over all the sites, the addition of the river contribution slightly increases the number of sites with a CRMS lower than 3 mm (from about 85% to 93% with GLDAS/Noah and from about 84% to 91% with GLDAS/Noah). The new values are close to the GRACE one (about 90%). For the horizontal components, the differences of number of stations with a CRMS lower than 1 mm are quite small (improvement for about 6% in North and 1% in East), but again the values are closer to GRACE ones. Focusing on the Amazon and Parana river basins, we can see that the addition of the river contribution has an effect on the East component only with MERRA-land. For the North component, the impact is quite different with a larger impact on the sites located in the Amazon basin (more 24%) than for Parana basin sites (more 8%). This contrast is even more obvious considering the vertical component, with an increase of 36-40% in the Amazon and only 8% in the Parana. Each time the models correctly explain the GPS observations, the CRMS falls below the significance level (GPS accuracy). Table 3. Distribution of the CRMS (in %) for the different models for different subsets of stations: all the stations (247), Amazon basin (25 stations), and Parana (40 stations), designated by the acronyms "All", "AB", "PB", respectively. The models without the rivers are provided in the "nR" columns, the ones including the river are in the "wR" columns. The horizontal and vertical classes are different to take into account the relative difference in terms of accuracy and signal amplitude.

Discussion
We have demonstrated that surface water is a significant contributor to seasonal mass variations in South America [27] which cannot be neglected when comparing GPS observations to hydrological loading estimates. Using a simple river routing scheme [49] forced by the runoffs from global hydrological models, we have shown that it is possible to determine the seasonal surface water changes with sufficient accuracy for loading estimates or comparing to time-variable gravity measurements from the GRACE and GRACE-FO missions. Almost half of the seasonal hydrological mass variations are stored in the river and the floodplains for the Amazon, Orinoco and Parana river basins.
We analyzed a set of almost 250 permanent GPS stations in South America over a 13-year period using the M-SSA technique. This tool allows to extract the common signal in the observed vertical and horizontal displacements, filtering out any local perturbation. Adding the river storage, we show and quantify the higher correlation between GPS observations and loading estimates, at the station level, but also at the river basin and continental scales. The improvement is particularly clear considering the CRMS (Centered Root Mean Square) values. Extracting this common mode is particularly interesting when comparing GPS measurements to loading models over such a large area. We show in our study higher correlation between the total loading estimates (atmosphere + ocean + hydrology, even without taking into account the river storage) than [14] over all equatorial regions. Indeed, [14] analyzed each individual vertical time series independently, then computed average over 10 • latitude bands, and showed a minimum reduction of variance after correction of loading effects at low latitudes. Their analysis was then more sensitive to local deformation or mismodelling, unlike ours.
For all three components, the GLDAS/Noah model [45] including the river storage variations is in better agreement with GPS observations than MERRA-land [44] and also GRACE mascons [47]. The lower spatial resolution of GRACE solutions may explain some of the lower correlation with GPS displacements, compared to the hydrological models. The agreement between GPS and loading estimates are slightly lower for the Parana river basin than for the Amazon river basin; this is most likely due to smaller variations (on average a standard deviation of 3.6 mm for the vertical compared to 6.4 mm), leading to an increase sensitivity to the GPS noise. For stations in close vicinity of the rivers, compared to the river routing resolution (here 0.5 degree), such as Manaus (NAUS), a higher resolution of the river model may be required to avoid loading overestimates [29].

Conclusions
Further investigation will help to understand why new models exhibit overestimations of the annual signal at some sites. This study could also be extended to other areas with dense permanent GPS networks, such as Europe [78] or Northern America. In spite of smaller hydrological variations, loading effects cannot explain entirely GPS observations [14,78]. Adding river storage to the hydrological models using similar procedure should help us increase the agreement between loading estimates and deformation observations. Recent advances in remote sensing for hydrology should be considered to improve the river contribution modelling [79]. At higher frequencies, monitoring the deformations due to river floods using nearby GPS stations requires better river modelling using Manning's equation and the knowledge of the river geometry (cross section area, hydraulic radius and channel slope).  Data Availability Statement: GPS timeseries were obtained at NGL (http://geodesy.unr.edu/ accessed on 1 March 2021). Hector software is available at (http://segal.ubi.pt/hector/ accessed on 1 March 2021). Maps were obtained using GMT software (https://www.generic-mapping-tools.org/ accessed on 1 March 2021). We thank Hydroweb (data available at http://hydroweb.theia-land.fr/ accessed on 1 March 2021) research groups for providing water level data used in this study. Most of the loading estimates are available at the EOST loading service (http://loading.u-strasbg.fr accessed on 1 March 2021).