Water Balance Standardization Approach for Reconstructing Runoff Using GPS at the Basin Upstream

: While in-situ estuarine dischargehas been correlated and reconstructed well with localized remotely-sensed data and hydraulic variables since the 1990s, its correlation and reconstruction using averaged GPS-inferred water storage from satellite gravimetry (i.e., GRACE) at the basin upstream based on the water balance standardization (WBS) approach remains unexplored. This study aims to illustrate the WBS approach for reconstructing monthly estuarine discharge (in the form of runoff ( R )) at Mekong River Delta, by correlating the averaged GPS-inferred water storage from GRACE of the upstream Mekong Basin with the in-situ R at the Mekong River Delta estuary. The resulting R based on GPS-inferred water storage is comparable to that inferred from GRACE, regardless of in-situ stations within Mekong River Delta being used for the R reconstruction. The resulting R from the WBS approach with GPS water storage converted by GRACE mascon solution attains the lowest normalized root-mean-square error of 0.066, and the highest Pearson correlation coefficient of 0.974 and Nash-Sutcliffe efficiency of 0.950. Regardless of using either GPS-inferred or GRACE-inferred water storage, the WBS approach shows an increase of 1%–4% in accuracy when compared to those reconstructed from remotely-sensed water balance variables. An external assessment also exhibits similar accuracies when examining the R estimated at another station location. By comparing the reconstructed and estimated R s between the entrance and the estuary mouth, a relative error of 1%–4% is found, which accounts for the remaining effect of tidal backwater on the estimated R . Additional errors might be caused by the accumulated errors from the proposed approach, the unknown signals in the remotely-sensed water balance variables, and the variable time shift across different years between the Mekong Basin at the upstream and the estuary at the downstream.


Introduction
River freshwater discharge, being expressed in the form of runoff (R) near estuary mouths, is a significant water balance variable of a river basin [1,2]. It is essential to capture floods and droughts in river deltas, and to prepare for potential economic losses [3][4][5][6]. However, insufficient funding for facility operation [7] has resulted in a decreasing number of in-situ gauges around the world [8]. Therefore, an approach for estimating river freshwater discharge in an ungauged basin is sought. variables, the inferred R from the water balance equation could be expected to outperform the correlative relationship between each individual water balance variable and in-situ R, let alone the inferred R from water balance standardization (WBS). This WBS approach and its corresponding results are the objectives of this paper.
Mekong Basin, being our study region with a catchment area of ~795 000 km 2 , is the most important river basin in Southeast Asia [34] (Figure 1). Freshwater transports from the Northeastern Tibetan Plateau to Southeast Asian countries and the Mekong River Delta, where it is affected by both freshwater discharge and ocean tidal backwater seasonally [35,36] in addition to the R adjustment due to the Tonle Sap Lake prior to transporting to the open ocean [37,38]. Along the main stream, Lancang River, located in Yunnan, China, is a significant upstream portion of the Mekong Basin that is climatically driven by an Indian monsoon [39]. Any changes of hydrological conditions upstream would pose an adverse effect on human beings, particularly involving agricultural and economic losses downstream [40]. This reason calls for research studies using upstream remotely-sensed observations for downstream river freshwater discharge estimation [41].
On the other hand, the Mekong Basin is substantially modified by human activities. Dam construction and operation at the upstream have been issues since the 1990s [42], which alters the R at the downstream in different seasons [43]. Subsequently, water security becomes a significant issue among different Southeast Asian nations, especially for the Mekong River Delta where it is important for the fish supply [44] and water security [45] of Southeast Asia. However, the influence of dams is not large enough as to alter the flow consistency [46]. Previous studies also indicated that the influence on the annual change of R of the Mekong River Delta is insignificant [42]. The accumulated effect of dam operations in the estuary area is almost systematic for a specified month every year [47]. Therefore, the subtraction and standardization process should be able to mitigate the accumulated bias. Previously mentioned reasons justify a potential use of the WBS approach in the upstream to reconstruct the downstream R time series located in the estuary.
This study explores the applicability of the GPS VCD at the upstream Mekong Basin employing the WBS approach for reconstructing the R time series of a gauge station at the Mekong River Delta on a monthly temporal scale. Based on the reconstructed relationship, the R time series estimated at another location in the river delta are then compared with an independent in-situ R for an external assessment. Other remotely-sensed hydrometeorological variables (RSHMVs) from TRMM and MODIS are used as baselines for direct comparison to show the applicability of the GPS VCD.

Time Series of In-Situ Estuarine Discharge Gauges
The in-situ estuarine discharge data are requested at http://www.mrcmekong.org. To be consistent with GPS time span, the estuarine discharge time span was extracted for a period between 2012 and 2014. Given the above geographic description, the in-situ estuarine discharge gauge stations should be chosen to be representative for the whole Mekong Basin, while minimizing the backwater effect due to ocean tides and the total discharge adjustment effect attributed to Tonle Sap Lake [38].
Tan Chau and Chau Doc stations, ~220 km away from the estuary mouth and located at the Mekong River Delta entrance, were chosen for reconstructing R in this study ( Figure 1). Can Tho and My Thuan stations, which are the nearest stations before the coastlines, were also employed for assessing the effect due to backwater on the estimated R time series. To reduce the backwater effect due to short-period (i.e., half-daily period and daily period) ocean tides, the discharge time series of Tan Chau and Chau Doc stations (hereafter called the TC-CD station) were summed up. The same process was conducted for Can Tho and My Thuan stations (hereafter called CT-MT station).
To convert daily estuarine discharge (m 3 per sec) into R (mm per month), the daily estuarine discharge data were added up each month, which was followed by dividing the catchment area of the basin. No matter which pair of stations (either CT-MT station or TC-CD station), both resulting time series shared the same fluctuation pattern, despite differences in R peaks and troughs ( Figure  2).

Remotely-Sensed Water Balance Variables
Except for the indirectly determined runoff R, other water balance variables (P, ET, and S) can be determined, respectively, from TRMM, MODIS, and GRACE, as mentioned above. To be consistent with the GPS time span, these water balance variables from 2012 to 2014 were used. TRMM measured P (hereinafter called TRMM-P) with a global coverage bounded between 50° N to 50° S [48]. We used monthly gridded P data (TRMM 3B43 version 7) with a 0.25°×0.25° spatial resolution available at https://disc.gsfc.nasa.gov/datasets/TRMM_3B43_V7/summary. These TRMM data were generated from calibration using all precipitation gauges around the world [49].
MODIS measured and inferred a wide variety of environmental variables, including (hereinafter called MODIS-ET). We used the gridded ET data (MOD16A2) that covers an area from 80° N to 60° S with a 0.5°×0.5° spatial resolution. This dataset, calculated by an improved algorithm using Penman-Monteith equation [50], is made available by the Numerical Terradynamic Simulation Group in the University of Montana (http://www.ntsg.umt.edu/project/modis/). TRMM-P and MODIS-ET are generalized as remotely-sensed hydrometeorological variables (RSHMV) in this study.
GRACE measured time-variable gravity, and, thereby, inferred global S (hereinafter called GRACE-S). The Center for Space Research (CSR) Release-06 (RL06) and its RL06-mascon solution were employed because the above two solutions were developed by two different data pre-processing techniques for their consistency to be validated from the resulting R reconstruction. Because the monthly mascon solution was developed via Tikhonov regularization on regular spatial grids [51], the monthly RL06-mascon solution can directly be used (http://www2.csr.utexas.edu/grace/RL06_mascons.html). However, the monthly RL06 solution are expanded to 60° (equivalent to a spatial resolution of 3°) in the form of spherical harmonic coefficients representing the mass changes (http://www2.csr.utexas.edu/grace/RL06.html). Therefore, post-processing steps are required for the monthly RL06 solution prior to their use. These steps include adding degree-1 term with measured geocenter time series generated from Satellite Laser Ranging [52], and replacing C20 term in the spherical harmonic coefficient to improve the second zonal coefficient, respectively [53]. De-striping and 350-km radius Gaussian filtering (hereinafter called RL06-G350) are then applied to attenuate spatially correlated errors [54]. The monthly S time series (in terms of Equivalent Water Height at a regular grid) are calculated using Equation (14) in Reference [55], which is divided by the water density.

Data Processing for GPS-Determined VCD and its Conversion into Water Storage (S)
We used GAMIT version 10.4 [56] to preprocess 33 GPS stations' raw observations in the upstream Mekong Basin to determine daily VCD time series from 2012 to 2014, provided by Crustal Movement Observation Network of China. We employed a network solution that is stochastically constrained to 24 IGS stations surrounding China (i.e., posing 5-cm standard error in 3D positioning) in which the IGS stations are in ITRF2008 coordinate reference frame.
Standard procedures were applied during the GPS pre-processing steps. For instance, the antenna offsets were corrected by the IGS provided antenna correction data files and the non-tidal atmospheric loading was removed by using the MIT correction data files. The orbits were constrained to the final precise ephemeris of IGS. The ionospheric delay was corrected up to the third-order term by choosing options in GAMIT. The tropospheric delay was corrected by the combination of the Vienna mapping function 1 and the global pressure and temperature model [57]. Earth Orientation Parameters were set to a priori values, and the solid Earth tide and pole tide were corrected, according to the International Earth and Rotation Service (IERS) Bulletin B standard [58]. The ocean tide loading was removed by choosing the FES2004 model option in GAMIT. Lastly, the daily GPS VCD time series, being a time series of relative height positions, were determined by subtracting the height from its average.
During the post-processing steps, the non-tidal ocean loading in the GPS VCD time series was corrected externally by the modeled non-tidal ocean loading displacement from a Global Geophysical Fluid Center (http://geophy.uni.lu/). Gross errors exceeding twice the standard deviation were removed. To suppress the signal aliasing and draconitic errors (e.g., ~351 days [59,60]) in the seasonal signal, we applied a spectral filtering in the frequency domain via Fast Fourier Transform technique. Apparently, the first peak (i.e., 1 cycle per year) was recovered ( Figure 3). We also observed that the peaks and troughs of the GPS time series were reduced after inversely transforming the filtered GPS spectra back into their respective time series. To be consistent with other monthly data, the daily GPS VCD time series were averaged to form the VCD on a monthly scale.
In principle, the time-varying VCD of each GPS station location is due to all neighboring S contributions [61]. Therefore, the gridded GRACE-S data near the single GPS station location should be acquired to conduct a spatially-weighted averaging process with respect to that single GPS station location. In this study, we set a search area within 3° for a number of i gridded GRACE-S data with respect to each GPS station location g. This spatially-weighted averaging for N number of gridded GRACE-S at the GPS station g, denoted as g S , is achieved by the following equation.
serves as a weighting factor according to each distance i d , with respect to the station g, with a spatial scale D set to 3° due to the previously mentioned spatial resolution of GRACE data and allowance for including the hydrological loading effect farther than 3°. In essence, GPS VCD is related to S elastically in a linear fashion [62], which can be expressed as Therefore, by a simple linear fitting with a slope α , and an offset β at the same time epoch t, the parameters α and β can be determined via least-squares solution in order to convert VCD into S. Note that α should be a negative value because S loads (unloads) the crust that yields the VCD downward (upward).
Only averaged GRACE-S fitted at the GPS station g with a Pearson correlation coefficient (i.e., Equation (7)) higher than 0.8 were used to calculate the mean α and β (Table 1)

Reconstruction Based on Correlation and Water Balance Standardization
Each of the previously mentioned time series of remotely-sensed water balance variables within a preset square bounding Yunnan Province were averaged and smoothed before the correlation analysis. Employing the traditional practice in remote sensing, the R reconstruction is conducted by directly correlating the remotely-sensed water balance variables with the in-situ R via a simple linear model (i.e., an offset c and a slope d), which is expressed as: where t y and t x are, respectively, the in-situ R and individual remotely-sensed water balance variable at month t. Note that a forward two-month shift for TRMM-P and MODIS-ET data was applied because this procedure yielded the highest correlation, which is attributable to hysteretic properties of the hydrological process [21]. The parameters, c and d, are empirically determined via a least-squares solution in which the determined parameters are then employed to reconstruct R using individual time series of the remotely-sensed water balance variables. Figure 4 visualizes all the time series, including GPS-derived S (hereinafter called GPS-S) from CSR RL06 with 350-km Gaussian filtering and its mascon solution from 2012 to 2014. Similar temporal fluctuations with the in-situ R are observed. However, variable time shift across different years is detected for the upstream averaged GRACE-S when compared to the in-situ R (Figure 4). We speculate that the climatic variability across different geographic zones of the entire Mekong Basin causes the spatial differences in the water storage every year [63,64]. The water storage from GRACE RL06 solution displays a slightly smaller amplitude than that of mascon solution. This is because 350-km radius Gaussian filtering was applied to RL06 solution that attenuated regional signals, while the time-variable regularization matrix was employed to solve for the mascon solution via Tikhonov regularization, which would not attenuate the regional signals [51]. The GPS-S exhibits a slower downward trend against the in-situ R from January to April 2014. This might be because GPS is a ground-observed technique that is more sensitive to local events or changes in water storage [26]. Overall, all individual time series of the remotely-sensed water balance variables, including GPS-S with 350-km Gaussian filtering and its mascon solution, exhibit similar temporal patterns allowing the direct use of Equation (4) for the R reconstruction, referred to as "reconstructed Rs." For the correlation procedures of the WBS approach in this study, the in-situ R are standardized and correlated with the WBS of R. The standardization of the in-situ R, s , , is calculated as where , is the in-situ R, ( ) is the median of the in-situ R, and is the standard error of the in-situ R for month k and year j. Equation (5) is also applied to the WBS of R, where R is derived by subtracting ET and S from P as when calculating corresponding standardization. Note that ∆ , is the difference between month k+1 and k of year j of GRACE-S (or GPS-S). Equation (4) was further employed to empirically determine the corresponding c and d between the standardization of the in-situ R and the R from the WBS. This procedure allows the determination of the reconstructed Rs based on the WBS approach.
Note that the reconstructed Rs that are directly used to compare against the utilized in-situ R time series using performance indicators illustrated in the next sub-section refers to the internal performance. The subsequent usage of the determined parameters for the Rs estimated at another locations with independent in-situ R time series in the river delta refers to estimated Rs. The estimated Rs are then compared against the independent in-situ R time series that refers to external performance. Both performance assessments examine the feasibility of our presented methodology.

Performance Indicators
The reconstructed and estimated Rs are assessed based on the utilized and independent in-situ R, respectively, by using the following three performance indicators.
Pearson correlation coefficient (PCC), being a measure between two variables ranging from negative one to positive one, is defined by the equation below.
Normalized root-mean-square error (NRMSE), being a measure of RMSE normalized by the maximum ranges of observations, is a relative accuracy indicator defined by the formula below.
The Nash-Sutcliffe efficiency (NSE) model coefficient [65], being an indicator to evaluate the efficiency gain of the estimated R against the in-situ R time series, ranges from negative infinity to one. The closer the NSE value to one, the better the efficiency of the estimated R. It is defined by the equation below.

Results and Evaluation
In this section, we examined internal and external performances of the reconstructed and estimated Rs generated from TRMM-P, MODIS-ET, GRACE-S, GPS-S, and the WBS approach using GRACE-S and GPS-S. Note that the CT-MT station is 120-150 km closer to the estuary mouth than the TC-CD station. The combined internal and external performance assessments of both station pairs could help quantify the remaining portion of a systematic ocean tidal backwater effect on both the reconstructed and estimated Rs because the time series of R of each station pair have been summed up to mitigate the ocean tidal effect.
The reconstructed Rs from the TRMM-P, MODIS-ET, GRACE-S, and GPS-S display similar temporal pattern of in-situ R, despite slight discrepancies observed in peaks and troughs ( Figure 5). The reconstructed R from TRMM-P appears to be the best because the in-situ R should be highly dependent on precipitation, as displayed in the performance indicators (Table 2).
Compared with the above reconstructed Rs from the TRMM-P, MODIS-ET, GRACE-S, and GPS-S, the reconstructed Rs based on the WBS approach using the GRACE-S and GPS-S achieve better results ( Figure 6). In particular, the discrepancy in peaks and troughs are reduced regardless of reconstruction based on GRACE-S and GPS-S ( Figure 6). The WBS approach is based on the principle of water balance (i.e., Equation (6)). The systematic errors in the remotely-sensed water balance variables cancel each other through subtraction among themselves, let alone in the standardization in Equation (5). Hence, the reconstructed R from the WBS approach yields the best performance among all reconstructions in the study region. In general, all the estimated Rs are slightly less accurate than the reconstructed Rs (Table 2 and Table 3), whereas the relative ranking of their performances also remains the same, no matter whether reconstructed at either CT-MT or TC-CD station. By examining the differences between the estimated Rs at CT-MT and TC-CD stations in terms of the performance indicators shown in Table 2 and Table 3, the usage of TC-CD station accounts for a 1%-4% decrease in the relative error (in terms of NRMSE) when compared to that at the CT-MT station. This should be a percentage of the remaining ocean tidal backwater effect at the CT-MT station in the estuary because the summation process of the R time series (as mentioned in Section 2) has already reduced the ocean tidal effect. In addition, the WBS approach yields a 1%-4% decrease in a relative error (in terms of NRMSE) for the CT-MT station, while yielding an average of a 1.65% decrease for the TC-CD station when compared to the remotely-sensed water balance variables. This represents a substantial improvement because the peaks and troughs are better captured in which accurate peaks and troughs are particularly essential for a comprehensive economic loss assessment during flooding and drought events. The R estimated from the GPS-S using the WBS approach results in the highest PCC (i.e., 0.959) and NSE (i.e., 0.907), and the lowest NRMSE (i.e., 0.089). Overall, the proposed WBS approach using the upstream averaged GPS-S is empirically shown to be competitive among the estimated Rs from the remotely-sensed water balance variables in the study region.
Nonetheless, remaining errors might still exist. These include the deficiency in our methodology, unknown signals in the remotely-sensed water balance variables, and variable time shift across different years between the Mekong Basin upstream and the estuary downstream. Moreover, in-situ discharge time series in the river delta is needed for reconstruction. The chosen GPS stations should be better located on the bedrock surface so that the seasonal elastic deformations can be clearly detected. The previously mentioned considerations represent limitations of this study.

Conclusions
We have investigated the applicability of the water balance standardization (WBS) approach based on using the GPS-inferred water storage (GPS-S) from satellite gravimetry (i.e., GRACE) averaged from the upstream of Mekong Basin for reconstructing the runoff (R) on a monthly scale at the Mekong River Delta estuary mouth. We acquired that the reconstructed R based on the WBS approach using GPS-S reaches at least a PCC of 0.97 and an NSE of 0.93 at both the Can Tho-My Thuan station as well as the Tan Chau-Chau Doc station pairs, which are comparable to those obtained based on the gravimetrically-inferred water storage (GRACE-S). This finding indicates that the GPS-S could be considered as an alternative to the GRACE-S within the water balance context under the WBS approach, even though this method still relies on GRACE-S. In contrast to those R reconstructed from the remotely-sensed water balance variables, the WBS approach shows a 1%-4% increase in the accuracy.
We conducted the external assessment of the estimated R using an independent in-situ R. We found out that the estimated and reconstructed Rs presented similar accuracy. By comparing the reconstructed and estimated Rs at the Can Tho-My Thuan and Tan Chau-Chau Doc stations, we found out that the remaining ocean tidal backwater effect on the estimated Rs accounted for a relative error of 1%-4%, even though the in-situ time series have been summed up to mitigate the ocean tidal backwater effect.
The R reconstructed based on the WBS approach from the GPS-S attains the lowest NRMSE value (i.e., < 9%) in which the remaining errors are the limitations of this approach. This might include the internal error in our methodology, the unknown signals in the remotely-sensed water balance variables, and the variable time shift across different years between the Mekong Basin upstream and its estuary downstream.
The proposed WBS approach using upstream-averaged GPS-S could be further expected to estimate the discharge at a sub-basin scale in addition to achieving at a higher temporal scale since the GPS VCD also offer relatively good precision at a daily temporal scale.
Author Contributions: H.S.F. designed the research, gathered data, conducted experiment, obtained funding, supervised the formal analysis, and wrote the manuscript, including review and editing. L.Z. conducted the entire formal data analysis. Y.L. performed GPS data post-processing and visualization. Z.M. conducted GRACE data post-processing and visualization. R.T. and F.Z. gave valuable advice, review, and editing. All authors have read and agreed to the published version of the manuscript.