Water Budget Closure in the Upper Chao Phraya River Basin, Thailand Using Multisource Data

: Accurate quantiﬁcation of the terrestrial water cycle relies on combinations of multisource datasets. This analysis uses data from remotely sensed, in-situ, and reanalysis records to quantify the terrestrial water budget/balance and component uncertainties in the upper Chao Phraya River Basin from May 2002 to April 2020. Three closure techniques are applied to merge independent records of water budget components, creating up to 72 probabilistic realizations of the monthly water budget for the upper Chao Phraya River Basin. An artiﬁcial neural network (ANN) model is used to gap-ﬁll data in and between GRACE and GRACE-FO-based terrestrial water storage anomalies. The ANN model performed well with r ≥ 0.95, NRMSE = 0.24 − 0.37, and NSE ≥ 0.89 during the calibration and validation phases. The cumulative residual error in the water budget ensemble mean accounts for ~15% of the ensemble mean for both the precipitation and evapotranspiration. An increasing trend of 0.03 mm month − 1 in the residual errors may be partially attributable to increases in human activity and the relative redistribution of biases among other water budget variables. All three closure techniques show similar directions of constraints (i.e., wet or dry bias) in water budget variables with slightly different magnitudes. Our quantiﬁcation of water budget residual errors may help benchmark regional hydroclimate models for understanding the past, present, and future status of water budget components and effectively manage regional water resources, especially during hydroclimate extremes.


Introduction
The terrestrial water cycle governs water and food security, hydrologic extremes, and ecosystem health [1]. Continually increasing human activities are altering the global and regional terrestrial water balance directly (e.g., water abstraction and infrastructure development) and indirectly (e.g., deforestation and increasing atmospheric greenhouse gases alter hydroclimate) [2,3]. Amid this integrated complexity between the natural water cycle and human activities, it is imperative to quantify water cycle components and their governing factors towards effective management, efficient water allocation, sustainable and strategic planning, and policymaking, especially for socioeconomically sensitive hydrologic systems [4][5][6][7]. Basin-scale assessments of the terrestrial water budget quantify hydrologic dynamics on various temporal scales (e.g., annual, seasonal, monthly) and uncertainties for each water budget component. These assessments provide knowledge of the mean state and variability of the water budget, which is fundamental to understanding the regional climate system and characterizing memories, pathways, and feedbacks between key energy, water, and biogeochemical cycles [1,7].
Although ground-based observations (i.e., in-situ data) of water budget components are often considered true, long-term assessments solely based on these data remain challenging due to inherent limitations regarding their spatiotemporal heterogeneity, sampling errors, high costs of installation and maintenance, and intermittent data gaps, especially in data-scarce (ungauged) or data-limited regions [3,[7][8][9]. To overcome this, multisource strategies, i.e., combinations of ancillary datasets from remote sensing observations, reanalysis models, and offline land surface models, have proven crucial to monitoring water storage and fluxes in a changing world [1,9,10]. Although multisource data can provide spatially continuous estimates of the terrestrial water budget, they have several drawbacks. For instance, remote sensing data is subject to sensor and signal processing uncertainties, and models typically lack the fully dynamic coupling of human impacts on the water budget and propagate uncertainties associated with metrological forcing data and parameterizations [1,2,9,11]. These error sources in quantifying constituent water budget components result in a water budget imbalance, i.e., residual error, which does not satisfy the physical mass balance constraint [1].
Further, multisource strategies may provide conflicting information if proper attributions of closure errors are not considered. Several deterministic (e.g., weighted average corresponding to error variance; [5,7]) and probabilistic [2] approaches have computed estimates of water budget components from multisource datasets while carefully accounting for uncertainties that are used to force water budget closure. Since each dataset contains unique information, we adhere to the ensemble approach proposed by Abolafia-Rosenzweig et al. [2] to probabilistically quantify the terrestrial water budget while enforcing closure on each unique realization. Enforcing water budget closure across multiple unique realizations from unique combinations of datasets (up to 72 realizations) reduces the risk of unknown bias cancellation (i.e., inaccurate datasets providing minimal water budget residuals).
River basin management, the backbone of many economies, relies on understanding historical and projected fluctuations of water budget components: precipitation (P), evapotranspiration (ET), streamflow (Q), and change in terrestrial water storage (∆S) [12,13]. Thus, long-term records of basin-scale water budgets and associated uncertainties for each constituent component are of paramount importance. We provide a water budget assessment over the upper Chao Phraya River Basin (CPRB), Thailand. The CPRB, historically known as "Thailand's rice bowl", inhabits about 40% of the country's population and contributes to about 2/3rd of the country's GDP, thus affecting the nation's overall socioeconomic development and water and food security [4]. The CPRB is vulnerable to hydrological extremes and has suffered flood and drought events in the past (e.g., 2004,2007,2011,2015), which cost Thailand more than one billion US dollars and severely disrupted industrial production of global supply chains in surrounding countries [4]. Moreover, the basin has an increased potential for severe drought, as revealed by the recently developed Drought Potential Index (DPI) [4]. Further, the CPRB has experienced up to a 73% increase in water withdrawal from 33.1 km 3 in 1990 to 57.3 km 3 in 2007 against an annual aquifer recharge of 41.9 km 3 (5-6% of the total precipitation) [12], which is projected to decrease in the near future due to natural and anthropogenic perturbations (e.g., climate change and increasing water demands) [14]. Notwithstanding the importance of understanding hydrologic dynamics in the CPRB, previous global (e.g., [1,2,5,7,15]) or regional (e.g., humid to arid regions in the South Central United State [16], nine major US river basins [17], Mississippi River basin [9], 16 large drainage basins in Canada [18], Mackenzie River basin [19], Upper Paraguay River Basin [8], Xingu basin [20], Rufiji basin [21], Amazon basin [22]) terrestrial water budget assessments either do not consider the CPRB or put forward the bulk estimate of water budget components of the larger region. Furthermore, limited studies have applied closure techniques to attain the physical constraint of mass balance in their respective study regions. Thus, our study advances the state of hydrologic science through a novel water budget assessment of a valuable but understudied and vulnerable region. The comprehensive understanding of CPRB's hydrologic dynamics and uncertainties/biases in constituent water budget components can inform water resources management, particularly for irrigation planning (water withdrawals and allocation strategies), flood and drought mitigation, drainage system design, and groundwater recharge estimation in the region [4,18].
This study is motivated to fill the aforementioned gaps in previous research by addressing the following questions over CPRB, (1) how much do state-of-the-art multisource data vary among themselves, and how well does an ensemble mean represent the regional water balance? (2) what is the direction and magnitude of relative biases in various water budget components that contribute to the imbalance in water budget closure? (3) what is the seasonal and annual variability of corrected components? We answer these questions by reconciling various hydrologic data products through the application of three water budget closure techniques proposed by Abolafia-Rosenzweig et al. [2]. We further discuss the long-term implications of corrected water cycle components in the context of CPRB water management as a pathway for future research.

Materials and Methods
Since the Nakhon Sawan gauge station ( Figure 1) is the most downstream station where the flow records are not contaminated by the factors such as tidal oscillations, upstream diversion or intake, the target study area is limited to the upper CPRB in this study. All other reanalysis and remote sensing data were selected and retrieved as per the spatiotemporal availability of individual datasets. Stream gauging station at Nakhon Sawan (C2 station) is shown as the orange-filled circle. Various mechanisms (west monsoon from the Andaman sea during May to October, south winds from the Gulf of Thailand during February to April, northeast monsoon during November to January, and tropical cyclones originating from the South China Sea) governing the climate of the basin are also shown.

Study Area
The upper CPRB is an agrarian river basin located in northern Thailand (Figure 1 inset) consisting of four sub-basins: the Ping, Wang, Yom, and Nan River basins, with a total catchment area of~102,600 km 2 (∼20% of the country's land area). The basin's climate is primarily governed by alterations between northeast monsoons, south-west monsoons, southern winds, and intermittent tropical cyclones. The upper CPRB receives an annual rainfall of 1100 mm and has a mean annual runoff of about 26 km 3 , which is 12.2% of the country's annual runoff [12]. The basin's water cycle is affected by human interventions such as reservoir management (water storage and release from Bhumibol and Sirikit reservoirs with capacities of 13.5 and 9.5 billion m 3 , respectively) and groundwater withdrawals [4]. More information on various salient features of the CPRB can be found in Abhishek et al. [4], and hence not repeated here. The research framework illustrating different datasets used, methods employed, and analyses carried out in this study is summarized in Figure 2 below. Schematic showing the multisource data used, research flow adopted, methods employed, and major analyses conducted in the current study. The 72 combinations were obtained by a permutation of the available products (4(P) × 3(ET) × 1(Q) × 6(∆S)) = 72). Bias and conventional errors were assessed as the standard deviation from the ensemble mean (for P, ET and ∆S) or by the predefined values based on the literature (for Q), followed by the application of three water budget constraining techniques.

Water Budget Closure
Considering the landmass as a one-dimensional column and neglecting the lateral fluxes, the terrestrial water budget at the basin scale for a given month can be written as, where r is the residual error. ∆S is calculated as the central difference of terrestrial water storage anomalies (TWSA) (i.e., ∆S t = (TWSA t+1 − TWSA t−1 )/2) of the monthly TWSA from the Gravity Recovery and Climate Experiment (GRACE) and GRACE-Follow on (GRACE-FO). Similar to previous studies [7,17,23], VIC-derived ∆S is found to agree more with this ∆S than that calculated using forward or backward difference. Further details of processing GRACE data are provided in Section 2.4. To be consistent with the other data products, TWSA is first linearly interpolated to daily values before performing monthly ∆S calculations [5]. Since multisource datasets contain unique information, probabilistically generated combinations (4(P) × 3(ET) × 1(Q) × 6(∆S) = 72 in total) of unique datasets are used following [2]. The mathematical representation of enforcing the water balance closure is notated in Equation (2): where r is the water budget closure error after correcting all the individual components against their respective errors (ε). Initial and enforced water budget, Equations (1) and (2), respectively, can be rewritten in the form as, where x is the corrected variable (P , ET , Q , and ∆S ).

Enforcing Water Budget Closure
We apply three available closure techniques, namely Constrained Kalman Filter (CKF) [1], Multiple Collocation (MCL) [24], and Proportional Redistribution (PR) [2], to enforce water budget closure in upper CPRB. Since the true value of any variable (P, ET, and ∆S) is unknown, the ensemble means across all datasets for respective variables are assumed to represent true values [7]. For Q, since the gage type and stage-discharge relationship were not available, runoff volume and associated uncertainties are assumed to be inversely proportional, as reported in an observational study [25]. We used a time series of uncertainty linearly varying from 2.3% to 28.8%, corresponding from highest to the lowest flow (Q).

Constrained Kalman Filter (CKF)
Assuming that errors in all water budget components follow zero-mean Gaussian distributions and are non-correlated among various estimates, the error covariance for an estimate can be represented as, where x and x t are the state estimates, and its true values, respectively. Water budget closure constraints (i.e., ε) that assume non-correlated errors [5] take the form, Equation (5) was further solved for ε xx and Kalman gain in accordance with Pan and Wood [26] and Pan et al. [1].

Multiple Collocation (MCL)
Assuming all the N (=4,3,1, and 6 for P, ET, Q, and ∆S, respectively) sets of estimates (with n monthly values in each estimate; n = 216 in our study) of a variable are unbiased, the distance between any two estimates (x i and x j ), measured as root mean squared (RMS) distance can be written as, Considering the errors of all estimates are uncorrelated (i.e., two errors are orthogonal), the Pythagorean Theorem in Hilbert space becomes, where, subscript t denotes the true value of a water budget component which, in the current study, is assumed equal to the ensemble mean for P, ET, ∆S. The Pythagorean constraints for any system (e.g., N = 3) can be written by Equation (8) in vector form [24], where, for example, d 2 23 is the mean squared distance between estimates 2 and 3. Moreover, since there is no exact solution for over-constrained (N > 3; ∆S in the current study) systems, MCL uses a least-squares solution to minimize Pythagorean distances of the constraints to reach the best 'compromise' [24]. Water budget closure error is then redistributed to the various components proportional to their relative distance (not the magnitude like in PR; explained below) from the true value (ensemble mean).

Proportional Redistribution (PR)
PR is the simplest method assuming that the relative errors for any given water budget component at a given time step are proportional to its relative magnitude [2], where i corresponds to the ith element, and other parameters have values as explained earlier.

Data Used
Datasets of water budget components are from reanalysis, remote sensing, and observed in-situ records, spanning May 2002 to April 2020 (Tables 1 and 2). All water budget components used in this analysis are basin averaged monthly time series represented as equivalent water depth (mm). Data products showing favorably similar inter-annual and seasonal dynamics are used for calculating the coefficient of variation (CV = standard deviation/ensemble mean).

Precipitation and Evapotranspiration
Precipitation (rainfall + snowfall) and evapotranspiration (evaporation + transpiration) are the largest incoming and outgoing water fluxes in any hydrologic system, respectively, and play significant roles in the distribution and availability of water resources in a region [27]. Traditional in-situ point-based monitoring of both P and ET is costly, restricted to local scales, and prone to large uncertainties, especially in the data-limited regions like the upper CPRB (with a meager rain gauge density of 0.5 rain gauges per 1000 km 2 ). Therefore, to get the spatially distributed continuous time series, we have retrieved data from multiple sources. A detailed list of various gridded datasets used for precipitation and evapotranspiration, their sources, and spatiotemporal resolutions are shown in Table 1.

Terrestrial Water Storage
Various datasets for terrestrial water storage (TWS) and their salient features are listed in Table 2. Spherical harmonics (SH) coefficients, used in TWSA products, are processed for noise reduction at high frequencies (low wavelength) using low-pass filtering (e.g., 300 km Gaussian filtering, truncation at the maximum order and degree 60, de-striping), which would have inevitably led to signal errors [28][29][30]. These errors consist of bias (leakage-out; signal loss within specific study area due to the basin function/filtered averaging kernel) and leakage (leakage-in; contamination/gain in the target signal from the surrounding region) errors [28,31]. To restore true signals, a number of model-dependent (additive [32], scaling [33], and multiplicative [31]; all of which primarily use outputs from the hydrologi-cal models) or model-independent (i.e., data-driven [34]) approaches have been employed in globally distributed river basins. Since the data-driven approach outperforms other methods in the upper CPRB, we employ the same for restoring GRACE signals. Overall, an inter-comparison of filtered GRACE TWSA and ∆S shows strong agreement between all products. The variance (scatter RMS varying from 12.5 to 19.5 mm) among different GRACE products range within error bounds of the GRACE data with no significant biases, which is consistent with previous studies [2,30]. Since there is minimal signal loss attributed to the regularization and post-fit residual analysis, no signal restoration procedures are required for GRACE Mascon (mass concentration) solutions [35]. We assume no bias between GRACE and GRACE-FO data following recent studies that reported negligible intermission biases over the Central United States, Middle East, Europe, Australia [36], and ice caps and glaciers [37,38]. Derived by the multi-channel microwave and IR observations from satellites, followed by rescaling based on gauge observations, summing (and applying a factor of three) 3-hourly valid retrievals in a grid cell.
Huffman et al. [40] Intercalibrates and merges the satellite microwave precipitation estimates with microwave-calibrated IR satellite estimates and gauge data using the quasi-Lagrangian time interpolation.
CHIRPS-2.0 0.05 • × 0.05 • Daily Funk et al. [41] Incorporates satellite data from NASA and NOAA, and in-situ station data followed by the removal of systematic bias based on IR CCD observations. Jointly uses the atmospheric general circulation model, atmospheric assimilation system (including modern hyperspectral radiance and microwave observations), an interactive aerosol scheme, and the observed precipitation at the land surface. A harmonization and quality control of the individual input solution level is performed, followed by application of variance component estimation. The resulting COST-G combined gravity fields are validated by assessing their signal and noise content in the spectral and spatial domain.

CNES GRGS Spherical
Harmonics RL05 GSFC monthly regularization matrices are determined by analyzing the geographical binning of the inter-satellite range-acceleration pre-fit residuals. The 1-arc-degree equal-area values have been placed on an equal angle 0.5 • × 0.5 • grid. Land values are determined with a least-squares estimator that conserves mass over each region.

Artificial Neural Network
GRACE, operational from April 2002 to June 2017, and GRACE-FO, operational from June 2018 onwards, monitor Earth's gravity at an unprecedented spatiotemporal resolution [57]. GRACE products have been revolutionary in understanding terrestrial water storage dynamics, sea-level change, melting of polar ice, and climate change [10,58]. GRACE products have frequent gaps, especially post 2011, and an 11-month gap between GRACE and GRACE-FO datasets (see footnote of Table 3 for detailed information of data gaps). We developed a multilayer perceptron artificial neural network (ANN) model for filling data gaps in and between GRACE and GRACE-FO TWSA time series. The ANN, a nonparametric modeling technique, seeks to learn the functional mapping (f) between a set of predictors or states (x) and the target or output variable (y), when there is a lack of conventional physics-based models. Mathematically, ANN can be represented as below [4,59] y = f(x) + ε where ε is the process noise. The ANN model consists of three components: an input layer (predictors), a hidden layer, and an output layer. A combination of monthly time series of various hydroclimatic (temperature, precipitation, evapotranspiration) and water storage components (soil moisture storage, surface runoff, TWS from GLDAS) were tested as predictors during model calibration, and the best performing set was further used for model validation and prediction. The hidden layer of the ANN is comprised of hidden neurons whose numbers were optimized with the coupled genetic algorithm and finalized as six, corresponding to the minimum root mean square error and the maximum Nash-Sutcliffe efficiency during model calibration. Each neuron in the hidden layer is the weighted sum of the predictors, where, a k is hidden neuron (k = 1, 2, . . . 5), w ki is the weight associated with the input layer i (= 1, 2, . . . P) and neuron k, and w ki is the bias term. The weighted sum is then utilized to calculate the output variable (i.e., TWSA) via a transfer function as below, where, z k is the output, and Ψ is the transfer function. Building upon a previous study [4], we utilized the ANN model for gap-filling between GRACE and GRACE-FO and other data gaps due to battery management. TWSA time series from May 2002 to June 2017 (GRACE) and June 2018 to April 2020 (GRACE-FO) were used for calibration and validation, respectively. Model performance during both calibration and validation phases is assessed using various statistical parameters, including the Pearson correlation coefficient (r), normalized root mean square error (NRMSE), and Nash-Sutcliffe efficiency (NSE).

Variability among Precipitation and Evapotranspiration Data Products
Precipitation: Although all precipitation products have similar dynamics in both monthly and seasonal time series (Figure 3a,b) and are highly correlated (r ≥ 0.9 except for PERSIANN-CDR), there are substantial deviations in magnitude (as high as 150 mm in July). Despite the adjustments with GPCP data [60], PERSIANN-CDR, in general, underestimates (average~110 mm per month, equivalent to~35%, during the rainy season) precipitation relative to the ensemble mean; meanwhile, other products fluctuate between overestimation or underestimation. PERSIANN-CDR tends to underestimate heavy precipitation (≥20 mm d −1 ) events and therefore results in the dampened monthly and seasonal cycles, and hence was excluded from further analysis. Deviations among data products, even within the same category (satellite-related, reanalysis, gauge-based), depends on a multitude of factors such as location, topography, climate (e.g., arid and semi-arid regions possess higher discrepancies among various products than in humid regions; [61]), physical inadequacies (e.g., including few cloud-related parameters, or few gauge observations) of various retrieval algorithms (e.g., in PERSIANN-CDR case), interpolation processes, and bias adjustments [62]. Incorporation of few but overlapping gauge data in sparsely gauged basins, like the upper CPRB, can also result in a smaller spread among various precipitation products, possibly due to limited uncertainties in gridding and undercatch correction procedures [1]. Consistent with the tropical climate regime, GPM, TRMM, CHIRPS, and GPCP show precipitation patterns with reasonably consistent distributions (maximum seasonal variation of 86 mm in July) and hence were considered for further analysis (Figure 3). The average rainy season CV is 10.5% (maximum of 15.7% in July) and non-rainy season CV is 32.4% (May to October); the latter is due to the near-zero precipitation values from all four products. The first precipitation peak in May is from the onset of the western monsoon season (May-October) and tropical rainfalls from the South China Sea. The second peak in August-September (variable among products) corresponds with monsoon rainfall and tropical storms in the basin, and the dry season rainfall is attributable to southern winds and the northeastern monsoon ( Figure 1). Evapotranspiration: ET products reveal complex monthly and seasonal time series over the basin due to high spatiotemporal variability in soil moisture, meteorological conditions, and other phenological factors [7,63]. Despite similar dynamics in annual and seasonal cycles, the three ET products possess a widespread (i.e., less agreement) relative to the precipitation products (Figure 3), which is consistent with previous water budget assessments [1,2,8]. Seasonal dynamics of all three ET products have more stable behavior than P (Figure 3c,d), attributable to the humid climate and intense irrigation (by surface water and groundwater in both wet and dry seasons), which reduces ET's dependence on soil moister availability [64]. GLEAM underestimates ET, whereas GLDAS provides overestimates, especially during the dry period. Underestimation by GLEAM is likely attributable to inadequate representation of the evaporation from water bodies in the region and underestimation of canopy interception, vegetation optical depth, and the water extent in the rainforests. Overestimation by MERRA in the tropical CPRB is consistent with the Zhang et al. [7] Climate Data Record (CDR). Average dry and wet seasonal ET variations are comparable with the CV of 13.4% and 10.8%, respectively.

Observed Runoff
The mean monthly runoff recorded at the Nakhon Sawan C2 station ranges from

Generating Continuous TWSA Time Series
Since all six TWSA time series from various data processing centers utilize common GRACE and GRACE-FO data, they agree favorably in terms of amplitude and dynamics for both the monthly and seasonal time series (Figure 5a,b) Figure A1) which can be ascribed to (a) small uncertainties in GRACE data, primarily from the short-wavelength signals and subsequent filtering processes which tends to be smaller for large areas, (b) ample predictor availability from ancillary data sources, and (c) the embedded genetic algorithm avoids underfitting or overfitting in the ANN model. An example for the modeled and observed time series is shown in Figure A1 to highlight the ANN model's performance during calibration and validation phases. The current ANN model is used for data gap filling between GRACE and GRACE-FO TWSA time series (11 values) and for filling intermittent data gaps (23 values) occurring approximately every six months starting from 2011; the latter of which has been filled by linear interpolation of the two or more bounding values in previous studies. This linear interpolation may induce uncertainties by (a) underestimating the actual (positive/negative peak) TWS if the data gap happens to be in the peak of the wet or dry season, or (b) overestimation or underestimation in case of the high short-term fluctuations in the TWS. Furthermore, these instances of overestimation or underestimation might lead to inappropriate inferences in the river basins like CPRB, which is highly vulnerable to floods and droughts. Therefore, we have attempted to quantify various water budget components as accurately as possible.
A number of floods (e.g., 2011) and droughts (e.g., 2015, 2016) experienced in the basin are evident in the TWSA time series [4]. Subsequently derived ∆S agrees well with a minimal scatter of <20% (Figure 5c,d). Some minor discrepancies among TWSA products can be attributed to different processing algorithms and correction and filtering methods used by various data centers. However, all variations are well within the error ranges of the GRACE data with no significant biases, which is consistent with the previous studies [2,30].  Table 3) is the output from the ANN model (details can be found in Figure A1). The shaded region represents the months of missing values (a total of 34 values). Since the value of ∆S is almost zero in October, leading to a very high CV value (>100%), the exact value is not shown in (d) for simplicity.

Raw Ensembles of the Water Budget and Residual Error
Seasonal variations of water budget components (shown in Figure 6)-excluding estimates from model products (e.g., GLDAS) which neglect anthropogenic activities-possess embedded effects of both natural and human-induced climate variability (e.g., reservoirs management, groundwater abstraction, irrigation, etc.) and has strong seasonality. Generally, P controls the other water cycle components during the rainy season (May to October) since the anthropogenic activities are minimal during this period, with an opposite situation in the dry season (November to April) ( Figure 6). P attains a minimum of~7-10 mm (December-February) and a maximum of 267.73 mm (September). ET has a minimum of 50 mm (February-March) and continually increases from May to October from the onset of the monsoon season with a maximum of 123.24 mm (October). ∆S is primarily driven by the net precipitation (P-ET) and shows a minimum of −105 mm in December and a maximum of 103 mm during June to August. Maxima and minima of all the water budget components have the lag behavior varying between one to three months from each other. The solid colored lines represent the ensemble mean of either the data products used in this study (for P, ET, Q, ∆S) or the mean of residuals from 72 combinations (for residual error). The darker shaded regions represent the range of the values of the respective variable (for P, ET, Q, ∆S) and the error range corresponding to the different combinations of the data products (for residual error) used in the water budget. The lighter shaded regions (for P, ET, ∆S) represent the 95% confidence interval.
The mean monthly residual error averaged across all combinations of water budgets is 14.37 mm month −1 for the basin and is consistent with the previously reported tendency for positive residuals in lower latitude basins [2]. The mean monthly error accounts for about 15% (consistent with Rodell et al. [5]) and 16% of the ensemble mean of monthly P and ET, respectively. An increasing trend of 0.03 mm month −1 in residual errors may partly be attributed to increased human interventions. Specifically, the CPRB has hosted a 10% increase in the area equipped for irrigation from 5830 × 10 3 ha to 6415 × 10 3 ha from 2002 to 2020 in Thailand; [65], which is expected to drive increasingly negative biases in model-based ET due to missing irrigation information. Although overall variations of the residual error in the basin are minimal (mean of the % variation = −0.02%), the residual error is more frequently positive than negative and generally possesses relatively high fluctuations from April to November ( Figure 6). The tendency of the residual error to be positive can be mainly explained by either wet biases in precipitation or by dry biases in the ET and ∆S (or a combination of these factors).
Since ground-truthing for most of the water budget variables is unavailable in datalimited/data-scarce river basins such as the upper CPRB, a set of data products having large errors in different directions (e.g., underestimated P, overestimated ET, Q, and ∆S in case of positive residuals) may erroneously lead to minimal or even near-zero residuals in the water balance resulting in misleading interpretation and implications. Therefore, three water balance closure techniques of varying mathematical complexity are applied to attain the physical water balance constraint for each unique realization of the water budget. Since an independent and uncorrelated a priori bias in various datasets cannot be known, we assume corrected values as 'true'; note that the development of more robust and advanced techniques may lead to more realistic results.

Comparison of Three Water Budget Closure Techniques
All three closure techniques perform similarly across water balance components without presenting unidirectional characteristics (Figure 7). The proportional redistribution (PR) method generally provides the lowest and highest closure constraints (Equation (9)) for P, ET, and Q in case of wet and dry biases, respectively (Figure 7). ∆S closure constraints are persistently positive during the dry season. P closure constraints are typically negative (indicating a wet bias in P), while those for ET, Q and ∆S have opposite signs during respective time periods. Although closure constraints are minimal during October-November in P, other components (Q, ET, and ∆S) have dry biases attributed to the negative residual during these months. Since Q is observed in-situ, it generally possesses low constraint values except for the month of October (corresponding to the highest Q), particularly from the PR approach. Despite the mixed behavior of closure constraints in seasonal time series, the mean annual variability of various components reveals wet biases in P and Q and dry biases in ET and ∆S (Figures 8 and A2). The absolute magnitude of these biases follows the order of variability as ∆S > P > ET > Q. One notable inference is that negative (or positive) closure constraints for P do not necessarily imply positive (or negative) constraints in the remaining components on annual scales. All components may have similar constraints (same sign/direction: positive or negative) but with different magnitudes, as revealed in the annual time series (e.g., the year 2012; Figure 8). Wet bias (overestimation) of P or dry bias (underestimation) of ET, especially in the rainy season, may lead to misinterpretation of regional flood and drought potential and projections. Similarly, accurate assessments of Q and ∆S may inform dam reservoir operations (water release amount and timing) and agricultural water management (e.g., partitioning and dependence on surface water and groundwater extraction).

Long-Term Variations and Trends in Corrected Water Budget Components
We calculate long-term annual means of various water storage components (P, ET, Q) from 2003 to 2019 and compared them with a previous study [66] focusing on global estimates for the same period and other global studies. Long term means of P, ET, and Q for the upper CPRB are 1355 mm yr −1 (global mean 811 mm yr −1 , [66]), 1086 mm yr −1 (1.8 to 3.3 times larger than global estimates based on various methods [5,66]), and 210 mm yr −1 (global mean of 366 mm yr −1 [66]), respectively. Uncertainties of all the water budget components are decreased after applying closure constraints (shaded areas in Figure 8). Corresponding uncertainties for corrected P (σ P ), Q (σ Q ), ET (σ ET ), and ∆S (σ ∆S ), calculated as the standard deviation across data products, are 10.4, 4.7, 9.12, and 2.6 mm month −1 , respectively. Intercomparison of the annual trends in raw and corrected components reveal that P has the highest change (raw vs corrected trends: −1.85 to −4.67 mm yr −1 ) followed by ET (−1.94 to −1.09 mm yr −1 ) and Q (−4.38 to −4.25 mm yr −1 ). Corrected long-term linear trends reveal a 1.0% decrease in P, 8.6% increase in ET, and 13% decrease in Q (+3%, +10%, and −6% change in P, ET, Q, respectively, globally [66]). ET/Q has an increasingly significant trend, which suggests that P is increasingly partitioned in ET rather than Q (i.e., decreasing runoff efficiency). Increases in ET can be associated with the joint impact of ENSO events with increasing temperature and irrigation in the region [4]. This result agrees with the premise of intensification of the water cycle in a warming climate [66]. Although the majority of deforestation in CPRB occurred in the late 20th century (especially in the 1970s and 1980s), ongoing dynamic and heterogeneous land use land cover changes in the basin may partially influence ET trends. The observed increase in ET/Q indicates a potential overreliance on groundwater in the future if the current trend persists. Further human influences of irrigation and reservoir management may be implicated in our water budget closure assessment. For example, corrected ET is larger than raw ET. This computed underestimate is consistent with expected biases from ET products that do not fully account for irrigation.
Additionally, the long-term average of ∆S is persistently positive, even in some drought years (e.g., 2015) ( Figure 8). This may be partially attributable to dam reservoir operations in the region that maintains water in CPRB or to the relative redistribution of biases among other water budget variables (P, ET, and Q); the latter of which can simply be a mathematical artifact and should be investigated further. These observationally-based signals of human interventions with the water cycle are important to consider for risk mitigation associated with extreme hydrologic events and agricultural water management in the basin. We acknowledge uncertainties and limitations associated with retrieval algorithms, models' physical structures, systematic biases, and limited data records implicit in data products used herein. Thus, future water budget assessments over CPRB can benefit from the use of higher spatial and temporal resolution data with longer records followed by downscaling/reconstruction (e.g., [67]). Continuously increasing density of rain gauges, ET flux towers, and groundwater monitoring will further assist in understanding the complex interactions of human-induced changes and climate variability on regional hydrology.

Conclusions
In this study, we have quantified relative biases in water budget components to better understand regional earth system processes and water dynamics in the upper Chao Phraya River Basin from May 2002 to April 2020. The basin is highly vulnerable to hydroclimatic extremes, plays a key role in the policymaking in Thailand and subsequently in neighboring countries, and possesses a complex existing water governance framework with a multitude of institutions (as many as 31 ministerial departments, among other national and autonomous agencies; [12,68]). The major findings of our study are summarized below,

•
Various data products, which are based on remote sensing, reanalysis, and in-situ observations, have similar seasonal and annual dynamics, and the variations across products can be attributed to the different retrieval algorithms, model inadequacies, and other biases. • The ANN model has the potential to fill data gaps between the GRACE and GRACE-FO TWS time series (11 months missing values) and for the sporadically missing values due to battery management practices (a total of 23 missing values during the study period); the latter of which is otherwise filled by using the linear interpolation of the bounding values and thus may underestimate the TWS especially during the peak of the dry or wet season. • Generally, P tends to have wet bias while ET, Q, and ∆S have dry biases. Seasonal time series of the closure constraints of various water budget components reveal a mixed behavior, while the absolute mean annual variability follows the order as ∆S > P > ET > Q. Interestingly, a negative (or positive) closure constraint in P does not necessarily imply a positive (or negative) constraint in the remaining components on an annual scale. Knowledge of wet or dry biases of individual components of the water cycle has potential implications for water resources, climate, and agriculture in the basin. For example, correction of an overestimation (underestimation) implicit to P (ET) in the rainy season may avert inaccurate flood (drought) projections.

•
The magnitude and sign of closure constraints on various water budget components using the three closure techniques and the resultant partitioning of the water cycle when combined with long term trends in various hydrological and water storage components in our companion study [4] can be effectively used to inform water management especially for mitigation of the adverse effects of drought and floods and for water availability in the basin.
• Although the currently employed water balance closure techniques utilize the unique information from the available scatter of various data products for any given component, the derived combination, though physically consistent, may not be most accurate. This could happen because of similar biases (e.g., dry bias in both P and ET), biases with opposite directions, correlation among various components arising from geophysical variabilities, or simply due to the mathematical artifact [69]. However, the results may serve as the basis for the starting point of getting insights into water availability and other hydrological applications for guiding potential users.
Our assessment provides insights into relative uncertainties of observed and modeled water budget components for the study period and may be utilized for enabling efficient water management and related policymaking. Results may be useful for further segregating the impacts of natural and anthropogenic factors on hindcasted or forecasted water budget components and subsequent potential adaptation measures. The improved understanding of regional water cycle dynamics is of utmost importance and has potential in benchmarking regional hydroclimate models for better understanding, quantification, and predicting the distribution and variability of the terrestrial water cycle and ultimately leading to reduced water budget imbalance.