Improving the Resolution of GRACE Data for Spatio-Temporal Groundwater Storage Assessment

: Groundwater has a signiﬁcant contribution to water storage and is considered to be one of the sources for agricultural irrigation; industrial; and domestic water use. The Gravity Recovery and Climate Experiment (GRACE) satellite provides a unique opportunity to evaluate terrestrial water storage (TWS) and groundwater storage (GWS) at a large spatial scale. However; the coarse resolution of GRACE limits its ability to investigate the water storage change at a small scale. It is; therefore; needed to improve the resolution of GRACE data at a spatial scale applicable for regional-level studies. In this study; a machine-learning-based downscaling random forest model (RFM) and artiﬁcial neural network (ANN) model were developed to downscale GRACE data (TWS and GWS) from 1 ◦ to a higher resolution (0.25 ◦ ). The spatial maps of downscaled TWS and GWS were generated over the Indus basin irrigation system (IBIS). Variations in TWS of GRACE in combination with geospatial variables; including digital elevation model (DEM), slope; aspect; and hydrological variables; including soil moisture; evapotranspiration; rainfall; surface runoff; canopy water; and temperature; were used. The geospatial and hydrological variables could potentially contribute to; or correlate with; GRACE TWS. The RFM outperformed the ANN model and results show Pearson correlation coefﬁcient (R) (0.97), root mean square error (RMSE) (11.83 mm), mean absolute error (MAE) (7.71 mm), and Nash–Sutcliffe efﬁciency (NSE) (0.94) while comparing with the training dataset from 2003 to 2016. These results indicate the suitability of RFM to downscale GRACE data at a regional scale. The downscaled GWS data were analyzed; and we observed that the region has lost GWS of about − 9.54 ± 1.27 km 3 at the rate of − 0.68 ± 0.09 km 3 /year from 2003 to 2016. The validation results showed that R between downscaled GWS and observational wells GWS are 0.67 and 0.77 at seasonal and annual scales with a conﬁdence level of 95%, respectively. It can; therefore; be concluded that the RFM has the potential to downscale GRACE data at a spatial scale suitable to predict GWS at regional scales. and GWS datasets to GRACE-derived TWS and GWS indicates the best performance at a monthly scale. The comparison of GRACE-derived GWS and downscaled GWS with observational wells GWS is in good agreement (R = 0.70, 0.67) at the seasonal scale. The statistical comparison of all the datasets shows low RMSE. These results provide the necessary conﬁdence to downscale GRACE products for the estimation of groundwater monitoring in the study region.


Introduction
Groundwater has become the main source of irrigation and domestic water supplies. It is being abstracted at an alarming rate due to its easy access and wider distribution. Recently, groundwater abstraction and consumption have been increased due to an increase in population and, consequently, stress on agriculture. The level of groundwater is continuously declining due to over-abstraction as well as deteriorating water quality. The situation demands continuous groundwater monitoring and management at a regional scale [1][2][3][4]. Traditionally, a network of observational wells is used to monitor groundwater behavior at a regional scale. However, establishing such networks is not only costly but also tiresome to collect data. The spatial distribution of such a network is uneven, which hinders the capture of variability in groundwater [5]. Therefore, the study of changes in groundwater storage at a large scale is normally constrained because observational wells data have less density and limitation of the spatial extent [6].
To this end, the deficiency is fulfilled with the usage of data from an earth gravity satellite, i.e., GRACE. It provides the opportunity for direct measurement of spatial changes in water storage with high precision and monthly resolutions [7][8][9][10][11] at a large-scale basin with reasonable accuracy [12][13][14]. Using the GRACE data, GWS can be obtained continuously at regular intervals as compared to monitoring groundwater with a network of piezometers, which is helpful to overcome the problems of scarcity and uneven distribution of piezometers in the study area. Scanlon et al. [13] observed regional water storage changes using GRACE data with an error of less than 1 cm for an area of more than 400,000 km 2 . Richey et al. [14] estimated global groundwater storage variation and found severe abstractions in various regions over the world, including the IBIS from 2003 to 2013. Tang et al. [15] applied the Budyko model to reconstruct annual GWS using GRACE and other multiple input variables and observed the depletion of GWS in Punjab province, Pakistan, from 1980 to 2015. Iqbal et al. [16] reported significant groundwater storage loss in this province from 2003 to 2010 using GRACE data. Although GRACE is being used widely for measuring GWS changes, its lower spatial resolution is a constraint for using this product at a small scale [13,[17][18][19]. Therefore, it is essential to downscale GRACE data at a reasonable resolution that could be useful for estimating GWS changes at regional scales.
Two main approaches of spatial downscaling available are dynamic downscaling and statistical downscaling [17,20,21]. Dynamic downscaling is used to develop a numerical model at a higher resolution, which can be applied at smaller scales in local areas. While statistical downscaling approaches are applied using observations taken over several years to develop the relationship between data from large-scale and data from small-scale observations. Different from the dynamic downscaling approach, which requires complex data from multiple sources, the statistical downscaling approach is flexible to develop modeling factors at a higher resolution. For example, multivariate statistical regression uses fewer auxiliary data and is less time-consuming; hence, it significantly improves the spatial resolution of hydrological variables. A statistical-based downscaling of TWS using a water balance equation to improve the spatial resolution of TWS was proposed by Ning et al. [22]. They observed that evapotranspiration, surface runoff, rainfall, and other hydrological variables have a comprehensive effect on TWS. Yin et al. [23] downscaled GWS using high-resolution evapotranspiration (ET) data and concluded that the spatial resolution of GWS can be improved effectively for the case study of the North China plain.
The machine-learning techniques are excellent, and they can be applied effectively to downscale hydrological variables [24][25][26]. For example, Seyoum et al. [27] estimated the glacial aquifer GWS based on downscaling using a machine-learning approach and observed that the resolution of GWS could be improved at a spatial scale despite the limitations in input data uncertainties. Miro et al. [19] used an artificial neural network model to conduct downscaling for the central basin of California and found that the downscaling model could accurately simulate the variations in GWS at high resolution. Rahaman et al. [28] used the random forest algorithm to downscale GRACE-derived GWS of 1 • at a high resolution (0.25 • ) for the Northern High Plains aquifer, and the NSE reached Remote Sens. 2021, 13, 3513 3 of 27 0.58~0.84. Sun. [29] utilized GRACE data and the artificial neural network model to predict the variations in GWS in the United States. The authors concluded that the model outputs showed some limitations in spatial prediction but changes in the time series of GWS could be predicted well. Yang et al. [30] reconstructed TWS using linear regression and a machinelearning approach using GRACE and GLDAS products in Northwest China from 1948 to 2002. The results show that canopy water, soil moisture, and snow water equivalent are closely related to TWS.
Keeping in view the limitations identified by the researchers, this study proposed a two-step machine-learning-based downscaling model. At first, TWS downscaling was conducted and, then, isolating other vertical water components of TWS to acquire downscaled GWS output. The TWS is a complex hydrological variable that has other vertical water storage components, e.g., soil moisture, canopy water, surface runoff, snow water equivalent, and groundwater [31,32].
Since the random forest model is data-driven, the quality and nature of the data used as inputs are of critical importance. However, a previous study by Chen et al. [33] that employed a random forest model to predict TWS and GWS utilized hydrologic variables that are combinations of evapotranspiration, rainfall, soil moisture, surface runoff, canopy water, and snow water equivalent and just investigated the model accuracy. Here, we extend this work. The main contribution and novelty of this study are:

•
We used temperature from a previous study by Zhang et al. [34] and other key local geospatial datasets (elevation, slope, and aspect) to increase the accuracy of the model along with GRACE data that are shown in the literature to be significant predictors of TWS [34,35].

•
We assessed the trend and estimated the loss of downscaled GWS based on RFM.

•
We explored the correlation of input variables with the GRACE TWS data.
In this case study, nine modeling variables were selected, including rainfall (RF), evapotranspiration (ET), soil moisture (SM), surface runoff (Qs), canopy water (CW), temperature, slope, aspect, and digital elevation model (DEM) and analyzed using a random forest machine-learning model to downscale GRACE-derived TWS spatially in space-time. However, a random forest model has been used to downscale GRACE data by very few researchers.
To date, however, there are no downscaled products that exist for IBIS, although these products are very important for local water resources management. Considering the lack of high-precision and high-resolution GWS products, our research concentrated mainly to construct a higher-accuracy model and, for the first time, produces a continuous yearly change in TWS (including GWS) over the IBIS using the machine-learning method. The objectives of this study are threefold:

Study Area
The area selected for this study is the Pakistani part of the Indus basin irrigation system (IBIS) (Figure 1). The IBIS encompasses approximately 201,072 km 2 area. The river flows comprise glacier melt, snowmelt, rainfall, and runoff. Outside the polar regions, the upper Indus River basin contains the greatest area of perennial glacial ice in the world (220,00 km 2 ). The glaciers serve as natural storage reservoirs that provide perennial supplies to the Indus River and some of its tributaries [36]. The Indus River system forms a link between two large natural reservoirs, the snow and glaciers in the mountains and the groundwater contained by the alluvium in the Indus plains of the Sindh and Punjab provinces of Pakistan [37].

Data Sources and Processing
There are three databases used in this study: (i) TWS data from the GRACE satellite; (ii) soil moisture, surface runoff, canopy water, temperature, and evapotranspiration data from the global land surface assimilation system (GLDAS) NOAH model; iii) rainfall data from TRMM 3B43. The downloaded gridded GRACE data have a spatial resolution of 1°, and the other datasets have a spatial resolution of 0.25°. The data were observed from January 2003 to December 2016, with monthly temporal resolution. Table 1 summarizes the data used in this study.

GRACE TWS
The gridded GRACE satellite data (RL-05, level-3) are processed and provided by three institutions Center for Space Research (CSR, the University of Texas), Jet Propulsion Laboratory (JPL, Pasadena, CA, USA), and GeoForschungsZentrum Potsdam (GFZ, Potsdam, Germany), respectively [48,49]. It can be downloaded from here (https://grace.jpl.nasa.gov/data/get-data/monthly-mass-grids-land/) which was accessed on 13 September 2019. These three gridded GRACE products have a spatial resolution of 1° × 1° We used the average data of these three products. These products have undergone pre-processing, such as de-striping filter, Gaussian smoothing, and glacier isostatic adjustment (GIA) [33]. By averaging these TWS products, the accuracy can be improved With the rapid increase in agricultural development, population growth, and industry, water resources scarcity has become one of the significant aspects in the study area. Recently, the region has over-abstracted groundwater to meet the requirements of residential uses and agricultural production [16]. Thus, studying GWS has practical significance in the IBIS. The arid to semi-arid climate exists in the major part of the IBIS and the secondary source of water is rainfall. The seasonal rainfall from November to April during the dry Rabi season is only 30% of the total water available. May to October is the rainy Kharif season [38] providing 70% of available water.
Crop intensity and crop water requirement have increased over the past decades at the system scale in the IBIS, and have not been satisfied by the combination of surface water withdrawals and rainfall [39]. Unreliability and inadequacy of surface water supply have forced farmers to abstract groundwater resources. The quantity of groundwater resources pumping varying from 52 to 61 km 3 /y was reported [40,41]. The declining groundwater level is experienced in most of the north-eastern regions of Punjab where fresh groundwater is abundant [42]. Mainly, groundwater depletion in eastern Punjab is a hotspot, where the transient effect of pumping extensive groundwater across the Indo Pak borders increases the likelihood of water table depletion [43,44]. The quality of groundwater is marginal to brackish in 60% of the IBIS aquifer [45][46][47].

Data Sources and Processing
There are three databases used in this study: (i) TWS data from the GRACE satellite; (ii) soil moisture, surface runoff, canopy water, temperature, and evapotranspiration data from the global land surface assimilation system (GLDAS) NOAH model; (iii) rainfall data from TRMM 3B43. The downloaded gridded GRACE data have a spatial resolution of 1 • , and the other datasets have a spatial resolution of 0.25 • . The data were observed from January 2003 to December 2016, with monthly temporal resolution. Table 1 summarizes the data used in this study. Table 1. Summary of the data used.

Variables
Sources and Processing Units

GRACE TWS
The gridded GRACE satellite data (RL-05, level-3) are processed and provided by three institutions Center for Space Research (CSR, the University of Texas), Jet Propulsion Laboratory (JPL, Pasadena, CA, USA), and GeoForschungsZentrum Potsdam (GFZ, Potsdam, Germany), respectively [48,49]. It can be downloaded from here (https://grace.jpl. nasa.gov/data/get-data/monthly-mass-grids-land/) which was accessed on 13 September 2019. These three gridded GRACE products have a spatial resolution of 1 • × 1 • We used the average data of these three products. These products have undergone pre-processing, such as de-striping filter, Gaussian smoothing, and glacier isostatic adjustment (GIA) [33]. By averaging these TWS products, the accuracy can be improved [48,50]. Linear interpolation was used to fill all missing monthly data [51]. To restore the original signal that was lost during data processing, a multiplicative scaling factor was applied to the monthly mass gridded data provided by the GRACE satellite [27]. More information about scaling factors can be found in Landerer and Swenson [48].

GLDAS Data
The GLDAS simulates satellite and ground-based observations [52] using four land surface models, i.e., common land model (CLM), Mosaic, Noah, and variable infiltration capacity (VIC). Four land surface models provided by GLDAS-1 exists to simulate global soil moisture, surface runoff, canopy water, evapotranspiration, and other hydrological variables. The data are available from 3 h to monthly while spatial resolution is 1 • ×

TRMM
The Tropical Rainfall Measuring Mission (TRMM) is a joint JAXA and NASA mission to monitor and study tropical rainfall and climate. The validation of the TRMM product has been carried out throughout the world [54][55][56][57][58]. The spatial resolution was aggregated to 1 • from 0.25 • to match with GRACE 1 • data. The monthly TRMM 3B43 Version 7 (http://disc.sci.gsfc.nasa.gov/precipitation/) rainfall data for the period from 2003 to 2016 were used which was accessed on 05 October 2019.

Ground-Based Measurement
In Pakistan, groundwater levels are recorded 2 times a year from 3368 observation wells, by the Punjab Irrigation and Drainage Authority (PIDA). The data can be accessed from the PIDA website (https://irrigation.punjab.gov.pk/) which was accessed on 23 December 2020. PIDA collects groundwater data in terms of depth to water table (DTW) on seasonal bases (i) before monsoon (June) and (ii) after the monsoon (October). Certain raw observation well data had serious quality problems with irregular jumps, missing data, and outliers, which were eliminated during the data pre-processing. A subset of 398 observation wells, without having a single missing observation, is used for validation from June 2003 to June 2014 over the upper part of IBIS. The observation well data provide good estimates about the fluctuations in the groundwater table but they cannot be compared directly with GRACE-derived GWS [59]. As a result, DTW levels were converted into water level variations by subtracting the DTW from average depth to bedrock.
where GWL C is groundwater level changes, DTW is depth to the water table, and DTB is depth to bedrock (average DTB for selected observational wells = 400 m [16]). The groundwater level anomalies at the seasonal scale were estimated by using the seasonal long-term mean from June 2003 to June 2014.
where GW L A is groundwater level anomalies, and GW L lm is a long-term mean of groundwater level changes. The seasonal groundwater storage changes were estimated by multiplying average specific yield with groundwater anomalies [60].
where GWS A is groundwater storage anomalies, and S Y is average specific yield. Specific yield is the quantity of water that is actually available for groundwater pumping when sediments or rocks are drained due to the lowering of the water table. The specific yield (S Y ) in the upper part of the IBIS ranges from 0.01 to 0.4 and is between 0.07 and 0.25 in most areas [61]. The average value of S Y , which is 0.14, was utilized to estimate ground-based GWS A [61]. The 398 observational wells location data covering the upper part of the study area ( Figure 2) were analyzed and utilized to validate downscaled GWS from June 2003 to June 2014.

Groundwater Storage Estimation Using the Water Balance Equation
The water flow in and out of the system can be described by using the water balance equation (WBE). WBE estimation is difficult and complex because water balance components have uncertainties of measurements [62]. An unprecedented opportunity was provided by the GRACE satellite to capture the variability of TWS. Generally, the TWS data derived from GRACE consist of vertically integrated water storages components of the aquifer, which includes soil moisture, surface runoff, canopy water storage, and groundwater (Equation (6)) [1,3,13,28].
The WBE estimated GWS is the downscaled GWS, which is calculated using the terrestrial water balance method. An aspect of change in TWS, the other vertical water storage components that are soil moisture storage (ΔSMS), surface runoff (ΔQs), snow water equivalent (Δ ), canopy water storage (ΔCWS), and ΔGWS, observed from GLDAS, were subtracted yielding ΔGWS [28], as shown in Equation (7).
The topography of the study region is plain with warm climate and snow is uncommon. Therefore, Δ excluded in the estimation of ΔGWS. Here, the GLDAS data (SMS, Qs, and CWS) were also processed similarly to the GRACE anomaly. The SMS/Qs/CWS anomaly at time t was estimated using Equation (8).  Figure 2a shows the IBIS watershed area overlaid with the downscaled grid cell (0.25 • × 0.25 • ). Accordingly, the area from the intersection of the grid cell with the IBIS watershed area was calculated and was related to the amount of weight given to the IBIS watershed area to mask the gridded variables. Equations (4) and (5) were used to assign weights to estimate the variable values (e.g., TWS and GWS) for the IBIS watershed area.

Data Masking for the Study Area
where W i is the weight of the grid i, a i is the area of the IBIS watershed in grid i within the downscaled 0.25 • × 0.25 • grid, b i is the area of the grid cell i, V N is the spatially averaged value of one variable at a certain time (i.e., month N here) for the entire IBIS watershed, v i is the value of the data for grid i, A is the total area of the watershed, and n is the number of the covered grids.

Groundwater Storage Estimation Using the Water Balance Equation
The water flow in and out of the system can be described by using the water balance equation (WBE). WBE estimation is difficult and complex because water balance components have uncertainties of measurements [62]. An unprecedented opportunity was provided by the GRACE satellite to capture the variability of TWS. Generally, the TWS data derived from GRACE consist of vertically integrated water storages components of the aquifer, which includes soil moisture, surface runoff, canopy water storage, and groundwater (Equation (6) The WBE estimated GWS is the downscaled GWS, which is calculated using the terrestrial water balance method. An aspect of change in TWS, the other vertical water storage components that are soil moisture storage (∆SMS), surface runoff (∆Qs), snow water equivalent (∆SWE), canopy water storage (∆CWS), and ∆GWS, observed from GLDAS, were subtracted yielding ∆GWS [28], as shown in Equation (7).
The topography of the study region is plain with warm climate and snow is uncommon. Therefore, ∆SWE excluded in the estimation of ∆GWS. Here, the GLDAS data (SMS, Qs, and CWS) were also processed similarly to the GRACE anomaly. The SMS/Qs/CWS anomaly at time t was estimated using Equation (8).
where G represents SMS/Qs/CWS, and G 2004-2009(avg) represents the average G from 2004 to 2009.

Mann-Kendall Trend Test
The statistic (S) of the Mann-Kendall test proposed by [63,64] is computed with the following equations given below Equation (9): where x i and x j = values of i and j (j > i) in time series, sgn(x j − x i ) is the sign function, N = number of data points, and Equation (10).
where t i = number of ties of extent i and m = number of tied groups. A dataset containing similar values is called a tied group. If tie values are not present between the observations, then Equation (12).
The standard normal test statistic Z S is computed as Equation (13): The Z S has positive and negative values, which represent the increasing and decreasing trends, respectively. The trend analysis was performed at a 5% significance level and the null hypothesis of no trend was rejected, if |Z S | > 1.96.

Artificial Neural Network (ANN)
The ANN approach develops non-linear and empirical relations between a set of "input" and response "target" variables. A series of linked units or neurons is associated with ANN that is aimed to replicate the functions of neurons in human or animal brains. They pass data between one another, a structure that enables ANNs to be trained and learn. In this study, the ANN approach utilized is called a multilayer perceptron (MLP). It consists of units known as perceptrons. It has one or more inputs, an activation function, and an output. The MLP model is developed with the combination of perceptrons in structured layers. The perceptrons are independent of each other in a given layer, but each links to the rest of the perceptrons in the following layer. Each layer consists of a set of neurons and is trained with an algorithm of back-propagation. It is one of the most extensively utilized algorithms for supervised training of multilayered neural networks [65][66][67]. It works by approximating the non-linear relation between the input and the response by varying the weight values internally. The weights with minimum error functions are then considered to be a solution to the learning problem. A comprehensive discussion on the training algorithm and neural network construction can be found in [68][69][70][71], respectively.
In this study, the ANNs are composed of one input layer, two hidden layers, and one output layer. The number of hidden neurons in our study ranged from 100 to 150. The number of epochs is the number of times the entire training data are used to update the weights. In other words, it is the number of times that the backpropagation algorithm works through the entire training dataset. This study used 100 epochs. The final model evaluation was carried out using evaluation metrics between observed and predicted values on training data.

Random Forest Model
The random forest model (RFM) is a supervised learning algorithm and was first proposed by Breiman [72] that deals with multiple classification and regression decision trees (CART) combinations. It generates homogeneous subsets of supplementary independent variables randomly at each step, which would result in a variety of trees called random forest after doing this many times [73]. RFM is not sensitive to collinearity. RFM "spreads" the variable importance across all the variables [74]. These related variables can be evaluated for similar variable importance [75]. In the aspect of modeling, it is very valuable for a nonlinear and complex problem. Generally, it is difficult to decide which variable to remove when two (or more) variables correlate with each other [76]. It utilizes the advantages of using average outcomes of each combination. Initially, it necessitates several numbers of independent variables and regression trees produced from two-thirds of the bootstrapped sample of train data individually. The remaining one-third of observations is known as out of bag (OOB).
During the classification, the number of subsets was selected taking the square root independent variables, while taking one-third of the total independent variables during regression analysis in developing the RFM. The following steps were implemented for modeling: at first, the original input data were used to train the model. The model was trained up with multiple independent variables and a response variable (dependent) using coarse resolution data. The trained RFM was tested with the independent variables to assess the model accuracy. A variable importance measure predictive (VIMP) test was performed to check the predictive power of each variable. Greater values of the VIMP test show the higher predictive power of each variable. The predicted RFM results were compared with the response variable (dependent) to compute residuals at coarse resolution. Afterward, the model was applied to the independent variables at a higher resolution (25 • × 25 • ). Finally, the high-resolution TWS output was estimated after residual correction. The hyperparameters of the RFM in this study were set to the defaults using the "randomForest" package of the R language [77].
The RFM is being widely utilized in remote sensing and hydrology due to its benefits in classification and regression problems [24]. The RFM has some exclusive advantages for a regression problem, which are: (1) thousands of variables can be simultaneously given to the model and it does not easy to overfitting; (2) it can run efficiently on large datasets with high accuracy; (3) the importance and interaction of variable can be detected.
We used k-fold cross validation to evaluate the robustness of the RFM. k-fold folds the dataset and takes various (random in default) portions of it to train the model, and then, tests it on the remaining instances, which is a pretty good way to check how good the model works for different train and test samples. The convention is to use 10-fold and average their results, which is more reliable.

Downscaling Model Design
RFM and ANN model were used to downscale TWS from 1 • to 0.25 • . To develop both models, a statistical relationship was established between TWS, geospatial variables (elevation, slope, and aspect), and hydrological variables (soil moisture, rainfall, surface runoff, evapotranspiration, canopy water, and temperature) at coarse resolution (1 • ). After that, the selected model with higher correlation of coefficient was applied to the independent variables at a higher resolution (0.25 • ) to acquire TWS. The downscaling process of TWS from 2003 to 2016 is explained in Figure 3. ter that, the selected model with higher correlation of coefficient was applied to the independent variables at a higher resolution (0.25°) to acquire TWS. The downscaling process of TWS from 2003 to 2016 is explained in Figure 3. (1) At first, the nine input variables from January 2003 to December 2016 are aggregated to 1° by pixel averaging. Afterward, developed the RFM between TWS and nine hydrological and geospatial variables at a spatial resolution of 1°. (2) Secondly, the GRACE-derived TWS data deducted the predicted TWS of the RFM simulation in step (1) to calculate the residuals at a spatial resolution of 1°. (3) Thirdly, the developed RFM is applied to the hydrological and geospatial variables at a spatial resolution of 0.25° to attain the estimated 0.25° TWS of the RFM. Subsequently, the residual correction was performed at 0.25° by adding residuals to the estimated TWS at 0.25° to attain the downscaled TWS data with a spatial resolution of 0.25°. This residual correction procedure consists of three steps: the re-aggregation of the fine scale predictors (0.25°) to the original TWS resolution (1°), the calculation

Evaluation Methods of Model Performance
To evaluate the downscaled TWS results, four different statistical metrics were used, including R, NSE, RMSE, and MAE. Equations (14)-(17) show the mathematical representations of R, RMSE, MAE, and NSE, respectively. For R and NSE, the values closer to 1 indicate the perfect model, while for RMSE and MAE, the values close to 0 indicate the perfect model.
where X i and Y i indicate two independent datasets with the mean value of X and Y. The input TWS is represented by X i , and the predicted value of RFM is represented by Y i . While, N represents the total number of samples.

Correlation Test Statistics of Model Input Variables
Before training RFM for downscaling, Pearson's correlation coefficient tests were performed between all variables and were significant if p < 0.05. The relation between variables can help to determine which independent variables strongly correlate with TWS ( Figure 4). This is an independent test that assists in determining the significance of variables in the trained RFM. Furthermore, it provides a secondary advantage in finding other interactions between independent variables.
In decreasing order, RF, SMS, Qs, and CWS strongly correlated with GRACE TWS as shown in Figure 4. This was expected because TWS is controlled by these variables in the study region. SMS is a more spatially heterogeneous variable of how much water may be recharging the area. The weaker correlation with temperature and stronger with TWS suggests that soil moisture influence on TWS is due to replenishing in post-monsoon rather than irrigation in summer months [78]. A weak correlation between TWS and rainfall may be due to a time lag in aquifer response to rainfall [50]. The study area has high rainfall in the monsoon season. It shows a strong positive correlation with TWS. The temperature is the only independent variable that has an insignificant correlation with TWS (R = 0.01, p = 0.29). An increase in temperature will probably decrease TWS values [79]. Therefore, temperature and TWS do not show a statistical correlation. Despite this, the temperature is still included as an independent variable.

Sensitivity Analysis of RFM
The model performance mainly depends on the selection of input variables that correlate with GRACE data strongly. The soil data, rainfall, DEM, and temperature [19,80], surface runoff [22], and evapotranspiration [23] were used to downscale the GRACE dataset. However, in this study, different independent variables, including rainfall, SMS, Qs, temperature, CWS, ET, slope, aspect, and elevation were used for developing the model (RFM). Two hypotheses were assumed, i.e., GRACE data have a non-linear relation to other independent variables, and interactions remain consistent in each spatial direction.
A variable importance measures predictive (VIMP) test was performed to check the predictability of independent variables by the model. It calculates the change for each variable in prediction error before and after permutation. VIMP values close to zero shows that the variable has less influence or is non-existent in prediction. The predictive importance of the variables for downscaling is shown in Figure 5. Figure 5 shows that rainfall (RF) was the most crucial variable in developing RFM. The slope was found to be the least important to predict TWS. Since all the variables have values above zero. Therefore, all the independent variables were considered skillful in the downscaling of TWS.

Sensitivity Analysis of RFM
The model performance mainly depends on the selection of input variables that correlate with GRACE data strongly. The soil data, rainfall, DEM, and temperature [19,80], surface runoff [22], and evapotranspiration [23] were used to downscale the GRACE dataset. However, in this study, different independent variables, including rainfall, SMS, Qs, temperature, CWS, ET, slope, aspect, and elevation were used for developing the model (RFM). Two hypotheses were assumed, i.e., GRACE data have a non-linear relation to other independent variables, and interactions remain consistent in each spatial direction.
A variable importance measures predictive (VIMP) test was performed to check the predictability of independent variables by the model. It calculates the change for each variable in prediction error before and after permutation. VIMP values close to zero shows that the variable has less influence or is non-existent in prediction. The predictive importance of the variables for downscaling is shown in Figure 5. Figure 5 shows that rainfall (RF) was the most crucial variable in developing RFM. The slope was found to be the least important to predict TWS. Since all the variables have values above zero. Therefore, all the independent variables were considered skillful in the downscaling of TWS. The VIMP of model-independent variables is consistent to some extent with the correlations to TWS observed in Figure 4. Though the order is slightly different, two variables (rainfall and SMS) strongly correlate to TWS, which are also the most influencing variables in training the RFM to predict TWS. Particularly, the temperature is also influential as soil moisture ( Figure 5) in predicting TWS despite the lack of significant correlation ( Figure  4). The slope has minimal influence as a predictor because of the nearly homogenous presence of surface area within the study area (Figure 1d).

Accuracy Analysis of RFM
The original input data were used to train and test the model. To predict the TWS using the above-mentioned nine-input independent variables, two machine-learning models were developed. A comparison between RFM and ANN model predicted TWS and GRACE-derived TWS is provided in Figure 6. The x-axis shows the GRACE-derived TWS, and the y-axis shows the predicted TWS. In Figure 6, the RFM has R (0.97), NSE (0.94), MAE (7.71 mm), and the lowest RMSE of 11.83 mm, respectively. While, the ANN model has R (0.92), NSE (0.85), MAE (12.30 mm), and the highest RMSE of 18.47 mm. From the above results, it can be seen that downscaling RFM is the best, which can meet the requirements of variable prediction. Incorporating the geospatial variables and temperature data, the RFM shows higher accuracy than the previous studies [22,33]. It can be concluded that the RFM can accurately predict changes in TWS at a high resolution after introducing the residual correction. Therefore, this study only analyzes the output of the RFM in the following discussion. The VIMP of model-independent variables is consistent to some extent with the correlations to TWS observed in Figure 4. Though the order is slightly different, two variables (rainfall and SMS) strongly correlate to TWS, which are also the most influencing variables in training the RFM to predict TWS. Particularly, the temperature is also influential as soil moisture ( Figure 5) in predicting TWS despite the lack of significant correlation ( Figure 4). The slope has minimal influence as a predictor because of the nearly homogenous presence of surface area within the study area (Figure 1d).

Accuracy Analysis of RFM
The original input data were used to train and test the model. To predict the TWS using the above-mentioned nine-input independent variables, two machine-learning models were developed. A comparison between RFM and ANN model predicted TWS and GRACEderived TWS is provided in Figure 6. The x-axis shows the GRACE-derived TWS, and the y-axis shows the predicted TWS. In Figure 6, the RFM has R (0.97), NSE (0.94), MAE (7.71 mm), and the lowest RMSE of 11.83 mm, respectively. While, the ANN model has R (0.92), NSE (0.85), MAE (12.30 mm), and the highest RMSE of 18.47 mm. From the above results, it can be seen that downscaling RFM is the best, which can meet the requirements of variable prediction. Incorporating the geospatial variables and temperature data, the RFM shows higher accuracy than the previous studies [22,33]. It can be concluded that the RFM can accurately predict changes in TWS at a high resolution after introducing the residual correction. Therefore, this study only analyzes the output of the RFM in the following discussion.
The performance of downscaled TWS and GWS against GRACE-derived TWS and GWS and observational wells GWS datasets are presented using Taylor's diagram based on standard deviation, R, and RMSE (Figure 7). The correlation coefficient values ranged from 0.67 to 0.99 show good performance of RFM. The proximity of downscaled TWS and GWS datasets to GRACE-derived TWS and GWS indicates the best performance at a monthly scale. The comparison of GRACE-derived GWS and downscaled GWS with observational wells GWS is in good agreement (R = 0.70, 0.67) at the seasonal scale. The statistical comparison of all the datasets shows low RMSE. These results provide the necessary confidence to downscale GRACE products for the estimation of groundwater monitoring in the study region.

Spatio-Temporal Variation Characteristics Analysis of TWS
To analyze the consistency of downscaled TWS and GWS on a spatial scale based on RFM, TWS and GWS were compared before and after downscaling. The variations in TWS and GWS were observed in more detail at a higher resolution. Figure 8  In Figure 8, the TWS increased near the middle portion, while it decreased in the top left side of the area. While the GWS decreased all over the area except the southwest part of the area. The variations in TWS after downscaling show that the spatial changes can be observed more effectively at a high spatial resolution. Similarly, the variations in GWS after downscaling are observed consistently.
The variability in the trends of GRACE-derived TWS and downscaled TWS is almost the same, showing a declining pattern in the study area. Figure 9 shows variations in the time series of TWS before and after downscaling at low (1 • ) and high (0.25 • ) resolution as well as CSR-M TWS, respectively. The variations in amplitude and value of TWS before and after downscaling are consistent (R = 0.99) in the study area. The uncertainty associated with the calculated trend values were calculated from the differences in trend values extracted from the three solutions (CSR, JPL, and GFZ) [82,83]. The GRACE-derived TWS and downscaled TWS is decreasing at the same rate −3.25 ± 0.45 mm/year at a regional scale.
The spatio-temporal maps showing variations in TWS from 2003 to 2016 at a higher resolution on an annual basis are shown in Figure 10 to further demonstrate spatial variability of downscaled RFM results. The IBIS received massive and localized floods due to monsoon rainfall during 2010-2015 in different regions of IBIS. The high amplitude of TWS between August and September during these flooded years can be observed in Figure 9 with high and small peaks. The overall yearly change in TWS at a spatial scale during the flooded year was mostly observed in the lower part of IBIS.

Spatio-Temporal Variation Characteristics Analysis of TWS
To analyze the consistency of downscaled TWS and GWS on a spatial scale based on RFM, TWS and GWS were compared before and after downscaling. The variations in TWS and GWS were observed in more detail at a higher resolution. Figure 8  In Figure 8, the TWS increased near the middle portion, while it decreased in the top left side of the area. While the GWS decreased all over the area except the southwest part of the area. The variations in TWS after downscaling show that the spatial changes can be observed more effectively at a high spatial resolution. Similarly, the variations in GWS after downscaling are observed consistently.

Spatio-Temporal Variation Characteristics Analysis of GWS
The variations in the trend of GRACE-derived GWS and downscaled GWS showing a declining pattern consistently in the IBIS. Figure 11 shows the variability in the time series of GWS before and after downscaling at low (1 • ) and high (0.25 • ) resolution, respectively. The variability of trend in GWS before and after downscaling is consistent, showing R (0.99) in the IBIS (Figure 11). The GRACE-derived GWS and downscaled GWS is decreasing at the same rate −3.39 ± 0.45 mm/year at the regional scale. Table 2 shows the trend variability of downscaled GWS within the study area at a confidence level of 5%.  The variability in the trends of GRACE-derived TWS and downscaled TWS is almost the same, showing a declining pattern in the study area. Figure 9 shows variations in the time series of TWS before and after downscaling at low (1°) and high (0.25°) resolution as The spatio-temporal maps showing variations in TWS from 2003 to 2016 at a higher resolution on an annual basis are shown in Figure 10 to further demonstrate spatial variability of downscaled RFM results. The IBIS received massive and localized floods due to monsoon rainfall during 2010-2015 in different regions of IBIS. The high amplitude of TWS between August and September during these flooded years can be observed in Figure 9 with high and small peaks. The overall yearly change in TWS at a spatial scale during the flooded year was mostly observed in the lower part of IBIS.   Figure 11 shows the variability in the time  The yearly spatial downscaled GWS maps were generated at a high resolution from 2003-2016 within the IBIS (Figure 12). The overall spatial distribution of downscaled GWS was similar to that of the TWS, that is, in decreasing trend. Moreover, the sub-grid heterogeneity was captured by downscaled GWS effectively, which is the effect of a local-scale hydro-climatological and geospatial characteristics on GWS. Higher groundwater depletion at the spatial scale was observed comparatively at the end of 2016 in the IBIS ( Figure  12). The yearly GWS represents the changes in groundwater abstraction and recharges impacted by climatic variations and anthropogenic activities. The change in the GWS is observed significantly in IBIS from 2003 to 2016 ( Figure 12).  The yearly spatial downscaled GWS maps were generated at a high resolution from 2003-2016 within the IBIS (Figure 12). The overall spatial distribution of downscaled GWS was similar to that of the TWS, that is, in decreasing trend. Moreover, the sub-grid heterogeneity was captured by downscaled GWS effectively, which is the effect of a localscale hydro-climatological and geospatial characteristics on GWS. Higher groundwater depletion at the spatial scale was observed comparatively at the end of 2016 in the IBIS ( Figure 12). The yearly GWS represents the changes in groundwater abstraction and recharges impacted by climatic variations and anthropogenic activities. The change in the GWS is observed significantly in IBIS from 2003 to 2016 ( Figure 12).
The supply of canal water was not adequate to fulfill the crop water requirements. This shortage of water was covered through groundwater irrigation. The overexploitation of groundwater was observed in the province of Punjab, which is the upper part of the IBIS as can be seen in the above figure. Iqbal et al. [16] estimated the total loss of about 11.83 km 3 /year groundwater storage in the upper Indus plain of Punjab province of Pakistan from 2003 to 2010. Tang et al. [15] observed that the groundwater depletion rate is −0.3 ± 0.2 cm/year from 1980 to 2015 in the Punjab province of Pakistan. Irrigation in paddy fields with insufficient canal water supply may be the cause of groundwater abstraction. It is a normal practice to use groundwater in combination with surface water in the Punjab province, Pakistan. The abstraction of groundwater was fragmented because of the poor quality of groundwater in the Sindh province. Nevertheless, percolation from canals and fields replenishes groundwater resulted in the use of groundwater in combination to surface water in the north portion of Sindh [84]. The negative change in GWS (from 2009-2016) is caused by the over-abstraction of groundwater. The spatial patterns of GWS showed that the upper part of the IBIS is under severe groundwater depletion. Whereas, groundwater depletion varies from moderate to severe in the lower part of the IBIS. The spatial variations in GWS can be utilized to find the regions with excessive groundwater depletion for developing groundwater management strategies. Cheema et al. [43] also observed the highest gross groundwater abstraction in the middle and upper parts of the irrigated areas of the Indus basin for the year 2007. The groundwater resources have a relatively good quality of groundwater in these areas [85]. They are present in the Punjab province, Pakistan, and the Indian states of Haryana and Punjab.
The overexploitation of groundwater was also observed in the Indian states of Haryana and Punjab [43]. An assessment of groundwater abstractions showed that the Indian states' (i.e., Haryana, Punjab, and Rajasthan) loss of groundwater was about 109 km 3 from 2002 to 2008 with the water table declining by 17.7 km 3 /year [1]. While Tiwari et al. [86] observed a loss of about 54 ± 9 km 3 /year of groundwater storage in northern India. The overexploitation of groundwater at this alarming rate could potentially alter the transboundary groundwater flow between Pakistan and India [87]. Further study of the IBIS using high-resolution models could be complemented to help in understanding the losing or gaining GWS in certain regions. The supply of canal water was not adequate to fulfill the crop water requirements. This shortage of water was covered through groundwater irrigation. The overexploitation of groundwater was observed in the province of Punjab, which is the upper part of the IBIS as can be seen in the above figure. Iqbal et al. [16] estimated the total loss of about 11.83 km 3 /year groundwater storage in the upper Indus plain of Punjab province of Pakistan from 2003 to 2010. Tang et al. [15] observed that the groundwater depletion rate is −0.3 ± 0.2 cm/year from 1980 to 2015 in the Punjab province of Pakistan. Irrigation in paddy fields with insufficient canal water supply may be the cause of groundwater abstraction. It is a normal practice to use groundwater in combination with surface water in the Punjab province, Pakistan. The abstraction of groundwater was fragmented because of the poor quality of groundwater in the Sindh province. Nevertheless, percolation from canals and fields replenishes groundwater resulted in the use of groundwater in combination to surface water in the north portion of Sindh [84]. The negative change in GWS (from 2009-2016) is caused by the over-abstraction of groundwater. The spatial patterns of GWS showed that the upper part of the IBIS is under severe groundwater depletion. Whereas, groundwater depletion varies from moderate to severe in the lower part of the IBIS. The spatial variations in GWS can be utilized to find the regions with excessive groundwater

Temporal Validation of Groundwater Storage
The non-uniform distribution of the specific yield is due to heterogeneity in aquifer layers; therefore, GWS estimation becomes more complex. The downscaled GWS was validated with 398 observational wells GWS temporally to assess the performance of downscaling. The temporal variation in GWS was illustrated by overlaying the observational wells on the downscaled GWS. The pixels having observation wells were only selected for validation analysis (Figure 2). Figure 13  downscaled GWS is significantly well correlated with the observational wells GWS, as the RFM downscaled GWS replicates the observed GWS with a high correlation. Therefore, RFM not only improves the spatial resolution of data but also ensures the accuracy of data.

Discussion
Generally, hydrological products at a higher resolution are a hot topic. In this study, GRACE TWS and GWS were downscaled from 1 • to 0.25 • by using RFM, which provided an extra research opportunity regarding the data of GRACE. The overall outcomes of RFM downscaled data showed the potential of implement in this type of study. The downscaled GWS was validated with only 398 observational wells data because of the limitation of groundwater level data. In addition, the network of observational wells was not distributed in the entire study area, which causes less influence of heterogeneity and also validates the results less reliable. However, if there were more groundwater level data, this might cause maximum reliability. In the IBIS, extreme spatio-temporal variability in evapotranspiration (ET) due to a large amount of water exploration for agricultural production can make the GLDAS NOAH model estimated ET prone to errors, because GLDAS is an off-satellite model and cannot replicate these effects. Therefore, the uncertainties associated with monthly ET input data may also lead to biases in the statistical model's estimations.
Nevertheless, the suitability of the downscaling TWS and GWS based on RFM has been provided in this study, which could be utilized in other areas. The applicability of data scarcity regions with limited hydrological variables can be improved using satellite data, for example, groundwater observational wells data with the irregular temporal and spatial distribution. Although, the data from different sources also significantly contribute to uncertainties. For example, there is a presence of an associated uncertainty with the GRACE and input data, which can trigger some propagation of potential errors in cumulative input data [27]. Here, three GRACE satellite spherical harmonic products (CSR, JPL, GFZ) were employed to minimize the uncertainty. These products were multiplied with the scaling coefficients to restore the signals removed by Gaussian, de-striping, and filters (degree 60) of land grids, but there might be uncertainties in the hydrological model at the interannual level. In subsequent research, the researchers can directly estimate the grid at 0.25 • × 0.25 • using GRACE spherical harmonic coefficients and compare the results of multiple downscaling machine-learning models. At the same time, unavoidable systematic uncertainty in the training of machine-learning models still remains. This study compared the results of two machine-learning models (RFM, ANN). The results showed that the RFM is the best. It shows that RFM is not easily overfitted, and the noise immunity ability is better than ANN. The RFM mitigated these errors because it can handle errors associated with input data.
The possible advantage of downscaling of RFM is that multiple variables can be input simultaneously, which also increases the model's flexibility. In the model, input variables are independent of each other at the same time, and different variable factors might be preferred for different regions. The regional water management policymaker primarily focuses on GWS at a small scale of the region as compared to the GRACE-derived GWS at a large scale. This coarse resolution problem can be solved effectively by using RFM. It provides more comprehensive information for water decision makers.
The proposed models could be further modified by improving the following aspects in the forthcoming studies. For example, (1) the average of each zone exhibits that tropical rainfall in the tropics directs the TWS fluctuations; however, the evapotranspiration is very much effective in the middle of the latitudes [88]. Hence, considering the latitude of the region, the remote sensing datasets with high resolution can be chosen and utilized for future experimental analysis. The local economic effects and demographic environments must be considered accurately for water reserves, (2) the variables, for example, NDVI, soil types, and many other conditioning factors, can be involved for studying the differential salient features of the current study area, (3) the products with highly remote sensing and fine resolution can be selected for future research goals at a small regional scale to accomplish the provisions of water managers at the regional level.

Comparison of Current and Previously Related Studies
The validation analysis part is constrained due to the data scarcity of observational wells. The independent variables utilized in RFM also had uncertainties. Moreover, GRACE has some missing months data, which were calculated by interpolation, and this process causes errors during estimation of GWS. An artificial neural network (ANN) approach, which is very much precise through hydrological variables (rainfall and soil moisture), could be utilized in future researches. Furthermore, since GRACE launched in April 2002, it has had latencies from 2 to 6 months causing real-time GWS evaluation at a small scale.
In recent times, downscaling of GRACE data was conducted by Gemitzi and Lakshmi [80] and Miro et al. [19] using the ANN model. Both models performed well with calibration. Though, the first model [80] overestimated the in situ data in the validation part. The other model [19] observed an error of almost 1 m in some regions as compared to the observed GWS data. Meanwhile, Seyoum and Milewski [50] performed an ANN model to predict variables by considering the lag effect. The data of this model poorly exhibit the Pearson's correlation coefficients by depicting more variations as compared to observed GWS. The ANN model was developed by [89] to downscale GRACE data. The Pearson's correlation coefficient of the model between predicted and the water balanced derived GWS was 0.64 along with RMSE of 22.50 mm.
Except this, the common statistical method [22] presented relatively better results. The currently proposed RFM output exhibits RMSE of 11.83 mm, and the statistical metrics also strengthen the RFM acceptance, since these metrics indicate a high correlation of coefficient (R) and NSE values as compared to the ANN model ( Figure 6) and common statistical method [22]. After improving the resolution of GRACE data at a spatial scale based on the RFM, it efficiently permits the evaluation of TWS and GWS in more details at a small regional scale over the globe.

Conclusions
The application of GRACE data provides a new research approach to explore variations in GWS at a large scale. This study used two machine-learning models to conduct downscaling and considers the impact of multiple geospatial and hydrological independent variables on TWS. The results showed that RFM has higher accuracy than the ANN model. Therefore, the downscaling was conducted based on the RFM. The TWS was downscaled from 1 • to 0.25 • successfully and estimated the GWS at high resolution. VIMP illustrates that soil moisture and RF have a relatively higher influence on the downscaling process of RFM. ET and Qs were less sensitive in the downscaling process, even though they are important components of the TWS cycle. The downscaling RFM could predict the variations in water storage at a regional scale relative to long time series very well. The RFM prediction results are ideal in the IBIS. The downscaling model error is very small, and variations in the trend of water storage are consistent before and after downscaling.
The sub-grid heterogeneity in spatial variations was captured effectively after downscaling of TWS and GWS, which cannot be comprehended at low resolution.
The observed groundwater level data compared with the downscaled and GRACEderived GWS data showing the correlation coefficients of 0.67 and 0.77 on the seasonal and annual scale, where it was 0.70 and 0.79 before downscaling. Therefore, RFM can not only improve the spatial resolution but also ensure the accuracy of the data. The quality of downscaled GWS is that it captures the sub-grid heterogeneity effectively and is consistent with the GRACE-derived GWS in the study area on a temporal scale. Therefore, the areas with less density of in situ data can be monitored more accurately by improving the spatial resolution of GWS data. The recently developed RFM allows for the assessment variations in GWS on a spatial scale due to climatic and anthropogenic activities in more detail. It has been observed that RFM can not only increase the spatial resolution of the data effectively but also ensure the accuracy of the data. This study mainly focuses on the downscaling method, and some limitations are present, including the influences of NDVI, as the study area is highly irrigated, anthropogenic activities in the variations of groundwater storages, and the lack of observational wells data in validation section in the IBIS, which will be further investigated and fully evaluated in the near future.