Quantifying Changes in Groundwater Storage and Response to Hydroclimatic Extremes in a Coastal Aquifer Using Remote Sensing and Ground-Based Measurements: The Texas Gulf Coast Aquifer

: With the increasing vulnerability of groundwater resources, especially in coastal regions, there is a growing need to monitor changes in groundwater storage (GWS). Estimations of GWS have been conducted extensively at regional to global scales using GRACE and GRACE-FO observations. The major goal of this study was to evaluate the applicability of uninterrupted monthly GRACE-derived terrestrial water storage (TWS GRACE ) records in facilitating detection of long- and short-term hydroclimatic events affecting the GWS in a coastal area. The TWS GRACE data gap was ﬁlled with reconstructed values from multi-linear regression (MLR) and artiﬁcial neural network (ANN) models and used to estimate changes in GWS in the Texas coastal region (Gulf Coast and Carrizo–Wilcox Aquifers) between 2002 and 2019. The reconstructed TWS GRACE , along with soil moisture storage (SMS) from land surface models (LSMs), and surface water storage (SWS) were used to estimate the GRACE-derived GWS (GWS GRACE ), validated against the GWS estimated from groundwater level observations (GWS well ) and extreme hydroclimatic event records. The results of this study show: (1) Good agreement between the predicted TWS GRACE data gaps from the MLR and ANN models with high accuracy of predictions; (2) good agreement between the GWS GRACE and GWS well records (CC = 0.56, p -value < 0.01) for the 2011–2019 period for which continuous GWL well data exists, thus validating the approach and increasing conﬁdence in using the reconstructed TWS GRACE data to monitor coastal GWS; (3) a signiﬁcant decline in the coastal GWS GRACE , at a rate of 0.35 ± 0.078 km 3 · yr − 1 ( p -value < 0.01), for the 2002–2019 period; and (4) the reliable applicability of GWS GRACE records in detecting multi-year drought and wet periods with good accuracy: Two drought periods were identiﬁed between 2005–2006 and 2010–2015, with signiﬁcant respective depletion rates of − 8.9 ± 0.95 km 3 · yr − 1 and − 2.67 ± 0.44 km 3 · yr − 1 and one wet period between 2007 and 2010 with a signiﬁcant increasing rate of 2.6 ± 0.63 km 3 · yr − 1 . Thus, this study provides a reliable approach to examine the long- and short-term trends in GWS in response to changing climate conditions with signiﬁcant implications for water management practices and improved decision-making capabilities.


Introduction
Groundwater is the main source of freshwater for almost half of the world's population; it provides a major resource for irrigation and plays a key role in ecosystem health [1,2]. Yet, groundwater resources are under extreme threat due to rapid depletion [3][4][5][6], a serious global issue affecting groundwater resource sustainability and ecosystem health. Global groundwater storage (GWS) decreased by 4500 km 3 (~42 km 3 ·yr −1 ) between 1900 and 2008, with a significant acceleration in the depletion rate since 1950. The depletion rate more than tripled (~145 km 3 ·yr −1 ) between 2000 and 2008 [7]. During the same period, in the United States (U.S.) GWS decreased by 1000 km 3 (~9.3 km 3 ·yr −1 ) and the rate doubled after 2000 (~24 km 3 ·yr −1 ) [8].
Groundwater is often the only source of freshwater in coastal areas. This source, however, is vulnerable to climate and anthropogenic variabilities. Coastal regions are among the most densely populated areas that place an increasing demand on coastal aquifers [9][10][11] associated with ever-growing agricultural irrigation and industrial needs. Seawater intrusion, for example, could result from excessive pumping [3,[12][13][14], storm surge [14][15][16], and/or sea level rise [17,18]. Coastal aquifers are more vulnerable to depletion due to excessive groundwater extraction when compared to sea level rise [19]. Where groundwater extraction is high, coastal areas experience seawater intrusion, also exacerbated by the sea level rise, and coastal land subsidence [17,20]. Storms may lead to an increase in water table levels and flooding because of either high surge or larger volumes of precipitations [15].
Historically, groundwater monitoring relied mainly on field observations of groundwater levels (GWL) from monitoring wells. Estimations of GWS from field observations have shown significant seasonal and annual variations in areas where observations are relatively dense [21,22]. However, in many parts of the world spatial and temporal coverage of groundwater level measurements are sparse [23,24]. While the U.S. has a relatively dense well observation network, more monitoring sites are required for improved understanding of the GWS spatial and temporal patterns in relation to groundwater extraction [25]. In addition, monitoring GWS using in situ measurements is a relatively expensive and labor-intensive process [26]. When in situ measurements are not adequate for temporal and spatial coverage, satellite observations play a significant role in GWS monitoring (e.g., recharge, discharge, and storage changes) [6,[27][28][29]. Data from the Gravity Recovery and Climate Experiment (GRACE; 2002-2017) and its successor GRACE-Follow On (GRACE-FO; 2018-present) missions [30,31] have been shown to overcome the abovementioned limitations.
GRACE, a joint U.S.-German satellite mission, was launched, in March 2002, to monitor the spatial and temporal variations in the Earth's gravity field [30]. These variations have been converted to changes in terrestrial water storage (TWS), after removing atmospheric mass variations and ocean tides. Storage components of the TWS are soil moisture, surface water, snow/ice, and groundwater [32]. GRACE-derived TWS (TWS GRACE ) together with land surface models (LSM) and ground-based measurements (e.g., GWL) have been widely used to monitor GWS around the world, including the Mississippi River basin [33], the Central Valley of California [12], northwestern India [34], Africa [6,[35][36][37][38], Asia [39], and northern China [40,41]. These studies also have shown a good agreement between GRACE-derived GWS (GWS GRACE ) and GWL-derived GWS (GWS well ) in different geologic and hydrogeologic settings [27,28,40,42].
After the GRACE mission ended in June 2017, the next generation of GRACE, GRACE-FO mission, launched in May 2018 [31]. As with its predecessor, the GRACE-FO observations have been successfully applied in TWS GRACE estimations. However, there are temporal gaps between the GRACE and GRACE-FO missions ( [43].
Effective planning and management of groundwater resources is largely dependent on the analysis of TWS and GWS temporal changes. However, the precision of short-and longterm spectral and statistical analysis of these timeseries depends on their continuity [43][44][45]. Temporal data gaps in TWS could increase the error and uncertainty in the analysis of seasonal and annual cycles as well as the short-and long-term trends [43,46,47]. Therefore, bridging gaps in TWS GRACE data helps enhance performance of TWS and GWS monitoring and the outcomes that inform and improve groundwater management practices.
Several studies have used outputs of LSMs and global hydrologic models to fill gaps in TWS GRACE [48][49][50]. However, these models cannot fully simulate influences from anthropogenic factors that affect the TWS GRACE such as deep groundwater extraction and cropland irrigation [51]. Machine learning-based models have been widely used to predict hydrological variables [51][52][53][54][55][56]. Recently, reconstruction of TWS GRACE data gaps on regional to global scales has been conducted using simple regression and machine learning techniques [43,[57][58][59]. These techniques have also been used to fill gaps in, and downscale, TWS GRACE records [60][61][62][63]. However, none of the previous studies have used a gap filled TWS GRACE record to examine TWS and GWS temporal variability over the Texas Gulf Coast region ( Figure 1). Therefore, bridging gaps in TWSGRACE data helps enhance performance of TWS and GWS monitoring and the outcomes that inform and improve groundwater management practices. Several studies have used outputs of LSMs and global hydrologic models to fill gaps in TWSGRACE [48][49][50]. However, these models cannot fully simulate influences from anthropogenic factors that affect the TWSGRACE such as deep groundwater extraction and cropland irrigation [51]. Machine learning-based models have been widely used to predict hydrological variables [51][52][53][54][55][56]. Recently, reconstruction of TWSGRACE data gaps on regional to global scales has been conducted using simple regression and machine learning techniques [43,[57][58][59]. These techniques have also been used to fill gaps in, and downscale, TWSGRACE records [60][61][62][63]. However, none of the previous studies have used a gap filled TWSGRACE record to examine TWS and GWS temporal variability over the Texas Gulf Coast region ( Figure 1). In this study, a complete GRACE and GEACE-FO record (2002-2019) is used to characterize the temporal variability in TWSGRACE and GWSGRACE over the Texas Gulf Coast region ( Figure 1). Specifically, the TWSGRACE was first reconstructed using multi-linear regression (MLR) and artificial neural network (ANN) techniques to fill the data gaps within GRACE mission and between GRACE and GRACE-FO missions. Second, the complete TWSGRACE data along with soil moisture storage (SMS) from LSMs, and surface water storage (SWS) were used to derive the GWSGRACE timeseries. The GWSwel was also estimated using GWLs and used to validate the GWSGRACE estimates. Our short-term and long-term trend analysis of complete TWSGRACE and GWSGRACE datasets provided a quantitative means to identify the short and multi-year drought and wet periods with significant accuracy, thus enabling enhanced decision-making processes of a broad set of water planning groups and stakeholders. In this study, a complete GRACE and GEACE-FO record (2002-2019) is used to characterize the temporal variability in TWS GRACE and GWS GRACE over the Texas Gulf Coast region ( Figure 1). Specifically, the TWS GRACE was first reconstructed using multilinear regression (MLR) and artificial neural network (ANN) techniques to fill the data gaps within GRACE mission and between GRACE and GRACE-FO missions. Second, the complete TWS GRACE data along with soil moisture storage (SMS) from LSMs, and surface water storage (SWS) were used to derive the GWS GRACE timeseries. The GWS wel was also estimated using GWLs and used to validate the GWS GRACE estimates. Our short-term and long-term trend analysis of complete TWS GRACE and GWS GRACE datasets provided a quantitative means to identify the short and multi-year drought and wet periods with significant accuracy, thus enabling enhanced decision-making processes of a broad set of water planning groups and stakeholders.
The GCA encompasses approximately 110,000 km 2 and is composed of multiple aquifer units comprised of a series of discontinuous sand, silt, clay, and gravel beds of Miocene to Holocene age [66,67]. Groundwater in GCA is mostly unconfined and semiconfined. Freshwater saturated thickness ranges from 213 m (to the south) to 396 m (to the north) with an average of about 305 m. The hydraulic conductivity of the aquifer is about 0.3 m·day −1 in the southern area and as much as 2.1 m·day −1 in the northeast [68]. Because of smaller saturated depth and lower hydraulic conductivity, aquifer transmissivity is lower in the south and maximum in the northeast. The specific yield ranges from 0.05 to 0.005, which is relatively low compared to the typical specific yields of sedimentary formations in unconfined aquifers [69].
The CWA, encompassing approximately 95,000 km 2 , extends from the Louisiana border to Mexico to the northwest of GCA. The CWA consists of sand locally interbedded with gravel, silt, clay, and lignite with average saturated freshwater thickness of about 204 m. The aquifer is unconfined in the outcrop area and confined down deep towards the coast where it is overlaid by low-permeability layers. The transmissivity and hydraulic conductivity of the CWA are highly variable spatially, with a geometric mean of 28 m 2 ·d −1 and 1.8 m·d −1 , respectively [70].
The study area encompasses a significant climate gradient from east to west, with an average annual precipitation over 140 cm, in the east, to less than 76 cm, in southwest Texas. The rate of annual gross lake evaporation increases from 127 cm, in the east, to more than 165 cm, in southwest Texas [71]. The mean annual potential evapotranspiration (ET) exceeds precipitation by 2-5 times. Thornthwaite's water-balance system [72] classified the climate of this region as moist subhumid in the east and subtropical semiarid in the southwest.

Data
Multiple datasets from different sources were used to fill gaps in the TWS GRACE record, extract and validate GWS GRACE timeseries, and characterize temporal variabilities in TWS GRACE and GWS GRACE data over the study area ( Figure 1). Below is a detailed description of datasets used in this study. The GRACE mission provides a scale factor for the SH and JPL-M solutions. This scale factor was used to rescale the individual TWS GRACE data to minimize the signal attenuation during post-processing [73,76]. The gridded TWS GRACE GRACE products were averaged over the study area to generate the TWS GRACE timeseries. The average TWS GRACE timeseries was calculated by taking an average of all five TWS GRACE SH and mascon solutions. This average tends to minimize the noise in the individual TWS GRACE solutions within the available scatter of the solutions [77]. The gaps in the average TWS GRACE timeseries (missing months and the gap between GRACE and GRACE-FO missions) were filled with the reconstructed TWS from the ANN and MLR models (refer to the "Methods" section).
Terrestrial Water Storage from Land Surface Model (TWS LSM ) The TWS data was also derived from the North American Land Data Assimilation System (NLDAS)-NOAH model (TWS LSM ). The TWS LSM from NLDAS-NOAH and Global Land Data Assimilation System (GLDAS)-NOAH in the study region looks similar; therefore, we only used the TWS LSM record derived from NLDAS-NOAH. This estimate was used as one of the MLR and ANN model's inputs. In this model, the TWS LSM is represented as the sum of soil moisture storage (SMS), plant canopy water storage, and accumulated snow. The TWS LSM was provided at a spatial resolution of 0.125 • × 0.125 • . The NLDAS model details are explained in Section 2.2.2 below.

Soil Moisture Storage (SMS)
The SMS data were extracted from the Global Land Data Assimilation System (GLDAS) and NLDAS. These two LSMs were developed jointly by the National Aeronautics and Space Administration [14] and the National Oceanic and Atmospheric Administration (NOAA). GLDAS and NLDAS simulate satellite and ground-based measurements to produce an optimal field of land surface statistics using advanced land surface modeling and data assimilation techniques [78,79]. Both models provide SMS data for different layer structures extending from the ground surface up to 3 m depth. The NLDAS derived SMS has a higher spatial resolution (i.e., 0.125 • ) than GLADS (i.e., 1 • ). Timeseries of SMS were extracted from the NOAH [80], Mosaic [81], and variable infiltration capacity (VIC) [82] model versions.
The gridded SMS data were averaged over the study area to generate SMS timeseries from each of the six model versions. Like TWS GRACE , the SMS anomaly was calculated by removing the 2004-2009 temporal mean. The average of all six SMS products and their respective monthly standard deviations were used to estimate the monthly SMS timeseries and associated errors, respectively [33,83].

Surface Water Storage (SWS)
The SWS data extracted from the Texas Water Development Board (TWDB) (available at https://waterdatafortexas.org/reservoirs/statewide; accessed on 10 March 2020), implemented herein, provide daily lake and reservoir levels within Texas. To calculate SWS timeseries, first, the available lake and reservoir levels (total: 29) within the study region were compiled into individual monthly timeseries (i.e., for each lake/reservoir). Then, the 2004-2009 mean was removed from each timeseries. Finally, all storage anomalies were added and divided by the study area to derive the average SWS timeseries. The monthly uncertainty was estimated using a conservative approach, which is 10% of the monthly SWS [83,84].

Groundwater Level Observations (GWL well ) and Groundwater Extraction Rates
Groundwater level data from the U.S. Geological Survey (USGS) and the TWDB (available at https://waterdatafortexas.org/groundwater; accessed on 15 March 2020) were used in this study to derive the GWL well . For this purpose, water level data from wells screened only in the unconfined or semi-confined portion of the GCA and CWA were selected as representative of the local water table [33]. Well hourly GWL well data was aggregated into monthly timeseries. Since most of the GWL well are available starting 2011, the temporal mean of the 2012-2016 period was removed from each timeseries. To ensure consistency for comparison purposes, the same time mean was removed from TWS GRACE , SMS, and SWS data when compared with GWL well . The annual groundwater pumping data by county, available up to 2018, was extracted from the TWDB (available at http://www.twdb.texas.gov/waterplanning/waterusesurvey/ historical-pumpage.asp; accessed on 20 March 2020).

Precipitation and Temperature
Precipitation and temperature data were used as one of the MLR and ANN model inputs. Precipitation data was derived from the Integrated Multi-satellite Retrievals for Global Precipitation Measurement Mission (IMERG) data products (available at https:// disc.gsfc.nasa.gov/datasets/GPM_3IMERGM_06/; accessed on 1 April 2020). The IMERG provides globally half-hourly, daily, and monthly precipitation products, on a 0.1 • × 0.1 • grid scale [85] and is available beginning with 2000 until present.
The monthly 2-m air temperature data from the European Centre for Medium-Range Weather Forecasts (ECMWF) Re-Analysis (ERA5-Land) project was used in this study. The ERA5-Land data, available from 1950 to the present (at https://cds.climate.copernicus.eu/; accessed on 1 April 2020), are produced by the replay of the land component of the ECMWF ERA5 climate reanalysis, forced by meteorological fields from ERA5. The ERA5-Land global data have a temporal and spatial resolution of 1 h and 0.1 • × 0.1 • , respectively [86].

Methods
Three main steps were utilized in this study ( Figure 2). First, we filled the TWS GRACE data gap using predictions from the ANN and MLR models. Secondly, the complete TWS GRACE , SWS, and SMS timeseries were used to extract GWS GRACE timeseries over the study area. The third step included validation of the generated GWS GRACE timeseries using GWS well data. The temporal variability in TWS GRACE and GWS GRACE was characterized over the Texas Gulf Coast region using short-and long-term trend analyses. aggregated into monthly timeseries. Since most of the GWLwell are available starting 2011, the temporal mean of the 2012-2016 period was removed from each timeseries. To ensure consistency for comparison purposes, the same time mean was removed from TWSGRACE, SMS, and SWS data when compared with GWLwell.

Precipitation and Temperature
Precipitation and temperature data were used as one of the MLR and ANN model inputs. Precipitation data was derived from the Integrated Multi-satellite Retrievals for Global Precipitation Measurement Mission (IMERG) data products (available at https://disc.gsfc.nasa.gov/datasets/GPM_3IMERGM_06/; accessed on 1 April 2020). The IMERG provides globally half-hourly, daily, and monthly precipitation products, on a 0.1° × 0.1° grid scale [85] and is available beginning with 2000 until present.
The monthly 2-m air temperature data from the European Centre for Medium-Range Weather Forecasts (ECMWF) Re-Analysis (ERA5-Land) project was used in this study. The ERA5-Land data, available from 1950 to the present (at https://cds.climate.copernicus.eu/; accessed on 1 April 2020), are produced by the replay of the land component of the ECMWF ERA5 climate reanalysis, forced by meteorological fields from ERA5. The ERA5-Land global data have a temporal and spatial resolution of 1 h and 0.1° × 0.1°, respectively [86].

Methods
Three main steps were utilized in this study ( Figure 2). First, we filled the TWSGRACE data gap using predictions from the ANN and MLR models. Secondly, the complete TWS-GRACE, SWS, and SMS timeseries were used to extract GWSGRACE timeseries over the study area. The third step included validation of the generated GWSGRACE timeseries using GWSwell data. The temporal variability in TWSGRACE and GWSGRACE was characterized over the Texas Gulf Coast region using short-and long-term trend analyses.

Terrestrial Water Storage Reconstruction
The predicted values from the ANN and MLR models were used to fill in the gaps in the average TWS GRACE timeseries using the following reasoning. The MLR model performs best when there is a linear relationship between target and input variables. This model has Remote Sens. 2022, 14, 612 7 of 21 been used widely in hydrology and meteorology [43,57,87,88] because of its simplicity. The MLR model can be explained by the following equation: where Y is the target variable; x 1 , x 2 , x 3 , and x n are the input variables; a 0 is a constant; and a 1 , a 2 , a 3 , and a n are the regression coefficients. The ANN model development is inspired by the function of biological neural networks. Such systems are trained to perform by considering examples, generally without task-specific programming [89]. The ANN model consists of input, hidden, and output layers. The number of hidden layers depends on the relationship between target and input variables. ANNs have been widely used in hydroclimatic predictions because they can handle large, non-linear, and complex datasets [43]. In this study, the number of hidden layers in the ANN model was selected by a trial-and-error method. The model was run for hidden layers from 1 to 8. The number of layers at which the model performed best (i.e., 5) was selected.
Both MLR and ANN models require target and input variables. The average TW GRACE from five different solutions was used as a target variable while the precipitation, temperature, and TWS LSM were selected as input variables. The input variables were normalized to ensure that all variables receive equal consideration using Equation (2): wherex i is normalized value for x i and x min and x max are minimum and maximum value for the timeseries, respectively. The ANN and MLR models were trained, tested, and validated using randomly selected data sets. The data were randomly divided into training (70%), testing (15%), and validation (~15%), with the missing months in the GRACE data and the gap between the two used for prediction purposes. To ensure the model stability, identify the associated uncertainty, and enhance model generalization, 100 simulations were run for the ANN model. The median and the standard deviation of these simulations were used as the ANN model output and as a measure of model uncertainty, respectively.
The performance of both ANN and MLR models was tested using a number of statistical criteria such as: The root mean square error (RMSE), the normalized RMSE (NRMSE), the correlation coefficient (CC), and the Nash-Sutcliffe efficiency coefficient (Equations (3)-(6)). RMSE was used to measure the global fitness of a predictive model and the optimal value is 0. The NRMSE was calculated as the normalized RMSE by the standard deviation of actual values. The cross-correlation between actual and predicted values was measured by the CC. The NSE is a normalized statistical coefficient that measures the relative magnitude of residual variance to the variance of actual/measured data [90]. The NSE value ranged from −∞ to 1 with an optimal value of 1. These statistical measures were calculated as follows: where y and x represent the predicted and actual timeseries; y and x represent the average of y and x; n is the number of data used; and σ is the standard deviation of actual timeseries.

Groundwater Storage Estimation
GWS GRACE timeseries were estimated, over the investigated area, using the complete TW GRACE record, SMS, and SWS timeseries (since the study area gets negligible amounts of snow, the snowpack component was ignored in the process of groundwater estimation) using Equation (7): The errors in GWS GRACE were estimated using the following equation: where δ GWS , δ GRACE , δ SMS , and δ SWS are the errors associated with GWS GRACE , TW GRACE , SMS, and SWS, respectively. The spatial trends in TWS GRACE , SMS, SWS, and GWS GRACE were calculated using the same method that was adopted for temporal trends. To analyze changes in the spatial distribution of GWS GRACE , the TWS GRACE and SMS data were resampled at 0.25 • × 0.25 • spatial grid scale. The SWS trend was estimated for each reservoir/lake area and interpolated to 0.25 • × 0.25 • grid area.

Validation of Groundwater Storage
Thiessen polygon and simple average approaches were used to calculate the GWS well timeseries used to validate GWS GRACE timeseries. In both approaches the specific yield (Sy) values by well were extracted from the TWDB groundwater availability models (GAM) as well as the available literature [68,[91][92][93][94][95]. In the first approach, the Thiessen polygons were constructed for the study area to estimate the area weighted GWS well timeseries using the following equation [96,97]: where S i is the Sy for each Thiessen polygon i, n is the total number of polygons, W i is the area of each polygon, and ∆h i is the average change in groundwater level in each polygon.
In the second approach, the GWL well anomaly was calculated for individual wells and then the average GWL well timeseries was calculated for the entire study area. Then, the GWS well was calculated using the following equation [98]:

Trend Calculations and Error Analysis
Short-and long-term trends in the TWS GRACE and GWS GRACE were calculated to characterize spatiotemporal variabilities. First, the annual cycle was calculated and removed from each timeseries before calculating the trend. This step is necessary as extreme events (i.e., flooding and drought) can lead to abrupt changes in the TWS GRACE that are not directly related to changes in the GWS. The trend in TWS GRACE and GWS GRACE was then calculated using the breakpoint detection algorithm described in detail in previous studies [38,[99][100][101].
The TWS GRACE timeseries errors and trend were calculated as described by Ahmed and Abdelmohsen [28]. Briefly, the timeseries was first fitted using annual, semi-annual, and trend terms, and the residuals were smoothed using a 13-month moving average and a second set of residuals was calculated. The error in monthly timeseries was calculated as a standard deviation of the second set of residuals. A Monte Carlo simulation approach was used to calculate the trend error [102,103]. This approach simulated multiple synthetic timeseries, each with a Gaussian-distributed population and a standard deviation similar to that of the second set of residuals. The standard deviation of the synthetic trends was assigned as an error in the TWS GRACE trend. Leakage errors in SH data over the study area was estimated to be less than 90% of the error bar calculated for the average TWS GRACE timeseries.

Terrestrial Water Storage Reconstruction Results
Overall, the ANN and MLR models show similar results and good performance in training, testing, and validation phases (Figure 3; Table 1). The ANN model shows better performance for the training and testing sets, while the MLR model performs best for the validation set (Table 1). Predicted TWS GRACE data from both models are highly correlated with the actual TWS GRACE for both testing and validation datasets (Figure 3b,c). Both models agree very well for the prediction of all TWS GRACE data gaps (Figure 3d). The MLR model output was used to fill the TWS GRACE data gaps and create a continuous data set since it outperformed the ANN model during the validation phase. and trend terms, and the residuals were smoothed using a 13-month moving average and a second set of residuals was calculated. The error in monthly timeseries was calculated as a standard deviation of the second set of residuals. A Monte Carlo simulation approach was used to calculate the trend error [102,103]. This approach simulated multiple synthetic timeseries, each with a Gaussian-distributed population and a standard deviation similar to that of the second set of residuals. The standard deviation of the synthetic trends was assigned as an error in the TWSGRACE trend. Leakage errors in SH data over the study area was estimated to be less than 90% of the error bar calculated for the average TWSGRACE timeseries.

Terrestrial Water Storage Reconstruction Results
Overall, the ANN and MLR models show similar results and good performance in training, testing, and validation phases (Figure 3; Table 1). The ANN model shows better performance for the training and testing sets, while the MLR model performs best for the validation set (Table 1). Predicted TWSGRACE data from both models are highly correlated with the actual TWSGRACE for both testing and validation datasets (Figure 3b,c). Both models agree very well for the prediction of all TWSGRACE data gaps (Figure 3d). The MLR model output was used to fill the TWSGRACE data gaps and create a continuous data set since it outperformed the ANN model during the validation phase.

Temporal Variation in Tresstrial Water Storage, Soil Moisture Storage, and Surface Water Storage
While we used the averaged TWS GRACE solution generated from three SH and two mascon solutions, we also examined the agreements of the individual solutions (Figure 4a). Examination of temporal variations in TWS GRACE shows that all five GRACE solutions show a similar trend with slightly different amplitude during the investigated period (Figure 4a). The TWS GRACE trends resulting from the CSR M, JPL M, GFZ SH, JPL SH, and CSR SH solutions are estimated to be −0.04 ± 0.07 km 3 ·yr −1 (p-value: 0.85), −0.04 ± 0.09 km 3 ·yr −1 (p-value: 0.87), −1.12 ± 0.09 km 3 ·yr −1 (p-value: <0.01), −0.54 ± 0.08 km 3 ·yr −1 (p-value: 0.02), −0.58 ± 0.08 km 3 ·yr −1 (p-value: 0.01), and −0.46 ± 0.08 km 3 ·yr −1 (p-value: 0.04), respectively. No obvious seasonal changes in the TWSGRACE were observed (Figure 4b), which contrary to other studies that suggest distinct seasonality in TWSGRACE in many basin across the world, with highs in winter and lows in summer [38,106]. Long et al. [83] note that the peak precipitation in Texas during spring and fall and low winter precipitatio causes the absence of such distinct seasonal cycle in TWSGRACE. For the Texas coastal are a decreasing trend in TWSGRACE is estimated from the complete timeseries at a rate of −0.4 ± 0.18 km 3 ·yr −1 (−0.2 ± 0.08 cm·yr −1 ) with a p-value 0.06 (Figure 4b). Spatially, the entire are experiences depletion or no change in TWSGRACE (Figure 5a). The highest depletion rate i TWSGRACE for the 2002-2019 period (−0.6 to −0.4 cm·yr −1 ) peaks in the central area of th study area, south of San Antonio and Austin (Figure 5a), an area experiencing increasin water demand and declining groundwater levels, particularly beginning with 2011, th driest year on record. Minimum to no changes in the TWSGRACE characterize the rest of th study area to the east and west of the central area. No obvious seasonal changes in the TWS GRACE were observed (Figure 4b), which is contrary to other studies that suggest distinct seasonality in TWS GRACE in many basins across the world, with highs in winter and lows in summer [38,106]. Long et al. [83] noted that the peak precipitation in Texas during spring and fall and low winter precipitation causes the absence of such distinct seasonal cycle in TWS GRACE . For the Texas coastal area, a decreasing trend in TWS GRACE is estimated from the complete timeseries at a rate of −0.43 ± 0.18 km 3 ·yr −1 (−0.2 ± 0.08 cm·yr −1 ) with a p-value 0.06 (Figure 4b). Spatially, the entire area experiences depletion or no change in TWS GRACE (Figure 5a). The highest depletion rate in TWS GRACE for the 2002-2019 period (−0.6 to −0.4 cm·yr −1 ) peaks in the central area of the study area, south of San Antonio and Austin (Figure 5a), an area experiencing increasing water demand and declining groundwater levels, particularly beginning with 2011, the driest year on record. Minimum to no changes in the TWS GRACE characterize the rest of the study area to the east and west of the central area. The SMS and SWS timeseries show large monthly variations (Figure 6a,b) that generally follow the TWSGRACE patterns ( Figure 4). However, there is a notable difference between SMS extracted from different LSMs and those extracted from different versions within the same model, especially during the extreme hydroclimatic periods. The observed variations between LSMs are attributed to the use of different forcing data, model parameters, and model structures [83]. The highest variability in SMS is associated with the NLDAS-Noah and GLDAS-Noah models, and the lowest with the VIC model, particularly during the extreme hydroclimatic time periods. Overall, average SMS declines sig- The SMS and SWS timeseries show large monthly variations (Figure 6a,b) that generally follow the TWS GRACE patterns (Figure 4). However, there is a notable difference between SMS extracted from different LSMs and those extracted from different versions within the same model, especially during the extreme hydroclimatic periods. The observed variations between LSMs are attributed to the use of different forcing data, model parameters, and model structures [83]. The highest variability in SMS is associated with the NLDAS-Noah and GLDAS-Noah models, and the lowest with the VIC model, particularly the most extreme drought year (i.e., 2011) and it decreases at a rate of −0.002 ± 0.01 km 3 ·yr −1 (0.004 ± 0.02 cm·yr −1 ) over the study period. Spatially, the SMS closely follows the climatic gradient (Figure 1c). An overwhelming majority of minimum to largest increases in SMS are characteristic of the northern half of the study area and to the east (0.35 to 0.15 cm·yr −1 ), where the highest precipitation depths are observed. The highest depletion rates occur in the western half of the study area (Figure 5b), where precipitation rates are about twofold lower than to the east, while evaporation rates are greatest. Surface water reservoirs are mainly concentrated in the northeast half of the area, with 24 out of 29 reservoirs (Figure 5c) showing no change or an increase in the SWS over the study period. As with SMS, the largest increases in the SWS occur to the east (0.4 to 0.1 cm·yr −1 ), including the greater Houston area. While very limited in number, two out of five reservoirs located in the western half of the area experience the largest depletion rates in the SWS (−0.71 to −0.3 cm·yr −1 ).

Temporal Variations in Groundwater Storage
Trend analyses indicate that the Texas coastal region experienced an overall decline in GWSGRACE at −0.35 ± 0.08 km 3 ·yr −1 (0.15 ± 0.04 cm·yr −1 ) (p-value < 0.01) for the investigated period, with large month-to-month variations (Figure 7). The lowest aquifer storage (−28.07 km 3  The most significant decline in the GWSGRACE occurred during the prolonged 2010-2015 drought period with a rate of −3.38 ± 0.43 km 3 ·yr −1 (1.54 ± 0.2 cm·yr −1 ). However, a very small declining trend in TWSGRACE (−0.1 ± 0.9 km 3 ·yr −1 ) was observed over the entire Spatially, the SMS closely follows the climatic gradient (Figure 1c). An overwhelming majority of minimum to largest increases in SMS are characteristic of the northern half of the study area and to the east (0.35 to 0.15 cm·yr −1 ), where the highest precipitation depths are observed. The highest depletion rates occur in the western half of the study area (Figure 5b), where precipitation rates are about twofold lower than to the east, while evaporation rates are greatest. Surface water reservoirs are mainly concentrated in the northeast half of the area, with 24 out of 29 reservoirs (Figure 5c) showing no change or an increase in the SWS over the study period. As with SMS, the largest increases in the SWS occur to the east (0.4 to 0.1 cm·yr −1 ), including the greater Houston area. While very limited in number, two out of five reservoirs located in the western half of the area experience the largest depletion rates in the SWS (−0.71 to −0.3 cm·yr −1 ).

Temporal Variations in Groundwater Storage
Trend analyses indicate that the Texas coastal region experienced an overall decline in GWS GRACE at −0.35 ± 0.08 km 3 ·yr −1 (0.15 ± 0.04 cm·yr −1 ) (p-value < 0.01) for the investigated period, with large month-to-month variations (Figure 7). The lowest aquifer storage (−28.07 km 3 ) was observed immediately after the end of the longest drought (i.e., in September 2015) (Figure 7). to a quick response in SMS and SWS following the precipitation events that ended the drought and initiated the groundwater recharge. Conversely, the similar TWSGRACE and GWSGRACE response to drought is explained by depletion in SMS accompanied by an increase in groundwater extraction as the SWS resource was depleted (from 0.75 km 3 to −5.5 km 3 ) by drought and excessive use. Inspection of the GWSGRACE spatial trends (Figure 5d) shows an overall decreasing trend, except along the western border of the study region. This is the opposite to the SMS trend that shows the largest depletion rate in this region ( Figure 5b). As with the TWS-GRACE, the largest depletion rate of the GWSGRACE is in the central coastal region (−0.9 to −0.45 cm·yr −1 ), an area where the change in SMS is minimum to none during the investigated period. Thus, the depletion in groundwater storage is captured with accuracy in the TWSGRACE in this area.

Validation of GWSGRACE
The GWLwell data has been used by a handful of studies to analyze changes in GWS and/or to validate GWSGRACE [33,96,97,107]. In this study, the estimated GWSwell was used to validate the GWSGRACE timeseries over the study area. The GWSwell from both Thiessen polygon and general average approaches show an overall significant increasing trend (4.6 ± 0.19 km 3 ·yr −1 and 2.23 ± 0.14 km 3 ·yr −1 , respectively; with p-value < 0.01) for the 2011-2019 period ( Figure 8). Comparatively, the GWSGRACE also shows an increasing trend, but at a lower rate (1.05 ± 0.22 km 3 ·yr −1 , Figure 8). The most significant decline in the GWS GRACE occurred during the prolonged 2010-2015 drought period with a rate of −3.38 ± 0.43 km 3 ·yr −1 (1.54 ± 0.2 cm·yr −1 ). However, a very small declining trend in TWS GRACE (−0.1 ± 0.9 km 3 ·yr −1 ) was observed over the entire drought period. It is expected that the difference in GWS GRACE and TWS GRACE trends is due to a quick response in SMS and SWS following the precipitation events that ended the drought and initiated the groundwater recharge. Conversely, the similar TWS GRACE and GWS GRACE response to drought is explained by depletion in SMS accompanied by an increase in groundwater extraction as the SWS resource was depleted (from 0.75 km 3 to −5.5 km 3 ) by drought and excessive use.
Inspection of the GWS GRACE spatial trends (Figure 5d) shows an overall decreasing trend, except along the western border of the study region. This is the opposite to the SMS trend that shows the largest depletion rate in this region ( Figure 5b). As with the TWS GRACE , the largest depletion rate of the GWS GRACE is in the central coastal region (−0.9 to −0.45 cm·yr −1 ), an area where the change in SMS is minimum to none during the investigated period. Thus, the depletion in groundwater storage is captured with accuracy in the TWS GRACE in this area.

Validation of GWS GRACE
The GWL well data has been used by a handful of studies to analyze changes in GWS and/or to validate GWS GRACE [33,96,97,107]. In this study, the estimated GWS well was used to validate the GWS GRACE timeseries over the study area. The GWS well from both Thiessen polygon and general average approaches show an overall significant increasing trend (4.6 ± 0.19 km 3 ·yr −1 and 2.23 ± 0.14 km 3 ·yr −1 , respectively; with p-value < 0.01) for the 2011-2019 period ( Figure 8). Comparatively, the GWS GRACE also shows an increasing trend, but at a lower rate (1.05 ± 0.22 km 3 ·yr −1 , Figure 8).

Validation of GWSGRACE
The GWLwell data has been used by a handful of studies to analyze changes in GWS and/or to validate GWSGRACE [33,96,97,107]. In this study, the estimated GWSwell was used to validate the GWSGRACE timeseries over the study area. The GWSwell from both Thiessen polygon and general average approaches show an overall significant increasing trend (4.6 ± 0.19 km 3 ·yr −1 and 2.23 ± 0.14 km 3 ·yr −1 , respectively; with p-value < 0.01) for the 2011-2019 period ( Figure 8). Comparatively, the GWSGRACE also shows an increasing trend, but at a lower rate (1.05 ± 0.22 km 3 ·yr −1 , Figure 8).  The major differences in trend, and the time of the lowest storage peaks, between GWS GRACE and GWS well , occur during the drought period (i.e., 2010-2015). The lowest GWS well using the Thiessen polygon and the general average methods is observed in 2011 (coinciding with the dip seen in the TWS GRACE ) and 2013, respectively. Comparatively the GWS GRACE drops at the end of the drought period, in 2015. These differences may arise from the lack of even distribution of the groundwater level monitoring locations within the study area. Most of the groundwater monitoring stations are located in the central part of the region and only a few in the eastern and western parts, potentially leading to more bias in the estimated GWS well towards conditions in the area with more coverage (e.g., central area). Additionally, in the absence of measured/calculated Sy data throughout the investigated area, the use of modelled Sy may also lead to larger uncertainties in the estimated GWS well . Further discussions that aid explaining these discrepancies between different estimates of GWS are included in the discussion section. Regardless of slight discrepancies between the two records, there is a significant positive correlation between the GWS GRACE and GWS well (CC = 0.48; p-value < 0.01) generated from the Thiessen polygon approach. Additionally, a stronger correlation was observed between the GWS GRACE and GWS well records (CC = 0.56, p-value < 0.01) using the general average approach.

Discussion and Conclusions
The short-term changes in TWS GRACE and GWS GRACE trends observed in this study are attributed mainly to climatic variabilities, which when compounded by anthropogenic stressors result in different response times in the storage compartments. Overall, the TWS GRACE shows an increasing tendency, while the GWS GRACE experiences multiple increasing and decreasing trends. These discrepancies are mainly due to the different response time to precipitation and evapotranspiration of the two types of storages. For instance, precipitation and evapotranspiration are the main driver of TWS changes in low latitudes [108,109], particularly because of the relatively quick response of SMS to climatic factors. To exemplify this, in this study the lowest TWS GRACE (−26.87 km 3 ) was observed during the peak of the extreme drought (i.e., 2011) ( Figure 10) with the lowest total annual precipitation (57.2 cm; Figure 10) that represented only 55% of the average total annual precipitation observed during the investigation period (103.9 cm). attributed mainly to climatic variabilities, which when compounded by anthropogenic stressors result in different response times in the storage compartments. Overall, the TWS-GRACE shows an increasing tendency, while the GWSGRACE experiences multiple increasing and decreasing trends. These discrepancies are mainly due to the different response time to precipitation and evapotranspiration of the two types of storages. For instance, precipitation and evapotranspiration are the main driver of TWS changes in low latitudes [108,109], particularly because of the relatively quick response of SMS to climatic factors. To exemplify this, in this study the lowest TWSGRACE (−26.87 km 3 ) was observed during the peak of the extreme drought (i.e., 2011) ( Figure 10) with the lowest total annual precipitation (57.2 cm; Figure 10) that represented only 55% of the average total annual precipitation observed during the investigation period (103.9 cm). While highly dependent on the region's hydrogeologic characteristics (i.e., aquifer permeabilities) and water use practices (i.e., groundwater extraction for irrigation and/or While highly dependent on the region's hydrogeologic characteristics (i.e., aquifer permeabilities) and water use practices (i.e., groundwater extraction for irrigation and/or human consumption), GWS is expected to have a delayed response to changing climatic conditions. The highest groundwater extraction rate (3 km 3 ·yr −1 ; Figure 10) for the study period occurred in the driest year on record (e.g., 2011), likely a result of increased groundwater usage for irrigation as precipitation amounts and surface water resources have been steadily declining since the beginning of the drought period in 2010. Nonetheless, the lowest annual average GWS GRACE (−12 km 3 ) was observed at the end of the prolonged drought, in 2015 ( Figure 10). The effects of groundwater abstractions on groundwater storage are not always straightforward and are a function of time following the initiation of stress [110] and the density of pumping wells. Initial stages of pumping cause a quick depletion in storage (e.g., year 2011 in this study), as a cone of depression forms to fulfill the pumping volumes [111]. In an area with dense network of pumping wells (e.g., the greater Houston area; Figure 1b), prolonged pumping may cause cones of depression to fuse with each other; thus, further pumping will result in an overall depletion of the area. If pumping rates remain fairly constant, an equilibrium is generally achieved as time progresses, and the change in the cone of depression and GWS is very small concerning its observation by GRACE.
During prolonged drought periods, pumping likely exceeds natural recharge, thus violating the quasi-equilibrium condition of the cone of depression. Rather, dewatering of the aquifer increases with time in the near vicinity of pumping, thus the lower GWS GRACE years after the most severe precipitation drought. Thus, GWS GRACE captures responses linked to climate-induced groundwater depletion.
Some areas within the Texas coastal region have had groundwater elevations substantially below sea level throughout the investigated period. For example, Jasechko et al. [17] found that 25% of the groundwater elevations within 10 km of the coast in the GCA are below sea level. Within the greater Houston area, average groundwater elevations from nine wells ranged between 30 m (early 2011) and 35 m (late 2011-early 2012) below mean sea level between 2011 and 2020. This confirms that changes in GWS in certain areas may not be significant if aquifer depletion has occurred before the beginning of this study's record. Groundwater elevations below mean sea level are an indication that sea water intrusion may have occurred and, given the hydrogeologic makeup of the area (e.g., fine sand and clay formations with low permeabilities) and the observed land subsidence (i.e., the Greater Houston area and other areas along the coast) [20], changes in these conditions are expected to be insignificant over the duration of the study [112,113]. However, storage depletion in other areas that have not been historically impacted by overproduction of aquifers (e.g., the central portion of the study area where groundwater elevations fluctuate between 34 m (late 2011) and 42 m (mid 2019) above mean sea level between 2011 and 2020) is expected to be captured within the study time frame. In systems like this, accelerated aquifer depletion is expected, particularly in periods of prolonged drought when the stress on groundwater is initiated in the early stages and increases over time with equilibrium likely attained in the later stages of the drought. In consequence, these systems are likely to weigh more significantly into the observed GWS GRACE changes. Additional research on the spatial changes of GWS in areas affected by sea water intrusion versus those that are impacted during drought conditions, like a regional groundwater flow and transport model, is necessary to further understand how GWS GRACE may be impacted.