A New Coupled Modeling Approach to Simulate Terrestrial Water Storage in Southern California

The study introduces a new atmosphere-land-aquifer coupled model and evaluates terrestrial water storage (TWS) simulations for Southern California between 2007 and 2016. It also examines the relationship between precipitation, groundwater, and soil moisture anomalies for the two primary aquifer systems in the study area, namely the Coastal Basin and the Basin and Range aquifers. Two model designs are introduced, a partially-coupled model forced by reanalysis atmospheric data, and a fully-coupled model, in which the atmospheric conditions were simulated. Both models simulate the temporal variability of TWS anomaly in the study area well (R2 ≥ 0.87, P < 0.01). In general, the partially-coupled model outperformed the fully-coupled model as the latter overestimated precipitation, which compromised soil and aquifer recharge and discharge. Simulations also showed that the drought experienced in the area between 2012 and 2016 caused a decline in TWS, evapotranspiration, and runoff of approximately 24%, 65%, and 11%, and 20%, 72% and 8% over the two aquifer systems, respectively. Results indicate that the models first introduced in this study can be a useful tool to further our understanding of terrestrial water storage variability at regional scales.


Introduction
Terrestrial water storage (TWS) is defined as the summation of water on the land surface and in the subsurface. It includes soil moisture and groundwater, as well as water stored in vegetation and snow. The ability to estimate TWS accurately is essential for hydrological studies and water resource assessment, particularly in arid and semi-arid regions where access to rainfall-fed surface water storage is limited. These areas are particularly vulnerable to climate extremes, such as droughts, which often further deteriorate TWS through enhanced evaporative losses and increased groundwater extraction [1][2][3].
The region of Southern California, located along the west coast of North America between the latitudes 31 • N and 36 • N, is mainly characterized by a semi-arid and mild-winter Mediterranean climate. This region has experienced a rapid population growth over the last 50 years [4], which has continuously increased the demand for agricultural and urban water uses in the region. Unlike the central and northern parts of the state, the contribution of snowfall to Southern California's TWS is negligible as wintertime temperatures are too warm for considerable snow to accumulate, except in a few high mountain peaks [5]. Furthermore, the local shrubland vegetation is sparsely distributed and does not hold substantial amounts of water [6]. Therefore, variations in TWS in the region are primarily driven by changes in soil moisture and groundwater contents.
Groundwater is one of the most valuable sources of freshwater in the world. Almost two billion people worldwide and fifty percent of the United States' populations use it for drinking and irrigation [7]. Current estimates indicate that groundwater provides 38% of the total water supply for Water 2020, 12, 808 3 of 14 staggered Arakawa C-grid-type horizontal coordinates. The model offers several radiation, convective, and microphysics scheme options and is currently maintained and supported by the Mesoscale and Microscale Meteorology Division at the National Center for Atmospheric Research [22,26].
The third version of the Simplified Simple Biosphere (SSIB) is a biophysically-based land surface model and was used in this study to simulate the land surface-atmosphere interactions by calculating the surface energy budget and surface water balance based on three soil layers and one vegetation canopy layer [23]. In addition to simulating processes such as photosynthesis-controlled transpiration, soil evaporation, and runoff, it also includes a multi-layer snow hydrology module to realistically simulate processes associated with snow load and melting, substantially enhancing the model's ability for soil hydrology studies [5].

The Aquifer Model
We used a modified version of the Simple Groundwater Model (SIMGM) to simulate the temporal variations of water stored in an aquifer [15]. In the SIMGM, an unconfined aquifer is defined as the portion of ground located below the soil column, and the variations of water stored in the aquifer (W A ) over time are calculated as: where Q in and Q out are the aquifer recharge and discharge (mm s −1 ), respectively. Q in is obtained through Darcy's Law as: where K bot , Ψ bot are the soil hydraulic conductivity and matric potential of the bottom soil layer; z bot is the depth of the bottom soil layer (1.5 m), and z w is the water table depth. Equation (2) accounts for downward water flux driven by gravitational drainage, as well as upward water flux driven by capillary forces. Measurements of aquifer hydraulic conductivity are available for only a few aquifers, therefore, to avoid uncertainties in determining the aquifer's hydraulic conductivity, the bottom soil layer's hydraulic conductivity was used when calculating the aquifer recharge [17]. The hydraulic conductivity within the soil column in the SSIB model was determined based on the soil texture [27] and the soil matric potential was calculated based on the empirical relationships [28]. Aquifer discharge, also known as baseflow or subsurface runoff, is parameterized as: where Q out.max is the maximum aquifer discharge, which happens when the water table is at its highest (z w = 0) and F is the subsurface runoff or baseflow decay factor. Q out.max is a function of the aquifer's topographic aspect and the maximum saturated fraction within a grid cell and is obtained through calibrations against water table depth data since Q out.max measurements are rarely available. A value of 5 × 10 −4 mm s −1 was used in this study, which was obtained through calibration against global runoff data through sensitivity tests and is described in [17]. We were aware that using a value obtained through global data calibration could introduce errors when calculating total aquifer storage; but it should be negligible when determining seasonal variations of the water table depth [15]. On the other hand, the baseflow decay factor (F) has an important role in aquifer storage anomalies, and therefore, its values must be determined through calibration against observations or remotely-sensed estimates for the study area. As shown in Section 2.7, we used satellite-based TWS anomalies to estimate this parameter.
Finally, simulated water table depth (z w ) can be obtained from the aquifer water storage (W a ) scaled by the aquifer specific yield, which is the fraction of water volume that can be drained by gravity form the aquifer. While specific yield ranged from 0.02 for clay to 0.27 for sandy soil, we used the same constant value utilized in the original model development (0.2) due to the lack of specific yield data in the study [15].

Model Coupling Approaches
This study describes the implementation and results of two different model-coupling approaches. The first is a partially-coupled model consisting of the SSIB land and the SIMGM aquifer models (SSIB-SIMGM henceforth), in which the atmospheric conditions are not simulated by the model. In this approach, hourly precipitation, atmospheric pressure, downward solar and terrestrial radiation, 30-m air temperature, humidity, and wind, obtained from the North American Regional Reanalysis (NARR) were used as forcing data for the model [29]. In the second approach, the SSIB-SIMGM was implemented in WRF to form a fully-coupled atmosphere-land-aquifer regional modeling system (WRF-SSIB-SIMGM), which in addition to the surface and subsurface layers, included 38 vertical atmospheric layers. Initial and lateral-boundary atmospheric conditions for this model were also taken from the NARR dataset.
Both models were configured for a domain set up on a 10-km horizontal resolution grid extending from 31 • N to 36 • N and from 121.5 • W to 115 • W, which covered Southern California and the adjoining Atlantic Ocean (Figure 1). MODIS-derived monthly leaf area index and vegetation cover fraction maps were used in all coupled and uncoupled simulations. Preliminary short-term simulations were conducted with WRF to determine the best physics option combination for the study area, which resulted in the following selection: Ferrier microphysics scheme [30], the Fu-Liou-Gu shortwave and longwave radiation schemes [31], the Yonsei University PBL scheme [32], and the Grell-Devenyi convective scheme [33]. Furthermore, a set of five simulations were performed with the fully-coupled model, starting from slightly different atmospheric initial conditions, to minimize the effects of uncertainties associated with the initial boundary conditions and the model internal variability.

Model Coupling Approaches
This study describes the implementation and results of two different model-coupling approaches. The first is a partially-coupled model consisting of the SSIB land and the SIMGM aquifer models (SSIB-SIMGM henceforth), in which the atmospheric conditions are not simulated by the model. In this approach, hourly precipitation, atmospheric pressure, downward solar and terrestrial radiation, 30-m air temperature, humidity, and wind, obtained from the North American Regional Reanalysis (NARR) were used as forcing data for the model [29]. In the second approach, the SSIB-SIMGM was implemented in WRF to form a fully-coupled atmosphere-land-aquifer regional modeling system (WRF-SSIB-SIMGM), which in addition to the surface and subsurface layers, included 38 vertical atmospheric layers. Initial and lateral-boundary atmospheric conditions for this model were also taken from the NARR dataset.
Both models were configured for a domain set up on a 10-km horizontal resolution grid extending from 31° N to 36° N and from 121.5° W to 115° W, which covered Southern California and the adjoining Atlantic Ocean (Figure 1). MODIS-derived monthly leaf area index and vegetation cover fraction maps were used in all coupled and uncoupled simulations. Preliminary short-term simulations were conducted with WRF to determine the best physics option combination for the study area, which resulted in the following selection: Ferrier microphysics scheme [30], the Fu-Liou-Gu shortwave and longwave radiation schemes [31], the Yonsei University PBL scheme [32], and the Grell-Devenyi convective scheme [33]. Furthermore, a set of five simulations were performed with the fully-coupled model, starting from slightly different atmospheric initial conditions, to minimize the effects of uncertainties associated with the initial boundary conditions and the model internal variability.

Terrestrial Water Storage Data
Global-gridded TWS estimates derived from temporal gravity field variations measured by the Gravity Recovery and Climate Experiment (GRACE) were used to validate the models [34]. GRACE TWS anomaly estimates are available at https://grace.jpl.nasa.gov [35]in three distinct datasets, each of which are obtained through different filtering and geophysical correction algorithms [36][37][38]. The data is available on a 1° × 1° latitude-longitude geographical resolution grid and represents the latest versions of GRACE estimates up to date. We utilized the average of all three datasets to calibrate the model's baseflow decay factor (Section 2.7) and to assess the model's TWS simulation performance. The use of GRACE estimates for modeling validation purposes has been employed in several studies [15,17,18,39].

Terrestrial Water Storage Data
Global-gridded TWS estimates derived from temporal gravity field variations measured by the Gravity Recovery and Climate Experiment (GRACE) were used to validate the models [34]. GRACE TWS anomaly estimates are available at https://grace.jpl.nasa.gov [35] in three distinct datasets, each of which are obtained through different filtering and geophysical correction algorithms [36][37][38]. The data is available on a 1 • × 1 • latitude-longitude geographical resolution grid and represents the latest versions of GRACE estimates up to date. We utilized the average of all three datasets to calibrate the model's baseflow decay factor (Section 2.7) and to assess the model's TWS simulation performance. The use of GRACE estimates for modeling validation purposes has been employed in several studies [15,17,18,39].

Model Domain
The model domain used for the validation was bounded by 31.5 • N and 36 • N, and 125 • W and 113.5 • W (Figure 1a). This area is characterized by Mediterranean, semi-arid, and desert climate types with warm, dry summers and cool, wet winters. Coastal regions are enclosed by mountain ranges and strongly influenced by maritime air masses. Precipitation peaks in the winter with variable snow accumulation in the higher terrain. Many large metropolitan areas lie in this section of the domain including Los Angeles and San Diego.
The interior portions of the domain are much drier and warmer with mostly semi-arid and arid climates. Precipitation is also greater in the winter months, but the North American monsoon occasionally brings rainfall to the region in the summer and early fall. The study area features two main aquifer systems namely the Coastal Basin aquifers located along the coast and the Basin and Range aquifers located inland (Figure 1b). Despite having different geohydrologic settings, geologic history, and land-and water-use characteristics, both aquifer systems are used to supplement water supply at different rates [40].

Terrestrial Water Initial Conditions
A spin-up approach was utilized to generate initial conditions of soil moisture and groundwater over the study area. In this approach, aquifer storage is started from an arbitrary value and the model run repeatedly through one year until the domain-average aquifer storage reaches equilibrium [18], which occurs when: where W A represents the domain-average aquifer water storage, and n indicates a day in the simulation. For this study, the aquifers were initialized as completely full throughout the domain; and repeated simulations were performed for the period between November 2004 and October 2005 using NARR as the forcing data and a globally-calibrated baseflow decay factor of 6.0 m −1 [17]. Results showed that the average aquifer storage reached equilibrium at n = 9000 days of simulation, or approximately after 24.6 years. Results from this spin-up were used to initialize both aquifer and soil moisture conditions for all simulations described hereafter. Furthermore, it should be noted that in addition to the spin-up run to generate soil and subsurface initial conditions, both models (SSIB-SIMGM and WRF-SSIB-SIMGM) were allowed one additional year (November 2006 to October 2007) of spin-up to guarantee consistency between soil and atmospheric initial conditions.

Baseflow Decay Factor Calibration
The baseflow decay factor (F) of an aquifer has an important role in determining terrestrial water storage. Due to the lack of observational data, the baseflow decay factor must be defined through calibration with remotely-sensed data. Previous studies show F values ranging from 1.25 to 8 m −1 [14,15,17]. We determined the optimum baseflow decay factor for two aquifer systems in our study area by performing partially-coupled model simulations for F values ranging from 1 to 8 m −1 and comparing simulated water table depth anomalies against measurements from 59 United States Geological Survey wells (Table S1). Table 1 summarizes the impact of the factor in the simulated water depth anomalies for the two main aquifer systems in the study area. Despite underestimating the water table decline in some months, simulations using a decay factor equal to 2 m −1 yielded the best results for both aquifer systems, with an average root-mean-square error (RMSE) of 4.6 mm for the Basin and Range aquifer system and 9.5 mm for the Coastal Basin aquifer system ( Figure 2 and Table 1). Therefore, this value was used in all subsequent simulations described in this study. This value was also similar to baseflow decay factors used in other similar studies [14,15]. Table 1. Impact of baseflow decay factor (m −1 ) on the SSIB-SIMGM simulations for the two main aquifer systems in the study area. Statistics represent 4-year domain-average terrestrial water storage (TWS) anomaly bias (mm), root-mean squared error (RMSE) (mm), and temporal correlation (TCOR). * and ** indicates correlation significance at 90% and 99% confidence levels, respectively.   Table S1.

Results
This section describes the 10-year simulations (2007-2016) for both modeling approaches. GRACE estimates were used to assess the model's performance in terms of TWS simulations. All anomalies were calculated based on the 2007-2010 average baseline. We started by examining the results from the partially-coupled (SSIB-SIMGM) model.  Table S1.

Results
This section describes the 10-year simulations (2007-2016) for both modeling approaches. GRACE estimates were used to assess the model's performance in terms of TWS simulations. All anomalies were calculated based on the 2007-2010 average baseline. We started by examining the results from the partially-coupled (SSIB-SIMGM) model.

SSIB-SIMGM Model
As previously described, atmospheric conditions and precipitation were not simulated in this model, rather, they were introduced as forcing data. Figures 3 and 4 show GRACE and modeled monthly TWS anomaly and mean precipitation averaged over the Coastal Basin and Basin and Range aquifer systems, respectively. In general, the model was able to capture the annual cycle of dry-season discharge and wet-season recharge of TWS in the study area. Temporal correlation between GRACE and SSIB-SIMGM modeled TWS anomalies for the two aquifer systems were 0.90 and 0.94, respectively. Both coefficients were significant at a 99.9% confidence interval (Table 2). Water 2020, 12, x FOR PEER REVIEW 7 of 15        In the first four years of the simulation, the model performed well for the Coastal Basin aquifers and captured the amplitude of the recharge-discharge annual cycle. However, it overestimated the TWS anomaly in 2011. A closer look at the monthly precipitation showed that December 2010 was an unusually wet month in coastal Southern California. Approximately 220.0 mm fell over the aquifer system that month, which was about five times the average rainfall in December for the area (44.1 mm month −1 ), making it the wettest December in 121 years for the Los Angeles metropolitan area. On average, the 2010−2011 rainy season received 74.3 mm of precipitation or about twice the average for the region. The above-average precipitation resulted in excessive TWS recharge for the coastal aquifers, which was overestimated by both models.
On the other hand, the model underestimated the amplitude of TWS anomalies in the Basin and Range aquifer system between 2007 and 2010 (Figure 4a). This was particularly evident in 2007 when the model missed the negative anomalies by the end of year. The large wet bias simulated in 2011 in the Coastal Basin aquifer was not seen in the Basin and Range aquifers, despite the above-average rainfall observed in both aquifers.
The year 2012 signaled the start of a steady decline in TWS for both aquifer systems, which continued until the end of the simulation. This decline was associated with a long drought observed in Southern California [24,25]. During those years, precipitation is well below the climatological averages throughout the region (Figures 3b and 4b). Despite simulating the prolonged drying of the aquifer, the model underestimated its magnitude, especially in the last two years of simulation, when GRACE registered its largest TWS decline values (Figures 3a and 4a). The SSIB-SIMGM model biases were more severe in the Basin and Range aquifers, and particularly large during the dry season when discharge is dominant. These results suggest that, during the drought, the model underestimated the baseflow discharge. The implementation of variable baseflow discharge factor could perhaps improve simulations under severe drought conditions. The 10-year average bias between modeled and GRACE TWS anomalies for the aquifer systems were approximately 3.8 mm and 6.3 mm month −1 ( Table 2). It should be noted that the GRACE estimates included groundwater withdrawal for irrigation and commercial use. These processes were not implemented in the models, which could amplify the model biases during the drought years in some locations.
To further assess the model's performance, we evaluated the average time lag between precipitation and TWS response peaks for both aquifer systems. TWS response time provides crucial insight about the model's ability to simulate important interactive mechanisms between the region's climate and hydrology, which may not be well represented in the model. The response time was obtained by calculating the correlation coefficients between GRACE and model TWS anomalies and precipitation for different monthly time lags, and identifying the lag corresponding to the maximum correlation coefficient [41,42]. While GRACE TWS anomalies peaked two months after precipitation in both aquifer systems, modeled TWS anomalies peaked one month after precipitation. Further investigation of the model dynamics are necessary to explain this behavior.
The spatial distribution of 10-year average TWS anomalies is shown in Figure 5a,b. GRACE estimates showed declining TWS throughout the entire study area. The strongest declines were found along the coastal areas between 33 • N and 35 • N, where values below −30 mm could be observed. The largest drying was located between Los Angeles and San Diego (Figure 1a), where TWS dropped by −55 mm over the 10 years. In general, the new SSIB-SIMGM model was able to simulate the main features of the TWS anomalies distribution, with the lowest values located along the coastal areas and highest along the border with Mexico (Figure 5b). Due to its higher spatial resolution, the model was able to significantly enhance the heterogeneity of the TWS anomaly distribution. TWS anomalies in the coastal areas are particularly difficult to assess through GRACE's coarse-resolution data due to the influence of the nearby ocean [36]. The 10-year domain-average TWS anomaly for GRACE and SSIB-SIMGM was −23.3 mm and −21.0 mm, respectively. In general, the partially-coupled model was able to simulate the temporal evolution and spatial distribution of TWS anomalies in Southern California.

WRF-SSIB-SIMGM Model
Unlike the partially-coupled model, the forcing data for the surface and aquifer models described in this section were simulated by the WRF model. Initial soil conditions and aquifer water storage for the simulations were the same as those used for the partially-coupled model, as were the domain size and resolution, land cover, and topographic distributions.
In general, results from the fully and partially coupled models were similar over the Coastal Basin aquifers (Figure 3a). The fully-coupled model overestimated the precipitation over that aquifer, between 2008 and 2010 (Figure 6a), which resulted in an overestimation of TWS recharge when compared to the partially-coupled model results and GRACE estimates. For the remaining years, the modeled precipitation biases were smaller and TWS results were consistent with the partiallycoupled model results. On average, the TWS anomaly's RMSE and temporal correlation compared to GRACE were respectively 34.7 mm month −1 and 0.87 (Table 2). In general, the new SSIB-SIMGM model was able to simulate the main features of the TWS anomalies distribution, with the lowest values located along the coastal areas and highest along the border with Mexico (Figure 5b). Due to its higher spatial resolution, the model was able to significantly enhance the heterogeneity of the TWS anomaly distribution. TWS anomalies in the coastal areas are particularly difficult to assess through GRACE's coarse-resolution data due to the influence of the nearby ocean [36]. The 10-year domain-average TWS anomaly for GRACE and SSIB-SIMGM was −23.3 mm and −21.0 mm, respectively. In general, the partially-coupled model was able to simulate the temporal evolution and spatial distribution of TWS anomalies in Southern California.

WRF-SSIB-SIMGM Model
Unlike the partially-coupled model, the forcing data for the surface and aquifer models described in this section were simulated by the WRF model. Initial soil conditions and aquifer water storage for the simulations were the same as those used for the partially-coupled model, as were the domain size and resolution, land cover, and topographic distributions.
In general, results from the fully and partially coupled models were similar over the Coastal Basin aquifers (Figure 3a). The fully-coupled model overestimated the precipitation over that aquifer, between 2008 and 2010 (Figure 6a), which resulted in an overestimation of TWS recharge when compared to the partially-coupled model results and GRACE estimates. For the remaining years, the modeled precipitation biases were smaller and TWS results were consistent with the partially-coupled model results. On average, the TWS anomaly's RMSE and temporal correlation compared to GRACE were respectively 34.7 mm month −1 and 0.87 (Table 2). The fully-coupled model also overestimated precipitation over the Basin and Range aquifer region, especially during the last three years of simulation (Figure 6b). The modeled average precipitation for those years was approximately twice as much as observations. The excessive precipitation produced by WRF-SSIB-SIMGM caused a positive bias in TWS anomaly compared to GRACE (Figure 4a). On average, TWS anomaly RMSE and temporal correlation compared to GRACE was 18.1 mm month −1 and 0.91 respectively (Table 2). Furthermore, similarly to the results obtained with the partially-coupled model, time lag between the fully-coupled model's precipitation and TWS anomaly maxima were one month shorter than those observed from GRACE TWS estimates.
In general, the fully-coupled model generated a weaker 10-year mean anomaly distribution than the partially-coupled model due to the wet bias in simulated rainfall (Figure 7). Along the coast, only a small area showed 10-year average anomalies below −120 mm month −1 , whereas both GRACE and the partially-coupled model showed a much more extensive area. Over the Basin and Range aquifer system, the fully-coupled model showed positive TWS anomalies, which were not consistent with either the GRACE estimates or the partially-coupled model results.

Discussion
The results presented show that the partially-coupled model formed by the SSIB land surface and SIMGM aquifer models was able to simulate the temporal and spatial variability of TWS anomalies in the study area. It provides a more detailed spatial representation of TWS anomalies when compared to GRACE data due to its higher spatial resolution. However, the results from the fully-coupled model clearly indicate that accurate precipitation information is essential to achieve accurate TWS simulations. Precipitation biases in the WRF-SSIB-SIMGM model during the last five years of simulation noticeably affected the quality of TWS anomaly results. Moreover, both models The fully-coupled model also overestimated precipitation over the Basin and Range aquifer region, especially during the last three years of simulation (Figure 6b). The modeled average precipitation for those years was approximately twice as much as observations. The excessive precipitation produced by WRF-SSIB-SIMGM caused a positive bias in TWS anomaly compared to GRACE (Figure 4a). On average, TWS anomaly RMSE and temporal correlation compared to GRACE was 18.1 mm month −1 and 0.91 respectively (Table 2). Furthermore, similarly to the results obtained with the partially-coupled model, time lag between the fully-coupled model's precipitation and TWS anomaly maxima were one month shorter than those observed from GRACE TWS estimates.
In general, the fully-coupled model generated a weaker 10-year mean anomaly distribution than the partially-coupled model due to the wet bias in simulated rainfall (Figure 7). Along the coast, only a small area showed 10-year average anomalies below −120 mm month −1 , whereas both GRACE and the partially-coupled model showed a much more extensive area. Over the Basin and Range aquifer system, the fully-coupled model showed positive TWS anomalies, which were not consistent with either the GRACE estimates or the partially-coupled model results. The fully-coupled model also overestimated precipitation over the Basin and Range aquifer region, especially during the last three years of simulation (Figure 6b). The modeled average precipitation for those years was approximately twice as much as observations. The excessive precipitation produced by WRF-SSIB-SIMGM caused a positive bias in TWS anomaly compared to GRACE (Figure 4a). On average, TWS anomaly RMSE and temporal correlation compared to GRACE was 18.1 mm month −1 and 0.91 respectively (Table 2). Furthermore, similarly to the results obtained with the partially-coupled model, time lag between the fully-coupled model's precipitation and TWS anomaly maxima were one month shorter than those observed from GRACE TWS estimates.
In general, the fully-coupled model generated a weaker 10-year mean anomaly distribution than the partially-coupled model due to the wet bias in simulated rainfall (Figure 7). Along the coast, only a small area showed 10-year average anomalies below −120 mm month −1 , whereas both GRACE and the partially-coupled model showed a much more extensive area. Over the Basin and Range aquifer system, the fully-coupled model showed positive TWS anomalies, which were not consistent with either the GRACE estimates or the partially-coupled model results.

Discussion
The results presented show that the partially-coupled model formed by the SSIB land surface and SIMGM aquifer models was able to simulate the temporal and spatial variability of TWS anomalies in the study area. It provides a more detailed spatial representation of TWS anomalies when compared to GRACE data due to its higher spatial resolution. However, the results from the fully-coupled model clearly indicate that accurate precipitation information is essential to achieve accurate TWS simulations. Precipitation biases in the WRF-SSIB-SIMGM model during the last five years of simulation noticeably affected the quality of TWS anomaly results. Moreover, both models

Discussion
The results presented show that the partially-coupled model formed by the SSIB land surface and SIMGM aquifer models was able to simulate the temporal and spatial variability of TWS anomalies in the study area. It provides a more detailed spatial representation of TWS anomalies when compared to GRACE data due to its higher spatial resolution. However, the results from the fully-coupled model clearly indicate that accurate precipitation information is essential to achieve accurate TWS simulations.
Precipitation biases in the WRF-SSIB-SIMGM model during the last five years of simulation noticeably affected the quality of TWS anomaly results. Moreover, both models have a shorter precipitation-TWS lagged response time (1 month) than that observed with GRACE (2 months). Nevertheless, the two models have the potential to be useful tools for the study of surface water budget in areas that lack extensive observational networks.
Since the partially-coupled model can reproduce the annual evolution of TWS well in both aquifers, it would be interesting to further examine the relationship between TWS anomaly and the two other components of surface water budget: runoff and evapotranspiration during a water year. A water year is defined as the time between October and September of the following year. Over both aquifers, evapotranspiration loss is the predominant component of the water budget (Figure 8a,b). Changes in evapotranspiration, TWS, and runoff account for approximately 65%, 24% and 11% of monthly precipitation on average for the Coastal Basin aquifer. For the Basin Range system those rates are 72%, 20%, and 8%, respectively. The skewed distribution of available water towards evapotranspiration reflects the semi-arid climate of these areas, which are characterized by long dry summers.
A water year is defined as the time between October and September of the following year. Over both aquifers, evapotranspiration loss is the predominant component of the water budget (Figure 8a,b). Changes in evapotranspiration, TWS, and runoff account for approximately 65%, 24% and 11% of monthly precipitation on average for the Coastal Basin aquifer. For the Basin Range system those rates are 72%, 20%, and 8%, respectively. The skewed distribution of available water towards evapotranspiration reflects the semi-arid climate of these areas, which are characterized by long dry summers.
It is also interesting to note that the partitioning of precipitation into the three water budget components varies from year to year. For example, in the 2010 water year in the Costal Basin aquifer, runoff and TWS together accounted for as much of the precipitation as evapotranspiration. However, in 2012, runoff and TWS added to about 10% of total evapotranspiration for that year. We believe that this variability could be associated with the type of precipitation event (long light versus short heavy), which impacts runoff and infiltration rates, as well as with the vegetation state (green versus brown) in the area, which affects photosynthesis and evapotranspiration rates.
Finally, we examined the partitioning of the TWS anomaly into soil moisture and aquifer water anomalies (Figure 8c,d). In general, soil moisture anomalies displayed a higher variability than groundwater. The results showed that average groundwater recharge occurred only in 2010 and 2011 for both aquifers, with the 2011 water year being the more significant recharging event. Groundwater discharge was more intense in 2013 and 2014 over the Coastal Basin, and for 2015 and 2016 over the Basin and Range aquifer system. In general, the soil moisture anomaly accounts for nearly 65% and 75% of the total TWS anomaly observed between 2007 and 2016.  It is also interesting to note that the partitioning of precipitation into the three water budget components varies from year to year. For example, in the 2010 water year in the Costal Basin aquifer, runoff and TWS together accounted for as much of the precipitation as evapotranspiration. However, in 2012, runoff and TWS added to about 10% of total evapotranspiration for that year. We believe that this variability could be associated with the type of precipitation event (long light versus short heavy), which impacts runoff and infiltration rates, as well as with the vegetation state (green versus brown) in the area, which affects photosynthesis and evapotranspiration rates.
Finally, we examined the partitioning of the TWS anomaly into soil moisture and aquifer water anomalies (Figure 8c,d). In general, soil moisture anomalies displayed a higher variability than groundwater. The results showed that average groundwater recharge occurred only in 2010 and 2011 for both aquifers, with the 2011 water year being the more significant recharging event. Groundwater discharge was more intense in 2013 and 2014 over the Coastal Basin, and for 2015 and 2016 over the Basin and Range aquifer system. In general, the soil moisture anomaly accounts for nearly 65% and 75% of the total TWS anomaly observed between 2007 and 2016.

Conclusions
We introduced two versions of a new atmosphere-land-aquifer modeling system and described the performance of simulated TWS anomalies in Southern California during a 10-year period marked by drought conditions. In the first version, the aquifer (SIMGM) and the land surface (SSIB) models were coupled and driven by the reanalysis data. Results indicated that this partially-coupled model was capable of simulating the TWS anomalies in the two main aquifer systems located in the study area. The model's TWS anomaly RMSE and temporal correlation calculated against GRACE estimates were 28.6 mm month −1 and 0.90 (P < 0.001) for the Coastal Basin aquifer system, and 14.3 mm month −1 and 0.94 (P < 0.001) for the Basin and Range aquifer system. In general, this model simulated the temporal variability of TWS anomaly well, but underestimated the decline in TWS towards the end of the period when the drought was more intense.
A fully-coupled version of the model formed by the addition of the WRF atmospheric model was also evaluated. In general, results from this model were less accurate. Comparison against GRACE data yields higher RMSE for both aquifer systems. On the other hand, temporal correlations for the TWS anomaly are similar to the partially-coupled model. The main cause of the lower accuracy in the fully-coupled model's results were the poor precipitation simulations, especially in the eastern portion of the study area, which degraded TWS results. This study suggests that accurate seasonal TWS anomaly forecasts may be difficult to achieve as meteorological models struggle to produce accurate regional precipitation forecasts.
The relationship between TWS, runoff, and evapotranspiration during the drought years was also examined based on the offline model results. On average, the total precipitation shortage during the drought years resulted in a TWS, evapotranspiration, and runoff decline of approximately 24%, 65%, and 11% for the Coastal Basin aquifer system, and 20%, 72%, and 8% for the Basin and Range system, respectively. In general, changes in the amount of soil moisture accounted for about 65% and 75% of the total TWS anomaly observed between 2007 and 2016 over the aquifer systems.
It should be noted that the baseflow decay factors used in these simulations were obtained through a simple calibration against water table measurements. We expect that a more advanced method for the estimation of this parameter could improve the results. Despite the biases, the results indicate that the newly coupled models could be an important tool to further our understanding of the role land cover and precipitation play in determining terrestrial water storage availability at regional and local scales.