Utility of Remotely Sensed Evapotranspiration Products to Assess an Improved Model Structure

Utility of Remotely Sensed Evapotranspiration Products to Assess Improved Structure. Abstract: There is a certain level of predictive uncertainty when hydrologic models are applied for operational purposes. Whether structural improvements address uncertainty has not well been evaluated due to the lack of observational data. This study investigated the utility of remotely sensed evapotranspiration (RS-ET) products to quantitatively represent improvements in model predictions owing to structural improvements. Two versions of the Soil and Water Assessment Tool (SWAT), representative of original and improved versions, were calibrated against streamﬂow and RS-ET. The latter version contains a new soil moisture module, referred to as RSWAT. We compared outputs from these two versions with the best performance metrics (Kling–Gupta Efﬁciency [KGE], Nash-Sutcliffe Efﬁciency [NSE] and Percent-bias [P-bias]). Comparisons were conducted at two spatial scales by partitioning the RS-ET into two scales, while streamﬂow comparisons were only conducted at one scale. At the watershed level, SWAT and RSWAT produced similar metrics for daily streamﬂow (NSE of 0.29 and 0.37, P-bias of 1.7 and 15.9, and KGE of 0.47 and 0.49, respectively) and ET (KGE of 0.48 and 0.52, respectively). At the subwatershed level, the KGE of RSWAT (0.53) for daily ET was greater than that of SWAT (0.47). These ﬁndings demonstrated that RS-ET has the potential to increase prediction accuracy from model structural improvements and highlighted the utility of remotely sensed data in hydrologic modeling.


Introduction
Water resource management is important for resource allocation in agricultural and mixed land-use watersheds, to accommodate for the rising water demand as a result of population increase. An improved understanding of water resource dynamics aids the development of effective adaptation strategies [1]. Hydrologic models are often used as a means to design and manage water resource systems [2][3][4]. Although the use of these models for operational purposes may provide practical solutions, there is a need for continuous effort to reduce uncertainty involved in operational applications of hydrologic models [5,6]. This study capitalizes on remotely sensed data to assess an improved soil moisture module within a hydrologic model; the Soil and Water Assessment Tool (SWAT). Qi et al. [33] incorporated a physically-based soil moisture module (i.e., the Richard equation-based module) into SWAT, producing the RSWAT. Qi et al. [29] compared SWAT and RSWAT to test the efficacy with which RSWAT replicates the partitioning of water into surface runoff and infiltration. They found that SWAT and RSWAT were similar in prediction capacity, likely due to the limitations of data sources that constrained model outputs [29]. This study employs remotely sensed evapotranspiration (RS-ET) products as an additional constraint to discern differences between SWAT and RSWAT. Evapotranspiration (ET) is the sum of evaporation and transpiration, and a crucial component in water and energy exchange between atmospheric and terrestrial systems [34]. ET may be an indicator for accurately representing water partitioning on the land surface [35]. To quantitatively assess model performance, we adopted the Kling-Gupta Efficiency (KGE) measure for daily streamflow and ET as an objective function (see Section 2.5). Two metrics (NSE and Percent-bias (P-bias)) were used as additional objective functions for daily streamflow.
First, SWAT and RSWAT were calibrated against streamflow and watershed level RS-ET; their outputs were subsequently compared at the watershed level. Then, the ET outputs from SWAT and RSWAT were compared at a finer spatial level using a subwatershed level RS-ET. These comparisons were used to test the ability of RS-ET to further constrain outputs from different versions of SWAT, and improve the ability to discern performance differences resulting from different model structures.

Study Area
This study used the drainage area of Tuckahoe Creek, as defined by the United States Geological Survey (USGS) gauge station located near Ruthsburg, Maryland (USGS#01491500); this is referred to as the Tuckahoe Creek Watershed (TCW, 220 km 2 ). The TCW is located within the upper region of the Choptank River watershed (CRW) within the Delmarva Peninsula (Figure 1a). This region is categorized as a temperate, humid climate zone receiving an annual average precipitation of 1200 mm [36]. Seasonal precipitation is evenly distributed throughout the year, while ET is low in the wet season (December to May) and high in the dry season (June to November) [37]. Land use in the TCW is dominated by croplands (54%) that cultivate corn, soybean, and winter wheat ( Figure 1b). The use of irrigated water for corn and soybean has been increasing in this region, contributing to changes in ET dynamics during the dry season [38]. The remaining area in the TCW is comprised of forest (33%), pasture (8%), urban (4%), and water bodies (1%). Well-drained soils (hydrologic soil groups-A and B) account for a slightly greater portion of the watershed (56%) relative to poorly drained soils (C and D, 44%).  [39]) Note: HSGs are characterized as follows: type A are well-drained soils with a water infiltration rate of 7.6-11.4 mm·h −1 ; type B are moderately well-drained soils with 3.8-7.6 mm·h −1 ; type C are moderately poorly-drained soils with 1.3-3.8 mm·h −1 ; and type D are poorly-drained soils with 0-1.3 mm·h −1 [40].

Description of SWAT
SWAT is a semi-distributed, watershed-scale water quality model capable of monitoring the impacts of environmental and anthropogenic changes on physical processes in an agricultural watershed [40]. The model partitions a watershed into subwatersheds, and further into hydrologic response units (HRUs); the latter are a unique combination of land use, soil, and slope within a subwatershed. All hydrological outputs were computed for individual HRUs. The partitioning between surface runoff and infiltration may be calculated using the Soil Conservation Service curve number (CN) procedure or the Green and Ampt infiltration method in SWAT. The CN method was used in this study as it is prevalent in the literature.
The CN method calculates the daily surface runoff ( , mm·d −1 ) using daily rainfall depth ( , mm·d −1 ), and the retention parameter ( , mm·d −1 ). The latter is determined by the and differs based on the land use, soil permeability, and antecedent soil water conditions: The retention parameter by soil profile water content is expressed as: where is the maximum value of retention on a given day (mm·d −1 ); SW is the soil water content of the entire profile excluding the amount of water held in the profile at wilting point; and 1 and 2 are shape coefficients.  [39]) Note: HSGs are characterized as follows: type A are well-drained soils with a water infiltration rate of 7.6-11.4 mm·h −1 ; type B are moderately well-drained soils with 3.8-7.6 mm·h −1 ; type C are moderately poorly-drained soils with 1.3-3.8 mm·h −1 ; and type D are poorly-drained soils with 0-1.3 mm·h −1 [40].

Description of SWAT
SWAT is a semi-distributed, watershed-scale water quality model capable of monitoring the impacts of environmental and anthropogenic changes on physical processes in an agricultural watershed [40]. The model partitions a watershed into subwatersheds, and further into hydrologic response units (HRUs); the latter are a unique combination of land use, soil, and slope within a subwatershed. All hydrological outputs were computed for individual HRUs. The partitioning between surface runoff and infiltration may be calculated using the Soil Conservation Service curve number (CN) procedure or the Green and Ampt infiltration method in SWAT. The CN method was used in this study as it is prevalent in the literature.
The CN method calculates the daily surface runoff (Q sur f , mm·d −1 ) using daily rainfall depth (R day , mm·d −1 ), and the retention parameter (S, mm·d −1 ). The latter is determined by the CN and differs based on the land use, soil permeability, and antecedent soil water conditions: The retention parameter by soil profile water content is expressed as: where S max is the maximum value of retention on a given day (mm·d −1 ); SW is the soil water content of the entire profile excluding the amount of water held in the profile at wilting point; and w 1 and w 2 are shape coefficients. Infiltrated water affects the soil water dynamics in SWAT. The daily soil water content in each layer was determined as follows: where ∆SW i is the change in soil water content (mm) at the ith soil layer; Q p,i−l is the percolation from the upper layer (mm); Q p,i is the percolation out of the current soil layer (mm); Q l,i is the lateral flow generated from the current soil layer (mm); and E e,i and E t,i are the evaporation and transpiration drawn from the current soil layer (mm), respectively. The percolation (Q p,i ) for the ith layer is expressed as: where FC i is the soil water content at field capacity (mm); K sat,i is the saturated hydraulic conductivity (mm·h −1 ); and SAT i is the amount of water required for the complete saturation (mm) of the ith layer. Percolation from the bottom of the soil profile enters groundwater, and lateral flow was modeled using a kinematic storage routing method based on the slope, slope length, and saturated conductivity. SWAT has three methods for reference ET calculations [40]: the Penman-Monteith, Priestley-Taylor, and Hargreaves methods. We used the Penman-Monteith method to compute ET r as follows: where ET r is the maximum transpiration rate (mm·d −1 ); ∆ is the slope of the saturation vapor pressure-temperature curve (kPa· • C −1 ); H net is the net radiation (MJ·m −2 d −1 ); G is the heat flux density to the ground (MJ·m −2 d −1 ); ρ air is the density of air (kg·m −3 ); c p is the specific heat at constant pressure (MJ·kg −1 • C −1 ); e 0 z is the saturation vapor pressure of air at height z (kPa); e z is the water vapor pressure of air at height z (kPa); γ is the psychrometric constant (kPa· • C −1 ); r c is the plant canopy resistance (s·m −1 ); and r a is the diffusion resistance of the air layer (aerodynamic resistance) (s·m −1 ). Further details are available in Neitsch et al. [40].

RSWAT
Qi et al. [33] generated a new version of SWAT by incorporating a modified Richards equation into the model to physically represent soil water content and moisture movement ( Figure 2). The Richards equation-based soil moisture module was tested against field measurements and compared with the original soil moisture module at 10 stations within the CRW. The results show that the Richards equation-based module outperformed the original module in terms of simulations of daily soil moisture based on the improved R-squared and reduced biases [33]. Here, we briefly introduce the Richards-equationbased soil moisture module; detailed information on model development and evaluation is provided in Qi et al. [33].
The modified Richards equation incorporated in RSWAT is as follows: where θ is the volumetric soil water content (mm 3 ·mm −3 ); t is time (s); z is the depth below the soil surface (mm; positive downward); k is the hydraulic conductivity (mm·s −1 ); h is the soil matric potential (mm); Q is the soil water sink term (mm·mm −1 s −1 ); and h e is the equilibrium soil matric potential (mm). Equation (7) was discretized into a set of tridiagonal equations solved using the method of Patankar [41]. Infiltration was determined using the surface boundary condition from the CN method, and assuming free-draining conditions for the bottom boundary condition. Evaporation, transpiration, and lateral flow are the sink terms in Equation (7) that were calculated using their corresponding SWAT functions. Instantaneous hydraulic conductivity was estimated using the Community Land Model [42], while saturated conductivity was measured and provided in each HRU based on the US Department of Agriculture (USDA) Natural Resources Conservation Service (NRCS) Soil Survey Geographic Database (SSURGO). According to Clapp and Hornberger [43] and Cosby et al. [44], soil matric potential is a function of water content and hydraulic conductivity. The equilibrium soil matric potential considers the impact of fluctuations in the water table, and was determined based on the method of Zeng and Decker [45]. SWAT functions. Instantaneous hydraulic conductivity was estimated using the Community Land Model [42], while saturated conductivity was measured and provided in each HRU based on the US Department of Agriculture (USDA) Natural Resources Conservation Service (NRCS) Soil Survey Geographic Database (SSURGO). According to Clapp and Hornberger [43] and Cosby et al. [44], soil matric potential is a function of water content and hydraulic conductivity. The equilibrium soil matric potential considers the impact of fluctuations in the water table, and was determined based on the method of Zeng and Decker [45].

SWAT Input Data and Model Constraints
Meteorological and geospatial data were used to run the SWAT (Table 1). SWAT input data consisted of the National Aeronautics and Space Administration (NASA) North American Land Data Assimilation System 2 (NLDAS2) forcing data, including hourly precipitation, temperature, solar radiation, relative humidity, and wind speed. The NLDAS2 data is generated by multiple observations that provide continental-scale data at a spatial resolution of 1/8° [46]. A light detection and ranging (LiDAR)-based digital elevation model of the USDA-Agricultural Research Service (Beltsville, MD, USA) was used to establish the topographic characteristics and divide modeling units. A soil map was downloaded from the SSURGO database, while the land use map utilized in this study was developed by Lee et al. [47]. This land use map characterized farmland configurations and annually cultivated crops using multi-year cropland data layers (CDLs). The spatial distribution of other types of land-use were identified using multiple geospatial datasets (Table 1); Lee et al. [47] provides further details.

SWAT Input Data and Model Constraints
Meteorological and geospatial data were used to run the SWAT (Table 1). SWAT input data consisted of the National Aeronautics and Space Administration (NASA) North American Land Data Assimilation System 2 (NLDAS2) forcing data, including hourly precipitation, temperature, solar radiation, relative humidity, and wind speed. The NL-DAS2 data is generated by multiple observations that provide continental-scale data at a spatial resolution of 1/8 • [46]. A light detection and ranging (LiDAR)-based digital elevation model of the USDA-Agricultural Research Service (Beltsville, MD, USA) was used to establish the topographic characteristics and divide modeling units. A soil map was downloaded from the SSURGO database, while the land use map utilized in this study was developed by Lee et al. [47]. This land use map characterized farmland configurations and annually cultivated crops using multi-year cropland data layers (CDLs). The spatial distribution of other types of land-use were identified using multiple geospatial datasets (Table 1); Lee et al. [47] provides further details. Streamflow and RS-ET were used as model constraints; daily streamflow records from 2010 to 2014 were obtained from USGS gauge station #01491500 (Figure 1a). RS-ET data were developed by the regional Atmosphere-Land Exchange Inverse (ALEXI) model [49,50] and the associated flux spatial-temporal disaggregation scheme (DisALEXI, Anderson et al., 2004). The 30 m daily RS-ET data from ALEXI/DisALEXI in the study area has been previously validated against in-situ eddy covariance flux tower measurements, with an average relative error of 10% [48]. Figure 3 presents examples of 30 m daily RS-ET data for the TCW; this data spans from January 2010 to December 2014 and was utilized as an additional model constraint. The watershed level average of the RS-ET was calculated for model calibration. Streamflow and RS-ET were used as model constraints; daily streamflow records from 2010 to 2014 were obtained from USGS gauge station #01491500 (Figure 1a). RS-ET data were developed by the regional Atmosphere-Land Exchange Inverse (ALEXI) model [49,50] and the associated flux spatial-temporal disaggregation scheme (DisALEXI, Anderson et al., 2004). The 30 m daily RS-ET data from ALEXI/DisALEXI in the study area has been previously validated against in-situ eddy covariance flux tower measurements, with an average relative error of 10% [48]. Figure 3 presents examples of 30 m daily RS-ET data for the TCW; this data spans from January 2010 to December 2014 and was utilized as an additional model constraint. The watershed level average of the RS-ET was calculated for model calibration.

Model Calibration
SWAT and RSWAT were calibrated at a daily time step over five years (2010-2014), as streamflow and RS-ET were available during this period. A warm-up period of two years (2008-2009) was used for the calibration. The simulation periods were not split into

Model Calibration
SWAT and RSWAT were calibrated at a daily time step over five years (2010-2014), as streamflow and RS-ET were available during this period. A warm-up period of two years (2008-2009) was used for the calibration. The simulation periods were not split into calibration and validation periods due to the short-term observations. The model simulation guidelines outlined by Arnold et al. [51] recommends the inclusion of different climate conditions (e.g., dry and wet) during the calibration period to identify optimal parameter values for sites of interest. In addition, a study comparing performances between two models often only considered a calibration period [28]. As RSWAT and SWAT were tested at the same study site, we used a five-year calibration period to identify the best parameter values for SWAT and RSWAT. Previous SWAT modeling studies conducted in the study area have demonstrated the sensitivity of streamflow and water quality parameters [36,52]. Based on these studies, we established 13 parameters to calibrate SWAT and RSWAT against streamflow and RS-ET (Table 2). We prepared 1000 parameter sets using the Latin hypercube sampling (LHS) method that efficiently identifies the optimal parameter set [53]; thus, the parameter set producing the best model performance measures was identified. The KGE was used to simulate streamflow and ET against corresponding observations. KGE diagnostically decomposes the NSE and mean squared error (MSE) to provide a combined measure of the relative importance of correlation, bias, and variability for hydrological modeling [54]. KGE values range from −∞ to 1, where values closer to 1 indicate a stronger model performance: where r indicates the Pearson product-moment correlation coefficient; σ s /σ o and µ s /µ o indicate the variability ratio and bias between simulations and observations, respectively; Sustainability 2021, 13, 2375 9 of 18 σ and µ indicate the standard deviation and mean of the variables, respectively; and the subscripts, s and o, indicate the simulations and observations, respectively. We additionally used NSE and P-bias as the metrics for daily streamflow. NSE is an indicator of how well simulated and observed values fit the 1:1 line and the range of NSE is from −∞ to 1 (one indicates a perfect fit) [55]. P-bias indicates a general tendency of model over-(or under-) prediction relative to observations, and positive and negative values refer to model underestimation and overestimation, respectively [55]. Lower and greater values refer to increased and decreased accuracy, respectively. The two metrics have been frequently adopted to assess daily performance measures [52,56] and they are calculated as follows: where O i are observed and S i are simulated data; O are observed mean values; and n equals the number of observations. The "hydroGOF" package of the R program [57] was used to calculate KGE, NSE, and P-bias.

Comparing the Prediction Capacity of SWAT and RSWAT
To demonstrate how representative the modified model structure (i.e., RSWAT) was for hydrologic variables, we conducted evaluations at two spatial levels ( Figure 4). Simulated streamflow and watershed level ET simulations from SWAT and RSWAT were first examined using observed streamflow and RS-ET, respectively. A flow duration curve (FDC) was plotted using daily streamflow from SWAT and RSWAT to examine how the two models replicate streamflow during high and low-flow periods compared to observations. Then, subwatershed-level assessments were conducted by comparing the subwatershed-level ET and the corresponding RS-ET divided by the subwatershed boundary. There are 19 subwatersheds spanning from 0.09 to 32 km 2 in the TCW. The subwatershed boundary was delineated using the ArcSWAT interface for SWAT with an input DEM [58]. Subwatershed-level model outputs were directly obtained from the SWAT and RSWAT results, as they represent the best performance metrics at the watershed level. Once the daily subwatershed-level average of RS-ET was determined, the KGE values for individual subwatersheds over the simulation period were calculated for SWAT and RSWAT. Evaluation of streamflow predictions at the subwatershed-level was not conducted due to the absence of subwatershed-level streamflow observations.   Figure 5 presents the relationship between simulated streamflow and ET from SWAT and RSWAT and the observed streamflow and watershed-level RS-ET. SWAT produced KGE of 0.47, NSE of 0.29, and P-bias of 1.7% for daily streamflow and KGE of 0.48 for daily ET. RSWAT had slightly higher metrics with KGE of 0.49 and NSE of 0.37 for daily streamflow and KGE of 0.52 for daily ET while P-bias was 15.9% greater than SWAT. Overall, SWAT and RSWAT showed similar metrics regarding that KGE values were similar between two models, and NSE and P-bias of SWAT indicated lower and greater accuracy than RSWAT, respectively.

Streamflow and ET Predictions at the Watershed Level
Both models replicated the observations reasonably well. NSE and P-bias for daily streamflow were acceptable when NSE > 0.2 and P-bias ≤ ±25%, respectively [52,56]. SWAT and RSWAT satisfied those criteria for streamflow. Regarding KGE values, a previous study defined KGE > 0.5 as an acceptable performance for monthly streamflow [59]. Our streamflow results were slightly lower than KGE of 0.5. Daily simulations are evaluated using relaxed criteria compared to longer time scale (e.g., monthly and annual) outputs since daily outputs depict detailed extreme values [51]. Therefore, our KGE values for daily streamflow seemed to be within an acceptable range. The KGE values for daily ET were greater than one previous study (i.e., 0.26) [60] while being lower than another previous study (i.e., 0.5-0.9) [61]. Less accurate results are likely due to the omission of plant parameters that account for a substantial portion of ET [61], and the exclusion of the impacts of irrigation. A study by [62] showed model parameterization of forest, resulting in more accurate ET predictions relative to default parameters. For croplands, similar findings were observed in a previous study [63]. Approximately 87% of the study watershed is covered by either crops or forest and therefore non-adjusting plant parameters may be a contributor of low accuracy. However, ET values from the two models reflected strong  Overall, SWAT and RSWAT showed similar metrics regarding that KGE values were similar between two models, and NSE and P-bias of SWAT indicated lower and greater accuracy than RSWAT, respectively.

Streamflow and ET Predictions at the Watershed Level
Both models replicated the observations reasonably well. NSE and P-bias for daily streamflow were acceptable when NSE > 0.2 and P-bias ≤ ±25%, respectively [52,56]. SWAT and RSWAT satisfied those criteria for streamflow. Regarding KGE values, a previous study defined KGE > 0.5 as an acceptable performance for monthly streamflow [59]. Our streamflow results were slightly lower than KGE of 0.5. Daily simulations are evaluated using relaxed criteria compared to longer time scale (e.g., monthly and annual) outputs since daily outputs depict detailed extreme values [51]. Therefore, our KGE values for daily streamflow seemed to be within an acceptable range. The KGE values for daily ET were greater than one previous study (i.e., 0.26) [60] while being lower than another previous study (i.e., 0.5-0.9) [61]. Less accurate results are likely due to the omission of plant parameters that account for a substantial portion of ET [61], and the exclusion of the impacts of irrigation. A study by [62] showed model parameterization of forest, resulting in more accurate ET predictions relative to default parameters. For croplands, similar findings were observed in a previous study [63]. Approximately 87% of the study watershed is covered by either crops or forest and therefore non-adjusting plant parameters may be a contributor of low accuracy. However, ET values from the two models reflected strong seasonality (e.g., high ET in summer seasons and low ET in winter seasons) in ET in this region [37] and were relatively well matched with RS-ET. often caused model underestimation of peak streamflow in this region [36]. In addition, a semi-distributed SWAT structure oversimplifies water routing at the subwatershed level, producing inaccurate peak flow predictions [52]. This model error may also be due to the climate input data, NLDAS2; Qi et al. [65] compared SWAT simulations with different climate data finding that SWAT results driven by NLDAS2 climate input data underestimated streamflow due to higher daily solar radiation, leading to overestimations of ET. A comparison between SWAT and RSWAT outputs using the FDC showed that the latter output provided good agreement with observations. This was particularly the case during low-streamflow conditions (i.e., flow intervals that exceed the greatest fraction of time [>80%], Figure 6). This may be partially due to the use of a simplified soil moisture conceptual model in SWAT, such as a bucket [40]. Shahrban et al. [66] also found that a hydrologic model using a bucket concept for soil moisture poorly simulated low-flow conditions relative to a model that used a continuous distribution of soil moisture according to vertical depth. Based on this FDC result, we postulate that RSWAT may have a greater capacity to replicate water partitioning processes than SWAT. In this region, low flows correspond to baseflow conditions, and the amount of baseflow is determined from water partitioning processes within upstream areas [67]. Another possible reason for this low-flow discrepancy between SWAT and RSWAT is to simultaneously constrain model parameters using streamflow and RS-ET. Tobin and Bennett [68] found that the inclusion of ET as an additional constraint enabled the accurate capturing of actual flow patterns. Multiple constraints were known to improve model ability to predict hydrologic variables [59]. It seemed that an original SWAT received less benefits of multiple constraints owing to limited model structure. However, the greater accuracy of baseflow patterns in RSWAT Peak streamflow was not well captured by SWAT and RSWAT, largely due to inherent model limitations. The CN method used in SWAT does not consider rainfall intensity and duration in partitioning surface runoff and infiltration; as such, simulated peak flows tend to be underestimated [64]. A localized precipitation not observed at weather stations often caused model underestimation of peak streamflow in this region [36]. In addition, a semi-distributed SWAT structure oversimplifies water routing at the subwatershed level, producing inaccurate peak flow predictions [52]. This model error may also be due to the climate input data, NLDAS2; Qi et al. [65] compared SWAT simulations with different climate data finding that SWAT results driven by NLDAS2 climate input data underestimated streamflow due to higher daily solar radiation, leading to overestimations of ET.
A comparison between SWAT and RSWAT outputs using the FDC showed that the latter output provided good agreement with observations. This was particularly the case during low-streamflow conditions (i.e., flow intervals that exceed the greatest fraction of time [>80%], Figure 6). This may be partially due to the use of a simplified soil moisture conceptual model in SWAT, such as a bucket [40]. Shahrban et al. [66] also found that a hydrologic model using a bucket concept for soil moisture poorly simulated low-flow conditions relative to a model that used a continuous distribution of soil moisture according to vertical depth. Based on this FDC result, we postulate that RSWAT may have a greater capacity to replicate water partitioning processes than SWAT. In this region, low flows correspond to baseflow conditions, and the amount of baseflow is determined from water partitioning processes within upstream areas [67]. Another possible reason for this low-flow discrepancy between SWAT and RSWAT is to simultaneously constrain model parameters using streamflow and RS-ET. Tobin and Bennett [68] found that the inclusion of ET as an additional constraint enabled the accurate capturing of actual flow patterns. Multiple constraints were known to improve model ability to predict hydrologic variables [59]. It seemed that an original SWAT received less benefits of multiple constraints owing to limited model structure. However, the greater accuracy of baseflow patterns in RSWAT may be a due to a combination of the improved model structure, and use of RS-ET as an additional constraint during model calibration. may be a due to a combination of the improved model structure, and use of RS-ET as an additional constraint during model calibration.   [61] assessed subwatershed-level ET predictions using four different model configurations, finding that the KGEs ranged from 0.35 to 0.8. Compared to this study, Rajib et al. [61] had higher KGEs; this may be because parameters were adjusted for individual subwatersheds and a greater number of parameters affecting ET were considered [62]. However, the size of our study site was too small to conduct a subwatershed-level calibration. Becker et al. [60] conducted model calibration at a scale smaller than the subwatershed scale, finding that the average KGE (0.4) was relatively low; such assessments may fail to show improved model predictions. As per Becker et al. [60], model calibration at the finest spatial level (i.e., HRU) may be unsuitable when RS-ET is utilized, as the size and configuration of the HRU is extremely   [61] assessed subwatershed-level ET predictions using four different model configurations, finding that the KGEs ranged from 0.35 to 0.8. Compared to this study, Rajib et al. [61] had higher KGEs; this may be because parameters were adjusted for individual subwatersheds and a greater number of parameters affecting ET were considered [62]. However, the size of our study site was too small to conduct a subwatershed-level calibration. Becker et al. [60] conducted model calibration at a scale smaller than the subwatershed scale, finding that the average KGE (0.4) was relatively low; such assessments may fail to show improved model predictions. As per Becker et al. [60], model calibration at the finest spatial level (i.e., HRU) may be unsuitable when RS-ET is utilized, as the size and configuration of the HRU is extremely random for comparisons with RS-ET [60]. Conversion of HRU ET results to a grid was found as a promising way to make comparison between SWAT results and grid-format ET, but a grid size should be carefully selected [69]. The approach adopted in this study is considered reasonable to address our aim; however, the inclusion of parameters associated with ET is recommended for future studies.

ET Predictions at the Subwatershed Level
Overall, subwatershed-level metrics better quantified model improvements relative to the watershed level metrics. Spatial pattern evaluation using remotely sensed data was reported as a promising means to reveal improved model performance [70]. It is recommended that future studies that adopt remotely sensed data should use finer spatial scale metrics to effectively discern model output improvements. random for comparisons with RS-ET [60]. Conversion of HRU ET results to a grid was found as a promising way to make comparison between SWAT results and grid-format ET, but a grid size should be carefully selected [69]. The approach adopted in this study is considered reasonable to address our aim; however, the inclusion of parameters associated with ET is recommended for future studies.
Overall, subwatershed-level metrics better quantified model improvements relative to the watershed level metrics. Spatial pattern evaluation using remotely sensed data was reported as a promising means to reveal improved model performance [70]. It is recommended that future studies that adopt remotely sensed data should use finer spatial scale metrics to effectively discern model output improvements.  The advent of RS-ET availability has produced advances in hydrologic modeling approaches. Model calibration against RS-ET aids the prediction of future water demand for agricultural activities via the enhanced representation of water consumption by crops [71].  random for comparisons with RS-ET [60]. Conversion of HRU ET results to a grid was found as a promising way to make comparison between SWAT results and grid-format ET, but a grid size should be carefully selected [69]. The approach adopted in this study is considered reasonable to address our aim; however, the inclusion of parameters associated with ET is recommended for future studies.
Overall, subwatershed-level metrics better quantified model improvements relative to the watershed level metrics. Spatial pattern evaluation using remotely sensed data was reported as a promising means to reveal improved model performance [70]. It is recommended that future studies that adopt remotely sensed data should use finer spatial scale metrics to effectively discern model output improvements.  The advent of RS-ET availability has produced advances in hydrologic modeling approaches. Model calibration against RS-ET aids the prediction of future water demand for agricultural activities via the enhanced representation of water consumption by crops [71]. The advent of RS-ET availability has produced advances in hydrologic modeling approaches. Model calibration against RS-ET aids the prediction of future water demand for agricultural activities via the enhanced representation of water consumption by crops [71]. By being a direct constraint, RS-ET is capable of supporting studies that assess the per-formance of different ET calculation methods based on observed streamflow [72,73]. A previous study observed enhanced model performance through the inclusion of RS-ET for multi-objective model calibration [74]. Wambura et al. [13] set up RS-ET as an additional constraint in conjunction with a primary constraint (streamflow), concluding that the addition of RS-ET reduced equifinality and predictive uncertainty. Parajuli et al. [75] demonstrated that model calibration based on RS-ET may be used to characterize hydrologic cycles in regions with limited meteorological data. Spatial calibration informed by RS-ET can increase overall model prediction accuracy [61]. The results from this study also demonstrate the value of RS-ET in assessing improved model structures, suggesting the added potential of RS-ET in advancing hydrologic modeling.
To better substantiate these findings, various improved model structures need to be tested with remotely sensed data in landscapes with different topographic and climatic conditions. As previously stated in the introduction section, ET is a key component of hydrologic cycling. The use of RS-ET in hydrologic modeling is able to better characterize hydrologic dynamics in various landscapes. In conjunction with RS-ET, remotely sensed hydrologic data, including soil moisture [76], is becoming increasingly available. This new data source provides opportunities for hydrologic modelers to better understand hydrologic processes, establish spatially explicit parameters, assess model performance, and improve the capacity of hydrologic modeling tools for water resource management. Efforts to integrate remotely sensed data into hydrologic models will increase the credibility of modeled outputs in operational applications.

Conclusions
We employed RS-ET to evaluate the predictive capability of a modified SWAT (i.e., the RSWAT) that contains improved representation of soil moisture dynamics. Streamflow and watershed-level RS-ET were used to calibrate SWAT and RSWAT. Simulations with the best performance metrics from the two models (i.e., SWAT and RSWAT) were compared at the watershed and subwatershed levels. The comparisons were made to determine the model that best predicted streamflow and watershed-level ET, and to understand to what extent the two models depict subwatershed-level ET. For subwatershed-level comparisons, RS-ET was divided by the subwatershed boundary and then compared with the subwatershedlevel simulated ET. Three metrics (NSE, P-bias, and KGE) and one metric (KGE) were used to assess model predictability for streamflow and ET, respectively. There were two keys findings from the results of this assessment. First, SWAT and RSWAT produced similar streamflow and ET at the watershed level with similar performance metrics. NSE, P-bias, and KGE for daily streamflow were 0.47, 0.29, and 1.7% (SWAT) and 0.49, 0.37, and 15.9% (RSWAT), respectively. SWAT and RSWAT had KGE values of 0.48 and 0.52 for daily ET, respectively. Second, differences between SWAT and RSWAT were more evident at the subwatershed level. RSWAT demonstrated increased prediction accuracy in most subwatersheds relative to SWAT, with greater average KGEs (i.e., average KGE values of 0.47 and 0.53 for SWAT and RSWAT, respectively). Previous studies that demonstrate model improvements using observations at the watershed outlet often failed to show significant improvements of model predictions caused by upgrading model structures based on the performance metrics mainly owing to limited observational data [28,29]. However, our study overcame this limitation by inclusion of RS-ET, leading to improved disparity of the performance metric between existing and modified SWATs. These findings demonstrate that the use of RS-ET as a further model constraint improves the ability to discern model prediction quality attributable to model structure improvements. Subwatershed-level metric comparisons reinforce the value of RS-ET data in informing the calibration process and discerning differences in model structure performance. Therefore, our study emphasizes the applicable way of remotely sensed data to support hydrologic models, being toward generating results approximating the realistic conditions.