Multi-Variable SWAT Model Calibration Using Satellite-Based Evapotranspiration Data and Streamflow

In this study, monthly streamflow and satellite-based actual evapotranspiration data (AET) were used to evaluate the Soil and Water Assessment Tool (SWAT) model for the calibration of an experimental sub-basin with mixed land-use characteristics in Athens, Greece. Three calibration scenarios were performed using streamflow (i.e., single variable), AET (i.e., single variable), and streamflow–AET data together (i.e., multi-variable) to provide insights into how different calibration scenarios affect the hydrological processes of a catchment with complex land use characteristics. The actual evapotranspiration data were obtained from the Moderate Resolution Imaging Spectroradiometer (MODIS). The calibration was achieved with the use of the SUFI-2 algorithm in the SWAT-CUP program. The results suggested that the single variable calibrations showed moderately better performance than the multi-variable calibration. However, the multi-variable calibration scenario displayed acceptable outcomes for both streamflow and actual evapotranspiration and indicated reasonably good streamflow estimations (NSE = 0.70; R2 = 0.86; PBIAS = 6.1%). The model under-predicted AET in all calibration scenarios during the dry season compared to MODIS satellite-based AET. Overall, this study demonstrated that satellite-based AET data, together with streamflow data, can enhance model performance and be a good choice for watersheds lacking sufficient spatial data and observations.


Introduction
Hydrological models have been extensively utilized to estimate the consequences of climate variability, land management practices, and policy directions at various temporal and spatial scales [1]. Model development requires a good comprehension of the watershed characteristics to achieve accurate model simulation [2,3]. Nonetheless, most basins are ungauged or inadequately gauged [4]. The absence of adequate observations affects the calibration process and further model improvement [5].
Hydrological model calibration is typically achieved with flow data at the outlet of the basin by choosing the most suitable values for input parameters and comparing simulated outcomes with observed data [6]. However, calibration focused on one variable only may aggregate all watershed processes together and intensify the occurrence of the equifinality problem (i.e., multiple parameter sets can reproduce a similar output) [7][8][9]. Using multiple variables (e.g., streamflow, evapotranspiration, soil moisture) in the calibration process attempts to overcome equifinality across multiple parameter sets [10][11][12].
In addition, unknown procedures to the modeler, such as unidentified discharges, agricultural activities, and dumping of construction materials, interfere with the natural behavior of the system and increase the uncertainty in streamflow calibration [13,14]. Therefore, incorporating remote sensing data in model calibration can increase model accuracy, capture the spatial and temporal heterogeneity of hydrological processes, and be a promising   (1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11) in the sub-basins used for actual evapotranspiration calibration.
The sub-basin is an urban/peri-urban area. The dominant land cover types ar dential areas (34.1%), shrubland (15.9%), and agriculture (12.4%) (Figure 2b) [28]. A1 displays the land use categories of the study area at catchment level and Tab displays the land use categories of the study area at sub-basin level. The major so Cambisols, Regosols, Leptosols, and Luvisols [29]. These formations are generally h clay and sand contents with good soil permeability (Figure 2c).  The sub-basin is an urban/peri-urban area. The dominant land cover types are residential areas (34.1%), shrubland (15.9%), and agriculture (12.4%) (Figure 2b) [28]. Table  A1 displays the land use categories of the study area at catchment level and Table A2 displays the land use categories of the study area at sub-basin level. The major soils are Cambisols, Regosols, Leptosols, and Luvisols [29]. These formations are generally high in clay and sand contents with good soil permeability (Figure 2c). The sub-basin is an urban/peri-urban area. The dominant land cover types are residential areas (34.1%), shrubland (15.9%), and agriculture (12.4%) (Figure 2b) [28]. Table A1 displays the land use categories of the study area at catchment level and Table A2 displays the land use categories of the study area at sub-basin level. The major soils are Cambisols, Regosols, Leptosols, and Luvisols [29]. These formations are generally high in clay and sand contents with good soil permeability (Figure 2c).

Data Sources
The input data include a digital elevation model (DEM) at 30 m resolution from the website of the US Geological Survey [30], a land use map from the 100 m 2018 Corine Land Cover map [28], a soil map from the Food and Agriculture Organization Digital Soil Map of the World (30 arcseconds resolution) [31], and meteorological data from the National Observatory of Athens [27]. Daily rainfall data were obtained from 2015 to 2019. The daily measured streamflow data at the basin outlet (Monastiri gauging station) were available from 2018 to 2019 and were retrieved from Open Hydrosystem Information Network [32]. The actual evapotranspiration (AET) data were collected from the Moderate Resolution Imaging Spectroradiometer Global Evaporation [33] with a pixel resolution of 500 × 500 m [34].

The SWAT Hydrological Model
The SWAT (Soil and Water Assessment Tool) program is an open-source, physically based, continuous-time river basin model developed to estimate the influence of management practices on discharge, sediments, and agriculture in large complex basins [6,35]. The model runs on a daily time step, and its main variables are hydrology, weather, soil, land use, sediments, nutrients, bacteria, and pathogens.
In SWAT, the basin is divided into sub-basins, then into hydrologic response units (HRUs) with unique land use, soil, and slope characteristics [36]. The water balance is computed separately for each hydrologic response unit [37]. The water balance equation is estimated using the following (Equation (1)): where SW t is the soil water content (mm), SW o is the soil water content on day i in the previous period (mm), t is the time step (days), R day indicates the amount of precipitation on day i (mm), Q sur f represents the surface streamflow on day i (mm), E a indicates the AET on day i (mm), W seep is the percolation and bypass flow on day i (mm), and Q gw represents the return flow on day i (mm).

Model Setup
The QGIS interface of the SWAT model was utilized for model configuration [38]. The watershed was delineated into 25 sub-basins and 386 hydrological response units (HRUs). A 10% threshold was used for land use, soil, and slope to limit the influence of minor soil and land use types for each sub-basin. The Corine Land Cover land use classes [28] were converted to the SWAT land use classes [6]. The model was simulated from 2015 to 2019 and run on a daily time step. Two years (1 January 2015-31 December 2016) were set as a warm-up period. The potential evapotranspiration was calculated using the Penman-Monteith method, the surface runoff was estimated using the curve number method [39], and the channel routing was computed using the variable storage coefficient method [40].

Model Calibration and Sensitivity Analysis
The model was calibrated using the Sequential Uncertainty Fitting Algorithm (SUFI-2) in the SWAT-Calibration and Uncertainty Program (SWAT-CUP) [41]. In SUFI-2, the calibrating parameters are set according to literature and sensitivity analysis, and then the parameters sets are generated using Latin hypercube sampling (LHS) [14]. The significance of each parameter is defined with a t-test. Parameters with large t-stat and small p-value (p-value < 0.03) are identified as sensitive parameters [42].
Three calibration scenarios were conducted using monthly time series of streamflow from one station at the outlet of the study site and the MODIS satellite-based actual evapotranspiration (AET) from eleven sub-basins based on data availability [33]. The daily streamflow was converted to monthly for comparison reasons. The monthly average values were calculated by a resample function, and the missing values were estimated by an interpolation function using the linear method (Python pandas library). The actual evapotranspiration data from MODIS were in a geotiff format (raster). In addition, to compare the MODIS satellite-based AET (pixel values) to the SWAT simulated AET, an area-weighted averaging approach in QGIS (zonal statistics) was performed to create the aggregated monthly values for each sub-basin [23,43].
The scenarios include: (i) streamflow calibration (i.e., single variable), (ii) AET calibration (i.e., single variable), and (iii) both streamflow and AET calibration (i.e., multi-variable). Based on data availability, streamflow was calibrated from 2018 to 2019, and evapotranspiration was calibrated from 2017 to 2019. All the available data were used for model calibration to represent the wet and dry conditions properly (2017: 487 mm, 2018: 675 mm, 2019: 765 mm). The calibration process used 20 parameters linked to streamflow and evapotranspiration (Table 1), and their sensitivities were estimated. The original value ranges of the parameters and their sensitivities for each calibration scenario are displayed in Table 2. In the single variable calibrations, the two variables (i.e., streamflow and actual evapotranspiration) were calibrated separately, and the performance of the second variable was evaluated. In the multi-variable calibration the two variables were calibrated together using a multi-variable objective function and assigning equal weights to each variable [16,44]. The Nash-Sutcliffe model efficiency (NS) was used as an objective function, and 900 simulations per iteration were performed and up to three iterations. Table 1. Calibrated parameters. The method "r" (relative) indicates multiplying the current parameter value by a given value, the method "v" (replace) indicates replacing the current parameter value, and the method "a" (absolute) indicates adding a given value to the current parameter [14].
Hydrology 2022, 9, 112 where Q obs is the measured streamflow, Q sim is the simulated streamflow on the day i, Q obs is the mean of measured streamflow, and Q sim is the mean of simulated streamflow. R 2 varies from 0 to 1, where 0 indicates no correlation and 1 means perfect correlation and less error variance. NSE can vary from −∞ to 1, where values ≤ 0 show that the model is unreliable and values closer to 1 indicate a perfect fit between simulated and measured data. The best PBIAS value is 0. Positive values show that the model results are underestimated, and negative values show that the model results are overestimated. Model performance can be assessed as "satisfactory" for a monthly time step if R 2 > 0.60, NSE > 0.50, and PBIAS ≤ ±15% for watershed-scale models [48].

Sensitivity Analysis
The most sensitive parameters for each calibration scenario are shown in Table 2. Sensitive parameters are identified by p-value less than 0.03. Figure A1 shows the relative changes of the five most sensitive parameters for each calibration scenario versus objective function. Most variations of NSE values were found to be in the calibration with streamflow rather than single evapotranspiration and multi-variable calibrations.
In the streamflow calibration, the parameters with the highest sensitivity were lateral flow travel time (LAT_TTIME), moist bulk density of the soil layer (SOL_BD), groundwater delay time (GW_DELAY), saturated hydraulic conductivity (SOL_K), and average slope steepness (HRU_SLP). These parameters were connected to groundwater flow, runoff generation, and channel routing. In the AET calibration, the parameters with the highest sensitivity were soil evaporation compensation coefficient (ESCO), saturated hydraulic conductivity (SOL_K), moist bulk density of the soil layer (SOL_BD), average slope steepness (HRU_SLP), and plant uptake compensation coefficient (EPCO), which are mostly related to soil properties. In the multi-variable calibration, the most sensitive parameters were lateral flow travel time (LAT_TTIME), groundwater delay time (GW_DELAY), average slope steepness (HRU_SLP), moist bulk density of the soil layer (SOL_BD), saturated hydraulic conductivity (SOL_K), and soil evaporation compensation coefficient (ESCO). These parameters referred to both groundwater flow and soil properties.

Model Performance Evaluation
The model performance was assessed with criteria recommended by Moriasi et al. [48]. In all three calibration scenarios, the deviations from the observed values start to increase during the dry season and decline during the wet season. Table 3 displays the model performance for all three scenarios for all the sub-basins. Figure 3 presents the measured and simulated hydrographs at the outlet of the catchment (Monastiri gauging station) for all three scenarios. Finally, Figure 4 shows the measured and simulated AET for the entire study area at a catchment scale.

Variable
Station/Sub-Basin

Streamflow Calibration
The streamflow calibration showed good results in general for streamflow (NSE = 0.71; R 2 = 0.84; PBIAS = 5.6%) ( Table 3). The results regarding evapotranspiration were unsatisfactory for NSE (NSE within −0.17 to 0.57), except for sub-basin 8 (NSE = 0.57), and satisfactory for R 2 and PBIAS (R 2 > 0.58; PBIAS within −12.7% to 15.8%). Figure 3 designates a satisfying match between measured and simulated streamflow except for low flows during the dry season. However, the temporal dynamics of the hydrograph were generated correctly. Figure 4 indicates differences between observed and simulated AET. Streamflow calibration underestimated evapotranspiration in the wet season and overestimated evapotranspiration at the beginning of the dry season.

Actual Evapotranspiration Calibration
The AET calibration presented unsatisfactory performance for NSE for sub-basins 2-7 and satisfactory performance for sub-basins 1, 8, 9, 10, and 11 (NSE within −0.14 to 0.58) ( Table 3). In respect of R 2 and PBIAS, the results were satisfactory for all the sub-basins

Streamflow Calibration
The streamflow calibration showed good results in general for streamflow (NSE = 0.71; R 2 = 0.84; PBIAS = 5.6%) ( Table 3). The results regarding evapotranspiration were unsatisfactory for NSE (NSE within −0.17 to 0.57), except for sub-basin 8 (NSE = 0.57), and satisfactory for R 2 and PBIAS (R 2 > 0.58; PBIAS within −12.7% to 15.8%). Figure 3 designates a satisfying match between measured and simulated streamflow except for low flows during the dry season. However, the temporal dynamics of the hydrograph were generated correctly. Figure 4 indicates differences between observed and simulated AET. Streamflow calibration underestimated evapotranspiration in the wet season and overestimated evapotranspiration at the beginning of the dry season.

Actual Evapotranspiration Calibration
The AET calibration presented unsatisfactory performance for NSE for sub-basins 2-7 and satisfactory performance for sub-basins 1, 8, 9, 10, and 11 (NSE within −0.14 to 0.58) ( Table 3). In respect of R 2 and PBIAS, the results were satisfactory for all the subbasins (R 2 > 0.69; PBIAS within −5.3% to 15%). The performance for streamflow was unsatisfactory (NSE = 0.38; R 2 = 0.84; PBIAS = 8.3%). In particular, the observed and simulated AET values did not match well. Nevertheless, they showed a well-matched seasonal variation of evapotranspiration ( Figure 4). AET calibration underestimated low flows and overestimated high flows for the simulation period ( Figure 3). Performance statistics were generally better for streamflow than evapotranspiration in single-variable calibration scenarios.

Multi-Variable Calibration
The multi-variable scenario, using streamflow and MODIS satellite-based AET, showed satisfactory performance for streamflow (NSE = 0.70; R 2 = 0.86; PBIAS = 6.1%) and unsatisfactory (sub-basins 1-7) to satisfactory (sub-basins 8-11) performance for evapotranspiration (NSE within −0.10 to 0.69; R 2 > 0.55; PBIAS within −8.4% to 21.7%) ( Table 3). Simulated and observed streamflow values are much better than those for AET calibration and similar to streamflow calibration (Figure 3). Results for AET are related to those of AET calibration, showing underestimation of the simulated values ( Figure 4). Compared to single variable calibration scenarios, multi-variable calibration displayed similar NSE values obtained from single-variable calibration, and R 2 showed good performances (>0.75) for both variables (Table 3).

Major Water Balance Components
The major water balance components (i.e., actual evapotranspiration, water yield, and precipitation) are displayed in Figure 5. In general, average annual precipitation (643 mm) was slightly greater than combined water yield (WYLD) and actual evapotranspiration (AET) values. Actual evapotranspiration contributed a large amount of water loss from the watershed, about 60% in all scenarios. The total water yield of the multi-variable calibration is higher than the other two modeling scenarios. In particular, the total water yield was estimated to be 171 mm for flow calibration, 151 mm for AET calibration, and 186 mm for multi-variable calibration.
Simulated and observed streamflow values are much better than those for AET calibration and similar to streamflow calibration (Figure 3). Results for AET are related to those of AET calibration, showing underestimation of the simulated values (Figure 4). Compared to single variable calibration scenarios, multi-variable calibration displayed similar NSE values obtained from single-variable calibration, and R 2 showed good performances (>0.75) for both variables (Table 3).

Major Water Balance Components
The major water balance components (i.e., actual evapotranspiration, water yield, and precipitation) are displayed in Figure 5. In general, average annual precipitation (643 mm) was slightly greater than combined water yield (WYLD) and actual evapotranspiration (AET) values. Actual evapotranspiration contributed a large amount of water loss from the watershed, about 60% in all scenarios. The total water yield of the multi-variable calibration is higher than the other two modeling scenarios. In particular, the total water yield was estimated to be 171 mm for flow calibration, 151 mm for AET calibration, and 186 mm for multi-variable calibration.

Discussion
In this study, the SWAT hydrological model was used to interpret the behaviour of an urban/sub-urban environment and analyse its underlying mechanisms. The study area is a typical Mediterranean catchment prone to natural hazards such as floods, forest fires

Discussion
In this study, the SWAT hydrological model was used to interpret the behaviour of an urban/sub-urban environment and analyse its underlying mechanisms. The study area is a typical Mediterranean catchment prone to natural hazards such as floods, forest fires and their combined impact. Therefore, the mechanisms governing surface runoff and the interactions between the hydrological components should be analysed in depth for these vulnerable areas. The main objective was to investigate which parameters are more sensitive in a mixed land-use basin and to propose a multi-variable calibration procedure using both streamflow and satellite-based AET data for SWAT modelling.
The sensitivity analysis results showed that the parameters with the highest sensitivity for streamflow are connected to groundwater flow, runoff generation, and channel routing, and for actual evapotranspiration, they are linked to soil properties, respectively (Table 2, Figure A1). The differences in the sensitivity of the parameters are due to different data used in the calibration process. Similar outcomes were obtained by Sirisena et al. [16] and Moriasi et al. [49]. Sirisena et al. [16] concluded that the most sensitive parameters for evapotranspiration were connected to soil properties. Moriasi et al. [49] pointed out that the high sensitivity of the soil parameters to AET indicated a connection between actual evaporation and soil water.
Both variables present a slightly better performance in the single calibrations with streamflow and evapotranspiration than in the multi-variable calibration (Table 3). Nonetheless, the multi-variable calibration produced satisfactory results for both streamflow and AET and showed reasonably good streamflow estimations (NSE = 0.70). AET showed the best performance for forests (e.g., sub-basins 1, 8,9,10,11). This indicates that the MODIS data probably show better performance at simulating forests and semi-natural areas than in sub-urban areas, including more complex management systems. In addition, evapotranspiration algorithms are characterized by resolution issues, misclassification of land use, and data generation uncertainties [50]. Therefore, these algorithms may not correctly capture the land use changes (especially in sub-urban areas) and the available soil moisture on the ground.
In most sub-basins, MODIS satellite-based AET and SWAT simulated AET show that seasonal patterns match well, although the SWAT model under-predicted AET compared with the MODIS satellite-based AET during the dry season ( Figure 4, Table 3). These results are consistent with those of other studies [51,52]. A study in Morocco [51] and a study in Iran [52] suggested that the multi-variable calibration can produce good results for both variables. Nevertheless, the single variable calibrations showed better performance. The differences for all three scenarios intensify during the dry season and decline during the wet season (Figures 3 and 4). The underestimation of AET and the low baseflow, especially during the dry season (Figure 3), could suggest unknown water contributions in the study area. These deviations are probably also connected to soil replenishment and the crops' high water demand during the dry season. Furthermore, it is worth mentioning that MODIS satellite-based AET could include errors and underestimations or overestimations of the "true" AET, altering the model's water balance [53]. For instance, the higher MODIS satellite-based AET values in the AET calibration scenario led to lower water yield values than the streamflow calibration scenario ( Figure 5). Satellite-based evapotranspiration datasets use sensor-derived parameters (e.g., surface heat flux, latent heat flux) that may have several uncertainties. Model misrepresentations, errors in the inputs, and spatial and temporal scaling decrease the efficiency of the algorithms [54].
For both single variable calibrations, the simulated second variable (i.e., AET for streamflow calibration and streamflow for AET calibration) is not well represented. The unsatisfactory performance of the second variable in the single variable calibrations indicates the poor representation of the catchment's water balance. Many studies support that incorporating satellite data in the hydrological model calibration improved the estimation of water balance components regardless of the model performance improvement. A study in China [55] calibrated the SWAT model with GLEAM AET data and streamflow. Although streamflow only calibration produced reliable results, this approach grouped the hydrological process. Furthermore, a study by Immerzeel and Droogers [13] pointed out that incorporating AET data in hydrological model calibration reduces equifinality obtained from traditional streamflow only calibration. The water balance is best reproduced when both streamflow and AET are used in the calibration process [52]. Several studies [49,56] reported that satellite AET data could be used to constrain hydrological parameters that are highly sensitive to evapotranspiration.
In general, the calibration process is challenging because of the uncertainties that exist due to model simplification, processes that are not accounted by the model, and processes that are unknown to the modeler [14]. In this study, the main sources of uncertainty are connected to inaccuracies (i) in the quality of input data (climate, soil, and land cover resolution), (ii) in the model set up (aggregation and interpolation methods), (iii) in the choice of objective data and parameterization, (iv) observed data, and (v) processes unknown to the modeler which interfere with the natural system [42]. Observational errors in the precipitation data, MODIS actual evapotranspiration, and discharge, as well as the effects of elevation and topography, increase bias and generate variability. For example, errors in streamflow measurements can vary from 6% to 19% under different combinations of channel types and measurement techniques [57]. In addition, poor resolution and data generation uncertainties in evapotranspiration algorithms can induce biases between AET simulated values and satellite-based AET [17,50,58]. For instance, a study in Ethiopia [50] estimated the major water balance components of the Upper Blue Nile basin using the JGrass-NewAge hydrological system and remote sensing data (GLEAM and MODIS AET). This study concluded that the satellite-based data introduce bias in estimating the water budget. Another study by [58] conducted a multi-objective validation for West Africa river basins using remote sensing data. The results showed that MODIS satellite-based AET underestimated SWAT-simulated AET in arid areas. Finally, Dile et al. [17] evaluated AET outputs derived from AVHRR and MOD16 AET datasets using outputs from a SWAT model for Ethiopia. This study suggested that datasets did not agree well with the precipitation in regions with a bimodal precipitation pattern. Therefore, careful consideration should be given to analyzing data from satellite-based products. Further information is necessary to estimate the uncertainty in model outputs and improve the calibration results at HRU level.

Conclusions
This study used monthly streamflow and MODIS satellite-based AET data to calibrate the SWAT model. Three calibration scenarios were conducted with streamflow, AET, and streamflow-AET data to evaluate the simulated outputs. The sensitivity analysis showed that the most sensitive parameters for streamflow are related to groundwater flow, runoff generation, and channel routing, and for actual evapotranspiration, they are all connected to soil properties. The model performance results indicated that the single variable calibrations showed satisfactory performance only for the first variable that was simulated. The multi-variable calibration showed satisfactory performance for both streamflow and AET. The SWAT model generally under-predicted AET in all scenarios compared to MODIS satellite-based AET.
This research showed that combining streamflow and MODIS satellite-based AET data in the calibration process can improve model performance regarding streamflow and water balance and contribute to understanding the hydrological processes in a mixed land-use catchment. Furthermore, the use of satellite data in model calibration, as presented in this study, can be utilized in catchments lacking measured data or in catchments with similar hydrological and geomorphological characteristics. Future work should incorporate discharge, soil moisture, and HRU level AET data in a combined objective function at a high temporal resolution.

Acknowledgments:
The authors are grateful to Nancy Sammons, Information Technology Specialist of the SWAT team for her kind help with the SWAT software. Authors gratefully acknowledge the financial support from the Hellenic Foundation for Research and Innovation (HFRI).

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix A
The following tables display the land use categories of the study area at catchment (Table A1) and sub-basin level (Table A2).  Table A2. Land use categories of the study area at sub-basin level. Artificial surfaces (i.e., urban fabric, industrial, commercial and transport units), agricultural areas (i.e., arable land, pastures and heterogeneous agricultural areas) and forests and semi natural areas (i.e., forests, scrub and herbaceous vegetation associations). The following figure shows the relative changes of the five most sensitive parameters for each calibration scenario versus objective function ( Figure A1). Figure A1. Calibrated values (x-axis) versus objective function NSE value range (y-axis); blue dots symbolize streamflow calibration, green dots symbolize AET calibration, and grey dots symbolize multi-variable calibration. The parameters (a-e) are the most sensitive parameters for streamflow calibration, the parameters (f-g) are the most sensitive parameters for AET calibration, and the parameters (k-o) are the most sensitive parameters for multi-variable calibration. Figure A1. Calibrated values (x-axis) versus objective function NSE value range (y-axis); blue dots symbolize streamflow calibration, green dots symbolize AET calibration, and grey dots symbolize multi-variable calibration. The parameters (a-e) are the most sensitive parameters for streamflow calibration, the parameters (f-j) are the most sensitive parameters for AET calibration, and the parameters (k-o) are the most sensitive parameters for multi-variable calibration.