Calibration of a Distributed Hydrological Model in a Data-Scarce Basin Based on GLEAM Datasets

: The calibration of hydrological models is often complex in regions with scarce data, and generally only uses site-based streamﬂow data. However, this approach will yield highly generalised values for all model parameters and hydrological processes. It is therefore necessary to obtain more spatially heterogeneous observation data (e.g., satellite-based evapotranspiration (ET)) to calibrate such hydrological models. Here, soil and water assessment tool (SWAT) models were built to evaluate the advantages of using ET data derived from the Global Land surface Evaporation Amsterdam Methodology (GLEAM) to calibrate the models for the Bayinhe River basin in northwest China, which is a typical data-scarce basin. The result revealed the following: (1) A great e ﬀ ort was required to calibrate the SWAT models for the study area to obtain an improved model performance. (2) The SWAT model performance for simulating the streamﬂow and water balance was reliable when calibrated with streamﬂow only, but this method of calibration grouped the hydrological processes together and caused an equiﬁnality issue. (3) The combination of the streamﬂow and GLEAM-based ET data for calibrating the SWAT model improved the model performance for simulating the streamﬂow and water balance. However, the equiﬁnality issue remained at the hydrologic response unit (HRU) level.


Introduction
Distributed hydrological models are important tools for revealing the hydrological processes that occur in a changing environment [1]. However, a hydrological model includes multiple complex parameters. Hence, various inputs are required to accurately simulate hydrological processes [2]. Although less effort is needed to calibrate a hydrological model if the input data are reliable and comprehensive [3], in some regions with data scarce, the calibration of a hydrological model is complex and requires considerable effort. Recently, auto-calibration software and methods for hydrological models have been developed, for example, Parameter ESTimation (PEST) [4], the Shuffled Complex Evolution algorithm (SCE-UA) [5], and SWAT Calibration Uncertainty Programs (SWAT-CUP) [6]. These tools are convenient for model calibration; however, there is more than one set of optimal parameters after a calibration using auto-calibration software or methods [7,8]. Moreover, although some of the parameters are reasonable, others are not, and it is difficult to evaluate whether the calibrated parameters are correct. The main output variables of a hydrological model are streamflow, evapotranspiration (ET), soil water content, surface runoff, groundwater flow, and lateral flow [7,9]. However, generally, only site-based streamflow data are used to calibrate and validate a hydrological model because it is difficult to observe other variables [9]. Furthermore, in some watersheds, there are only a few hydrological stations that are heterogeneously distributed. As a result, an auto-calibrated model using site-based

Study Area
The Bayinhe River basin is an archetypical, alpine inland river basin located at the north-eastern edge of the Qaidam basin. The river itself originates from the Zongwulong mountain range on the southern slopes of Mount Qilian (average elevation 4200 m). The river exits the mountains into the Zelinggou basin and then travels through the basin before flowing into Delingha City [21]. Finally, the river splits and flows east into Keluke Lake and west into Gahai Lake. According to statistics from the Delingha meteorological station, the mean annual precipitation in the Bayinhe River basin between 1999 and 2019 was 210 mm, and the mean annual temperature was 5.81 • C. The Bayinhe River basin is a typical ecologically fragile, arid and semi-arid plateau. The region's primary land cover types are desert and grassland with primarily light frigid calcic soil and dark frigid calcic soil. This study takes the upper and middle reaches of the Bayinhe River basin (above the Delingha hydrological station) as its study area (Figure 1).

Study Area
The Bayinhe River basin is an archetypical, alpine inland river basin located at the north-eastern edge of the Qaidam basin. The river itself originates from the Zongwulong mountain range on the southern slopes of Mount Qilian (average elevation 4200 m). The river exits the mountains into the Zelinggou basin and then travels through the basin before flowing into Delingha City [21]. Finally, the river splits and flows east into Keluke Lake and west into Gahai Lake. According to statistics from the Delingha meteorological station, the mean annual precipitation in the Bayinhe River basin between 1999 and 2019 was 210 mm, and the mean annual temperature was 5.81 °C. The Bayinhe River basin is a typical ecologically fragile, arid and semi-arid plateau. The region's primary land cover types are desert and grassland with primarily light frigid calcic soil and dark frigid calcic soil. This study takes the upper and middle reaches of the Bayinhe River basin (above the Delingha hydrological station) as its study area (Figure 1).

Basic Data of the SWAT Model
Setting up the surface parameters for a SWAT model requires soil data, land use data, topographic data, and a drainage basin outlet point. Soil classification data ( Figure 2a) were taken from the 1:1,000,000 soil types from the dataset of Qinghai Province, while relevant soil hydrology data were referenced from the Soils of Qinghai Province. Land use data ( Figure 2b) were cropped from the 1:100,000 China land use dataset for 2015. To correspond data to the land use types in the SWAT hydrological model database, land use types were reclassified as farmland, forest, grassland, water bodies, residential land, and bare land. Terrain data ( Figure 1) had a 30 × 30 m resolution (ASTER GDEM). In addition, the Delingha hydrological station was selected as the basin's outlet point. Figure 2c delineates the sub-basins that comprise the middle and upper reaches of the Bayinhe River basin. This study divided the region into a total of 17 sub-basins.

Basic Data of the SWAT Model
Setting up the surface parameters for a SWAT model requires soil data, land use data, topographic data, and a drainage basin outlet point. Soil classification data ( Figure 2a) were taken from the 1:1,000,000 soil types from the dataset of Qinghai Province, while relevant soil hydrology data were referenced from the Soils of Qinghai Province. Land use data ( Figure 2b) were cropped from the 1:100,000 China land use dataset for 2015. To correspond data to the land use types in the SWAT hydrological model database, land use types were reclassified as farmland, forest, grassland, water bodies, residential land, and bare land. Terrain data ( Figure 1) had a 30 × 30 m resolution (ASTER GDEM). In addition, the Delingha hydrological station was selected as the basin's outlet point. Figure 2c delineates the sub-basins that comprise the middle and upper reaches of the Bayinhe River basin. This study divided the region into a total of 17 sub-basins.   The GLEAM includes a series of algorithms to calculate the components of surface ET based on satellite data for water and heat (Table 1). The algorithms calculate the potential ET based on the Priestley-Taylor method [22]. There are four modules in the GLEAM to calculate the different proportions of the ET process: the interception model, soil water module, stress module and Priestley and Taylor model [14]. The GLEAM provides daily actual ET data at a 0.25 • × 0.25 • spatial resolution. Figure 3 illustrates the coverage of the GLEAM dataset for the study area. The GLEAM includes a series of algorithms to calculate the components of surface ET based on satellite data for water and heat (Table 1). The algorithms calculate the potential ET based on the Priestley-Taylor method [22]. There are four modules in the GLEAM to calculate the different proportions of the ET process: the interception model, soil water module, stress module and Priestley and Taylor model [14]. The GLEAM provides daily actual ET data at a 0.25° × 0.25° spatial resolution. Figure 3 illustrates the coverage of the GLEAM dataset for the study area.

SWAT Model
SWAT is a semi-distributed hydrologic model; it first uses a conceptual model to estimate precipitation, streamflow, and sediment for each individual hydrologic response unit (HRU). After these calculations are complete, the SWAT calculates the convergence of the river basin channels. Finally, the flow rate and sediment and pollutant loads for the basin outlet section are obtained [9].
The simulation of watershed hydrology by the SWAT can be divided into (1) the land-surface component of the water cycle (flow and slope convergence), which controls the input of water, sediment, and nutrients in the main channel of each sub-basin, and (2) the surface water component of the water cycle (channel convergence), which determines the transport of water, sediment, and other substances from the channel network to the basin outlet [9].
The water volume calculation of the SWAT model is based on the principle of water volume balance and follows Equation (1) [9]: where SWC t is the soil water content (mm); SWC 0 is the soil water content in the previous period (mm); t is the model time step; R day is the amount of precipitation on i day (mm); W surf indicates the surface streamflow of the i-th day (mm); E t represents the actual ET (mm); SC seep represents the soil permeation of i-th day (mm); W gw represents the amount of basic flow (mm).

SWAT Model Calibration Strategy
Three SWAT models were built: (1) SWAT1 simulates the water balance of the entire study area (upper and middle reaches of the Bayinhe River), and is calibrated by the stream outflow from the middle reach; (2) SWAT2U simulates the water balance of the upper reach of the Bayinhe River and is calibrated with the GLEAM based ET data; (3) SWAT2M simulates the water balance of the middle reach of the Bayinhe River and is calibrated with the GLEAM-based ET data and stream outflow from the middle reach. For the SWAT2M, the simulated stream outflow from the upper reach was directly used as the inflow to the middle reach. The auto-calibration tool SWAT-CUP combined with a manual calibration strategy [9] were used to calibrate the SWAT model based on observed data. The Nash-Sutcliff efficiency (NSE), percent bias (PBIAS) and coefficient of determination (R 2 ) were used to evaluate the performance of the three SWAT models. Figure 3 shows the unique calibration strategy used in this study (SWAT2U and SWAT2M). Firstly, the upper and middle reaches were separately modelled. Secondly, the satellite-based ET data derived from the GLEAM dataset were used to calibrate the SWAT model to simulate the water balance of the upper reach of the Bayinhe River. Thirdly, the simulated stream outflow from the upper reach was used as the inflow to the middle reach. The ET data and streamflow data were then used to calibrate the SWAT model for the middle reach of the Bayinhe River.

Parameters Sensitivity
In this study, 25 model parameters related to ET and streamflow were selected and their sensitivities were calculated. The sequential uncertainty fitting (SUFI2) algorithm was combined with the global sensitivity method in SWAT-CUP software (Swiss Federal institute of Aquatic Science and Technology, Duebendorf, Swizerland) and used to evaluate parameter sensitivities. The SUFI2 algorithm in SWAT-CUP combined with the SWAT-CUP and manual calibration strategy [9] were used to calibrate the three SWAT models based on the parameter sensitivity results (Section 3.1).

Indicators for Evaluating the SWAT Model Simulation Result
Based on the results of [23], this study uses the NSE (Equation (2)), PBIAS (Equation (3)), and R 2 (Equation (4)) to evaluate the performance of the SWAT model.
In Equations (2)-(4), S obs i represents the measured flow, S sim i represents the simulated flow, and S obs represents the mean of the measured flow. NSE values range from negative infinity to 1; the closer the value of NSE to 1, the better and more credible the simulation results are. The closer the value of NSE to 0.5, the closer the simulation results are to the observed values, which means that the overall model result is credible; however, the process simulation error is large. If the NSE is far less than 0, the model is not reliable. If the PBIAS value is between −10% and 10%, the model results are good. R 2 ranges from 0 to 1, whereby the closer the value of R 2 is to 1, the better and more credible the simulation results are [23]. Table 2 presents the results for the first ten sensitivity parameters of the SWAT1, SWAT2U, and SWAT2M models. The higher the absolute value of t-Stat and the lower the p-value, the more sensitive the parameter is. The first ten sensitivity parameters for the SWAT1 model were CN2 (the SCS runoff curve number), CH_K2 (the effective hydraulic conductivity in main-channel alluvium), SOL_BD (the moist bulk-density), CH_N2 (Manning's "n" value for the main channel), SOL_K (the saturated hydraulic-conductivity), SOL_AWC (the available water capacity of the soil layer), GW_REVAP (a groundwater "revap" coefficient), GWQMN (the threshold depth of water in the shallow aquifer required for return flow to occur), SLSUBBSN (the average slope length), and SMFMN (the annual minimum melt-rate for snow). The first ten sensitivity parameters for the SWAT2U model were CN2, SOL_BD, SOL_K, ESCO (a soil evaporation compensation factor), SLSUBBSN, GWQMN, SMFMN, SNOCOVMN (a snow-pack temperature lag factor), SNO50COV (the fraction of the snow volume in a given area that corresponds to 50% of the snow cover) and CH_N2. The first ten sensitivity parameters for the SWAT2M model were CN2, SOL_BD, SLSUBBSN, SOL_K, HRU_SLP (the average slope steepness), ALPHA_BF (the baseflow alpha-factor), SOL_AWC, ESCO, GW_REVAP, and GWQMN. The first ten sensitivity parameters were different for the three SWAT models because different calibration data were used.

Non-Calibrated SWAT
The non-calibrated SWAT model results can demonstrate how well the SWAT model predicts the streamflow before calibration, which indicates the effort required for calibration when using each configuration [3]. Figure 4 exhibits the monthly streamflow at the outlet of the middle reach simulated by the non-calibrated SWAT. The performance of the non-calibrated SWAT was poor: R 2 < 0.50, NSE < 0.50, and PBIAS < −20%. Moreover, the simulated and observed streamflow were not well matched in each year.  CN2: SCS runoff curve number; CH_K2: effective hydraulic conductivity in main-channel alluvium; SOL_BD: moist bulk-density; CH_N2: Manning's "n" value for the main channel; SOL_K: saturated hydraulic-conductivity; SOL_AWC: available water capacity of the soil layer; GW_REVAP: a groundwater "revap" coefficient; GWQMN: threshold depth of water in the shallow aquifer required for return flow to occur; SLSUBBSN: average slope length; SMFMN: annual minimum melt-rate for snow; ESCO: a soil evaporation compensation factor; SNOCOVMN: a snow-pack temperature lag factor; SNO50COV: fraction of the snow volume in a given area that corresponds to 50% of the snow cover; HRU_SLP: average slope steepness; ALPHA_BF: baseflow alpha-factor.

Non-Calibrated SWAT
The non-calibrated SWAT model results can demonstrate how well the SWAT model predicts the streamflow before calibration, which indicates the effort required for calibration when using each configuration [3]. Figure 4 exhibits the monthly streamflow at the outlet of the middle reach simulated by the non-calibrated SWAT. The performance of the non-calibrated SWAT was poor: R 2 < 0.50, NSE < 0.50, and PBIAS < −20%. Moreover, the simulated and observed streamflow were not well matched in each year.    Figure 5 exhibits the monthly streamflow at the outlet of the middle reach simulated by the calibrated SWAT1 (R 2 = 0.74; NSE = 0.73; PBIAS = 0.6%). Compared to the non-calibrated SWAT (R 2 = 0.06; NSE = −0.11; PBIAS = −34.5%), the performance of the SWAT1 model to simulate the monthly streamflow improved by~863.6% to 1133.3%. The simulated and observed streamflow were well matched except for some specific years (e.g., the fourth and sixth years) when the simulated streamflow was particularly low.

SWAT1 Performance
Water 2020, 12, x FOR PEER REVIEW 9 of 14 matched except for some specific years (e.g., the fourth and sixth years) when the simulated streamflow was particularly low.

SWAT2U Performance
In the SWAT2U model, the ranges of some important and sensitive parameters (i.e., SMFMN, SNOCOVMN, SNO50COV, and ALPHA_BF) were assigned according to existing research for some similar watersheds [23,24]. Figure 6 shows the annual average ET simulated by the SWAT2U model for the upper reach of the Bayinhe River. In most of the sub-basins, the observed ET derived from the GLEAM was higher than that of the SWAT model. Table 3 presents the performance of the SWAT2U model for simulating the monthly ET in each sub-basin. All R 2 values were >0.90, all NSE values were >0.84 and PBIAS values were within −20% to 20%. Hence, the SWAT2U model was well-calibrated by using the monthly ET data derived from the GLEAM. Figure 7 exhibits the simulated stream outflow from the upper reach. This part of the streamflow was the inflow for the middle reach.

SWAT2U Performance
In the SWAT2U model, the ranges of some important and sensitive parameters (i.e., SMFMN, SNOCOVMN, SNO50COV, and ALPHA_BF) were assigned according to existing research for some similar watersheds [23,24]. Figure 6 shows the annual average ET simulated by the SWAT2U model for the upper reach of the Bayinhe River. In most of the sub-basins, the observed ET derived from the GLEAM was higher than that of the SWAT model. Table 3 presents the performance of the SWAT2U model for simulating the monthly ET in each sub-basin. All R 2 values were >0.90, all NSE values were >0.84 and PBIAS values were within −20% to 20%. Hence, the SWAT2U model was well-calibrated by using the monthly ET data derived from the GLEAM. Figure 7 exhibits the simulated stream outflow from the upper reach. This part of the streamflow was the inflow for the middle reach.    Table 4 and Figure 8 show the SWAT2M model simulation results for the streamflow in the middle reach of the Bayinhe River. The R 2 and NSE values reached up to 0.78 and 0.75, respectively, and the PBIAS was within −20% to 20%. Moreover, the simulated and observed streamflows were well-matched. The performance of the SWAT2M model for simulating the monthly ET was also good ( Table 4: R 2 > 0.91; NSE > 0.78; PBIAS within −20% to 20%). The performance of the SWAT2M model for simulating the monthly streamflow at the outlet of the middle reach was better than that of the SWAT1 model. Figure 9 presents the monthly ET simulated by the SWAT2 model for the entire study area, whereby the simulated and observed ET were well-matched.   Table 4 and Figure 8 show the SWAT2M model simulation results for the streamflow in the middle reach of the Bayinhe River. The R 2 and NSE values reached up to 0.78 and 0.75, respectively, and the PBIAS was within −20% to 20%. Moreover, the simulated and observed streamflows were well-matched. The performance of the SWAT2M model for simulating the monthly ET was also good ( Table 4: R 2 > 0.91; NSE > 0.78; PBIAS within −20% to 20%). The performance of the SWAT2M model for simulating the monthly streamflow at the outlet of the middle reach was better than that of the SWAT1 model. Figure 9 presents the monthly ET simulated by the SWAT2 model for the entire study area, whereby the simulated and observed ET were well-matched.  Table 4 and Figure 8 show the SWAT2M model simulation results for the streamflow in the middle reach of the Bayinhe River. The R 2 and NSE values reached up to 0.78 and 0.75, respectively, and the PBIAS was within −20% to 20%. Moreover, the simulated and observed streamflows were well-matched. The performance of the SWAT2M model for simulating the monthly ET was also good ( Table 4: R 2 > 0.91; NSE > 0.78; PBIAS within −20% to 20%). The performance of the SWAT2M model for simulating the monthly streamflow at the outlet of the middle reach was better than that of the SWAT1 model. Figure 9 presents the monthly ET simulated by the SWAT2 model for the entire study area, whereby the simulated and observed ET were well-matched.    Figure 10 presents the water balance of the SWAT1 and SWAT2 models (where SWAT2 = SWAT2U + SWAT2M). It is obvious that the total water yield of the SWAT1 model was higher than that of the SWAT2 model, which resulted in a lower total simulated streamflow by the SWAT2 models. In addition, the precipitation and ET of the SWAT1 model were lower than those of the SWAT2 models. This was because different calibration data may result in different model parameter values and different water components. Compared to other research in similar study areas [7,24,25], the simulated water balance of the Bayinhe River basin by the SWAT1 and SWAT2 models was reasonable.     Figure 10 presents the water balance of the SWAT1 and SWAT2 models (where SWAT2 = SWAT2U + SWAT2M). It is obvious that the total water yield of the SWAT1 model was higher than that of the SWAT2 model, which resulted in a lower total simulated streamflow by the SWAT2 models. In addition, the precipitation and ET of the SWAT1 model were lower than those of the SWAT2 models. This was because different calibration data may result in different model parameter values and different water components. Compared to other research in similar study areas [7,24,25], the simulated water balance of the Bayinhe River basin by the SWAT1 and SWAT2 models was reasonable.  Figure 10 presents the water balance of the SWAT1 and SWAT2 models (where SWAT2 = SWAT2U + SWAT2M). It is obvious that the total water yield of the SWAT1 model was higher than that of the SWAT2 model, which resulted in a lower total simulated streamflow by the SWAT2 models. In addition, the precipitation and ET of the SWAT1 model were lower than those of the SWAT2 models. This was because different calibration data may result in different model parameter values and different water components. Compared to other research in similar study areas [7,24,25], the simulated water balance of the Bayinhe River basin by the SWAT1 and SWAT2 models was reasonable.

Discussion
The performance of the non-calibrated SWAT model was quite poor because the precipitation data were obtained from the only meteorological station in the watershed-in the middle reaches of the Bayinhe River, which is not representative of the rainfall across the entire upper and middle

Discussion
The performance of the non-calibrated SWAT model was quite poor because the precipitation data were obtained from the only meteorological station in the watershed-in the middle reaches of the Bayinhe River, which is not representative of the rainfall across the entire upper and middle reaches. This indicates that a great effort was required to calibrate the SWAT model to obtain a better model performance. The study area of the upper and middle reaches includes variations in climate, terrain, and land use. However, as mentioned, only one hydrological station exists in the entire watershed, at the outlet of the middle reaches. Therefore, we first calibrated the SWAT model (SWAT1) using only the observed streamflow at the outlet of the middle reach. Compared to the non-calibrated SWAT model, the performance of the calibrated SWAT1 model to simulate the monthly streamflow improved obviously. The traditional calibration using the site-based streamflow grouped the hydrological processes together, which was primarily due to there being only one gauging station. Hence, this situation illustrates the need for more data with a high spatial resolution to calibrate the SWAT model for such a data scarce area.
To address this issue, the SWAT2 model was calibrated with both the site-based streamflow and satellite-based ET data. In addition, the upper and middle reaches were calibrated separately (SWAT2U and SWAT2M). The SWAT2U model simulated the water balance of the upper reach of the Bayinhe River and was calibrated using the GLEAM-based ET data, whereas the SWAT2M model simulated the water balance of the middle reach of the Bayinhe River and was calibrated using the GLEAM-based ET data and observed stream outflow from the middle reach. For the SWAT2M, the simulated stream outflow from the upper reach was used directly as the inflow to the middle reach. The performances of the SWAT2U and SWAT2M models for simulating the monthly ET were very good. The performance of the SWAT2M model for simulating the monthly streamflow at the outlet of the middle reach was better than that of the SWAT1 model. Although other similar studies [16][17][18] have used different ET data to calibrate their hydrological models, our results, which used the satellite-based ET data to improve our model's performance, were essentially the same as these previous studies. In our research, the GLEAM-based ET data played four roles in the calibration process, whereby the data: (1) distributed the hydrological processes of the study area (compared to SWAT1); (2) reduced the uncertainty of the SWAT model in this data scarce area; (3) improved the performance of the SWAT model to simulate the streamflow and water balance; (4) improved the reliability of the model parameters.
As mentioned, the precipitation data used in the present study were obtained from the only meteorological station in the study area; thus, the spatial heterogeneity of precipitation was not considered. This may have been a factor for the discrepancy between the simulated and observed streamflow in some specific months. Hence, if the spatial heterogeneity of precipitation had been taken into account, the model uncertainties may have been reduced. Although the use of the GLEAM-based ET data to calibrate the SWAT model improved the model's performance for simulating the streamflow, the performance was not very good. Consequently, validated satellite-based precipitation data are needed for hydrological modelling in such data scarce areas.
In this research, we assumed that the GLEAM-based ET data could provide an independent measure of ET. Although the dataset was validated for the entire area of China, the GLEAM is just a series of algorithms to calculate ET, and some deviation still exists in comparison to field observed ET. Moreover, the spatial resolution of the GLEAM-based ET data is 0.25 • × 0.25 • , with one data point corresponding to one or more sub-basins. The use of the GLEAM dataset was able to reduce the grouping of hydrological processes that occurred during the model calibration. However, an equifinality issue may still have occurred at the HRU scale [16]. Further study is therefore required to downscale the GLEAM-based ET data and improve the calibration results in HRUs.

Conclusions
In this research, three SWAT models (SWAT1, SWAT2U, and SWAT2M) were built to evaluate the advantages of using ET data derived from the GLEAM to separately calibrate the widely used SWAT model for the upper and middle reaches of a scare data area: The Bayinhe River. The results showed that: (1) A great effort was required to calibrate the SWAT model for the Bayinhe River basin to obtain a better model performance; (2) The performance of the SWAT model to simulate the streamflow and water balance was reliable when calibrated with streamflow only; however, this calibration method grouped the hydrological processes together and caused an equifinality issue; (3) The combination of the streamflow and GLEAM-based ET data for the SWAT model calibration improved the model's performance for simulating the streamflow and water balance. However, the equifinality issue remained at the HRU level.