Improving Hydrologic Simulations of a Small Watershed through Soil Data Integration

: The e ﬀ ects of soil data sources on the performance of hydrologic model simulations remain poorly understood compared to the e ﬀ ects of other data inputs. This paper investigated the e ﬀ ects of di ﬀ erent soil datasets in simulating streamﬂow and sediment yield using the Soil and Water Assessment Tool (SWAT). Furthermore, potential improvements in watershed simulations were evaluated by integrating ﬁeld measured soil parameters (user soil) with global soil datasets. Five soil datasets, namely user soil, AfSIS (Africa Soil Information Service), Food and Agriculture Organization (FAO), and two integrated soils (User-AfSIS and User-FAO) produced by assimilating the user soil with the latter two, were evaluated. The beneﬁts of the user soil in improving streamﬂow simulations to better replicate observed ﬂow were greater at daily time steps than monthly. Compared to the individual AfSIS and FAO soils, their integration with the user soil improved the daily Nash-Sutcli ﬀ e E ﬃ ciency (NSE) by 0.19 and 0.17 during model calibration, respectively. Overall, all soils performed relatively similar with monthly sediment yield simulations, which were improved when it was integrated with the user soil. Based on selected rainfall events, the watershed response time was less than 1 h, which suggests that the watershed has a quick runo ﬀ response time. This paper showed that streamﬂow and sediment yield simulation performances of freely available global soil datasets can be improved through integration with locally measured soil information. This study demonstrated that the availability of local soil information is critical for daily hydrologic model simulations, which is critical for planning e ﬀ ective soil and water management practices at plot and ﬁeld scales.


Introduction
Land and water management is at a critical junction in the Ethiopian highlands. Traditional land management practices (e.g., fallowing, shifting cultivation) are becoming less common due to shortage of arable land [1][2][3][4][5]. This is primarily due to the increased population and the need for more farmlands and/or animal grazing [6]. Agricultural practices in most parts of the country are rainfed, and water scarcity is rampant during most of the months while runoff and erosion are common during the short rainy monsoon period [7]. Since hydrological, climate, and soil data are scarcely available in the region, land and water management efforts rely on model simulations to understand hydrological processes by fitting discharge and sediment yield at watershed outlets [8][9][10][11]. While some models, such as the Soil and Water Assessment Tool (SWAT) and Water Erosion Prediction Project (WEPP) provide distributed

Study Site
This study was carried out in the Anjeni watershed, a small (113 ha) agricultural watershed, in the northwestern Ethiopian highlands (Figure 1). The Anjeni watershed was selected for this study because of the availability of long-term streamflow and sediment concentration records at its watershed outlet (Figure 1). In Ethiopia, continuous hydro-meteorological monitoring data are rarely available at a watershed scale to support hydrological process understanding. However, at Anjeni watershed, streamflow, sediment, and other watershed biophysical data were collected through the Soil Conservation and Research Program [26][27][28], which is a collaborative project between the Ethiopian Ministry of Agriculture and the Swiss Agency for Development and Cooperation (SDC). The watershed has been serving as a hydrological and erosion monitoring site under this program. Most of the watershed area (>80%) is covered with soils such as Alisols, Nitisols, and Cambisols that were developed from basalt and volcanic ash [29]. The bottom part of the watershed has relatively deeper Alisol soils; the mid and gentle slope parts of the watershed are covered by moderately deep Nitisols; while the steep slope and upper parts of the watershed have shallow Regosols and Leptosols [29]. On average, the watershed receives ~1600 mm rainfall. Daily average temperature ranges between 9 °C and 23 °C.

Hydro-Climatic Data
The SWAT model requires daily climatic data to simulate watershed processes. Daily rainfall and maximum/minimum temperature data for the period 1984-1994 were used to set up the model. The rainfall and temperature data were obtained from a weather station near the outlet of the Anjeni watershed ( Figure 2). The SWAT model also requires solar radiation, wind speed, and relative

Hydro-Climatic Data
The SWAT model requires daily climatic data to simulate watershed processes. Daily rainfall and maximum/minimum temperature data for the period 1984-1994 were used to set up the model. The rainfall and temperature data were obtained from a weather station near the outlet of the Anjeni  (Figure 2). The SWAT model also requires solar radiation, wind speed, and relative humidity data, which were generated from the Climate Forecast System Reanalysis data [30]. The watershed has a unimodal rainfall pattern and most of the rain occurs during the monsoon rainy season that lasts between June to September [28].
Water 2020, 12, x FOR PEER REVIEW  4 of 20 humidity data, which were generated from the Climate Forecast System Reanalysis data [30]. The watershed has a unimodal rainfall pattern and most of the rain occurs during the monsoon rainy season that lasts between June to September [28].

Streamflow and Sediment Data
Streamflow and sediment concentration data were monitored at the outlet of the Anjeni watershed. The streamflow has been measured continuously since 1984 [27]. Sediment concentration was monitored by grab-sampling 1 L water samples every 10-or 30-min interval. The sampling interval was decided based on the color change of the stream water. The total sediment load was calculated from sediment concentration in the water sample and the corresponding flow volume that was generated using the rating curve of the watershed. Detailed information about runoff and sediment measurements at the outlet of the Anjeni watershed are presented in Bayabil et al. [26,27].

Soil Data Sources and Integration Approach
This study evaluated the simulation performance of five soil datasets: Three single sources obtained from field measurements (user soil), African Soil Information Service (AfSIS) [31], FAO Harmonized soil datasets [32], and two integrated soils produced by combining the user soil with the latter two.
The user soil dataset used in this study were previously used by Bayabil et al. [23,28]. The user soil database was developed based on field measurements at 42 locations in the Anjeni watershed by taking soil samples from the top 30 cm. Field soil measurements consisted of bulk density, infiltration, soil depth, soil texture, organic carbon content, and pH. Bayabil et al. [28] presented a detailed

Streamflow and Sediment Data
Streamflow and sediment concentration data were monitored at the outlet of the Anjeni watershed. The streamflow has been measured continuously since 1984 [27]. Sediment concentration was monitored by grab-sampling 1 L water samples every 10-or 30-min interval. The sampling interval was decided based on the color change of the stream water. The total sediment load was calculated from sediment concentration in the water sample and the corresponding flow volume that was generated using the rating curve of the watershed. Detailed information about runoff and sediment measurements at the outlet of the Anjeni watershed are presented in Bayabil et al. [26,27].

Soil Data Sources and Integration Approach
This study evaluated the simulation performance of five soil datasets: Three single sources obtained from field measurements (user soil), African Soil Information Service (AfSIS) [31], FAO Harmonized soil datasets [32], and two integrated soils produced by combining the user soil with the latter two.
The user soil dataset used in this study were previously used by Bayabil et al. [23,28]. The user soil database was developed based on field measurements at 42 locations in the Anjeni watershed by taking soil samples from the top 30 cm. Field soil measurements consisted of bulk density, infiltration, soil depth, soil texture, organic carbon content, and pH. Bayabil et al. [28] presented a detailed description of the soil physical and chemical properties and the measurement techniques used. Point-based soil information was used to develop a spatial soil map that covers the entire Anjeni watershed using the Thiessen polygon method in ArcGIS 10.4. The Thiessen polygon method generates polygons with one sampling point per polygon and assigns soil property values measured at a sampling point to the area within each polygon. As a result, 42 unique soil polygons were produced based on the same number of soil sampling points. The AfSIS data provide soil information at 250 m spatial resolution and contains most of the soil parameters for six soil layers [31,33], while the FAO dataset is available at 1 km resolution and has two layers [32]. Figure 2 presents the spatial coverage of AfSIS and FAO soils for the Anjeni watershed. The watershed is represented by a single soil type in the FAO soil dataset [32] and with four soil types in the AfSIS soil dataset (Figure 2).
The three soil datasets have different soil physical and hydraulic parameters. Summary of soil properties at different depths is presented in Table 1. Bayabil et al. [23] reported that the AfSIS and FAO datasets can provide a diverse estimate of soil parameters but fail to capture the spatial variability of actual field conditions. Moreover, they showed that user soils and global soils have their advantages and disadvantages in hydrologic modeling applications. Unlike the global soil databases, the user soil database has a finer resolution, but soil parameter measurements only for the top 30 cm. Therefore, two more datasets were created to leverage the finer resolution capabilities of the user soil and multiple layer information of the global soil datasets. The new soil datasets were created by replacing the top 30 cm soil information of the global datasets (AfSIS and FAO) by the soil parameters from the user soil. For example, an integration of a soil that has two layers at a certain location in the FAO soil and a respective soil in the user soil will result in a soil type that has a top layer with information from the user soil and a second layer with soil information from the FAO soil dataset ( Figure 3). The study, therefore, used these five soil datasets to simulate the SWAT model and evaluate the performance of the soil datasets to simulate streamflow and sediment yield in a case study in the Anjeni watershed.
Water 2020, 12, x FOR PEER REVIEW 5 of 20 description of the soil physical and chemical properties and the measurement techniques used. Pointbased soil information was used to develop a spatial soil map that covers the entire Anjeni watershed using the Thiessen polygon method in ArcGIS 10.4. The Thiessen polygon method generates polygons with one sampling point per polygon and assigns soil property values measured at a sampling point to the area within each polygon. As a result, 42 unique soil polygons were produced based on the same number of soil sampling points. The AfSIS data provide soil information at 250 m spatial resolution and contains most of the soil parameters for six soil layers [31,33], while the FAO dataset is available at 1 km resolution and has two layers [32]. Figure 2 presents the spatial coverage of AfSIS and FAO soils for the Anjeni watershed. The watershed is represented by a single soil type in the FAO soil dataset [32] and with four soil types in the AfSIS soil dataset ( Figure 2). The three soil datasets have different soil physical and hydraulic parameters. Summary of soil properties at different depths is presented in Table 1. Bayabil et al. [23] reported that the AfSIS and FAO datasets can provide a diverse estimate of soil parameters but fail to capture the spatial variability of actual field conditions. Moreover, they showed that user soils and global soils have their advantages and disadvantages in hydrologic modeling applications. Unlike the global soil databases, the user soil database has a finer resolution, but soil parameter measurements only for the top 30 cm. Therefore, two more datasets were created to leverage the finer resolution capabilities of the user soil and multiple layer information of the global soil datasets. The new soil datasets were created by replacing the top 30 cm soil information of the global datasets (AfSIS and FAO) by the soil parameters from the user soil. For example, an integration of a soil that has two layers at a certain location in the FAO soil and a respective soil in the user soil will result in a soil type that has a top layer with information from the user soil and a second layer with soil information from the FAO soil dataset ( Figure 3). The study, therefore, used these five soil datasets to simulate the SWAT model and evaluate the performance of the soil datasets to simulate streamflow and sediment yield in a case study in the Anjeni watershed.   Table 1. Summary of soil physical and hydrologic properties at different depths for the user (1 depth), AfSIS (7 depths), and FAO soil databases (2 depths). ρd, AWC, and Ks refer to bulk density, available water holding capacity, and hydraulic conductivity, respectively. Clay, Silt, and Sand refer to percentage textural classes.

Land Use and Digital Elevation Model
Besides the soil data, land use, and digital elevation model (DEM) are basic spatial data required by the SWAT model to discretize the watershed and define the Hydrological Response Units (HRUs). The DEM and land use data were developed under the SCRP project. The DEM data have a spatial resolution of 2 m, which was helpful to accurately capture the river networks, developed by the Center for Development and Environment (CDE) at the University of Berne, Switzerland. The land use map was developed by recording crop type in each plot and fully represents field conditions.

SWAT Model Setup
The SWAT model discretization with a threshold area of one hectare provided 37 number of sub-basins. Three slope classes (i.e., <4, 4-8, >8%) were created for the Hydrologic Response Units (HRUs) formation. Multiple HRUs were created within a sub-basin. The model setup provided a total of 465, 201, and 164 HRUs from the User, AfSIS, and FAO soils, respectively.
SWAT has different options to calculate different biophysical processes in a watershed. This study used the Soil Conservation Service's curve number (CN) method to estimate surface runoff. The Penman-Monteith method was used to calculate potential evapotranspiration. The routing of water in the channels was determined using the variable storage routing method.
The Soil and Water Assessment Tool (SWAT) model is a physically based watershed model with features that capture the spatial variability of biophysical parameters at Hydrological Response Unit (HRU) levels, which is the smallest unit in a watershed represented with similar land use, soil type, and elevation class. Although the model requires daily climate data, it can run at a daily or monthly timestep. SWAT has been used throughout the world to test various watershed related processes ranging from commonly used streamflow and sedimentation to nutrient transport, climate change, best management practices, chemical transport and cycling, and farming practices.
The SWAT model uses the following equation (Equation (1)) to simulate the water balance of a watershed [34].
where SW t is final soil water content at time t (mm), SW 0 is initial soil water content at time i (mm), t is time (days), P i is the amount of precipitation on day i (mm), R i is the amount of surface runoff on day i (mm), ET i is the amount of evapotranspiration on day i (mm), W i is the percolation of water entering the vadose zone from the soil profile on day i (mm), and Q i is the amount of return flow on day i (mm). Similarly, SWAT simulates sediment losses from a landscape due to rainfall and runoff using the Modified Universal Soil Loss Equation (MUSLE) developed by Williams [35], which is a modified version of the Universal loss Equation. Williams [35] used the following equation (Equation (2)).
where SY is sediment yield (tons/day), R i is surface runoff volume (mm/ha), R peak = peak runoff rate (m 3 /s), A h is area of HRU (ha), and CFRG is coarse fragment factor. Meanwhile, the other parameters of the equation are based on the description by Wischmeier and Smith [36] and K is soil erodibility factor (0.013 ton m 2 h)/(m 3 tons cm), LS is length/slope factor, C is crop cover and management factor, and P is practice factor.

Model Calibration, Validation, and Sensitivity Analysis
A total of 12 model parameters were selected for streamflow calibration and validation, while for sediment calibration, 6 parameters were selected ( Table 2). Model parameters considered for calibration were selected based on literature recommendation [14,[37][38][39]. where qualifiers "a_", "v_" and "r_" represent absolute increase, replacement, and relative change to the original parameter values, respectively.
The SWAT model parameters were calibrated using the Sequential Uncertainty Fitting version 2 (SUFI-2) algorithm in the SWAT Calibration Uncertainty Prediction (SWAT-CUP) tool [40,41]. In SUFI-2, the level of uncertainty for a particular model is evaluated using p-factor and r-factor. The p-factor informs the percentage of observed data bracketed within the 95 percent prediction uncertainty (95 PPU) of the model while r-factor estimates the thickness of the 95 PPU. A p-factor close to 1 and an r-factor close to 0 suggest a reasonable level of model uncertainty [42]. The model was calibrated using observed streamflow and sediment data for the period 1988-1992 and 1990-1994, respectively. Model simulation for the period 1984-1987 was used for model warm-up. Daggupati et al. [43] recommend a model warm-up period of at least three years to properly initiate and balance stocks within the watershed. The model was validated using independently observed streamflow data for the period 1993-1994.
The evaluations of the model were conducted using the Nash-Sutcliffe Efficiency (NSE), which is a normalized statistic that determines the relative magnitude of the residual variance compared to the measured data variance [44]. An NSE value of 1 refers to a perfect match between observed and simulated values, and an NSE value between 0 and 1 is considered an acceptable level of model performance. An NSE value < 0 indicates that the observed mean is a better predictor than the model [45].

Long-Term Observed Rainfall, Discharge and Sediment Yield
Long-term data analysis showed that the Anjeni watershed receives an average annual rainfall of 1600 mm and a runoff of~700 mm. The average annual sediment yield was~26.0 Mg ha −1 . Since the watershed has a unimodal rainfall pattern, most of the streamflow and sediment losses occur during the rainy monsoon period of June to September (Figure 4). Sediment yield is greater towards the end of Water 2020, 12, 2763 9 of 19 the rainy season, suggesting a significant contribution of gully erosion once the soils become saturated, perhaps at the bottom part of the watershed [3,5,46]. of ~1600 mm and a runoff of ~700 mm. The average annual sediment yield was ~26.0 Mg ha −1 . Since the watershed has a unimodal rainfall pattern, most of the streamflow and sediment losses occur during the rainy monsoon period of June to September (Figure 4). Sediment yield is greater towards the end of the rainy season, suggesting a significant contribution of gully erosion once the soils become saturated, perhaps at the bottom part of the watershed [3,5,46].
The average soil loss from the Anjeni watershed is smaller than the average erosion rate that Hurni (1988) estimated for the Ethiopian highlands (42 Mg ha −1 ); however, there are comparable sized watershed such as the Maybar watershed where the soil erosion is far lower at 7.4 Mg ha −1 [47]. Soil losses between 31 Mg ha −1 and 530 Mg ha −1 were reported in Ethiopian watersheds where severe gully erosion is prevalent [48].

Effects of Soil Data Source on Streamflow Simulation
Uncalibrated model simulations from the five soil datasets (i.e., three independent and two integrated soil datasets) consistently overestimated streamflow at the Anjeni watershed both on daily and monthly time steps ( Figure 5). While daily simulations appear to significantly overestimate peak and low flows (Figure 5a), the monthly simulations seem to follow the pattern of the observed hydrograph and capture low flows better (Figure 5b). Yet, the peak flows were significantly overestimated by all soil datasets even in the monthly simulations (Figure 5b). The average soil loss from the Anjeni watershed is smaller than the average erosion rate that Hurni (1988) estimated for the Ethiopian highlands (42 Mg ha −1 ); however, there are comparable sized watershed such as the Maybar watershed where the soil erosion is far lower at 7.4 Mg ha −1 [47]. Soil losses between 31 Mg ha −1 and 530 Mg ha −1 were reported in Ethiopian watersheds where severe gully erosion is prevalent [48].

Effects of Soil Data Source on Streamflow Simulation
Uncalibrated model simulations from the five soil datasets (i.e., three independent and two integrated soil datasets) consistently overestimated streamflow at the Anjeni watershed both on daily and monthly time steps ( Figure 5). While daily simulations appear to significantly overestimate peak and low flows (Figure 5a), the monthly simulations seem to follow the pattern of the observed hydrograph and capture low flows better (Figure 5b). Yet, the peak flows were significantly overestimated by all soil datasets even in the monthly simulations (Figure 5b).

Effects of Soil Data Source on Streamflow Simulation
Uncalibrated model simulations from the five soil datasets (i.e., three independent and two integrated soil datasets) consistently overestimated streamflow at the Anjeni watershed both on daily and monthly time steps ( Figure 5). While daily simulations appear to significantly overestimate peak and low flows (Figure 5a), the monthly simulations seem to follow the pattern of the observed hydrograph and capture low flows better (Figure 5b). Yet, the peak flows were significantly overestimated by all soil datasets even in the monthly simulations (Figure 5b). It is a common practice to calibrate model parameters to mimic actual biophysical conditions in watersheds. Moreover, unsatisfactory model performance with the uncalibrated model parameters ( Figure 5) suggested model calibration to fine-tune model parameters, thereby improving the streamflow simulation efficiency at the watershed outlet. Ayana et al. [49], without calibration, reported that AfSIS soils yield only marginal improvements in streamflow simulations compared to It is a common practice to calibrate model parameters to mimic actual biophysical conditions in watersheds. Moreover, unsatisfactory model performance with the uncalibrated model parameters ( Figure 5) suggested model calibration to fine-tune model parameters, thereby improving the streamflow simulation efficiency at the watershed outlet. Ayana et al. [49], without calibration, reported that AfSIS soils yield only marginal improvements in streamflow simulations compared to coarse resolution soil data inputs. In this study, the model calibration significantly improved streamflow simulations in the Anjeni watershed ( Figure 6). Model calibration improved overestimation issues of the peak flows especially for monthly simulations (Figure 6b). Overall, the simulated streamflow hydrographs replicated the observed hydrographs well both during the calibration and validation periods ( Figure 6). Compared to the AfSIS and FAO soils, the user soil performed best in simulating the streamflow at a daily time step, while the AfSIS was the poorest. At daily time steps, the order of performance in terms of NSE goodness-of-fit evaluation is the user, AfSIS, and FAO soil datasets both during the calibration and validation periods (Table 3). Compared to the AfSIS and FAO soils, the user soil performed best in simulating the streamflow at a daily time step, while the AfSIS was the poorest. At daily time steps, the order of performance in terms of NSE goodness-of-fit evaluation is the user, AfSIS, and FAO soil datasets both during the calibration and validation periods (Table 3). Replacing the top 30 cm information of AfSIS and FAO soils by information from the user soil greatly improved the model's performance at daily time steps. Model performances of User-AfSIS and User-FAO datasets were better than their counterpart global soils (AfSIS and FAO) in stimulating daily streamflow, in which increases in NSE values for User-AfSIS and User-FAO simulations were 0.19 and 0.17 during model calibration, and 0.03 and 0.02% during model validation, respectively (Table 3). Similarly, reduction in PBIAS due to integration of the user soil with the AfSIS and FAO soils was 16 and 10% during model calibration, respectively. However, improvements due to soil data integration were better for the AfSIS soil compared to the FAO soil at daily simulations. Overall, based on Moriasi et al. [45] model performance scale, daily streamflow simulations have 'very good' ratings, except model simulations with the individual AfSIS and FAO soils which resulted in 0.45 and 0.55 NSE values during the model calibration period, respectively (Table 3).
At monthly time step, the user soil when used alone and integrated with the AfSIS and FAO soils did not improve streamflow simulations compared to the AfSIS and FAO soils ( Table 3). The AfSIS and FAO soils when used independently provided better model performances for monthly streamflow simulations compared to user soil data both during calibration and validation. This suggests that the shallow depth in the measured User soil could not capture the relatively longer timestep water budget simulations. These findings have significant implications on the modeling and land and water management planning efforts in the region.
For example, for process understanding, which relies on analysis at daily-or sub-daily time steps in small watersheds, the use of detailed soil data at finer resolutions is warranted; however, for water balance studies, which is often conducted at monthly or annual scale, use of coarser resolution data that may have deeper soil profile may provide reasonable estimates. While this study showed that integrating fine resolution data into coarser global datasets improve model simulation performance at daily timestep, other studies have reported that the use of finer resolution soil datasets may not improve model performance. For example, Ye et al. [50], based on field observed soil data in a sub-humid watershed in China, reported that fine-resolution soil data inputs did not improve streamflow simulations at a larger watershed using the SWAT model. Similarly, Morias et al. [21] observed no significant differences in monthly streamflow prediction of fine and coarse resolution soil data inputs in their study at three sub-watersheds within the Fort Cobb Reservoir Experimental watershed in Oklahoma.
The Anjeni watershed has a very quick response time after storm events due to its short time of concentration; mostly less than one-hour even during dry months (Figure 7). This suggests that soil and water management planning that primarily intend to reduce runoff and sediment loss, and improve moisture availability for crop production, should rely on daily or sub-daily model simulations.
Water 2020, 12, x FOR PEER REVIEW 14 of 20 significant differences in monthly streamflow prediction of fine and coarse resolution soil data inputs in their study at three sub-watersheds within the Fort Cobb Reservoir Experimental watershed in Oklahoma.
The Anjeni watershed has a very quick response time after storm events due to its short time of concentration; mostly less than one-hour even during dry months (Figure 7). This suggests that soil and water management planning that primarily intend to reduce runoff and sediment loss, and improve moisture availability for crop production, should rely on daily or sub-daily model simulations. In the Anjeni watershed, monthly simulations are less promising for soil and water management planning purposes that are aimed to alleviate drought impacts on crop productivity. Since monthly simulations aggregate the daily hydrologic processes, it masks the daily variations of the water budget components (e.g., soil moisture, runoff). However, it requires balancing the intended uses of model outputs and scale of data resolution requirements (e.g., soil data) since data collection is costly, time-consuming, and at times impractical to cover large areas [23]. For example, [18] highlighted that the preparation of fine resolution soil data for model calibration requires greater efforts and suggested that the benefits of fine-resolution data should be weighed against the resources needed. Moreover, having high-resolution data may also increase the computational needs especially in larger watersheds [18,22]. For example, although Anjeni is a relatively small watershed (113 ha), the user soil resulted in almost three times the number of HRUs (465) compared to the FAO soil (164), while the AfSIS soil had 201 HRUs. This suggested that as the watershed size and data resolution increases, the number of HRUs will considerably increase, thereby requiring substantial computational resources and time to make simulations.

Effects of Soil Data Source on Sediment Yield Simulation
Similar to the streamflow simulation, the uncalibrated SWAT model could not effectively simulate sediment yield. The uncalibrated model substantially underestimated the peak sediment yield compared to the observed sediment yield. However, model calibration improved the sediment In the Anjeni watershed, monthly simulations are less promising for soil and water management planning purposes that are aimed to alleviate drought impacts on crop productivity. Since monthly simulations aggregate the daily hydrologic processes, it masks the daily variations of the water budget components (e.g., soil moisture, runoff). However, it requires balancing the intended uses of model outputs and scale of data resolution requirements (e.g., soil data) since data collection is costly, time-consuming, and at times impractical to cover large areas [23]. For example, [18] highlighted that the preparation of fine resolution soil data for model calibration requires greater efforts and suggested that the benefits of fine-resolution data should be weighed against the resources needed. Moreover, having high-resolution data may also increase the computational needs especially in larger watersheds [18,22]. For example, although Anjeni is a relatively small watershed (113 ha), the user soil resulted in almost three times the number of HRUs (465) compared to the FAO soil (164), while the AfSIS soil had 201 HRUs. This suggested that as the watershed size and data resolution increases, the number of HRUs will considerably increase, thereby requiring substantial computational resources and time to make simulations.

Effects of Soil Data Source on Sediment Yield Simulation
Similar to the streamflow simulation, the uncalibrated SWAT model could not effectively simulate sediment yield. The uncalibrated model substantially underestimated the peak sediment yield compared to the observed sediment yield. However, model calibration improved the sediment yield simulation (Figure 8). The NSE value for the sediment yield simulations with the different soil datasets varied between 0.72 and 0.77 (Table 4). Based on NSE values, all soil datasets performed similarly. According to Moriasi et al. [45], sediment yield simulations by the AfSIS soil dataset were 'good' while the user soil when integrated with the AfSIS soil provided a 'very good' model performance with NSE values of 0.77 (Table 4).
Water 2020, 12, x FOR PEER REVIEW 15 of 20 yield simulation (Figure 8). The NSE value for the sediment yield simulations with the different soil datasets varied between 0.72 and 0.77 (Table 4). Based on NSE values, all soil datasets performed similarly. According to Moriasi et al. [45], sediment yield simulations by the AfSIS soil dataset were 'good' while the user soil when integrated with the AfSIS soil provided a 'very good' model performance with NSE values of 0.77 (Table 4).  The user soil when used alone or integrated with the AfSIS and FAO soils provided better pfactor values (~0.26), which represents the percentage of data points bracketed within the 95PPU of the model simulations [42]. In contrast, the AfSIS and FAO soils provided lower p-factor values of 0.17 and 0.12 each. Although unsatisfactory p-values, the sediment yield simulations provided smaller r-factor values, which suggest a small level of model uncertainty. Abbaspour [42] suggested that a higher p-factor and smaller r-factor values suggest a smaller level of model uncertainty.
Regardless of the studied soil datasets, the sediment yield simulations were consistently underestimated compared to the observed sediment yield data. The average annual simulated sediment yield for the period 1990-1993 for the user, AfSIS, and FAO soils were 16.4, 18.2, and 15.8 Mg ha −1 , respectively; while for the same period, the average annual observed sediment yield was 26.1 Mg ha −1 . Underestimation of sediment yield simulation by the SWAT model was also reported  The user soil when used alone or integrated with the AfSIS and FAO soils provided better p-factor values (~0.26), which represents the percentage of data points bracketed within the 95PPU of the model simulations [42]. In contrast, the AfSIS and FAO soils provided lower p-factor values of 0.17 and 0.12 each. Although unsatisfactory p-values, the sediment yield simulations provided smaller r-factor values, which suggest a small level of model uncertainty. Abbaspour [42] suggested that a higher p-factor and smaller r-factor values suggest a smaller level of model uncertainty.
Regardless of the studied soil datasets, the sediment yield simulations were consistently underestimated compared to the observed sediment yield data. The average annual simulated sediment yield for the period 1990-1993 for the user, AfSIS, and FAO soils were 16.4, 18.2, and 15.8 Mg ha −1 , respectively; while for the same period, the average annual observed sediment yield was 26.1 Mg ha −1 . Underestimation of sediment yield simulation by the SWAT model was also reported in a similar sized Maybar watershed in the northern Ethiopian highlands [51]. The SWAT model's consistent underestimation of peak sediment yield may be related to its limitations to account soil erosion from gullies. Field observations confirmed that active gully erosion exists in the Anjeni watershed and its contribution to total soil loss was found significant, especially once the soils become saturated [3,10,46]. Nyssen et al. [52] reported that gullying is among the dominant erosion processes in most areas in the Ethiopian highlands.

Water Budget
Annual precipitation and potential evapotranspiration were 1656 and 1160 mm, respectively. The simulated water yields with the AfSIS and FAO soil datasets were equal. The user soil, when used alone or integrated with the other two soils, estimated the highest water yield but the smallest actual evapotranspiration rates (AET) ( Table 5). Smaller AET estimates for the user soil (and its integrated soils) suggested a more water limiting environment, which is more representative of the actual conditions in the Anjeni watershed than simulated by the other two soils. The AfSIS soil estimated the highest surface runoff when used alone and integrated with the user soil (Table 5). Transmission loss (TLoss) and groundwater revap did not show a significant difference among the soil dataset simulations. where Water yield is average annual water leaving the watershed (mm), AET is actual evapotranspiration (mm), Surf_Q is surface runoff (mm), GW_Q is Deep aquifer groundwater contribution to streamflow (mm), Perc is water that percolates past the root zone (mm), and Lat_Q is lateral flow (mm).

Sensitivity of Model Parameters
The curve number (CN) and soil bulk density (SOL_BD) were the top ranking and most sensitive model parameters to simulate streamflow using the three individual soils, and two user soil integrated with AfSIS and FAO soils ( Figure 9). Overall, parameter rankings and sensitivity analysis showed that model parameters related to soil hydraulic properties were significantly sensitive in streamflow simulation, especially at daily time steps. Observed effects of integrating user soil with coarse resolution AfSIS and FAO soils were apparent on soil hydraulic parameters, which are known to affect hydrological processes at daily and sub-daily time steps. In contrast, however, model parameters related to groundwater showed higher ranking and sensitivity at monthly simulations since groundwater takes several days to respond. As a result, the ground water 'revap' coefficient (GW_REVAP) and groundwater delay (GW_DELAY) were the most sensitive parameters for monthly streamflow simulations. Based on field observations, soils in the study watershed are poorly managed and have compacted topsoil. Since surface runoff generation processes are primarily affected by hydrological parameters of the topsoil, the presence of such poorly managed top soils led to quick watershed response time. As such, daily simulations are appropriate to represent the actual hydrological processes in the watershed and thereby inform policy and management decisions related to soil and water conservation.

Conclusions
This study showed that the user soil, which has a finer spatial resolution, that better represents the spatial variability of soil hydrologic properties, improved hydrologic simulations when used independently or integrated with global soil datasets (AfSIS and FAO), especially at a daily time step. Moreover, this study showed that the integration of locally measured user soil data with global coarser resolution soil datasets (AfSIS and FAO) can enhance streamflow simulations, especially at a daily time step. However, the coarser resolution soil datasets such as the AfSIS and FAO soils could be effectively used to simulate monthly streamflow. This suggests that such coarse resolution soil datasets could be applicable for water balance studies in watersheds and sediment yield simulations, which are often performed at monthly or annual timescales. This study concluded that finerresolution, locally collected soil datasets could be optimal for daily hydrologic simulations; however, freely available global soil datasets can also be integrated with locally measured soil information to achieve better hydrological simulations at daily time steps than the individual global soil counterparts. Hydrologic simulations should capture actual biophysical processes such as watershed response time, soil moisture dynamics, runoff generation, etc., as these factors are critical in soil and water management decision making. In rainfed farming systems, which are predominant in the study area, a few dry or wet days may have a significant impact on crop growth, yield, and produce quality. As such, field measured user soil's finer resolution that helps daily hydrological processes understanding will play significant importance in soil and water management decisions.

Conclusions
This study showed that the user soil, which has a finer spatial resolution, that better represents the spatial variability of soil hydrologic properties, improved hydrologic simulations when used independently or integrated with global soil datasets (AfSIS and FAO), especially at a daily time step. Moreover, this study showed that the integration of locally measured user soil data with global coarser resolution soil datasets (AfSIS and FAO) can enhance streamflow simulations, especially at a daily time step. However, the coarser resolution soil datasets such as the AfSIS and FAO soils could be effectively used to simulate monthly streamflow. This suggests that such coarse resolution soil datasets could be applicable for water balance studies in watersheds and sediment yield simulations, which are often performed at monthly or annual timescales. This study concluded that finer-resolution, locally collected soil datasets could be optimal for daily hydrologic simulations; however, freely available global soil datasets can also be integrated with locally measured soil information to achieve better hydrological simulations at daily time steps than the individual global soil counterparts. Hydrologic simulations should capture actual biophysical processes such as watershed response time, soil moisture dynamics, runoff generation, etc., as these factors are critical in soil and water management decision making. In rainfed farming systems, which are predominant in the study area, a few dry or wet days may have a significant impact on crop growth, yield, and produce quality. As such, field measured user soil's finer resolution that helps daily hydrological processes understanding will play significant importance in soil and water management decisions.