Multi-site Validation of the Swat Model on the Bani Catchment: Model Performance and Predictive Uncertainty

The objective of this study was to assess the performance and predictive uncertainty of the Soil and Water Assessment Tool (SWAT) model on the Bani River Basin, at catchment and subcatchment levels. The SWAT model was calibrated using the Generalized Likelihood Uncertainty Estimation (GLUE) approach. Potential Evapotranspiration (PET) and biomass were considered in the verification of model outputs accuracy. Global Sensitivity Analysis (GSA) was used for identifying important model parameters. Results indicated a good performance of the global model at daily as well as monthly time steps with adequate predictive uncertainty. PET was found to be overestimated but biomass was better predicted in agricultural land and forest. Surface runoff represents the dominant process on streamflow generation in that region. Individual calibration at subcatchment scale yielded better performance than when the global parameter sets were applied. These results are very useful and provide a support to further studies on regionalization to make prediction in ungauged basins.


Introduction
Water resources managers are facing challenges in many river basins across the world due to limited data availability.Anthropogenic activities add more uncertainties to this task by inducing changes to land and climate at different scales [1,2].This situation is more pronounced in developing countries, where in many river basins no runoff data are available [3][4][5][6][7] and the existing ones are of questionable quality or, at best, short or incomplete.
The Niger River basin is not an exception to that rule.The general situation of insufficient data is exacerbated by a deterioration of measurement networks.In the 80s and 90s, for instance, hydrometric stations were reduced to a minimum and many have been abandoned (e.g., [8]).To prevent the hydrologic observing system from more degradation, the Niger Basin Authority (NBA) has set the Niger-HYCOS project, which one of its specific objectives is to improve data quality of the Niger Basin.For this purpose, the project identified and brings assistance in the installation and the management of 105 hydrometric stations shared by nine countries drained by the River, and contributes to the capacity building of national hydrological services.
In its fifth assessment report on regional aspects of climate change, the Inter-Governmental Panel on Climate Change [9] has shown that adaptation to climate change in Africa is confronted with a number of challenges among which is a significant data gap.Too many basins lack reliable data necessary to assess, in details, impacts of climate change on different components of the hydrological cycle and to develop strategies of adaptation related to each specific impact.Thus, it is germane to predict hydrological variables in ungauged basins for building high adaptive capacity by improving: (i) water resources knowledge, planning, and management; (ii) identification and implementation of strategies of adaptation to climate change in the sector of water, and (iii) ecological studies for a sustainable development.
The application of rainfall-runoff models and then, transferring model parameters from gauged to ungauged catchments is a long-standing method [10] for flow prediction in ungauged basins and has been highlighted during the decade of Prediction in Ungauged Basins (PUB) launched in 2003 by the International Association of Hydrological Sciences (IAHS) and concluded by the PUB Symposium held in 2012.This is the framework of the present study, in which the Soil and Water Assessment Tool (SWAT) model was calibrated on the Bani catchment (Niger River basin) and the most sensitive model parameters were estimated.
Many studies have successfully applied the SWAT model in West Africa, on different river basins.Examples include, among others: calibration of the SWAT model on the Niger basin [11][12][13][14][15][16], the Volta basin [12][13][14][15][17][18][19] and the Oueme catchment in Benin [15,[20][21][22].However there are few published papers on the application of the SWAT model on the Bani catchment.For instance, Schuol and Abbaspour [12] and Schuol et al. [14] applied the SWAT model to selected watersheds in West Africa including the Niger basin and modeled monthly values of river discharges (blue water) as well as the soil water (green water), and clearly showed the uncertainty of the model results.They developed and applied a daily weather generator algorithm [13] that uses 0.5 degree monthly weather statistics from the Climatic Research Unit (CRU) to obtain time series of daily precipitation as well as minimum and maximum temperatures for each sub-basin.These generated weather data were then used as input for model setup and the authors concluded that "discharge simulations using generated data were superior to the simulations using available measured data from local climate stations".Reported Nash-coefficient values obtained vary largely between sub-basins and were principally presented as average intervals limiting thus, our understanding of model performance at finer spatial (subbasin) and temporal (daily) scales.
Laurent and Ruelland [23] successfully calibrated SWAT on the Bani catchment using daily measured climate data.They interpolated precipitation data on a regular grid by the Inverse Distance Weighted (IDW) method, which has proven to yield better results than kriging, Thiessen and spline methods, especially when a hydrological model is used [24].To show the model performance, Laurent and Ruelland [23] reported both discharge and biomass calibration results on an average annual basis, but did not assess model calibration uncertainty.Moreover, both above-mentioned studies performed interpolation of input data out of the model framework to obtain a time series of daily weather data for each sub-basin.However, the results of interpolation methods are strongly influenced by the density and spatial distribution of the measurement stations used in the interpolation [25].Such a density of data is not always available in developing countries.
Against this background, the objective of this study was to assess the performance of the SWAT model and its predictive uncertainty on the Bani at catchment and subcatchment levels.More specifically, this meant to: (i) set up a hydrological model for the Bani catchment using the SWAT program; (ii) calibrate the model at the catchment outlet at daily and monthly time steps and assess the predictive performance and uncertainty; (iii) evaluate the spatial performance of the watershed-wide model within the catchment by validating it at two internal stations; and (iv) calibrate the model at the sub-catchments separately and provide a comparative assessment of the model performance at different spatial scales.
The originality of this study was the daily performance of the SWAT model at the whole catchment outlet and at two internal stations.Another important output of this paper was the involvement of evapotranspiration (the most important component of the water balance after rainfall especially under warm climate) in the verification of model outputs reasonability, a particular attention that has not been considered by any previous study in the region.In addition, we used in the current work point rain gauge data (as per SWAT's standard procedure) opposed to areal precipitation as used in previous studies [12][13][14][15]24,26,27] on the same basin in order to maintain the real data condition (limited in time and space) to the extent possible.

The Study Area
The Bani is the major tributary of the Upper Niger River.Its drainage basin is principally located in Mali but spans in a lesser extent over Cote d'Ivoire and Burkina Faso and covers an area of about 100,000 km 2 at Douna gauging station (Figure 1).The Bani watershed was chosen for this study, on one hand, due to its relatively high-quality data availability compared to regional situation.It thus constitutes the appropriate gauged catchment in different hydro-climatic variables.On the other hand, this watershed has not been affected by important hydraulic structures able to significantly modify its flow regime, making the hydrological modeling of that catchment more convenient.
The catchment's topography (Figure 1) is characterized by a gentle elevation that ranges from 826 m in the South and the center-east to 249 m at the outlet in the North.According to FAO (2003) [28], major soil groups are mainly constituted by Luvisol, Acrisol, and Nitosol (Figure 2a).Based on the USGS Global Land Cover Characterization (GLCC) version 2.0 [29], agricultural land constitutes the dominant land use category followed by savannah and forest (Figure 2b).The Bani catchment is characterized by a Sudano-Sahelian climatic regime.The river flows from south to north along a high rainfall gradient.Annual precipitation varies from 1250 mm at Odienne to 615 mm at Segou (average of the period 1981-2000).The average annual discharge recorded at Douna gauging station between 1981 and 2000 was 184 m 3 s ´1, which is equivalent to 58 mm of surface runoff depth for an average annual precipitation of 1000 mm.The smallest runoff values were recorded during the years 1983, 1984, and 1987.Due to climate change, there was an abrupt decrease in rainfall in the period 1970-1971 and remained for two decades [27,30] with a more severe impact on water resources.A decrease of more than 60% in discharge at Douna [27,31] and lower contribution of baseflow to the annual flood [32,33] have been reported since the 70s.Concerning future climate change impacts, the Bani basin is projected to experience substantial decrease in rainfall and runoff especially in the long term behavior [27].

Model Description
SWAT is a river basin, or watershed, scale model developed to predict the impact of land management practices on water, sediment, and agricultural chemical yields in large, complex watersheds with varying soils, land use, and management conditions over long periods of time [34].The model is semi-distributed, physically based and computationally efficient, uses readily available inputs and enables users to study long-term impacts [35].For a detailed description of SWAT, see Soil and Water Assessment Tool input/output version 2012 [36] and the Theoretical Documentation, Version 2009 [37].
The ArcSWAT (ArcGIS extension) is a graphical user interface for the SWAT model.In the present study, the recent version, ArcSWAT2012, was used for building the hydrological model of the Bani catchment.SWAT divides a basin into sub-basins which are further discretized into hydrologic response units (HRUs), based on unique soil-land use-slope combinations.The subdivision of the watershed enables the model to reflect differences in evapotranspiration for various crops and soils.Runoff is predicted separately for each HRU and routed to obtain the total runoff for the watershed.This increases accuracy and gives a much better physical description of the water balance [37].
Various hydrological models exist and there is no strict guideline on the selection of the model.The SWAT model uses a modified version of the Curve Number method, which was developed in the US for specifically calculating surface runoff generation.Therefore the model is especially suitable for regions with a high share of overland flow on total runoff.Other advantages of the SWAT model are that it allows a number of different physical processes (hydrologic, sediment, pollutants) to be simulated in a watershed.It has been previously validated for several large-scale watersheds throughout different climate contexts across the globe and has performed satisfactorily even in data poor and complex catchments (e.g., [38,39]).SWAT is also very flexible in terms of using specific and appropriate soil and land use information's of the watershed to be modeled by adding them to its database.However in this context, it is worth using a low cost or free model, which West African National Hydrological services could afford due to economic constraints.

Input Data and Databases
The SWAT model for the Bani was constructed using weather data and globally and freely available spatial information described in Table 1.Daily precipitation data from 11 rain gauges as well as daily maximum and minimum temperature from five weather stations located mainly on the catchment were used as input.The location and spatial distribution of input precipitation and temperature stations are represented in Figure 1.
It is worth noting the weak spatial density of the measuring network that is characterized by a rain gauge for more than 9000 km 2 .Precipitation data are complete at the majority of the sites except for a few numbers of them, where the maximum missing data percentage varies between 8.5% and 100% in a year.Many more missing values are recorded in the temperature data.Collected climate data time series were of varying lengths.Thus, a common period of observation from 1981 to 2000 was first determined.Retained data then underwent a thorough quality control as recommended by the World Meteorological Organization (WMO) in the guide to climatological practices, third edition [40].Three procedures were applied: (1) completeness check; (2) plausible value check; and (3) consistency check.The aim of the check is to detect erroneous data in order to correct and, if not possible, to delete it.Missing values were filled by the weather generator during the running time.For this purpose, the excel macro WGNmaker4 [41] was used to calculate weather stations statistics needed to generate representative daily climatic data.
Two different databases were used to set up the model.The SWAT database is composed by the crop database and the user soils database, both included in swat2012.mdb.They are named crop1 and soil1, respectively.Crop1 was kept default whereas soil1 was filled with soils transferred from mwswat2009.mdb (the database of the MapWindow interface for SWAT).The second database is composed by crop2 and soil2.Four land use categories define crop2: forest, savannah-bush, savannah, and steppe whereas six major soil groups are added to soil2: Acrisol, Cambisol, Gleysol, Lithosol, Luvisol ferrique, and Nitosol.Detailed description of this database can be found in [23].For calibration purpose, we used daily river discharge data at Douna, Bougouni and Pankourou stations covering the period 1981-2000, obtained from AGRHYMET and the National Hydraulic Direction of Mali.The period 1981-1997 was kept for calibration and validation processes as it exhibits few gaps.Small existing gaps were thus filled by a simple linear interpolation.

Model Setup
The catchment was delineated and divided into sub-catchments based on the DEM.A stream network was superimposed on the DEM in order to accurately delineate the location of the streams.The threshold drainage area was kept as default and additional outlets were considered at the location of stream gauging stations to enable comparison of measured discharge with SWAT results.The whole catchment was so discretized into 28 sub-catchments, which were further subdivided into 181 HRUs based on soil, land use, and slope combinations.Further parameters have been edited through the general watershed parameters and SWAT simulation menus and are reported in Table 2. Four simulations were performed based on land use and soil databases combinations: crop1soil1, crop1soil2, crop2soil1, and crop2soil2.A Nash-Sutcliffe Efficiency (NSE) [45] was thereafter calculated at Douna by comparing measured discharges against each default simulation and the one which will yield the highest NSE value will be kept for calibration and validation processes.

Calibration and Validation Procedures
It is commonly accepted in hydrology to split the measured data either temporally or spatially for calibration and validation [36].In addition to the split-sample method, a split-location calibration and validation approach has been performed because the global parameter set is not expected to be optimal for sub-catchments processes in view of the high heterogeneity in terms of climate, topography, soil, and land use characterizing such a large-area watershed.This approach is especially needed when prediction at data sparse sites is foreseen [46,47].In the split-sample approach, the model was calibrated using discharge data solely measured at the catchment outlet by splitting the homogenous period mentioned in Section 2.3 into two datasets: two-thirds for calibration (1983)(1984)(1985)(1986)(1987)(1988)(1989)(1990)(1991)(1992), and the other one for validation (1993)(1994)(1995)(1996)(1997).To implement the split-location method, the model was calibrated at Douna and then validated at intermediate gauging stations (Bougouni and Pankourou) by turning the model on the same period (1983-1992), using the same behavioral parameter sets determined at the outlet.
Calibration was thereafter performed at Bougouni and Pankourou stations individually, and both modeling frameworks facilitated a comparative analysis of model performance and predictive uncertainty through scales.At this step, the calibration at Bougouni did not succeed within realistic range of the Curve Number (CN).Then, the daily CN calculation method was changed to Plant ET for simulation at Bougouni because soil moisture method is found to predict too much runoff in shallow soils [36].An additional parameter (CNCOEF) was then necessary as required by the plant ET method and fixed to 0.5 in the Edit SWAT input menu.
Calibration/validation, uncertainty analysis, and sensitivity analysis were performed within the SWAT Calibration and Uncertainty Programs SWAT-CUP version 2012 [48] using Generalized Likelihood Uncertainty Estimation (GLUE) procedure [49].GLUE is a Monte Carlo based method for model calibration and uncertainty analysis.It was constructed to partly account for non-uniqueness of model parameters.GLUE requires a large number of model runs with different combinations of parameter values chosen randomly and independently from the prior distribution in the parameter space.The prior distributions of the selected parameters are assumed to follow a uniform distribution over their respective range since the real distribution of the parameter is unknown.By comparing predicted and observed responses, each set of parameter values is assigned a likelihood value.The likelihood functions selected here is principally the NSE as it is very commonly used and included in SWAT-CUP for GLUE performance assessment.In this study, the number of model runs was set to 10,000 and the total sample of simulations were split into "behavioral" and "non-behavioral" based on a threshold value of 0.5, a minimum threshold for NSE recommended by [50] for streamflow simulation to be judged as satisfactory on a monthly time step.In that case, only simulations which yielded a NSE ě 0.5 are considered behavioral and kept for further analysis.
In the calibration procedure, we included 12 parameters that govern the surface runoff and baseflow processes.The real approached baseflow alpha factor (ALPHA_BF) value has been determined by applying the baseflow filter program developed by [51] and modified by [52] to streamflow data measured at the three outlets.One novelty in this study was to involve the Manning's roughness coefficient for overland flow (OV_N) and the average slope length (SLSUBBSN) parameters that are not commonly used in calibration.The reason behind this choice was to correct the tendency of the model to delay the runoff as detected by graphical analysis.The remaining parameters were chosen based on the literature [53][54][55] and their adjusting ranges from the SWAT Input/Output version 2012 document (e.g., [56]).

Model Performance and Uncertainty Evaluation
To evaluate model performance, both statistical and graphical techniques were used as recommended by [50] based on previous published studies.The following quantitative statistics were chosen: NSE to quantify the relative magnitude of the residual variance ("noise") compared to the measured data variance, PBIAS for water balance error, and R 2 to describe the degree of collinearity between simulated and measured data, and were given for the best simulation.The NSE, R 2 and PBIAS were determined using the following equations: PBI AS " where Y sim i and Y obs i are the ith simulated and observed discharge, respectively, Y sim and Y obs the mean value of simulated and observed discharge, respectively and n the total number of observations.
The NSE varies between ´8 and 1 (1 inclusive), with NSE = 1 being the optimal value.The optimal value of PBIAS is 0, with low PBIAIS in absolute values indicating accurate model simulation.Positive values indicate model overestimation bias, and negative values indicate model underestimation bias.R 2 ranges from 0 to 1, with higher values indicating less error variance, values greater than 0.5 are considered acceptable.
In the present study, model performance, for a monthly time step, will be judged as satisfactory if NSE > 0.50 and PBIAS < ˘25% for discharge [50] and if the graphical analysis reveals a good agreement between predicted and measured hydrographs.
The GLUE prediction uncertainty was then quantified by two indices referred to as P-factor and R-factor [57].The P-factor represents the percentage of observed data bracketed by the 95% predictive uncertainty (95PPU) band of the model calculated at the 2.5% and 97.5% levels of the cumulative distribution of an output variable obtained through Latin hypercube sampling.The R-factor is the ratio of the average width of the 95PPU band and the standard deviation of the measured variable.For uncertainty assessment, a value of P-factor > 0.5 (i.e., more than half of the observed data should be enclosed within the 95PPU band) and R-factor < 1 (i.e., the average width of the 95PPU band should be less than the standard deviation of the measured data) should be adequate for this study, especially considering limited data availability.

Sensitivity Analysis
A Global Sensitivity Analysis (GSA) was performed after 10,000 simulations on the 12 parameters included in the calibration process.Only GSA is allowed with GLUE in SWAT-CUP and can be performed after one iteration.A t-test is then used to identify the relative significance of each parameter.T-stat provides a measure of sensitivity and p-value determines the significance of the sensitivity.A larger t-stat in absolute value is more sensitive and a p-value close to zero has more significance [48].

Verification of Model Outputs
To evaluate the accuracy of the SWAT model to predict PET, we considered the model average annual basin output which was computed by the Hargreaves method [58] and compared it to PET values calculated with two other methods: the FAO-Penman Monteith method and the pan evaporation method.The estimates from those three methods are hereinafter referred to as PET har (for average annual PET estimated by the Hargreaves method), PET pen (for average annual PET estimated by the Penman-Monteith method) and PET pan (for average annual PET estimated by the pan evaporation method).The modified Penman method is taken herein as the standard because it was considered to offer the best results with minimum possible error [59].Average observed 10-day PET pen were collected and computed to obtain average annual value on the calibration-validation period.Monthly observed pan evaporation data were used to estimate PET pan .Doorenbos and Pruitt [60] related pan evaporation to reference evapotranspiration, ET 0 (or PET) using empirically derived coefficients.PET can be obtained by: PET " K p ˆEpan where, PET is the potential evapotranspiration in mm¨day ´1, Epan represents the pan evaporation in mm¨day ´1, and K p is the pan coefficient, which is the adjustment factor that depends on mean relative humidity, wind speed, and ground cover.
As the pan factor in the Bani catchment could not be exactly determined due to lack of information about the pan environment and the climate, the average value of 0.7 [61] was used in this study.The PBAIS was again used as the evaluation criterion representing the deviation of the predicted PET compared to the one considered as the baseline.

Global Model Performance
In the preliminary analyses, we tested different land use and soil databases and kept for subsequent analysis the simulation of databases combination crop2soil2, which yielded the highest default, i.e., before calibration, performance (NSE = 0.09).The impact of land use database was not so significant, but the type of soil database used to setup the model was very decisive in obtaining a simulation with the smallest overall error.SWAT-CUP output results are presented as 95PPU as well as the best simulation (Table 3).Overall, calibration and validation of the hydrological model SWAT on the Bani catchment at the Douna outlet yielded good results in terms of NSE and R 2 for both daily and monthly timesteps.364 simulations for daily calibration against 588 for monthly calibration returned a NSE ě 0.5 and were thus considered as behavioral.Very good NSE and R 2 values were obtained and were greater than 0.75 for the best simulations.Moreover, it can be noticed that the performance is slightly lower for daily calibration compared to monthly calibration, but always higher for the validation period.Only one year (1984) over 10 showed very low performance with a NSE of 0.23.
The water balance prediction can be considered as accurate at a daily time-step but becomes hardly satisfactory for monthly calibration, which is characterized by higher PBIAIS values showing increasing errors in the prediction.For example, the PBIAIS values increased from daily to monthly time intervals: from ´12% to ´16% in the calibration period and from ´23% to ´27% in the validation period (Figure 3).With regard to high flow events, visual analysis of simulated and observed hydrographs represented in Figure 3 came out with the following results: timing of peak is well reproduced although the simulation tends to underestimate peak flows especially during dry years (e.g., 1983, 1984, and 1987).Average annual basin values simulated by the model and described in Section 2.8 are shown in Table 4.The analysis of these values came out with several results.On average, PEThar presented a positive PBIAS of 11% compared with observed PETpen herein equal to 1737 mm and the latter is very close to PETpan, estimated to 1755 mm.These results give a clear indication of overestimation of PET by the SWAT model over the Bani catchment, an overestimation that can be attributed to the Hargreaves method used herein by the model to compute PET.
To further investigate the model's accuracy, we evaluated predicted biomass values over the calibration/validation period (Table 4) against reported values for the study area.Simulated biomass was on average 4.3 ton•ha −1 for forest and 1.45 ton•ha −1 for agricultural land and both are in the ranges of observed values in the region (the observed biomass ranges between 2-4 and 2-3 ton•ha −1 for forest and cultivated land, respectively [23,62]).Nevertheless, this component is far underestimated for savannah with a simulated value of 0.4 ton ha −1 compared to the observed value which varies between 0.8 and 2 ton•ha −1 [62].

Verification of Average Annual Basin Values
Table 4 reports the average annual values of the SWAT model simulated on the Bani catchment.However, there are not available data to enable a full verification of all model outputs at the watershed scale.In this case, we focused on available PET and biomass for which there exist regional values.Average annual basin values simulated by the model and described in Section 2.8 are shown in Table 4.The analysis of these values came out with several results.On average, PET har presented a positive PBIAS of 11% compared with observed PET pen herein equal to 1737 mm and the latter is very close to PET pan , estimated to 1755 mm.These results give a clear indication of overestimation of PET by the SWAT model over the Bani catchment, an overestimation that can be attributed to the Hargreaves method used herein by the model to compute PET.To further investigate the model's accuracy, we evaluated predicted biomass values over the calibration/validation period (Table 4) against reported values for the study area.Simulated biomass was on average 4.3 ton¨ha ´1 for forest and 1.45 ton¨ha ´1 for agricultural land and both are in the ranges of observed values in the region (the observed biomass ranges between 2-4 and 2-3 ton¨ha ´1 for forest and cultivated land, respectively [23,62]).Nevertheless, this component is far underestimated for savannah with a simulated value of 0.4 ton ha ´1 compared to the observed value which varies between 0.8 and 2 ton¨ha ´1 [62].

Sensitivity Analysis
There is a wide range of uses for which sensitivity analysis is performed.Based on the 12 selected SWAT parameters (ALPHA_BF being fixed), a GSA was used herein for identifying sensitive and important model parameters in order to better understand which hydrological processes are dominating the streamflow generation in the Bani catchment.
Sensitivity analysis results of 10,000 simulations are summarized in Table 5.The three most sensitive parameters (CN2, OV_N, and SLSUBBSN) are directly related to surface runoff, reflecting therefore the dominance of this process on the streamflow generation in the Bani catchment.Processes occurring at soil level followed at the second position as pointed out by the sensitivity of ESCO and SOL_AWC.Groundwater parameters happened in the last position demonstrating the low contribution of the latter to flows measured at the Douna outlet.The same sensitive parameters were identified by daily and monthly calibrations with only different ranks for soils parameters (ESCO and SOL_AWC).

Spatial Validation
The results of the spatial validation were divergent according to the location (Figure 4).For instance, at Pankourou, the same parameter sets determined at Douna produced a good simulation on a monthly basis (satisfactory for daily validation) whereas predictive uncertainty remained adequate and all met our requirements (NSE > 0.5, P-factor > 0.5 and R-factor < 1).In addition, the water balance was reasonably predicted at both time steps.In contrast, it has been recorded a complete loss of model performance at Bougouni with unsatisfactory NSE values and more uncertainty related to input discharge as expressed by a lower percentage of observed data (P-factor = 0.55 et 0.57 for daily and monthly validation) inside the 95PPU band (Figure 4).Accordingly, important uncertainty could be attributed to observed discharge at Bougouni.

The Subcatchment Model
Statistical evaluation results of the subcatchment calibration are presented in Table 3 and time series of observed and simulated hydrographs are shown in Figures 5 and 6.Good to very good performance was obtained at Pankourou with accurate predictive uncertainty.However, the validation period remained unsatisfactorily simulated at Bougouni.A comparative analysis of the catchment and subcatchment calibration performances came out with the following results:

‚
When calibrated separately, the prediction at Pankourou was slightly better, but greatly improved at Bougouni compared to when the catchment wide model was applied.

‚
The total uncertainty of the model is smaller at Pankourou (smaller R-factor and larger P-factor) than at the whole catchment, but larger at Bougouni.

‚
The water balance is better simulated at both internal stations compared to the watershed-wide water balance as depicted by smaller PBIAIS values, except always in the validation period at Bougouni.

‚
The model performance in terms of NSE and R 2 was higher at the watershed-wide level than at the sub-watershed level.
Overall, these results revealed that further calibration at the internal gauging stations was synonymous with gain of performance at the subcatchment level.on a monthly basis (satisfactory for daily validation) whereas predictive uncertainty remained adequate and all met our requirements (NSE > 0.5, P-factor > 0.5 and R-factor < 1).In addition, the water balance was reasonably predicted at both time steps.In contrast, it has been recorded a complete loss of model performance at Bougouni with unsatisfactory NSE values and more uncertainty related to input discharge as expressed by a lower percentage of observed data (P-factor = 0.55 et 0.57 for daily and monthly validation) inside the 95PPU band (Figure 4).Accordingly, important uncertainty could be attributed to observed discharge at Bougouni.

The Subcatchment Model
Statistical evaluation results of the subcatchment calibration are presented in Table 3 and time series of observed and simulated hydrographs are shown in Figures 5 and 6.Good to very good performance was obtained at Pankourou with accurate predictive uncertainty.However, the validation period remained unsatisfactorily simulated at Bougouni.A comparative analysis of the catchment and subcatchment calibration performances came out with the following results:

•
When calibrated separately, the prediction at Pankourou was slightly better, but greatly improved at Bougouni compared to when the catchment wide model was applied.Water 2016, 8, 178 16 of 23

•
The total uncertainty of the model is smaller at Pankourou (smaller R-factor and larger P-factor) than at the whole catchment, but larger at Bougouni.

•
The water balance is better simulated at both internal stations compared to the watershed-wide water balance as depicted by smaller PBIAIS values, except always in the validation period at Bougouni.

•
The model performance in terms of NSE and R 2 was higher at the watershed-wide level than at the sub-watershed level.
Overall, these results revealed that further calibration at the internal gauging stations was synonymous with gain of performance at the subcatchment level.

Model Predictive Uncertainty
In the global model, the predictive uncertainty, as indicated by the P-factor and R-factor, is adequate, though being larger during peak flow and recession periods (reflected by larger 95PPU band).On a daily basis, for instance, 61% of the observed discharge data are bracketed by a narrow 95PPU band depicted by the R-factor < 1 (Table 6).It has been noted that the entire uncertainty band is, however, very large during the year 1984 (Figure 3).
It is important to note the decrease of predictive uncertainty from Douna to Pankourou.In fact, the percentage of observed discharge bracket by 95PPU band has increased to 68%, while the width of the uncertainty band itself has decrease to 0.41 for the daily calibration (Table 6).The same trend has been observed for the monthly calibration.At Bougouni, results showed a clear decrease of the uncertainty band (for daily and monthly calibration), but at the expense of bracketing less observed data.For instance, the P-factor and R-factor decreased from 0.65 to 0.58 and from 0.65 to 0.54, respectively, when moving from Douna to Bougouni during the monthly calibration.
Moreover, an increase of the uncertainty band with increasing time step (daily to monthly) has been recorded as depicted by higher R-factor values at Douna and Pankourou (from 0.59 to 0.65 and

Model Predictive Uncertainty
In the global model, the predictive uncertainty, as indicated by the P-factor and R-factor, is adequate, though being larger during peak flow and recession periods (reflected by larger 95PPU band).On a daily basis, for instance, 61% of the observed discharge data are bracketed by a narrow 95PPU band depicted by the R-factor < 1 (Table 6).It has been noted that the entire uncertainty band is, however, very large during the year 1984 (Figure 3).It is important to note the decrease of predictive uncertainty from Douna to Pankourou.In fact, the percentage of observed discharge bracket by 95PPU band has increased to 68%, while the width of the uncertainty band itself has decrease to 0.41 for the daily calibration (Table 6).
The same trend has been observed for the monthly calibration.At Bougouni, results showed a clear decrease of the uncertainty band (for daily and monthly calibration), but at the expense of bracketing less observed data.For instance, the P-factor and R-factor decreased from 0.65 to 0.58 and from 0.65 to 0.54, respectively, when moving from Douna to Bougouni during the monthly calibration.
Moreover, an increase of the uncertainty band with increasing time step (daily to monthly) has been recorded as depicted by higher R-factor values at Douna and Pankourou (from 0.59 to 0.65 and from 0.41 to 0.45, respectively).However, the uncertainty band was reduced during the validation period compared to the calibration period for all the stations (Table 6).

Model Performance
In an effort to assess the performance of the SWAT model on the Bani catchment, we calibrated and validated the model at multiple sites on daily and monthly time steps by using measured climate data.There was no statistically significant difference in model performance among time intervals.Using guidelines given in Moriasi et al. [50], the overall performance of the SWAT model in terms of NSE and R 2 can be judged as very especially considering limited data conditions in the studied area.On a monthly basis, we obtained at the Douna outlet a NSE value equal to 0.79 for the calibration period (0.85 for the validation period).These results are greater than the ones of the studies by Schuol and Abbaspour [12], and Schuol et al. [14] at the same outlet.Schuol and Abbaspour [12] reported indeed a negative NSE (between ´1 and 0) for the monthly calibration and a value ranging between 0 and 0.7 for monthly validation, while Schuol et al. [14] obtained a NSE between 0 and 0.70 for both monthly calibration and validation.However, Laurent and Ruelland [23] reported a greater performance (NSE values varying between 0.81 and 0.91 for calibration and validation period, respectively) but on a coarser time step (average annual basis).The water balance is less well simulated, especially for monthly time step with a PBIAIS greater than 25% in absolute value.
The quantified prediction uncertainty is surprisingly satisfactory (Table 6).At the end of the daily calibration, the model was able to account for 61% of observed discharge data (65% for monthly calibration) in a narrow uncertainty band.These results are close to the result of Schuol et al. [14] who estimated the observed discharge data bracketed by the 95PPU between 60% and 80% for monthly calibration (40% and 60% for monthly validation).However, one explanation that could be attributed to the small uncertainty band we obtained is that model predictive uncertainty derived by GLUE depends largely on the threshold value to separate "behavioral" from "non-behavioral" parameter sets [63,64].
This means, a high threshold value (as in this case) will generally lead to a narrower uncertainty band [65][66][67] but this will be achieved at the cost of bracketing less observed data within the 95PPU band.In addition, GLUE accounts partly for uncertainty due to the possible non-uniqueness (or equifinality) of parameter sets during calibration and could therefore underestimate total model uncertainty [68].For instance, Sellami et al. [69] showed that the GLUE predictive uncertainty band was larger and surrounded more observation data when uncertainty in the discharge data was explicitly considered.Engeland and Gottschalk [70] demonstrated that the conceptual water balance model structural uncertainty was larger than parameter uncertainty.In spite of all the aforementioned limitations of GLUE, we succeeded in enclosing interestingly most of the observed data within a narrow uncertainty band (the sought adequate balance between the two indices) hence increasing confidence in model results.These are encouraging results showing, on one hand, the good performance of the SWAT model on a large Soudano-Sahelian catchment under limited data and varying climate conditions and, on the other hand, the capability of observed climate and hydrological input data of this catchment, even though contested, to provide reliable information about hydro-meteorological systems prevailing in the region.
It has been also noted that the model did not perform well during the year 1984 particularly (lower performance and larger uncertainty).This loss of performance can be attributed to the disruption in rainfall-runoff relationship consequence of consecutive years of drought, which has prevailed in the beginning of the 80s.The over-predicted PET on the Bani catchment could be attributed to the Hargreaves method, which could give a greater estimate of PET than it actually is.Ruelland et al. [28] applied a temperature-based method given by Oudin et al. [71] and provided a similar estimate of PET (1723 mm) than the values calculated herein by the Penman and pan evaporation methods hence corroborating our results.These results demonstrated the valuable of pan evaporation measurements for estimating PET and that the simple pan evaporation method appears to be suited for application in the study area and can be used when all the climatic data required by the Penman method are missing.
As far as biomass is concerned, the underestimation of this component in savannah could be explained by inappropriate specification of all categories in the land use map grid to be modeled by SWAT as savannah or inaccurate savannah characteristics added in the SWAT database, directly affecting biomass production such as BIO_E and LAI parameters, among others.

Impact of Spatial and Temporal Scales on the Model Uncertainty
Results showed that transferring the model parameters from the catchment outlet (Douna) to the internal gauging stations performs reasonably well only in the case of similarity between donor and target catchments.The case of catchments controlled by Douna and Pankourou gives a clear example of such physical proximity where precipitation, soil and land use vary smoothly between both catchments.However, the SWAT model parameters determined at the outlet could not reproduce well the measured discharge at Bougouni mainly due to more significant spatial dissimilarities.Bougouni is indeed situated in a more humid zone and dominated by forest whereas Douna is more arid.Moreover, it has been demonstrated that the individual calibration at subcatchment scale has led to a narrower uncertainty band and more observed discharge data enclosed in it, which is the sought adequate balance between the two indices.Hence, predictive uncertainty was found to decrease with decreasing spatial scale.This finding can be attributed to the presence of less heterogeneity in hydrological variables in smaller catchments.These results showed the importance of the calibration of hydrological models at finer spatial scale to ensure that predominant processes in each subcatchment are captured, and this is particularly relevant in case of large-area global catchments.Concerning the effect of temporal scale, we demonstrated that the validation period is characterized by less predictive uncertainty as opposed to the calibration period.One explanation that can be given is the fact that 1993-1997 constitutes a more humid period than 1983-1992 and is characterized, therefore, by less variability in precipitation.In contrast, when moving from daily to monthly calibration, the uncertainty of the model, in terms of uncertainty band width, increased.This could be attributed to the cumulative effect of uncertainty in daily discharge data used to compute monthly discharge, resulting therefore in larger monthly uncertainty.Overall, due to decreasing prediction uncertainty with decreasing spatial and temporal scales, it is germane to develop on the basin a more efficient system of hydro-meteorological data collection to account for spatial and temporal variabilities in hydro-meteorological systems prevailing in the region, especially under changing climate and land use conditions.

Advance in Understanding of Hydrological Processes
The GSA confirms what has already been reported on and around the Bani catchment about the contribution of hydrological processes to streamflow generation.In order to better understand the origin of flows at Kolondieba (a tributary of the Bani River), Dao et al. [72] showed that Groundwater contribution to the hydrodynamic equilibrium at the outlet of watershed Kolondieba is small and the direct flow from the soil surface governs the runoff process.This fact can be explained by the double impact of a general impoverishment of shallow aquifers due to reduction in precipitation in West Africa in general since the great drought of the 70s as well as a concurrent increase of the recession coefficient of the Bani river as demonstrated by Bamba et al. [32] and Mahé [73] with a decrease of baseflow contribution to total flow in absolute and relative values as corollary.

Spatial Performance
The results of different calibration and validation techniques showed varying predictive abilities of the SWAT model through scales.Firstly, it can be derived from these findings that model performance in terms of NSE and R 2 was higher on the watershed-wide level than on the sub-watershed level.However, this could be attributed to compensation between positive and negative errors of processes occurring at a larger scale [74,75].This suggests that calibrating a model only at the basin outlet leads to an overconfidence in its performance than at the sub-basin scale.Secondly, individual calibration of subcatchment processes expectedly improved model accuracy in predicting flows at the internal gauging stations, due to reducing heterogeneities with downsizing space [76], and is especially beneficent while the donor and receiver catchments are substantially different.Finally, predictive uncertainty appears to decrease with reducing spatial scale, but increases with humidity as shown by the lower performance recorded at Bougouni.The inability of the model to perform during the validation period at Bougouni could be attributed to the structure of the validation period which is substantially different to that of calibration, and is solely composed by average to wet years while in contrast, the occurrence of dry, average, and wet years during the calibration period is noted.
These results have an important role to play in the calibration and validation approaches of large-area watershed models and constitutes a first step to model parameter regionalization for prediction in ungauged basins.
Generally speaking, it is well known that in recent decades the Niger River basin has suffered from a serious degradation of its natural resources, which in turn lead to severe environmental issues.To this end, different agreements and collaborations on water and climate data sharing have been established between the 9 countries sharing the basin through different national and international programs.Thus, the need to reinforce the existing framework of integrated, coordinated, and sustainable water management strategies in the Bani basin and therefore the Niger River Basin become more urgent than ever.
Therefore, this study is a step in that long-term direction, where an integrated water management tool has been developed and validated spatially on the Bani catchment, which allows investigation of future effects of land use and climate change scenarios on water resources.

Conclusions
In this study, the performance of the widely-used SWAT model was evaluated on the Bani catchment using both split-sample and split-location calibration and validation techniques on daily and monthly intervals.The model was calibrated at the Douna outlet and at two internal stations.Freely available global data and daily observed climate and discharge data were used as inputs for model simulation and calibration.Calibration, validation, uncertainty, and sensitivity analyses were performed with GLUE within SWAT-CUP.Both graphical and statistical techniques were used for hydrologic calibration results evaluation.Evapotranspiration and biomass production outputs were verified and compared to regional values to make sure these components were reasonably predicted.Sensitivity analysis contributed to a better understanding of the hydrological processes occurring at the study area.
Final results showed a good SWAT model performance to predict daily as well as monthly discharge at Douna with acceptable predictive uncertainty despite the poor data density and the high gradient of climate and land use characterizing the study catchment.However, the daily calibration resulted in less predictive uncertainty than the monthly calibration.The performance of the model is somehow lower at an internal sub-catchments level when the global parameter sets are applied, especially at the one with higher humidity and dominated by forest.However, subcatchment calibration induced an increase of model performance at intermediate gauging stations as well as a decrease of total uncertainty.With regard to predicted PET, this component is overestimated by the model when the Hargreaves method is applied in that specific region while biomass production remained low in the savannah land use category.The GSA revealed the predominance of surface and subsurface processes in the streamflow generation of the Bani River.
Overall, this study has shown the validity of the SWAT model for representing globally hydrological processes of a large-scale Soudano-Sahelian catchment in West Africa.Given the high spatial variability of climate, soil, and land use characterizing the catchment, additional calibration is however needed at subcatchment level to ensure that predominant processes are captured in each subcatchment.Accordingly, the importance of spatially distributed hydrological measurements is demonstrated and constitutes the backbone of any type of progress in hydrological process understanding and modeling.The calibrated SWAT model for the Bani can be used to assess the current and future impacts of climate and land use change on water resources of the catchment, increasingly necessary information awaited by water resources managers.Knowing this information, a strategy of adaptation in response to the current and future impacts can be clearly proposed and the vulnerability of the population can therefore be reduced.More widely, this impact study can increase the transferability of the model parameters from the Bani subcatchment to another ungauged basin with some similarities, and then predicting discharge without the need of any measurement.These findings are very useful, especially in West Africa, where many river basins are ungauged or poorly gauged.

Figure 1 .SegouFigure 1 .Figure 2 .
Figure 1.Localization of the Bani catchment at the Douna outlet.The altitude and the monitoring network of the catchment are also given.

Figure 2 .
Figure 2. (a) Soil attributes and (b) land use categories of the Bani catchment.

a
Average annual PET estimated by the Hargreaves method (herein used by SWAT).

Figure 3 .
Figure 3. Simulated and observed hydrographs at Douna station at (a) daily and (b) monthly timesteps along with calculated statistics on calibration and validation periods.

Figure 3 .
Figure 3. Simulated and observed hydrographs at Douna station at (a) daily and (b) monthly timesteps along with calculated statistics on calibration and validation periods.

Figure 4 .
Figure 4. Spatial validation of the SWAT model on the Bani catchment.The model was turned at Pankourou ((a) daily and (b) monthly time steps) and at Bougouni ((c) daily and (d) monthly timesteps) by using the same behavioral parameter sets determined at the Douna outlet on the period 1983-1992.

Figure 4 .
Figure 4. Spatial validation of the SWAT model on the Bani catchment.The model was turned at Pankourou ((a) daily and (b) monthly time steps) and at Bougouni ((c) daily and (d) monthly timesteps) by using the same behavioral parameter sets determined at the Douna outlet on the period 1983-1992.

Figure 5 .
Figure 5. Simulated and observed hydrographs at Pankourou station at (a) daily and (b) monthly time steps along with calculated statistics on calibration and validation periods.

Figure 5 .
Figure 5. Simulated and observed hydrographs at Pankourou station at (a) daily and (b) monthly time steps along with calculated statistics on calibration and validation periods.

Figure 6 .
Figure 6.Predicted and measured discharges at Bougouni station at (a) daily and (b) monthly intervals during the calibration and validation periods with their corresponding statistics.

Figure 6 .
Figure 6.Predicted and measured discharges at Bougouni station at (a) daily and (b) monthly intervals during the calibration and validation periods with their corresponding statistics.

Table 1 .
Input data of the SWAT model for the Bani catchment.

Table 2 .
Input methods for SWAT model simulation on the Bani catchment.

Table 3 .
Model performance statistics for the Bani catchment at Douna, Pankourou, and Bougouni discharge gauging stations.

Table 4 .
Average annual basin values of precipitation (P), evapotranspiration (ET), potential evapotranspiration (PET), and biomass as SWAT outputs on the Bani catchment.Average annual PET estimated by the Hargreaves method (herein used by SWAT).

Table 5 .
Summary of the SWAT model parameters calibrated on the Bani catchment at Douna on a daily time interval.
* Determined on observed discharges by applying the baseflow filter program.ND: Not Determined.

Table 6 .
Predictive uncertainty indices of the SWAT model for the Bani catchment at Douna, Pankourou, and Bougouni discharge gauging stations.

Table 6 .
Predictive uncertainty indices of the SWAT model for the Bani catchment at Douna, Pankourou, and Bougouni discharge gauging stations.